Reprinted with Permission by Quest Software Sept. 2005


Introducing the MySQL information_schema

Roland Bouman  

Introducing the MySQL information_schema database

MySQL 5.02 and later includes an information_schema database. The MySQL information_schema is a basic implementation of the INFORMATION_SCHEMA schema defined in the SQL Standard (ISO/IEC 9075). The purpose of this database is to provide data describing the databases and their constituent objects: metadata.

The information_schema database consists of a set of table-like objects (actually, system views), exposing metadata in a relational format. This allows one to execute arbitrary SELECT statements to retrieve or to format metadata. Metadata is available only for those objects that are accessible to the current user. The metadata is automatically maintained by the server, and the information_schema database is automatically created upon MySQL installation.

In MySQL versions prior to 5.0.2, virtually the only way to obtain metadata is through the MySQL specific SHOW syntax. The possibilities to control the output of the SHOW commands are limited to filtering for data of interest, whereas the information schema allows us to use the full set of relational operations defined by the SQL language. This allows DBA's and application developers to retrieve exactly the data they need to port applications, generate maintenance scripts or document database structure.

Some naming and typography conventions

We will usually refer to the information_schema database by the term "information schema". When we want to refer to the information schema as a particular database among other databases, we will use its actual name. In those cases, we'll call it: "information_schema database", or just plain: "information_schema".

The information schema consists of a set table-like objects called system views. We deliberately use the term "table-like objects", because for most purposes, we can regard these objects as ordinary tables. There are good reasons to assume that these objects are actually temporary tables. There are also good reasons to assume that these objects are actually views. In practice however, the exact nature of these objects is of no consequence for how we use them. Instead of writing "information schema table-like objects", "information schema temporary tables", "information schema system views" or "information schema views", we will refer to them by the term "information schema tables", or just "tables" whenever the context allows it.

Metadata and data dictionaries

We just stated that the information schema tables contain metadata. Literally, metadata means "data about data", but the information schema does not contain data about just any other data. Rather, it contains data that describes databases and their structure in terms of their associated objects. That is, the information schema lists the existence and properties of databases, tables, columns, and so on. Most database management systems contain some sort of collection of database objects that list database metadata. Such a collection of objects is loosely referred to by the term "data dictionary".

The MySQL information_schema database matches the description of a data dictionary. However, it is by no means the only source of MySQL metadata. The built-in mysql system database also contains data about databases and database objects. It even contains human-readable metadata describing the SQL language supported by MySQL. One could argue that the MySQL data dictionary consists of all metadata sources, and would thus include both the mysql and the information_schema databases. Let's take a practical point of view towards this issue. Instead of referring to the MySQL data dictionary, we will refer explictly to the "information schema", meaning the information_schema database. When we need to, we'll refer to the mysql database. We'll use the term "data dictionary" only in its loose sense, and not to refer to either of these databases.

The SQL Standard (ISO/IEC 9075) also defines a data dictionary, the INFORMATION_SCHEMA schema. The MySQL information schema nearly covers all the elements that make up the SQL Standard Feature entitled "Basic information schema". The MySQL implementation shows some extensions, some minor incompatibilities and some less minor incompatibilities compared to the requirements of the feature defined by the standard. However, the majority of requirements for the Standard Feature are met by the MySQL implementation. The exact differences and similarities will not be adressed in this article.

Based upon the metadata contained in the information schema, one could recreate the structure of a particular user database, but not the data contained within it. So, the information schema does not contain the actual data that is stored in those tables and columns, it contains information about those tables and columns. The one exception is the metadata that describes the information_schema database itself, which is also listed by the information schema. The metadata that describes the information_schema database itself is at least in part identical to data stored in the information_schema database. This illustrates the fact that there is no clear distinction between metadata and other data. To some extent, your point of view determines wheter data is "just data", or "data about data".

The information schema is dynamically populated to list only metadata concerning those objects that are accessible to the current user. Accessible are those objects upon which the current user can exert some privilege. Actually, the sort of information contained within the information_schema database is a lot like the sort of information returned by the various SHOW and DESCRIBE statements. However, there are quite some differences when it comes to putting that information to use. In the course of this topic, specific similarities and differences between working with the information schema and the SHOW syntax will be highlighted.

It's a database, isn't it?

Let's take a closer look at the MySQL information schema.

(In the remainder of this article, we'll send some commands to the MySQL server to illustrate some features of the information schema. We did this using MySQL 5.0.10 version of the Server and the standard MySQL command line client tool, mysql.exe. You can use any client tool you like to do this, as long as it allows you to send arbitrary SQL commands to the server. The MySQL Server should have at least version 5.0.2)

In some respects, the information schema database resembles any other database. You can issue a USE command, making it the current database.

    USE information_schema;

You can issue various SHOW commands such as SHOW TABLES, SHOW CREATE TABLE and SHOW COLUMNS FROM, just as if it were a user-defined database:

    SHOW TABLES;
    +---------------------------------------+
    | Tables_in_information_schema          |
    +---------------------------------------+
    | SCHEMATA                              |
    | TABLES                                |
    | COLUMNS                               |
    | CHARACTER_SETS                        |
    | COLLATIONS                            |
    | COLLATION_CHARACTER_SET_APPLICABILITY |
    | ROUTINES                              |
    | STATISTICS                            |
    | VIEWS                                 |
    | USER_PRIVILEGES                       |
    | SCHEMA_PRIVILEGES                     |
    | TABLE_PRIVILEGES                      |
    | COLUMN_PRIVILEGES                     |
    | TABLE_CONSTRAINTS                     |
    | KEY_COLUMN_USAGE                      |
    | TRIGGERS                              |
    +---------------------------------------+
    16 rows in set (0.00 sec)

(The TRIGGERS table was added in version 5.0.10, thus users of a prior version of the server product won't see that one in their listing)

Finally and foremost, you can build SELECT statements referencing any table(s) from the information schema. For example, the following SELECT statement retrieves the table name and column name of all the columns in the database named human_resources that are collated according to the binary latin1 collation:

    SELECT  table_name
    ,       column_name
    FROM    information_schema.columns c
    WHERE   c.table_schema   = 'human_resources'
    AND     c.collation_name = 'latin1_bin'
    ;

The ability to build arbitrary queries functionally distinguishes the information schema from the SHOW commands. Whereas some SHOW commands do support a WHERE clause to filter results, the information schema allows for the full set of SQL contructs to mould the resultset. Complex SELECT expression involving subqueries, JOIN's, GROUP BY's and UNION's can be used to query the information_schema database, just as is the case for any ordinary database. For example, the following statement lists name, type and comment of all tables, views and routines residing in the current database. If the object is either a table or a view, the number of columns is displayed, and if the object is either a view or a routine, the definition is displayed:

    SELECT     t.table_name                  object_name
    ,          t.table_type                  object_type
    ,          count(c.column_name)          column_count
    ,          t.table_comment               object_comment
    ,          v.view_definition             object_definition
    FROM       information_schema.tables     t
    INNER JOIN information_schema.columns    c
    ON         t.table_schema              = c.table_schema
    AND        t.table_name                = c.table_name
    LEFT JOIN  information_schema.views      v
    ON         t.table_schema              = v.table_schema
    AND        t.table_name                = v.table_name
    WHERE      t.table_schema              = schema()
    GROUP BY   t.table_name
    ,          t.table_type
    ,          t.table_comment
    ,          v.view_definition
    UNION ALL
    SELECT     r.routine_name                object_name
    ,          r.routine_type                object_type
    ,          null                          column_count
    ,          r.routine_comment             object_comments
    ,          r.routine_definition          object_definition
    FROM       information_schema.routines   r
    WHERE      r.routine_schema            = schema()
    ;

Is it really just a database?

In spite of what these examples seem to indicate, the information_schema database is not an ordinary physical database like the ones you create by issuing a CREATE DATABASE statement. Rather, the information_schema database is a built-in virtual database. Its sole purpose is to provide information about the database system itself. MySQL automatically populates the tables in the information schema. You should expect any attempt to modify the data using some DML statement like INSERT, UPDATE or DELETE to fail:

    INSERT
    INTO   schemata (
           schema_name
    ,      default_character_set_name
    ,      default_collation_name
    ) VALUES (
           'bogus'
    ,      'latin1'
    ,      'latin1_swedish_ci'
    );
    ERROR 1288 (HY000): The target table schemata of the INSERT is not updatable

The same goes for creating your own database objects in the information_schema database and altering or dropping the built-in objects. Again, it is not surprising that the next statement fails:

    ALTER TABLE schemata DROP COLUMN catalog_name;
    ERROR 1146 (42S02): Table 'information_schema.schemata' doesn't exist

Finally, issuing a SHOW CREATE DATABASE command also generates a an error:

    SHOW CREATE DATABASE information_schema;
    ERROR 1049 (42000): Unknown database 'information_schema'

The latter two error messages suggest that our commands attempt to reference some non-existant object. However, we know that these objects really do exist: we can successfully write a SELECT statement for the SCHEMATA table, just as we can successfully USE the information_schema database.

The explanation probably lies in the details of the implementation that handles references to the information_schema database. We shouldn't worry too much about it though. The important thing is that we are allowed to query the information schema, whereas we are not allowed to change its structure or modify its data.

What about the mysql database?

In some respects, the information_schema database resembles that other built-in database, the mysql database. We already noted that both these databases contain metadata. Usually, the mysql database is also automatically populated by MySQL. You should not attempt to manipulate its structure.

There are differences too. For starters, the mysql database is an actual, physical database. In your MySQL data directory, you can find the database files that contain the actual data that makes up the mysql database.

Secondly, the information schema is intended to be a comprehensible, rich source of metadata. The information_schema database is accessible to every user, and it's populated dynamically to contain only the metadata for those objects that are accessible to the current user. This is not the case with the mysql database. When a user can query a particular table in the mysql database, all rows in this table can be retrieved, not just the ones that are applicable to the current user. The mysql database contains some structures that are readily understandable by the human user, such as the help_% tables. However, most structures in that database are not, or at least not intended to be easily understandable by human users. In that respect, the mysql database is a true system database. It's primarily intended to provide information about the system to the system itself. It is not intended to be queried directly by ordinary users. Querying the mysql database tables could be valuable to system administrators and DBA's however.

Thirdly, there's a difference when it comes to addition, deletion or modification of data. This is possible for the tables in the mysql database, but not for those in the information_schema database. In older versions of MySQL, some options concerning user accounts and privileges could only be achieved by modifying data in a special collection of tables in the mysql database, the grant tables. More recent versions provide specialized SQL statements that define user and user privileges, making it unnecessary to modify the data in the grant tables directly. However, one should be very careful when one does decide to manipulate the data in the grant tables directly. Because the mysql database is a system database, the effect of changing the data directly could have a serious impact on the system. For example, an accidental error in an UPDATE of the mysql.user table could lock a user out of the system. Like querying the mysql database, direct manipulation of the data contained within it is out of bounds for normal users, although it could be useful for DBA's and System Administrators.

Finally, you can safely rely on the structure of the information_schema database to remain quite stable. The definition and functionality of the structures found in the information schema are quite firmly rooted in the SQL standard. We expect it to evolve towards a nearly full implementation in the future. This is not all the case for the mysql database. This database may or may not change in the future, so the information schema is really the preferable way to obtain metadata in MySQL 5.02 and up.

Ok, What's next?

In the next issue of the newsletter, we'll take a closer look at the MySQL information schema. A realistic use case will be discussed, illustrating the construction a SQL code generator driven by the information schema. Meanwhile, We'd like to suggest some further reading for the interested reader:

General info on the information schema from the MySQL reference
http://dev.mysql.com/doc/mysql/en/information-schema.html
MySQL grant tables
http://dev.mysql.com/doc/mysql/en/adding-users.html
MySQL Information Schema Forum
http://forums.mysql.com/list.php?101
A Diagram of the MySQL information schema
http://www.xcdsql.org/Misc/MySQL_INFORMATION_SCHEMA.html
Showing index selectivity
http://www.jpipes.com/article/mysql-5-stored-procedures-and-information_schema
Fuzzy foreign key matching
http://leithal.cool-tools.co.uk/?p=8