mk-log-parser

Langue: en

Version: 2008-12-29 (fedora - 04/07/09)

Section: 1 (Commandes utilisateur)

NAME

mk-log-parser - Analyze and review MySQL logs.

SYNOPSIS

Analyze and report on a slow log:
  mk-log-parser /path/to/slow.log
 
 

Review a slow log, saving results to the test.query_review table in a MySQL server running on host1. See --review for more on reviewing queries:

  mk-log-parser --review h=host1,D=test,t=query_review
 
 

DESCRIPTION

"mk-log-parser" is a framework for much more functionality in the future, but at present it can analyze MySQL slow query logs and report on aggregate statistics by grouping similar queries together and showing the worst ones first in a report. It can also do a ``query review,'' which means to save a sample of each type of query into a MySQL table so you can easily see whether you've reviewed and analyzed a query before. The benefit of this is that you can keep track of changes to your server's queries and avoid repeated work. You can also save other information with the queries, such as comments, issue numbers in your ticketing system, and so on.

By default the tool does slow-query-log analysis, and there are options to make it do reviews.

Note that this is a work in *very* active progress and you should expect incompatible changes in the future.

OUTPUT

The default output is a query analysis report. There is one paragraph for each class of query analyzed. A ``class'' of queries all have the same fingerprint, which is an abstracted version of the query text with literals removed, whitespace collapsed, and so forth. The report is formatted so it's easy to paste into emails without wrapping, and all non-query lines begin with a comment, so you can save it to a .sql file and open it in your favorite syntax-highlighting text editor.

The report begins with one paragraph about the entire analysis run:

    # Queries: 8, Time: 762.1ms
    # CPU times: 260ms user, 30ms system
    # Memory use: 5.63M rss,  8.40M vsz
 
 

This file had 8 queries total, with a total execution time of 762.1ms. The CPU and memory lines are statistics about the analysis tool; we are trying very hard to make the tool fast and lean. If you find a log file that takes an unusually long time or a lot of memory, please submit a bug report.

Each query then appears in a paragraph. Here's a sample, slightly reformatted so 'perldoc' will not wrap lines in a terminal:

  # Query 001 (0 QPS) ID: 0x66825DDC008FFA89 at byte 332 ________________
  #              total   %     min     max     avg     95% stddev  median
  #     Count        1  12
  #      Time  726.1ms  95 726.1ms 726.1ms 726.1ms 726.1ms      0 726.1ms
  #      Lock     91us  29    91us    91us    91us    91us      0    91us
  # Rows sent        0   0       0       0       0       0      0       0
  # Rows exam    63.0k 100   63.0k   63.0k   63.0k   63.0k      0   63.0k
  #       DBs  db1
  #     Users  [SQL_SLAVE]
  # Execution times
  #   1us
  #  10us
  # 100us
  #   1ms
  #  10ms
  # 100ms  ##############################################################
  #    1s
  #  10s+
  # Tables
  #    SHOW TABLE STATUS FROM `db2` LIKE 'table1'\G
  #    SHOW CREATE TABLE `db2`.`table1`\G
  #    SHOW TABLE STATUS FROM `db1` LIKE 'table2'\G
  #    SHOW CREATE TABLE `db1`.`table2`\G
  update db2.table1 n
        inner join db1.table2 a using(table2) 
        set n.column1 = a.column1, n.word3 = a.word3\G
  # Converted for EXPLAIN
  # EXPLAIN
  select  n.column1 = a.column1, n.word3 = a.word3 from db2.table1 n
        inner join db1.table2 a using(table2) \G
 
 

Let's take that line by line. Line 1 is the header:

  # Query 001 (0 QPS) ID: 0x66825DDC008FFA89 at byte 332 ________________
 
 

This line identifies the sequential number of the query in the sort order specified by --sort. Then there's the queries per second, and a query ID. This ID is a hex version of the query's checksum in the database, if you're using --review. You can select the reviewed query's details from the database with a query like "SELECT .... WHERE checksum=0x66825DDC008FFA89".

Finally, in case you want to find a sample of the query in the log file, there's the byte offset where you can look. (This is not always accurate, due to some silly anomalies in the slow-log format, but it's usually right.) The position refers to the --worst sample, which we'll see more about below.

Next is the table of metrics about this class of queries.

  #              total   %     min     max     avg     95% stddev  median
  #     Count        1  12
  #      Time  726.1ms  95 726.1ms 726.1ms 726.1ms 726.1ms      0 726.1ms
  #      Lock     91us  29    91us    91us    91us    91us      0    91us
  # Rows sent        0   0       0       0       0       0      0       0
  # Rows exam    63.0k 100   63.0k   63.0k   63.0k   63.0k      0   63.0k
 
 

The first line is column headers for the table. The total is the actual value of the specified metric, and the percentage is the percent of the total for the whole analysis run. For example, in this case we can see that the query executed 1 time, which is 12% of the total number of queries in the file. The min, max and avg columns are self-explanatory. The 95% column shows the 95th percentile; 95% of the values are less than or equal to this value. The standard deviation shows you how tightly grouped the values are. The standard deviation and median are both calculated from the 95th percentile, discarding the extremely large values.

  #       DBs  db1
  #     Users  [SQL_SLAVE]
 
 

These two lines show you from which databases the queries in this class were executed, and the users who ran them. In this case, the queries were all run by the replication SQL slave thread. This is a feature of Percona's enhanced slow-query-logging functionality. Normal unpatched servers don't log replication activity to the slow query log.

  # Execution times
  #   1us
  #  10us
  # 100us
  #   1ms
  #  10ms
  # 100ms  ##############################################################
  #    1s
  #  10s+
 
 

The execution times show a logarithmic chart of time clustering. Each query goes into one of the ``buckets'' and is counted up. The buckets are powers of ten. The first bucket is all values in the ``single microsecond range'' --- that is, less than 10us. The second is ``tens of microseconds,'' which is from 10us up to (but not including) 100us; and so on.

  # Tables
  #    SHOW TABLE STATUS FROM `db2` LIKE 'table1'\G
  #    SHOW CREATE TABLE `db2`.`table1`\G
  #    SHOW TABLE STATUS FROM `db1` LIKE 'table2'\G
  #    SHOW CREATE TABLE `db1`.`table2`\G
 
 

This section is a convenience: if you're trying to optimize the queries you see in the slow log, you probably want to examine their structure and size. These are copy-and-paste-ready commands to do that. The values are extracted from the query sample, and are usually correct. If you find a query that isn't parsed right, submit a bug report.

  update db2.table1 n
        inner join db1.table2 a using(table2) 
        set n.column1 = a.column1, n.word3 = a.word3\G
 
 

This is a sample of the queries in this class of query. This is not a random sample. It is the query that performed the worst, according to --worst. You would normally see a commented "# EXPLAIN" line just before it, so you can copy-paste the query to examine its EXPLAIN plan. But for non-SELECT queries such as this one, that isn't possible to do, so the tool tries to transform the query into a roughly equivalent SELECT query, and adds that below:

  # Converted for EXPLAIN
  # EXPLAIN
  select  n.column1 = a.column1, n.word3 = a.word3 from db2.table1 n
        inner join db1.table2 a using(table2) \G
 
 

QUERY REVIEWS

A ``query review'' is the process of storing all the query fingerprints you find so you can add meta-data to them, mark them for follow up, etc. But most importantly, you can refer to the stored values on subsequent runs so you'll knowh whether you've seen a query before, and not waste your time on it.

You can do even more with "mk-log-parser"'s query review functionality, though. You can add meta-data to each row by adding more columns. These can be arbitrary columns, such as the ticket ID of a JIRA ticket you open for a given query. And you can add comments and other information.

Next time you run "mk-log-parser" in --review mode, it will do the following:

It won't show you queries you've already reviewed. A query is considered to be already reviewed if you've set a value for the "reviewed_by" column.
Queries that you've reviewed, and don't appear in the output, will cause gaps in the query number sequence in the first line of each paragraph. And the value you've specified for --top will still be honored. So if you've reviewed all queries in the top 10 and you ask for the top 10, you won't see anything in the output.
If you want to see the queries you've already reviewed, you can specify --reportall. Then you'll see the normal analysis output, but you'll also see the information from the review table, just below the execution time graph. For example,
   # Review information
   #      comments: really bad IN() subquery, fix soon!
   #    first_seen: 2008-12-01 11:48:57
   #   jira_ticket: 1933
   #     last_seen: 2008-12-18 11:49:07
   #      priority: high
   #   reviewed_by: xaprb
   #   reviewed_on: 2008-12-18 15:03:11
 
 

You can see how useful this meta-data is --- as you analyze your queries, you get your comments integrated right into the report.

INTERNALS

A few concepts might help you understand what this tool does. First, a query fingerprint. This is the abstracted form of a query, which makes it possible to group similar queries together. Abstracting a query removes literal values, normalizes whitespace, and so on. For example, these two queries:
   SELECT name, password FROM user WHERE id='12823';
   select name,   password from user
      where id=5;
 
 

Both of those queries will fingerprint to

   select name, password from user where id=?
 
 

Once the query's fingerprint is known, we can then talk about a query as though it represents all similar queries.

What "mk-log-parser" does is analogous to a GROUP BY statement in SQL. If your command-line looks like this,

   mk-log-parser /path/to/slow.log --sort sum --worst Query_time --top 10
 
 

The corresponding pseudo-SQL looks like this:

   SELECT WORST(query BY Query_time), SUM(Query_time), ...
   FROM /path/to/slow.log
   GROUP BY FINGERPRINT(query)
   ORDER BY SUM(Query_time) DESC
   LIMIT 10
 
 

This will matter a lot more in the future when we add more capabilities to "mk-log-parser". (At that point we'll dig back in SVN history for the documentation that we've removed from this tool due to unimplemented functionality.)

OPTIONS

Some options are negatable by specifying them in their long form with a --no prefix.
--analyze
short form: -A; default: yes; negatable: yes

Group similar queries and report on them.

This is the standard slow-log analysis functionality. See OUTPUT for the description of what this does and what the results look like. You can disable it for --review if you don't want to see any reports on queries.

--fingerprints
short form: -f

Add query fingerprints to the --analyze report. This is mostly useful for debugging purposes.

--forexplain
negatable: yes; default: yes

Print extra information to make analysis easy.

This option adds code snippets to make it easy to run SHOW CREATE TABLE and SHOW TABLE STATUS for the query's tables. It also rewrites non-SELECT queries into a SELECT that might be helpful for determining the non-SELECT statement's index usage.

--header
default: yes; negatable: yes

Print information about the entire analysis run.

See OUTPUT for more information on the header.

--review
short form: -R; type: DSN

Store a sample of each class of query in this DSN.

The argument specifies a table to store all unique query fingerprints in. The table must have at least the following columns, but you can add any more columns you wish:

   CREATE TABLE query_review (
      checksum     BIGINT UNSIGNED NOT NULL PRIMARY KEY,
      fingerprint  TEXT NOT NULL,
      sample       TEXT NOT NULL,
      first_seen   DATETIME,
      last_seen    DATETIME,
      reviewed_by  VARCHAR(20),
      reviewed_on  DATETIME,
      comments     VARCHAR(100)
   );
 
 

The columns are as follows:

   COLUMN       MEANING
   ===========  ===============
   checksum     A 64-bit checksum of the query fingerprint
   fingerprint  The abstracted version of the query; its primary key
   sample       The query text of a sample of the class of queries
   first_seen   The smallest timestamp of this class of queries
   last_seen    The largest timestamp of this class of queries
   reviewed_by  Initially NULL; if set, query is skipped thereafter
   reviewed_on  Initially NULL; not assigned any special meaning
   comments     Initially NULL; not assigned any special meaning
 
 

Note that the "fingerprint" column is the true primary key for a class of queries. The "checksum" is just a cryptographic hash of this value, which provides a shorter value that is very likely to also be unique.

As the tool parses the log, it keeps track of which query fingerprints it has seen. Each time it sees a new one, it inserts the query into this table. When you're done, your table should contain a row for each fingerprint.

If you enable this option and disable --analyze, the tool will ignore certain analysis-related options, like --top.

--reportall
Include all queries, even if they have already been reviewed.
--rusage
default: yes; negatable: yes

Report CPU times and memory usage in the header of the query analysis report.

--sort
short form: -s; type: string; default: sum

Sort the reported queries by this aggregate value of --worst.

Valid aggregate values are: sum, min, max, avg, cnt. For example, if you want to sort by the sum of Query_time, then specify --worst Query_time --sort sum.

--top
short form: -t; type: string; default: 95%

Limit output to the worst offenders.

If the argument is an integer, report only the top N worst queries. If the argument is an integer followed by the "%" sign, report that percentage of the worst queries.

--worst
short form: -w; type: string; default: Query_time

The attribute that measures a query's badness.

Only basic attributes are valid: Query_time, Lock_time, Rows_read, etc. This option is not fully implemented. It's possible that non-default values will do strange things.

DOWNLOADING

You can download Maatkit from Google Code at <http://code.google.com/p/maatkit/>, or you can get any of the tools easily with a command like the following:
    wget http://www.maatkit.org/get/toolname
    or
    wget http://www.maatkit.org/trunk/toolname
 
 

Where "toolname" can be replaced with the name (or fragment of a name) of any of the Maatkit tools. Once downloaded, they're ready to run; no installation is needed. The first URL gets the latest released version of the tool, and the second gets the latest trunk code from Subversion.

SYSTEM REQUIREMENTS

You need Perl and some core packages that ought to be installed in any reasonably new version of Perl.

ENVIRONMENT

The environment variable "MKDEBUG" enables verbose debugging output in all of the Maatkit tools:
    MKDEBUG=1 mk-....
 
 

BUGS

Please use Google Code Issues and Groups to report bugs or request support: <http://code.google.com/p/maatkit/>.

Please include the complete command-line used to reproduce the problem you are seeing, the version of all MySQL servers involved, the complete output of the tool when run with ``--version'', and if possible, debugging output produced by running with the "MKDEBUG=1" environment variable.

COPYRIGHT, LICENSE AND WARRANTY

This program is copyright 2007-2008 Baron Schwartz. Feedback and improvements are welcome.

THIS PROGRAM IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

AUTHOR

Baron Schwartz, Daniel Nichter

VERSION

This manual page documents Ver 0.9.1 Distrib 2725 $Revision: 2720 $.