mk-table-checksum

Langue: en

Version: 2008-12-29 (fedora - 04/07/09)

Section: 1 (Commandes utilisateur)

NAME

mk-table-checksum - Perform an online replication consistency check, or checksum MySQL tables efficiently on one or many servers.

SYNOPSIS

STOP! Are you checksumming a slave(s) against its master? Then be sure to learn what ``--replicate'' does. It is probably the option you want to use.
    mk-table-checksum --replicate=mydb.checksum master-host
    ... time passses, replication catches up ...
    mk-table-checksum --replicate=mydb.checksum --replcheck 2 master-host
 
 

Or,

    mk-table-checksum h=host1,u=user,p=password h=host2 ...
 
 

Or,

    mk-table-checksum host1 host2 ... hostN | mk-checksum-filter
 
 

See ``SPECIFYING HOSTS'' for more on the syntax of the host arguments.

OVERVIEW

mk-table-checksum generates table checksums for MySQL tables, typically useful for verifying your slaves are in sync with the master. The checksums are generated by a query on the server, and there is very little network traffic as a result.

Checksums typically take about twice as long as COUNT(*) on very large InnoDB tables in my tests. For smaller tables, COUNT(*) is a good bit faster than the checksums. See ``--algorithm'' for more details on performance.

If you specify more than one server, mk-table-checksum assumes the first server is the master and others are slaves. Checksums are parallelized for speed, forking off a child process for each table. Duplicate server names are ignored, but if you want to checksum a server against itself you can use two different forms of the hostname (for example, ``localhost 127.0.0.1'', or ``h=localhost,P=3306 h=localhost,P=3307'').

If you want to compare the tables in one database to those in another database on the same server, just checksum both databases:

    mk-table-checksum --databases db1,db2
 
 

You can then use mk-checksum-filter to compare the results in both databases easily.

mk-table-checksum examines table structure only on the first host specified, so if anything differs on the others, it won't notice. It ignores views.

The checksums work on MySQL version 3.23.58 through 6.0-alpha. They will not necessarily produce the same values on all versions. Differences in formatting and/or space-padding between 4.1 and 5.0, for example, will cause the checksums to be different.

SPECIFYING HOSTS

mk-table-checksum connects to a theoretically unlimited number of MySQL servers. You specify a list of one or more host definitions on the command line, such as ``host1 host2''. Each host definition can be just a hostname, or it can be a complex string that specifies connection options as well. You can specify connection options two ways:
Format a host definition in a key=value,key=value form. If an argument on the command line contains the letter '=', mk-table-checksum will parse it into its component parts. Examine the ``--help'' output for details on the allowed keys.

Specifying a list of simple host definitions ``host1 host2'' is equivalent to the more complicated ``h=host1 h=host2'' format.

With the command-line options such as ``--user'' and ``--password''. These options, if given, apply globally to all host definitions.

In addition to specifying connection options this way, mk-table-checksum allows shortcuts. Any options specified for the first host definition on the command line fill in missing values in subsequent ones. Any options that are still missing after this are filled in from the command-line options if possible.

In other words, the places you specify connection options have precedence: highest precedence is the option specified directly in the host definition, next is the option specified in the first host definition, and lowest is the command-line option.

You can mix simple and complex host definitions and/or command-line arguments. For example, if all your servers except one of your slaves uses a non-standard port number:

    mk-table-checksum --port 4500 master h=slave1,P=3306 slave2 slave3
 
 

If you are confused about how mk-table-checksum will connect to your servers, give the ``--explainhosts'' option and it will tell you.

HOW FAST IS IT?

Speed and efficiency are important, because the typical use case is checksumming large amounts of data.

"mk-table-checksum" is designed to do very little work itself, and generates very little network traffic aside from inspecting table structures with "SHOW CREATE TABLE". The results of checksum queries are typically 40-character or shorter strings.

The MySQL server does the bulk of the work, in the form of the checksum queries. The following benchmarks show the checksum query times for various checksum algorithms. The first two results are simply running "COUNT(col8)" and "CHECKSUM TABLE" on the table. "CHECKSUM TABLE" is just "CRC32" under the hood, but it's implemented inside the storage engine layer instead of at the MySQL layer.

  ALGORITHM       HASH FUNCTION  EXTRA      TIME
  ==============  =============  ========  =====
  COUNT(col8)                                2.3
  CHECKSUM TABLE                             5.3
  BIT_XOR         FNV_64                    12.7
  ACCUM           FNV_64                    42.4
  BIT_XOR         MD5            --optxor   80.0
  ACCUM           MD5                       87.4
  BIT_XOR         SHA1           --optxor   90.1
  ACCUM           SHA1                     101.3
  BIT_XOR         MD5                      172.0
  BIT_XOR         SHA1                     197.3
 
 

The tests are entirely CPU-bound. The sample data is an InnoDB table with the following structure:

  CREATE TABLE test (
    col1 int NOT NULL,
    col2 date NOT NULL,
    col3 int NOT NULL,
    col4 int NOT NULL,
    col5 int,
    col6 decimal(3,1),
    col7 smallint unsigned NOT NULL,
    col8 timestamp NOT NULL,
    PRIMARY KEY  (col2, col1),
    KEY (col7),
    KEY (col1)
  ) ENGINE=InnoDB
 
 

The table has 4303585 rows, 365969408 bytes of data and 173457408 bytes of indexes. The server is a Dell PowerEdge 1800 with dual 32-bit Xeon 2.8GHz processors and 2GB of RAM. The tests are fully CPU-bound, and the server is otherwise idle. The results are generally consistent to within a tenth of a second on repeated runs.

"CRC32" is the default checksum function to use, and should be enough for most cases. If you need stronger guarantees that your data is identical, you should use one of the other functions.

DOWNLOADING

You can download Maatkit from Google Code at <http://code.google.com/p/maatkit/>, or you can get any of the tools easily with a command like the following:
    wget http://www.maatkit.org/get/toolname
    or
    wget http://www.maatkit.org/trunk/toolname
 
 

Where "toolname" can be replaced with the name (or fragment of a name) of any of the Maatkit tools. Once downloaded, they're ready to run; no installation is needed. The first URL gets the latest released version of the tool, and the second gets the latest trunk code from Subversion.

OPTIONS

--algorithm
short form: -a; type: string

Checksum algorithm (ACCUM|CHECKSUM|BIT_XOR).

Specifies which checksum algorithm to use. Valid arguments are CHECKSUM, BIT_XOR and ACCUM. The latter two do cryptographic hash checksums.

CHECKSUM is built into MySQL, but has some disadvantages. BIT_XOR and ACCUM are implemented by SQL queries. They use a cryptographic hash of all columns concatenated together with a separator, followed by a bitmap of each nullable column that is NULL (necessary because CONCAT_WS() skips NULL columns).

CHECKSUM is the default. This method uses MySQL's built-in CHECKSUM TABLE command, which is a CRC32 behind the scenes. It cannot be used before MySQL 4.1.1, and various options disable it as well. It does not simultaneously count rows; that requires an extra COUNT(*) query. This is a good option when you are using MyISAM tables with live checksums enabled; in this case both the COUNT(*) and CHECKSUM queries will run very quickly.

The BIT_XOR algorithm is available for MySQL 4.1.1 and newer. It uses BIT_XOR(), which is order-independent, to reduce all the rows to a single checksum.

ACCUM uses a user variable as an accumulator. It reduces each row to a single checksum, which is concatenated with the accumulator and re-checksummed. This technique is order-dependent. If the table has a primary key, it will be used to order the results for consistency; otherwise it's up to chance.

The pathological worst case is where identical rows will cancel each other out in the BIT_XOR. In this case you will not be able to distinguish a table full of one value from a table full of another value. The ACCUM algorithm will distinguish them.

However, the ACCUM algorithm is order-dependent, so if you have two tables with identical data but the rows are out of order, you'll get different checksums with ACCUM.

If a given algorithm won't work for some reason, mk-table-checksum falls back to another. The least common denominator is ACCUM, which works on MySQL 3.23.2 and newer.

--argtable
type: string

The database.table with arguments for each table to checksum.

This table may be named anything you wish. It must contain at least the following columns:

   CREATE TABLE checksum_args (
      db         char(64)     NOT NULL,
      tbl        char(64)     NOT NULL,
      -- other columns as desired
      PRIMARY KEY (db, tbl)
   );
 
 

In addition to the columns shown, it may contain any of the other columns listed here (Note: this list is used for a magical overridable_args hash in the code):

   a C columns r c f k nouseindex o probability s savesince
   singlechunk since sincecolumn sleep sleep-coef trim w W
 
 

Each of these columns corresponds to the short form of a command-line option. Each column should be NULL-able. The data type does not matter, but it's suggested you use a sensible data type to prevent garbage data.

When "mk-table-checksum" checksums a table, it will look for a matching entry in this table. Any column that has a defined value will override the corresponding command-line argument for the table being currently processed. In this way it is possible to specify custom command-line arguments for any table.

If you add columns to the table that aren't in the above list of allowable columns, it's an error. The exceptions are "db", "tbl", and "ts". The "ts" column can be used as a timestamp for easy visibility into the last time the "since" column was updated with ``--savesince''.

This table is assumed to be located on the first server given on the command-line.

--askpass
Prompt for a password when connecting to MySQL.
--checksum
Print checksums and table names in the style of md5sum (disables ``--count'').

Makes the output behave more like the output of "md5sum". The checksum is first on the line, followed by the host, database, table, and chunk number, concatenated with dots.

--check-interval
type: time; default: 1s

How often to check for slave lag if ``--check-slave-lag'' is given.

--check-slave-lag
type: string

Pause checksumming each chunk until the specified DSN's slave lag is less than ``--maxlag''.

--chunksize
short form: -C; type: string

Approximate number of rows or size of data to checksum at a time. Allowable suffixes are k, M, G. Disallows -a CHECKSUM.

If you specify a chunk size, mk-table-checksum will try to find an index that will let it split the table into ranges of approximately ``--chunksize'' rows, based on the table's index statistics. It will checksum each range separately with parameters in the checksum query's WHERE clause.

If mk-table-checksum cannot find a suitable index, it will do the entire table in one chunk as though you had not specified ``--chunksize'' at all. Each table is handled individually, so some tables may be chunked and others not.

The chunks will be approximately sized, and depending on the distribution of values in the indexed column, some chunks may be larger than the value you specify.

If you specify a suffix (one of k, M or G), the parameter is treated as a data size rather than a number of rows. The output of SHOW TABLE STATUS is then used to estimate the amount of data the table contains, and convert that to a number of rows.

--columns
type: array

Checksum only this comma-separated list of columns.

--count
short form: -r; negatable: yes

Count rows in tables. This is built into ACCUM and BIT_XOR, but requires an extra query for CHECKSUM.

This is disabled by default to avoid an extra COUNT(*) query when ``--algorithm'' is CHECKSUM. If you have only MyISAM tables and live checksums are enabled, both CHECKSUM and COUNT will be very fast, but otherwise you may want to use one of the other algorithms.

--crc
short form: -c; negatable: yes; default: yes

Do a CRC (checksum) of tables.

Take the checksum of the rows as well as their count. This is enabled by default. If you disable it, you'll just get COUNT(*) queries.

--createreplicate
Create the replicate table given by ``--replicate'' if it does not exist.

Normally, if the replicate table given by ``--replicate'' does not exist, "mk-table-checksum" will die. With this option, however, "mk-table-checksum" will create the replicate table for you, using the database.table name given to ``--replicate''.

The structure of the replicate table is the same as the suggested table mentioned in ``--replicate''. Note that since ENGINE is not specified, the replicate table will use the server's default storage engine. If you want to use a different engine, you need to create the table yourself.

--databases
short form: -d; type: hash

Only checksum this comma-separated list of databases.

--defaults-file
short form: -F; type: string

Only read mysql options from the given file. You must give an absolute pathname.

--emptyrepltbl
Empty table given by ``--replicate'' before starting.

Issues a DELETE against the table given by ``--replicate'' before beginning work. Ignored if ``--replicate'' is not specified. This can be useful to remove entries related to tables that no longer exist, or just to clean out the results of a previous run.

--engine
short form: -e; type: hash

Do only this comma-separated list of storage engines.

--explain
Show, but do not execute, checksum queries (disables ``--emptyrepltbl'').
--explainhosts
Print connection information and exit.

Print out a list of hosts to which mk-table-checksum will connect, with all the various connection options, and exit. See ``SPECIFYING HOSTS''.

--float-precision
type: int

Precision for "FLOAT" and "DOUBLE" column comparisons.

If you specify this option, FLOAT and DOUBLE columns will be rounded to the specified number of digits after the decimal point for the checksum. This can avoid checksum mismatches due to different floating-point representations of the same values on different MySQL versions and hardware.

--function
short form: -f; type: string

Hash function for checksums (FNV_64, SHA1, MD5, CRC32, etc).

You can use this option to choose the cryptographic hash function used for ``--algorithm''=ACCUM or ``--algorithm''=BIT_XOR. The default is to use "CRC32", but "MD5" and "SHA1" also work, and you can use your own function, such as a compiled UDF, if you wish. Whatever function you specify is run in SQL, not in Perl, so it must be available to MySQL.

The "FNV_64" UDF mentioned in the benchmarks is much faster than "MD5". The C++ source code is distributed with Maatkit. It is very simple to compile and install; look at the header in the source code for instructions. If it is installed, it is preferred over "MD5".

--ignorecols
type: array

Ignore this comma-separated list of columns when calculating the checksum.

This option only affects the checksum when using the ACCUM or BIT_XOR ``--algorithm''.

--ignoredb
short form: -g; type: Hash

Ignore this comma-separated list of databases.

--ignoreengine
short form: -E; type: Hash; default: FEDERATED,MRG_MyISAM

Ignore this comma-separated list of storage engines.

--ignoretbl
short form: -n; type: Hash

Ignore this comma-separated list of tables.

Table names may be qualified with the database name.

--lock
short form: -k

Lock on master until done on slaves (implies ``--slavelag'').

This option can help you to get a consistent read on a master and many slaves. If you specify this option, mk-table-checksum will lock the table on the first server on the command line, which it assumes to be the master. It will keep this lock until the checksums complete on the other servers.

This option isn't very useful by itself, so you probably want to use ``--wait'' instead.

Note: if you're checksumming a slave against its master, you should use --replicate. In that case, there's no need for locking, waiting, or any of that.

--maxlag
type: time; default: 1s

Suspend checksumming if the slave given by ``--check-slave-lag'' lags.

This option causes mk-table-checksum to look at the slave every time it's about to checksum a chunk. If the slave's lag is greater than the option's value, or if the slave isn't running (so its lag is NULL), mk-table-checksum sleeps for ``--check-interval'' seconds and then looks at the lag again. It repeats until the slave is caught up, then proceeds to checksum the chunk.

This option is useful to let you checksum data as fast as the slaves can handle it, assuming the slave you directed mk-table-checksum to monitor is representative of all the slaves that may be replicating from this server. It should eliminate the need for ``--sleep'' or ``--sleep-coef''.

--modulo
short form: -M; type: int

Do only every Nth chunk on chunked tables.

This option lets you checksum only some chunks of the table. This is a useful alternative to ``--probability'' when you want to be sure you get full coverage in some specified number of runs; for example, you can do only every 7th chunk, and then use ``--offset'' to rotate the modulo every day of the week.

Just like with ``--probability'', a table that cannot be chunked is done every time.

--nouseindex
Do not add USE INDEX hint to SQL statements.

By default "mk-table-checksum" adds an USE INDEX hint to each SQL statement to coerce MySQL into using the index for the column by which a table will be chunked. This option causes "mk-table-checksum" to omit the USE INDEX hint.

--offset
short form: -O; type: string; default: 0

Modulo offset expression for use with ``--modulo''.

The argument may be an SQL expression, such as "WEEKDAY(NOW())" (which returns a number from 0 through 6). The argument is evaluated by MySQL. The result is used as follows: if chunk_num % ``--modulo'' == ``--offset'', the chunk will be checksummed.

--optxor
short form: -o; default: yes; negatable: yes

Optimize BIT_XOR with user variables.

This option specifies to use user variables to reduce the number of times each row must be passed through the cryptographic hash function when you are using the BIT_XOR algorithm.

With the optimization, the queries look like this in pseudo-code:

   SELECT CONCAT(
      BIT_XOR(SLICE_OF(@user_variable)),
      BIT_XOR(SLICE_OF(@user_variable)),
      ...
      BIT_XOR(SLICE_OF(@user_variable := HASH(col1, col2... colN))));
 
 

The exact positioning of user variables and calls to the hash function is determined dynamically, and will vary between MySQL versions. Without the optimization, it looks like this:

   SELECT CONCAT(
      BIT_XOR(SLICE_OF(MD5(col1, col2... colN))),
      BIT_XOR(SLICE_OF(MD5(col1, col2... colN))),
      ...
      BIT_XOR(SLICE_OF(MD5(col1, col2... colN))));
 
 

The difference is the number of times all the columns must be mashed together and fed through the hash function. If you are checksumming really large columns, such as BLOB or TEXT columns, this might make a big difference.

--password
short form: -p; type: string

Password to use when connecting.

--port
short form: -P; type: int

Port number to use for connection.

--probability
type: int; default: 100

Checksums will be run with this percent probability.

This is an integer between 1 and 100. If 100, every chunk of every table will certainly be checksummed. If less than that, there is a chance that some chunks of some tables will be skipped. This is useful for routine jobs designed to randomly sample bits of tables without checksumming the whole server. By default, if a table is not chunkable, it will be checksummed every time even when the probability is less than 100. You can override this with ``--singlechunk''.

See also ``--modulo''.

--quiet
short form: -q

Do not print checksum results.

--recheck
Re-checksum chunks that ``--replcheck'' found to be different.
--replcheck
type: int

Check results in ``--replicate'' table, to the specified depth.

Recursively finds differences recorded in the table given by ``--replicate''. Recurses to the depth you specify: 0 is no recursion (check only the server you specify), 1 is check the server and its slaves, 2 is check the slaves of its slaves, and so on.

It finds differences by running the query shown in ``CONSISTENT CHECKSUMS'', and prints results. Exits after printing. This is just a convenient way of running the query so you don't have to do it manually.

The output is one informational line per slave host, followed by the results of the query, if any. If ``--quiet'' is specified, there is no output.

This option makes "mk-table-checksum" looks for slaves by running "SHOW PROCESSLIST". If it finds connections that appear to be from slaves, it derives connection information for each slave the same default-and-override method described in ``SPECIFYING HOSTS''.

If "SHOW PROCESSLIST" doesn't return any rows, "mk-table-checksum" looks at "SHOW SLAVE HOSTS" instead. The host and port, and user and password if available, from "SHOW SLAVE HOSTS" are combined into a DSN and used as the argument. This requires slaves to be configured with "report-host", "report-port" and so on.

This requires the @@SERVER_ID system variable, so it works only on MySQL 3.23.26 or newer.

If any slave has chunks that differ from the master, mk-table-checksum's exit status is 1; otherwise it is 0.

--replicate
short form: -R; type: string

Replicate checksums to slaves (disallows -a CHECKSUM).

This option enables a completely different checksum strategy for a consistent, lock-free checksum across a master and its slaves. This works only with statement-based replication (mk-table-checksum will switch the binlog format to STATEMENT for the duration of the session if your server uses row-based replication). Instead of running the checksum queries on each server, you run it only on the master. You specify a table to insert the results into. The query will insert directly into the table, so it will be replicated through the binlog to the slaves.

The argument to the option is the table (fully qualified: db.table) in which the checksums should be stored. The table must have at least these columns: db, tbl, chunk, boundaries, this_crc, master_crc, this_cnt, master_cnt. The table may be named anything you wish. Here is a suggested table structure:

   CREATE TABLE checksum (
      db         char(64)     NOT NULL,
      tbl        char(64)     NOT NULL,
      chunk      int          NOT NULL,
      boundaries char(100)    NOT NULL,
      this_crc   char(40)     NOT NULL,
      this_cnt   int          NOT NULL,
      master_crc char(40)         NULL,
      master_cnt int              NULL,
      ts         timestamp    NOT NULL,
      PRIMARY KEY (db, tbl, chunk)
   );
 
 

Be sure to choose an appropriate storage engine for the checksum table. If you are checksumming InnoDB tables, for instance, a deadlock will break replication if the checksum table is non-transactional, because the transaction will still be written to the binlog. It will then replay without a deadlock on the slave and break replication with ``different error on master and slave.'' This is not a problem with mk-table-checksum, it's a problem with MySQL replication, and you can read more about it in the MySQL manual.

When the queries are finished replicating, you can run a simple query on each slave to see which tables have differences from the master. See ``CONSISTENT CHECKSUMS'' for details. If you find tables that have differences, you can use the chunk boundaries in a WHERE clause to mk-table-sync to help repair them more efficiently. See mk-table-sync for details.

This option eliminates the need to do complicated locking and unlocking, waiting for master binlog positions, and so on. It disables ``--lock'', ``--wait'', and ``--slavelag''.

The checksum queries actually do a REPLACE into this table, so existing rows need not be removed before running. However, you may wish to do this anyway to remove rows related to tables that don't exist anymore. The ``--emptyrepltbl'' option does this for you.

Since mk-table-checksum uses USE to select the table's database as its default database before executing the checksum query, the checksum queries should replicate to slaves even if --binlog-do-db settings on the master filter out the checksum table's database. For more information on how --binlog-do-db works, see <http://dev.mysql.com/doc/en/binary-log.html>.

If the slaves have any --replicate-do-X or replicate-ignore-X options, you should be careful not to checksum any databases or tables that exist on the master and not the slaves. Changes to such tables may not normally be executed on the slaves because of the --replicate-X options, but the checksum queries change the checksum table, not the tables they checksum. Therefore these queries will be executed on the slave, and if the table or database does not exist, they will cause replication to fail. For more information on replication rules, see <http://dev.mysql.com/doc/en/replication-rules.html>.

The table specified by ``--replicate'' will never be checksummed itself.

--resume
type: string

Resume checksum using given output file from a previously interrupted run.

The given output file should be the literal output from a previous run of "mk-table-checksum". For example:

    mk-table-checksum host1 host2 -C 100 > checksum_results.txt
    mk-table-checksum host1 host2 -C 100 --resume checksum_results.txt
 
 

The command line options given to the first run and the resumed run must be identical (except, of course, for --resume). If they are not, the result will be unpredictible and probably wrong.

--resume does not work with --replicate; for that, use --resume-replicate.

--resume-replicate
Resume --replicate.

This option resumes a previous checksum operation using --replicate. It is like --resume but does not require an output file. Instead, it uses the checksum table given to --replicate to determine where to resume the checksum operation.

--savesince
When ``--argtable'' and ``--since'' are given, save the current ``--since'' value into that table's "since" column after checksumming. In this way you can incrementally checksum tables by starting where the last one finished.

The value to be saved could be the current timestamp, or it could be the maximum existing value of the column given by ``--sincecol''. It depends on what options are in effect. See the description of ``--since'' to see how timestamps are different from ordinary values.

--schema
Checksum "SHOW CREATE TABLE" intead of table data.

This option disables most other options since it does not require access to table data. Only the following options are allowed with --schema: ``--askpass'', ``--checksum'', ``--databases'', ``--defaults-file'', ``--engine'', ``--explain'', ``--explainhosts'', ``--ignoredb'', ``--ignoreengine'', ``--ignoretbl'', ``--password'', ``--port'', ``--quiet'', ``--setvars'', ``--socket'', ``--tab'', ``--tables'', ``--user''.

--separator
short form: -s; type: string; default: #

The separator character used for CONCAT_WS().

This character is used to join the values of columns when checksumming with ``--algorithm'' of BIT_XOR or ACCUM.

--setvars
type: string; default: wait_timeout=10000

Set these MySQL variables.

Specify any variables you want to be set immediately after connecting to MySQL. These will be included in a "SET" command.

--since
type: string

Checksum only data newer than this value.

If the table is chunk-able or nibble-able, this value will apply to the first column of the chunked or nibbled index.

This is not too different to ``--where'', but instead of universally applying a WHERE clause to every table, it selectively finds the right column to use and applies it only if such a column is found. See also ``--sincecolumn''.

The argument may be an expression, which is evaluated by MySQL. For example, you can specify "CURRENT_DATE - INTERVAL 7 DAY" to get the date of one week ago.

A special bit of extra magic: if the value is temporal (looks like a date or datetime), then the table is checksummed only if the create time (or last modified time, for tables that report the last modified time, such as MyISAM tables) is newer than the value. In this sense it's not applied as a WHERE clause at all.

--sincecolumn
type: string

The column name to be used for ``--since''.

The default is for the tool to choose the best one automatically. If you specify a value, that will be used if possible; otherwise the best auto-determined one; otherwise none. If the column doesn't exist in the table, it is just ignored.

--singlechunk
Permit skipping with ``--probability'' if there is only one chunk.

Normally, if a table isn't split into many chunks, it will always be checksummed regardless of ``--probability''. This setting lets the probabilistic behavior apply to tables that aren't divided into chunks.

--slavelag
short form: -l

Report how far slaves lag master.

If this option is enabled, the output will show how many seconds behind the master each slave is. This can be useful when you want a fast, parallel, non-blocking checksum, and you know your slaves might lag the master. You can inspect the results and make an educated guess whether any discrepancies on the slave are due to slave lag instead of corrupt data.

If you're using ``--replicate'', slave lag is a non-issue and you don't need to worry about this.

--sleep
type: int

Sleep time between checksums.

If this option is specified, mk-table-checksum will sleep the specified number of seconds between checksums. That is, it will sleep between every table, and if you specify ``--chunksize'', it will also sleep between chunks.

This is a very crude way to throttle checksumming; see ``--sleep-coef'' and ``--check-slave-lag''.

--sleep-coef
type: float

Calculate ``--sleep'' as a multiple of the last checksum time.

If this option is specified, mk-table-checksum will sleep the amount of time elapsed during the previous checksum, multiplied by the specified coefficient. This option is ignored if ``--sleep'' is specified.

This is a slightly more sophisticated way to throttle checksum speed: sleep a varying amount of time between chunks, depending on how long the chunks are taking. Even better is to use ``--check-slave-lag'' if you're checksumming master/slave replication.

--socket
short form: -S; type: string

Socket file to use for connection.

--tab
short form: -b

Print tab-separated output, not column-aligned output.

--tables
short form: -t; type: hash

Do only this comma-separated list of tables.

Table names may be qualified with the database name.

--trim
Trim "VARCHAR" columns (helps when comparing 4.1 to >= 5.0).

This option adds a "TRIM()" to "VARCHAR" columns in "BIT_XOR" and "ACCUM" modes.

This is useful when you don't care about the trailing space differences between MySQL versions which vary in their handling of trailing spaces. MySQL 5.0 and later all retain trailing spaces in "VARCHAR", while previous versions would remove them.

--user
short form: -u; type: string

User for login if not current user.

--verify
short form: -v; negatable: yes; default: yes

Verify checksum compatibility across servers.

This option runs a trivial checksum on all servers to ensure they have compatible CONCAT_WS() and cryptographic hash functions.

Versions of MySQL before 4.0.14 will skip empty strings and NULLs in CONCAT_WS, and others will only skip NULLs. The two kinds of behavior will produce different results if you have any columns containing the empty string in your table. If you know you don't (for instance, all columns are integers), you can safely disable this check and you will get a reliable checksum even on servers with different behavior.

--wait
short form: -w; type: time

Wait this long for slaves to catch up to their master (implies ``--lock'' ``--slavelag'').

Note: the best way to verify that a slave is in sync with its master is to use ``--replicate'' instead. The ``--wait'' option is really only useful if you're trying to compare masters and slaves without using ``--replicate'', which is possible but complex and less efficient in some ways.

This option helps you get a consistent checksum across a master server and its slaves. It combines locking and waiting to accomplish this. First it locks the table on the master (the first server on the command line). Then it finds the master's binlog position. Checksums on slaves will be deferred until they reach the same binlog position.

The argument to the option is the number of seconds to wait for the slaves to catch up to the master. It is actually the argument to MASTER_POS_WAIT(). If the slaves don't catch up to the master within this time, they will unblock and go ahead with the checksum. You can tell whether this happened by examining the STAT column in the output, which is the return value of MASTER_POS_WAIT().

--where
short form: -W; type: string

Do only rows matching this "WHERE" clause (disallows ``--algorithm'' CHECKSUM).

You can use this option to limit the checksum to only part of the table. This is particularly useful if you have append-only tables and don't want to constantly re-check all rows; you could run a daily job to just check yesterday's rows, for instance.

This option is much like the -w option to mysqldump. Do not specify the WHERE keyword. You may need to quote the value. Here is an example:

   mk-table-checksum --where "foo=bar"
 
 

CONSISTENT CHECKSUMS

If you are using this tool to verify your slaves still have the same data as the master, which is why I wrote it, you should read this section.

The best way to do this with replication is to use the ``--replicate'' option. When the queries are finished running on the master and its slaves, you can go to the slaves and issue SQL queries to see if any tables are different from the master. Try the following:

   SELECT db, tbl, chunk, this_cnt-master_cnt AS cnt_diff,
      this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc)
         AS crc_diff
   FROM checksum
   WHERE master_cnt <> this_cnt OR master_crc <> this_crc
      OR ISNULL(master_crc) <> ISNULL(this_crc);
 
 

The ``--replcheck'' option can do this query for you. If you can't use this method, try the following:

If your servers are not being written to, you can just run the tool with no further ado:
   mk-table-checksum server1 server2 ... serverN
 
 
If the servers are being written to, you need some way to make sure they are consistent at the moment you run the checksums. For situations other than master-slave replication, you will have to figure this out yourself. You may be able to use the ``--where'' option with a date or time column to only checksum data that's not recent.
If you are checksumming a master and slaves, you can do a fast parallel checksum and assume the slaves are caught up to the master. In practice, this tends to work well except for tables which are constantly updated. You can use the ``--slavelag'' option to see how far behind each slave was when it checksummed a given table. This can help you decide whether to investigate further.
The next most disruptive technique is to lock the table on the master, then take checksums. This should prevent changes from propagating to the slaves. You can just lock on the master (with ``--lock''), or you can both lock on the master and wait on the slaves till they reach that point in the master's binlog (``--wait''). Which is better depends on your workload; only you know that.
If you decide to make the checksums on the slaves wait until they're guaranteed to be caught up to the master, the algorithm looks like this:
  For each table,
    Master: lock table
    Master: get pos
    In parallel,
      Master: checksum
      Slave(s): wait for pos, then checksum
    End
    Master: unlock table
  End
 
 

What I typically do when I'm not using the ``--replicate'' option is simply run the tool on all servers with no further options. This runs fast, parallel, non-blocking checksums simultaneously. If there are tables that look different, I re-run with ``--wait''=600 on the tables in question. This makes the tool lock on the master as explained above.

OUTPUT

Output is to STDOUT, one line per server and table, with header lines for each database. I tried to make the output easy to process with awk. For this reason columns are always present. If there's no value, mk-table-checksum prints 'NULL'.

The default is column-aligned output for human readability, but you can change it to tab-separated if you want. Use the ``--tab'' option for this.

Output is unsorted, though all lines for one table should be output together. For speed, all checksums are done in parallel (as much as possible) and may complete out of the order in which they were started. You might want to run them through another script or command-line utility to make sure they are in the order you want. If you pipe the output through mk-checksum-filter, you can sort the output and/or avoid seeing output about tables that have no differences.

The columns in the output are as follows. The database, table, and chunk come first so you can sort by them easily (they are the ``primary key'').

Output from ``--replcheck'' and ``--checksum'' are different.

DATABASE
The database the table is in.
TABLE
The table name.
CHUNK
The chunk (see ``--chunksize''). Zero if you are not doing chunked checksums.
HOST
The server's hostname.
ENGINE
The table's storage engine.
COUNT
The table's row count, unless you specified to skip it.
CHECKSUM
The table's checksum, unless you specifed to skip it or the table has no rows. some types of checksums will be 0 if there are no rows; others will print NULL.
TIME
The time the actual checksum and/or counting took.
WAIT
How long the checksum blocked before beginning.
STAT
The return value of MASTER_POS_WAIT().
LAG
How far the slave lags the master, as reported by SHOW SLAVE STATUS.

EXIT STATUS

A successful exit status is 0. If there is an error checksumming any table, the exit status is 1.

When running ``--replcheck'', if any slave has chunks that differ from the master, the exit status is 1.

QUERIES

If you are using innotop (see <http://innotop.sourceforge.net/>), mytop, or another tool to watch currently running MySQL queries, you may see the checksum queries. They look similar to this:
   REPLACE /*test.test_tbl:'2'/'5'*/ INTO test.checksum(db, ...
 
 

Since mk-table-checksum's queries run for a long time and tend to be textually very long, and thus won't fit on one screen of these monitoring tools, I've been careful to place a comment at the beginning of the query so you can see what it is and what it's doing. The comment contains the name of the table that's being checksummed, the chunk it is currently checksumming, and how many chunks will be checksummed. In the case above, it is checksumming chunk 2 of 5 in table test.test_tbl.

SEE ALSO

See also mk-checksum-filter and mk-table-sync.

ENVIRONMENT

The environment variable "MKDEBUG" enables verbose debugging output in all of the Maatkit tools:
    MKDEBUG=1 mk-....
 
 

BUGS

Please use Google Code Issues and Groups to report bugs or request support: <http://code.google.com/p/maatkit/>.

Please include the complete command-line used to reproduce the problem you are seeing, the version of all MySQL servers involved, the complete output of the tool when run with ``--version'', and if possible, debugging output produced by running with the "MKDEBUG=1" environment variable.

SYSTEM REQUIREMENTS

You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl.

AUTHOR

Baron ``Xaprb'' Schwartz.

ACKNOWLEDGEMENTS

This is an incomplete list. My apologies for omissions or misspellings.

Claus Jeppesen, Francois Saint-Jacques, Giuseppe Maxia, Heikki Tuuri, James Briggs, Martin Friebe, Sergey Zhuravlev,

COPYRIGHT, LICENSE AND WARRANTY

This program is copyright 2007-2008 Baron Schwartz. Feedback and improvements are welcome.

THIS PROGRAM IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

VERSION

This manual page documents Ver 1.2.2 Distrib 2725 $Revision: 2608 $.