Rechercher une page de manuel
mk-slave-prefetch
Langue: en
Version: 2008-12-29 (fedora - 04/07/09)
Section: 1 (Commandes utilisateur)
Sommaire
NAME
mk-slave-prefetch - Pipeline relay logs on a MySQL slave to pre-warm caches.SYNOPSIS
mk-slave-prefetch mk-slave-prefetch --statistics > /path/to/saved/statistics mk-slave-prefetch /path/to/saved/statistics
DESCRIPTION
mk-slave-prefetch reads the slave's relay log slightly ahead of where the slave's SQL thread is reading, converts statements into "SELECT", and executes them. In theory, this should help alleviate the effects of the slave's single-threaded SQL execution. It will help take advantage of multiple CPUs and disks by pre-reading the data from disk, so the data is already in the cache when the slave SQL thread executes the un-modified version of the statement.Statements that can't be converted into "SELECT" are ignored. However, there is always a chance of bugs. It would be a very good idea to connect as a read-only user. Here is an example of how to grant the necessary privileges:
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'prefetch'@'%' IDENTIFIED BY 'sp33dmeup!';
"mk-slave-prefetch" learns how long it takes statements to execute, and doesn't try to execute those that take a very long time. You can ask it to print what it has learned after it executes. You can also specify a filename on the command line. The file should contain the statistics printed by a previous run. These will be used to pre-populate the statistics so it doesn't have to re-learn.
This program is based on concepts I heard Paul Tuckfield explain at the November 2006 MySQL Camp un-conference. However, the code is my own work. I have not seen any other implementation of Paul's idea.
DOES IT WORK?
Does it work? Does it actually speed up the slave?That depends on your workload, hardware, and other factors. It might work when the following are true:
- •
- The slave's data is much larger than memory, and the workload is mostly randomly scattered small (single-row is ideal) changes.
- •
- There are lots of high-concurrency "UPDATE" and "DELETE" statements on the master.
- •
- The slave SQL thread is I/O-bound, but the slave overall has plenty of spare I/O capacity (definitely more than one disk spindle).
- •
- The slave uses InnoDB or another storage engine with row-level locking.
It does not speed up replication on my slaves, which mostly have large queries like "INSERT .. SELECT .. GROUP BY". In my benchmarks it seemed to make no difference at all, positive or negative.
On the wrong workload or slave configuration, this technique might actually make the slaves slower. Your mileage will vary.
User-contributed benchmarks are welcome.
DOWNLOADING
You can download Maatkit from Google Code at <http://code.google.com/p/maatkit/>, or you can get any of the tools easily with a command like the following:wget http://www.maatkit.org/get/toolname or wget http://www.maatkit.org/trunk/toolname
Where "toolname" can be replaced with the name (or fragment of a name) of any of the Maatkit tools. Once downloaded, they're ready to run; no installation is needed. The first URL gets the latest released version of the tool, and the second gets the latest trunk code from Subversion.
OPTIONS
--print and --daemonize are mutually exclusive.Specify at least one of --print or --execute.
- --askpass
- Prompt for a password when connecting to MySQL.
- --charset
- short form: -A; type: string
Default character set.
Enables character set settings in Perl and MySQL. If the value is "utf8", sets Perl's binmode on STDOUT to utf8, passes the "mysql_enable_utf8" option to DBD::mysql, and runs "SET NAMES UTF8" after connecting to MySQL. Any other value sets binmode on STDOUT without the utf8 layer, and runs "SET NAMES" after connecting to MySQL.
- --checkint
- short form: -i; type: Array; default: 16,1,1024
How often to check the slave: init,min,max.
How many relay log events should pass before checking the output of "SHOW SLAVE STATUS". The syntax is a three-number range: initial, minimum, and maximum. You should be able to leave this at the defaults.
"mk-slave-prefetch" varies the check interval in powers of two, depending on whether it decides the check was necessary.
- --daemonize
- Fork and run in the background; POSIX OSes only.
- --database
- short form: -D; type: string
The database to use for the connection.
Connect to this database. "mk-slave-prefetch" will issue "USE" statements as required by the binary log events.
- --defaults-file
- short form: -F; type: string
Only read mysql options from the given file. You must give an absolute pathname.
- --errors
- culumative: yes; default: 0; type: int
Print queries that caused errors. If specified once, at exit; if twice, in realtime.
If you specify this option once, you will see a report at the end of the script execution, showing the normalized queries and the number of times they were seen. If you specify this option twice, you will see the errors printed out as they occur, but no normalized report at the end of execution.
- --execute
- short form: -x; negatable: yes; default: yes
Execute the transformed queries to warm the caches.
- --host
- short form: -h; type: string
Host to connect to.
- --iolag
- short form: -l; type: size; default: 1k
How many bytes to lag the slave I/O thread.
This helps avoid "mysqlbinlog" reading right off the end of the relay log file.
- --maxquerytime
- short form: -q; type: float; default: 1
Do not run queries longer than this many seconds; fractions allowed.
If "mk-slave-prefetch" predicts the query will take longer to execute, it will skip the query. This is based on the theory that pre-warming the cache is most beneficial for short queries.
"mk-slave-prefetch" learns how long queries require to execute. It keeps an average over the last ``--querysampsize'' samples of each query. The averages are based on an abstracted version of the query, with specific parameters replaced by placeholders. The result is a sort of ``fingerprint'' for the query, not executable SQL. You can see the learned statistics with the ``--statistics'' option.
You can pre-load query fingerprints, and average execution times, from a file. This way you don't have to wait for "mk-slave-prefetch" to learn all over every time you start it. Just specify the file on the command line. The format should be the same as the output from ``--statistics''.
You might also want to filter out some statements completely, or let only some statements through. See the ``--rejectregexp'' and ``--permitregexp'' options.
If "mk-slave-prefetch" hasn't seen a query's fingerprint before, and thus doesn't know how long it will take to execute, it wraps it in a subuery, like this:
SELECT 1 FROM ( <query> ) AS X LIMIT 1;
This helps avoid fetching a lot of data back to the client when a query is very large. It requires a version of MySQL that supports subqueries (version 4.1 and newer). If yours doesn't, the subquery trick can't be used, so the query might fetch a lot of data back to the client.
Once a query's fingerprint has been seen, so it's known that the query isn't enormously slow, "mk-slave-prefetch" just rewrites the "SELECT" list for efficiency. (Avoiding the subquery reduces the query's overhead for short queries). The rewritten query will then look like the following;
SELECT ISNULL(COALESCE(<columns>)) FROM ...
- --numprefix
- Abstract away numeric table name prefixes.
This causes the following two queries to ``fingeprint'' to the same thing:
select from 1_2_users; select from 2_3_users;
- --offset
- short form: -o; type: size; default: 128
How many bytes "mk-slave-prefetch" will try to stay in front of the slave SQL thread.
It will not execute log events it doesn't think are at least this far ahead of the SQL thread. See also ``--window''.
- --password
- short form: -p; type: string
The password to use when connecting.
- --permitregexp
- type: string
Permit queries matching this Perl regexp.
This is a filter for log events. The regular expression is matched against the raw log event, before any transformations are applied. If specified, this option will permit only log events matching the regular expression.
- --pid
- type: string
Create the given PID file when daemonized.
For example, '--daemonize --pid /tmp/mk-slave-prefetch.pid' would cause mk-slave-prefetch to create the PID file /tmp/mk-slave-prefetch.pid.
/var/run/ is usually not writable by non-root users, therefore /tmp/ is a more reliable alternative.
The PID file is removed when the daemonized instance of mk-slave-prefetch exits.
- --port
- short form: -P; type: int
Port number to use for connection.
- Print the transformed relay log events to standard output.
- --printnonrewritten
- Print queries that could not be transformed into "SELECT".
- --progress
- type: int
Print progress information every X events.
The information is the current log file and position, plus a summary of the statistics gathered.
- --querysampsize
- type: int; default: 4
Average query exec time over this many queries.
The last "N" queries with a given fingerprint are averaged together to get the average query execution time (see ``--maxquerytime'').
- --rejectregexp
- type: string
Reject queries matching this Perl regexp.
Similar to ``--permitregexp'', but has the opposite effect: log events must not match the regular expression.
- --sentinel
- type: string; default: /tmp/mk-slave-prefetch-sentinel
Exit if this file exists.
- --setvars
- type: string; default: wait_timeout=10000
Set these MySQL variables.
Specify any variables you want to be set immediately after connecting to MySQL. These will be included in a "SET" command.
- --socket
- short form: -S; type: string
Socket file to use for connection.
- --statistics
- Print execution statistics after exiting.
The statistics are in two sections: counters, and queries. The counters simply count the number of times events occur. You may see the following counters:
NAME MEANING ====================== ======================================= mysqlbinlog Executed mysqlbinlog to read log events. events The total number of relay log events. not_far_enough_ahead An event was not at least --offset bytes ahead of the SQL thread. too_far_ahead An event was more than --offset + --window bytes ahead of the SQL thread. too_close_to_io_thread An event was less than --iolag bytes away from the I/O thread's position. event_not_allowed An event wasn't a SET, USE, INSERT, UPDATE, DELETE or REPLACE query. event_filtered_out An event was filtered out because of --permitregexp or --rejectregexp. same_timestamp A SET TIMESTAMP event was ignored because it had the same timestamp as the last one. do_query A transformed event was executed or printed. query_error An executed query had an error. query_too_long An event was not executed because its average query length exceeded --maxquerytime. query_not_rewritten A query could not be rewritten to a SELECT. master_pos_wait The tool waited for the SQL thread to catch up. show_slave_status The tool queried SHOW SLAVE STATUS. load_data_infile The tool found a LOAD DATA INFILE query and unlinked (deleted) the temp file. could_not_unlink The tool failed to unlink a temp file. sleep The tool slept for a second because the slave's SQL thread was not running, or because it read past the end of the log.
After the counters, "mk-slave-prefetch" prints information about each query fingerprint it has seen, two lines per fingerprint. The first line contains the query's fingerprint. The second line contains the number of times the fingerprint was seen, number of times executed, the sum of the execution times, and the average execution time over the last ``--querysampsize'' samples.
- --stop
- Stop running instances by creating the ``--sentinel'' file.
- --time
- short form: -t; type: time
How long "mk-slave-prefetch" should run before exiting.
The default is to run forever.
- --tmpdir
- type: string; default: /dev/null
Where to create temp files for "LOAD DATA INFILE" queries.
The default will cause "mysqlbinlog" to skip the file and the associated "LOAD DATA INFILE" command entirely.
If "mk-slave-prefetch" sees a "LOAD DATA INFILE" command (which it won't, if this is left at the default), it will try to remove the temporary file, then skip the event.
- --user
- short form: -u; type: string
User for login if not current user.
- --window
- short form: -w; type: size; default: 4k
The max bytes ahead of the slave "mk-slave-prefetch" should get.
Defines the window within which "mk-slave-prefetch" considers a query OK to execute. The window begins at the slave SQL thread's last known position plus ``--offset'' bytes, and extends for the specified number of bytes.
If "mk-slave-prefetch" sees a log event that is too far in the future, it will increment the "too_far_ahead" counter and wait for the slave SQL thread to catch up (which increments the "master_pos_wait" counter). If an event isn't far enough ahead of the SQL thread, it will be discarded and the "not_far_enough_ahead" counter increments.
Watching the mentioned statistics can help you understand how to tune the window. You want "mk-slave-prefetch" to run just ahead of the SQL thread, not throwing out a lot of events for being too far ahead or not far enough ahead.
SYSTEM REQUIREMENTS
You need Perl, DBI, DBD::mysql, and some core packages that ought to be installed in any reasonably new version of Perl.ENVIRONMENT
The environment variable "MKDEBUG" enables verbose debugging output in all of the Maatkit tools:MKDEBUG=1 mk-....
BUGS
Please use Google Code Issues and Groups to report bugs or request support: <http://code.google.com/p/maatkit/>.Please include the complete command-line used to reproduce the problem you are seeing, the version of all MySQL servers involved, the complete output of the tool when run with ``--version'', and if possible, debugging output produced by running with the "MKDEBUG=1" environment variable.
COPYRIGHT, LICENSE AND WARRANTY
This program is copyright 2007-2008 Baron Schwartz. Feedback and improvements are welcome.THIS PROGRAM IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2; OR the Perl Artistic License. On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic' to read these licenses.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
AUTHOR
Baron Schwartz.VERSION
This manual page documents Ver 1.0.5 Distrib 2725 $Revision: 2311 $.Contenus ©2006-2024 Benjamin Poulain
Design ©2006-2024 Maxime Vantorre