bmc-watchdog

Langue: en

Version: 2009-02-26 (CentOS - 06/07/09)

Section: 8 (Commandes administrateur)

NAME

bmc-watchdog - BMC watchdog timer daemon and control utility

SYNOPSIS

bmc-watchdog command [OPTIONS] [COMMAND_OPTIONS]...

DESCRIPTION

bmc-watchdog controls a Baseboard Management Controller (BMC) watchdog timer. The bmc-watchdog tool typically executes as a cronjob or daemon to manage the watchdog timer. A user must be root in order to run bmc-watchdog.

BMC WATCHDOG DETAILS

A BMC watchdog timer is part of the Intelligent Platform Management Interface (IPMI) specification and is only available to BMCs that are compliant with IPMI. When a BMC watchdog timer is started, it begins counting down to zero from some positive number of seconds. When the timer hits zero, the timer will execute a pre-configured pre-timeout interrupt and/or timeout action.

In order to stop the pre-timeout interrupt or timeout action from being executed, the watchdog timer must be periodically reset back to its initial beginning value.

The BMC watchdog timer automatically stops itself when the machine is rebooted. Therefore, when a machine is brought up, the BMC watchdog timer must be setup again before it can be used.

Typically, a BMC watchdog timer is used to automatically reset a machine that has crashed. When the operating system first starts up, the BMC timer is set to its initial countdown value. At periodic intervals, when the operating system is functioning properly, the watchdog timer can be reset by the OS or a userspace program. Thus, the timer never counts down to zero. When the system crashes, the timer cannot be reset by the OS or userspace program. Eventually, the timer will countdown to zero and reset the machine.

See EXAMPLES below for examples of how bmc-watchdog is commonly used.

COMMANDS

The following commands are available to bmc-watchdog.
-s, --set
Set BMC Watchdog Configuration. BMC watchdog timer configuration values can be set using the set command options listed below under SET OPTIONS. If a particular configuration parameter is not specified on the command line, the current configuration of that parameter will not be changed.
-g, --get
Get BMC Watchdog Configuration and State. The current configuration and state is printed to standard output.
-r, --reset
Reset BMC Watchdog Timer.
-t, --start
Start BMC Watchdog Timer. Does nothing if the timer is currently running. Identical to --reset command when the timer is stopped with the exception of the start command options listed below under START OPTIONS.
-y, --stop
Stop BMC Watchdog Timer. Stops the current timer.
-c, --clear
Clear BMC Watchdog Configuration. Clears all configuration values for the watchdog timer, except for timer use, which is kept at its current value.
-d, --daemon
Run bmc-watchdog as a daemon. Configurable BMC watchdog timer options are listed below under DAEMON OPTIONS. The configuration values are set once, then the daemon will reset the timer at specified periodic intervals. Everytime the BMC watchdog timer is reset, a log entry will be generated in the bmc-watchdog log. The default log is stored at /var/log/freeipmi/bmc-watchdog.log. The daemon can be stopped using the --stop command, --clear command, or by setting the stop_timer flag on the --set command.

OPTIONS

The following options are generic and can be used by any command.
-?, --help
Output the help menu. If a specific command (--set, --get, --reset, --start, --stop, --clear, or --daemon) is listed on the command line, only the specific options for that command will be listed.
-V, --version
Output the program version and exit.
-D, --driver-type=IPMIDRIVER
Specify the driver type to use instead of doing an auto selection. The currently available inband drivers are KCS, SSIF, and OPENIPMI.
--no-probing
Do not probe IPMI devices for default settings.
--driver-address=DRIVER-ADDRESS
Specify the in-band driver address to be used instead of the probed value.
--driver-device=DEVICE
Specify the in-band driver device path to be used instead of the probed path.
--register-spacing=REGISTER-SPACING
Specify the in-band driver register spacing instead of the probed value.
-f STRING, --logfile=FILE
Specify an alternate logfile from the default of /var/log/freeipmi/bmc-watchdog.log.
-n, --no-logging
Turns off all logging done by bmc-watchdog.
--debug
Turns on debugging. All data written and read from the BMC is dumped to stderr.

SET OPTIONS

The following options can be used by the set command to set or clear various BMC watchdog configuration parameters.
-u INT, --timer-use=INT
Set timer use. The timer use value can be set to one of the following: 1 = BIOS FRB2, 2 = BIOS POST, 3 = OS_LOAD, 4 = SMS OS, 5 = OEM.
-m INT, --stop-timer=INT
Set Stop Timer Flag. A flag value of 0 stops the current BMC watchdog timer. A value of 1 doesn't turn off the current watchdog timer.
-l INT, --log=INT
Set Log Flag. A flag value of 0 turns logging on. A value of 1 turns logging off.
-a INT, --timeout-action=INT
Set timeout action. The timeout action can be set to one of the following: 0 = No action, 1 = Hard Reset, 2 = Power Down, 3 = Power Cycle.
-p INT, --pre-timeout-interrupt=INT
Set pre-timeout interrupt. The pre timeout interrupt can be set to one of the following: 0 = None, 1 = SMI, 2 = NMI, 3 = Messaging Interrupt.
-z SECS, --pre-timeout-interval=SECONDS
Set pre-timeout interval in seconds.
-F, --clear-bios-frb2
Clear BIOS FRB2 Timer Use Flag.
-P, --clear-bios-post
Clear BIOS POST Timer Use Flag.
-L, --clear-os-load
Clear OS Load Timer Use Flag.
-S, --clear-sms-os
Clear SMS/OS Timer Use Flag.
-O, --clear-oem
Clear OEM Timer Use Flag.
-i SECS, --initial-countdown=SECONDS
Set initial countdown in seconds.
-w, --start-after-set
Start timer after set command if timer is stopped. This is typically used when bmc-watchdog is used as a cronjob. This can be used to automatically start the timer after it has been set the first time.
-x, --reset-after-set
Reset timer after set command if timer is running.
-j, --start-if-stopped
Don't execute set command if timer is stopped, just start timer.
-k, --reset-if-running
Don't execute set command if timer is running, just reset timer. This is typically used when bmc-watchdog is used as a cronjob. This can be used to reset the timer after it has been initially started.

START OPTIONS

The following options can be used by the start command.
-G INT, --gratuitous-arp=INT
Suspend or don't suspend gratuitous ARPs while the BMC timer is running. A flag value of 1 suspends gratuitous ARPs. A value of 0 will not suspend gratuitous ARPs. If this option is not specified, gratuitous ARPs will not be suspended.
-A INT, --arp-response=INT
Suspend or don't suspend BMC-generated ARP responses while the BMC timer is running. A flag value of 1 suspends ARP responses. A value of 0 will not suspend ARP responses. If this option is not specified, ARP responses will not be suspended.

DAEMON OPTIONS

The following options can be used by the daemon command to set the initial BMC watchdog configuration parameters.
-u INT, --timer-use=INT
Set timer use. The timer use value can be set to one of the following: 1 = BIOS FRB2, 2 = BIOS POST, 3 = OS_LOAD, 4 = SMS OS, 5 = OEM.
-l INT, --log=INT
Set Log Flag. A flag value of 0 turns logging on. A value of 1 turns logging off.
-a INT, --timeout-action=INT
Set timeout action. The timeout action can be set to one of the following: 0 = No action, 1 = Hard Reset, 2 = Power Down, 3 = Power Cycle.
-p INT, --pre-timeout-interrupt=INT
Set pre-timeout interrupt. The pre timeout interrupt can be set to one of the following: 0 = None, 1 = SMI, 2 = NMI, 3 = Messaging Interrupt.
-z SECS, --pre-timeout-interval=SECONDS
Set pre-timeout interval in seconds.
-F, --clear-bios-frb2
Clear BIOS FRB2 Timer Use Flag.
-P, --clear-bios-post
Clear BIOS POST Timer Use Flag.
-L, --clear-os-load
Clear OS Load Timer Use Flag.
-S, --clear-sms-os
Clear SMS/OS Timer Use Flag.
-O, --clear-oem
Clear OEM Timer Use Flag.
-i SECS, --initial-countdown=SECONDS
Set initial countdown in seconds.
-G INT, --gratuitous-arp=INT
Suspend or don't suspend gratuitous ARPs while the BMC timer is running. A flag value of 1 suspends gratuitous ARPs. A value of 0 will not suspend gratuitous ARPs. If this option is not specified, gratuitous ARPs will not be suspended.
-A INT, --arp-response=INT
Suspend or don't suspend BMC-generated ARP responses while the BMC timer is running. A flag value of 1 suspends ARP responses. A value of 0 will not suspend ARP responses. If this option is not specified, ARP responses will not be suspended.
-e, --reset-period
Time interval to wait before resetting timer. The default is 60 seconds.

ERRORS

Errors are logged to the bmc-watchdog log.

EXAMPLES

Setup a bmc-watchdog daemon that resets the machine after 15 minutes (900 seconds) if the OS has crashed (see default bmc-watchdog rc script /etc/init.d/bmc-watchdog for a more complete example):
        bmc-watchdog -d -u 4 -p 0 -a 1 -i 900

KNOWN ISSUES

Bmc-watchdog may fail to reset the watchdog timer if it is not scheduled properly. It is always recommended that bmc-watchdog be executed with a high scheduling priority.

On some machines, the hardware based SMI Handler may disable a processor after a watchdog timer timeout if the timer use is set to something other than SMS/OS.

ORIGIN

Developed by Albert Chu <chu11@llnl.gov> on LLNL's GNU/Linux clusters. This software is open source and distributed under the terms of the Gnu GPL.