Home > Nagios, Network Management > Monotoring Redundant Power Supplies using IPMI Tool and Nagios

Monotoring Redundant Power Supplies using IPMI Tool and Nagios

April 22nd, 2013

For SuperMicro motherboards which use IPMI, you can setup monitoring of RPS’s using Nagios and the IPMI plugin.

IPMItool is an open source utility to work with the IPMI management cards in some servers. Depending on your particular Linux distribution, you can probably “apt-get install ipmitool” or “yum install ipmitool” to get it. It is basically a command line tool that can be used instead of the IPMI web interface.

First: Get the plugin

The plugin for checking Supermicro power supplies can be found on the tummy.com FTP site. This plugin is written for the X8 class motherboards, and may need changes in the IPMI raw commands to work with other boards.

You can drop this in your nagios plugins directory, usually /usr/lib/nagios/plugins. As with any script I use I suggest at least looking at it to get an idea about how it works. With no arguments it will prompt you with the needed command line format:

# ./check_ipmi_powersupply 
USAGE: -H host -U ipmi_username -P ipmi_password

Not too complicated, and it looks like most any other nagios plugin. Here is some nagios command glue to help you use it:

define command{
	command_name	check_ipmi_powersupply
	command_line	$USER1$/check_ipmi_powersupply -H $HOSTADDRESS$ -U ADMIN -P $ARG1$
}

And to use it as a service for some host:

define service{
        use                             generic-service
        host_name                       My-Really-Important-Server
        service_description             POWERSUPPLY
        contact_groups                  admin
        check_command                   check_ipmi_powersupply!supersecretpassword
}

You can see in this way I have the password as the first argument, allowing me to use the same command description on multiple different hosts. I found that the Admin account was the only account that had the privilege of sending the raw commands necessary to check the power supply in this way.

IPMI Raw Command

So a nagios plugin that checks power supplies, no big deal right? Maybe, but if you want to get the job done right, you have to monitor the server completely, from the health of the power supply all the way up to the status code of the apache page. The real magic in this thing comes from the raw IPMI command that the IPMItool sends. This raw command does a very low level query to the data bus that the power supply is connected to. Here is the explanation from the Supermicro engineer I worked with to make this check:

# ipmitool -H <IP Address> -U <User ID> -P <User Password> raw 0x06 0x52 0x07 0x78 0x01 0x78 
>> 
>> NetFn: 0x06 
>> Cmd : 0x52 
>> Data : 0x07 // bus 3 for X8 motherboard 
>> 0x78 // slave address of PS (it can be 0x78, 0x7a, 0x7c for 3 redundant PS 
>> 0x01 // read 1 byte 
>> 0x78 // where 78 is offset of the PS, 0-bad, 1-good 
>> >> If the power supply is installed but failed, it will return value 0. 
>> If the power supply totally lose the power, it will reply an error message..

And this is the main reason for this blog post, to get this ipmi raw command out in the open. A special thanks goes out to the Supermicro engineer who was able to pass down these special commands from deep within the bowels of their documentation.

It is worth noting that particular command will only work on X8 class motherboards. Other motherboard types will need to be looked up. If you are deploying this on a Supermicro 4-Node 6026TT then only the blade in the A slot has access to this data bus.

Sources:

 

Categories: Nagios, Network Management Tags:
Comments are closed.