<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Linux System Admins Blog &#187; Nagios</title>
	<atom:link href="http://linuxsysadminblog.com/category/nagios/feed/" rel="self" type="application/rss+xml" />
	<link>http://linuxsysadminblog.com</link>
	<description>System admins of Promet - an e-commerce, high availability Open Source web shop - share their findings</description>
	<lastBuildDate>Sat, 10 Jul 2010 01:33:47 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Setup Nagios User to View Specific Host and Services</title>
		<link>http://linuxsysadminblog.com/2009/05/setup-nagios-user-to-view-specific-host-and-services/</link>
		<comments>http://linuxsysadminblog.com/2009/05/setup-nagios-user-to-view-specific-host-and-services/#comments</comments>
		<pubDate>Fri, 15 May 2009 00:29:00 +0000</pubDate>
		<dc:creator>gerold</dc:creator>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Nagios]]></category>
		<category><![CDATA[Tips and Tricks]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[nagios users]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=654</guid>
		<description><![CDATA[This guide will help you setup Nagios user to have limited access to host and service checks.  It is helpful when you want to allow your customers or clients to view and receive alerts on their servers and services, like for dedicated servers.
Procedure:
  Contacts:  Create new contact definitions for your client.
  [...]]]></description>
			<content:encoded><![CDATA[<p>This guide will help you setup Nagios user to have limited access to host and service checks.  It is helpful when you want to allow your customers or clients to view and receive alerts on their servers and services, like for dedicated servers.</p>
<p><strong>Procedure:</strong></p>
<p>  <strong>Contacts: </strong> Create new contact definitions for your client.<br />
  <code lang="html">	define contact{<br />
        contact_name                    customer1<br />
        alias                           Customer One Admin<br />
        service_notification_period     24x7<br />
        host_notification_period        24x7<br />
        service_notification_options    c,r<br />
        host_notification_options       d,r<br />
        service_notification_commands   notify-service-by-email<br />
        host_notification_commands      notify-host-by-email<br />
        email                           customer1@domain.tld<br />
    }  </code><br />
<span id="more-654"></span><br />
  <strong>Groups:  </strong>Create contact groups or you can add the new contact for you existing group, depending on the checks that you want to allow.<br />
<code lang="html">	define contactgroup {<br />
        contactgroup_name               Dedicated-Server1-Admins<br />
        alias                           Admins for Server 1<br />
        members                         customer1,hostingadmins<br />
	}</code></p>
<p>  <strong>Hosts / Services: </strong>  Use the new Contact Group with customers email and your main admin.  Note that i used the existing Host Groups but you create new HostGroups if you want.<br />
<code lang="html">	define host {<br />
        use                            generic-host<br />
        host_name                      Server1<br />
        alias                          Server1<br />
        address                        10.0.0.2  // private or public ip<br />
        hostgroups                     DedicateServers<br />
        check_command                  check-host-alive<br />
        contact_groups                 Dedicated-Server1-Admins<br />
        check_period                   24x7<br />
        max_check_attempts             10<br />
        notification_interval          480<br />
        notification_period            24x7<br />
        notification_options           d,r<br />
        notifications_enabled          1<br />
	}<br />
    define service {<br />
        use                            generic-service<br />
        host_name                      Server1<br />
        service_description            HTTP<br />
        is_volatile                    0<br />
        check_period                   24x7<br />
        max_check_attempts             3<br />
        normal_check_interval          5<br />
        retry_check_interval           3<br />
        contact_groups                 Dedicated-Server1-Admins<br />
        notification_interval          480<br />
        notification_period            24x7<br />
        notification_options           w,u,c,r<br />
        check_command                  check_http<br />
        notifications_enabled          1<br />
	}</code></p>
<p>  In my case, I created a new group and add our admin contacts and customers, then update the contact groups for hosts and services.  You can also create a new definitions for hosts, contacts, groups, and services with different names for the clients if you don&#8217;t want to edit your existing definitions.</p>
<p>  <strong>Htaccess: </strong> Lastly, you need to add htaccess user to your htpasswd file (htpasswd.users).  Username should match the name on your Contact.  In this sample it is customer1. <strong> [Update]</strong> If you&#8217;ve implemented &#8220;<a href="http://nagios.sourceforge.net/docs/3_0/cgisecurity.html">Digest Authentication</a>&#8221; you need to update your digest file instead of the htpasswd.</p>
<p>  Don&#8217;t forget to restart you Nagios.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/05/setup-nagios-user-to-view-specific-host-and-services/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Monitoring Drupal Sites With Nagios</title>
		<link>http://linuxsysadminblog.com/2009/04/monitoring-drupal-sites-with-nagios/</link>
		<comments>http://linuxsysadminblog.com/2009/04/monitoring-drupal-sites-with-nagios/#comments</comments>
		<pubDate>Thu, 23 Apr 2009 09:22:31 +0000</pubDate>
		<dc:creator>gerold</dc:creator>
				<category><![CDATA[Centos]]></category>
		<category><![CDATA[Installation]]></category>
		<category><![CDATA[Nagios]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[monitoring]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=549</guid>
		<description><![CDATA[There is a module released for monitoring Drupal sites with Nagios.  Monitoring includes the check if your site is up and running, check for new updates on Drupal core, security, and modules, database updates, write permission on &#8220;files&#8221; directory,  check if cron is running on the specified period, and other sections of your Drupal site.  [...]]]></description>
			<content:encoded><![CDATA[<p>There is a <a href="http://drupal.org/project/nagios"><strong>module</strong></a> released for monitoring <a href="http://drupal.org/">Drupal</a> sites with <a href="http://www.nagios.org/">Nagios</a>.  Monitoring includes the check if your site is up and running, check for new updates on Drupal core, security, and modules, database updates, write permission on &#8220;files&#8221; directory,  check if cron is running on the specified period, and other sections of your Drupal site.  It is intended and helpful to those maintain large number of Drupal sites.</p>
<p>At this time of writing, this module is still on a development version and there&#8217;s no guarantee that the installation guide will work out-of-the-box with your system.  And this post will mainly cover my own installation process on our Nagios monitoring server running on Debian and Nagios version 3.0, and Drupal version 6.x sites on web servers running CentOS 5.x.</p>
<p><strong>Installation</strong>:<br />
My installation is based on the included README file and with some adjustments to my liking.</p>
<p><strong>Install the Drupal Module:</strong></p>
<ul>
<li> Download the Nagios module from <a href="http://drupal.org/project/nagios">Drupal project page</a>.</li>
<li> Install the module to your Drupal site just like the other modules.  Download tarball, extract to modules directory ex: <strong><em>sites/all/modules/</em></strong>, go to <strong><em>admin-&gt;build-&gt;modules</em></strong> and enable the module.<span id="more-549"></span></li>
<li>Configure your Nagios module and set the site&#8217;s UniqueID and Cron duration.
<p><strong>UniqueID</strong> is your site identifier to be used by the Nagios (<em>check_drupal</em>) to authorize the service check and for security purposes.  The author also suggests the use of MD5 or SHA1 string. Refer to README for more info on this parameter.</p>
<p><strong>Cron Duration</strong> &#8211; you need to supply the interval of your cron job that checks for Drupal updates.  This value should match with your cron settings, ex: daily or every 3 hours..etc.</li>
</ul>
<p><strong>Configure Nagios checks:</strong></p>
<ul>
<li>Copy the plugin file (<strong><em>check_drupal</em></strong>) found on the <strong><em>nagios-plugin</em></strong> directory of the module, to your Nagios plugins directory where the other Nagios check commands are located &#8211; in my case it&#8217;s on <strong><em>/usr/local/nagios/libexec/</em></strong> (CentOS).
<p>If your Nagios installation is on a different machine than your Drupal server, you need to copy the <em><strong>check_drupal</strong></em> file in there.  You can also put it on the same server with Drupal sites and use NRPE instead.</p>
<p>On my CentOS machine i received an error on <strong><em>check_drupal</em></strong> regarding the location of <em><strong>basename</strong></em> file &#8211; it&#8217;s on <em><strong>/bin/basename</strong></em>.  You can edit the <em><strong>check_drupal</strong></em> file directly to adjust the path to <em><strong>basename</strong></em>.<br />
<code>./check_drupal: line 14: /usr/bin/basename: No such file or directory.</code></li>
<li> <strong>Add command, host, hostgroup, and service definition:</strong>
<p><strong>Command </strong>(commands.cfg):  I made small modification on the given commands from the README file to match my setup.<br />
<code>define command{<br />
command_name  check_drupal<br />
command_line  $USER1$/check_drupal -H $ARG1$ -U $ARG2$ -t $ARG3$<br />
}<br />
</code><br />
<strong>HostGroup</strong>:  I created a new Host group because we have other service checks on our server such as SSH, HTTP, LOAD, etc and I want to separate my checks for Drupal sites.<br />
<code>define hostgroup {<br />
hostgroup_name  Drupal<br />
alias           Drupal Sites<br />
members         MyWebServer<br />
}<br />
</code><br />
<strong>Host:</strong> I defined new host for Drupal sites so i can configure and group my them on the same host where they belong.<br />
<code>define host {<br />
host_name                      MyWebServer<br />
display_name                   MyWebServer<br />
address                        HOSTNAME/IP ADDRESS HERE<br />
hostgroups                     Drupal<br />
check_command                  check-host-alive<br />
contact_groups                 Admins<br />
check_period                   24x7<br />
max_check_attempts             10<br />
notification_interval          480<br />
notification_period            24x7<br />
notification_options           d,r<br />
notifications_enabled          1<br />
}<br />
</code><br />
<strong>Service:</strong> Below is my service checks definition for checking Drupal sites, i only need to copy this and change supply parameters for domain, unique key and the timeout.<br />
<code> define service {<br />
service_description            DRUPAL_SITE 1<br />
host_name                      MyWebServer<br />
check_period                   24x7<br />
max_check_attempts             3<br />
normal_check_interval          5<br />
retry_check_interval           3<br />
contact_groups                 Admins<br />
notification_interval          480<br />
notification_period            24x7<br />
notification_options           w,u,c,r<br />
check_command                  check_drupal!mysite.example.com!mykeyhere!5<br />
notifications_enabled          1<br />
}<br />
</code></li>
</ul>
<p>If your installation and configuration is correct you will get the Nagios service status similar below.  It indicates number of modules, themes, users, nodes, etc.</p>
<p><code>DRUPAL OK, ADMIN:OK, CRON:OK<br />
SAN=0;SAU=0;NOD=12;USR=7;MOD=23;THM=9</code></p>
<p>On my initial tests i received Nagios status (below) different than the above info and it was caused by my Apache configuration because i have a default Nagios installation before on my server that hosts my Drupal sites.</p>
<p><code>HTTP returned an error code. HTTP:   HTTP/1.1 301 Moved Permanently</code><br />
So you need to check first the url of your Nagios module installation ex:  http://mysamplesite.com/nagios/, this will give you:</p>
<p><code>Nagios status page<br />
nagios=UNKNOWN, DRUPAL:UNKNOWN=Unauthorized</code></p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/04/monitoring-drupal-sites-with-nagios/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nagios:  How to check if remote process is running</title>
		<link>http://linuxsysadminblog.com/2009/02/nagios-how-to-check-if-remote-process-is-running/</link>
		<comments>http://linuxsysadminblog.com/2009/02/nagios-how-to-check-if-remote-process-is-running/#comments</comments>
		<pubDate>Tue, 24 Feb 2009 09:00:33 +0000</pubDate>
		<dc:creator>gerold</dc:creator>
				<category><![CDATA[Nagios]]></category>
		<category><![CDATA[monitoring]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=362</guid>
		<description><![CDATA[We have a monitoring server running Nagios and we needed to add checks for Nginx process on a new server.  Basically, you only need to install NRPE to monitor services, processes, disk space, load, etc on your remote machine.  Check the NRPE docummention for complete reference and here&#8217;s a quick NRPE installation guide for Debian.
For [...]]]></description>
			<content:encoded><![CDATA[<p>We have a monitoring server running <a href="http://www.nagios.org/" target="_blank">Nagios</a> and we needed to add checks for Nginx process on a new server.  Basically, you only need to install NRPE to monitor services, processes, disk space, load, etc on your remote machine.  Check the <a href="http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf" target="_blank">NRPE docummention</a> for complete reference and here&#8217;s a quick <a href="http://sysbible.org/x/2008/11/10/how-to-install-nagios-nrpe-under-debian-linux/" target="_blank">NRPE installation guide for Debian</a>.</p>
<p>For my objective i only need to check if Nginx process is running and will use the check_procs.  NRPE and Nagios Plugins were installed and i can check the Nginx process locally using the following commands:</p>
<p><span id="more-362"></span><br />
<code>/usr/local/nagios/libexec/check_procs -c 1:30 -C nginx</code><br />
wherein :<br />
<code>-c 1:30</code> &lt;&#8211; refers to the Critical range for number of Nginx processes. If there process count is below 1 and above 30 this will send me a Critical notice.  If you wan to add a Warning level you can use &#8220;-w 1:25&#8243; &#8211; adjust the number of processes for you needs.<br />
<code> -C nginx</code> &lt;&#8211; this will check for the command name (nginx)</p>
<p><strong>NOTE:</strong> For complete reference on this check and other samples please refer to the <a href="http://nagioswiki.org/wiki/Plugin:check_procs" target="_blank">NagiosWiki</a> page.</p>
<p>Below are my configurations:</p>
<p>NRPE(remote):  <em>/etc/nagios/nrpe_local.cfg</em><br />
<code>command[check_nginx]=/usr/local/nagios/libexec/check_procs -c 1:30 -C nginx</code></p>
<p>Nagios(host):  <em>/usr/local/nagios/etc/objects/localhost.cfg</em><br />
<code>define service {<br />
use                            generic-service         ; Name of service template to use<br />
host_name                      HOST/IPADDRESS<br />
service_description            CHECK_NGINX<br />
check_period                   24x7<br />
max_check_attempts             3<br />
normal_check_interval          5<br />
retry_check_interval           3<br />
contact_groups                 Admins<br />
notification_interval          480<br />
notification_period            24x7<br />
notification_options           w,u,c,r<br />
check_command                  check_nrpe!check_nginx<br />
notifications_enabled          1<br />
}</code></p>
<p>Nagios version is 3.0.  Nagios monitoring and remote server are running Debian Etch.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/02/nagios-how-to-check-if-remote-process-is-running/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Monitoring a Java application from Nagios</title>
		<link>http://linuxsysadminblog.com/2009/01/monitoring-a-java-application-from-nagios/</link>
		<comments>http://linuxsysadminblog.com/2009/01/monitoring-a-java-application-from-nagios/#comments</comments>
		<pubDate>Mon, 05 Jan 2009 21:16:47 +0000</pubDate>
		<dc:creator>Pim</dc:creator>
				<category><![CDATA[Nagios]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[java]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=174</guid>
		<description><![CDATA[This is a slight departure from our regular programming. Instead of just concentrating on the sys admin side of things I want to show how to add a Nagios check to an existing application. In this case we have a Java application for which we want to monitor whether it is running or not. Later [...]]]></description>
			<content:encoded><![CDATA[<p>This is a slight departure from our regular programming. Instead of just concentrating on the sys admin side of things I want to show how to add a Nagios check to an existing application. In this case we have a Java application for which we want to monitor whether it is running or not. Later on we can make this more detailed by monitoring error codes in the application but for the moment let&#8217;s keep it simple.</p>
<p><strong>Configuring Nagios</strong><br />
On the Nagios end of things we need to define a command to perform a check on a specific port of the server where the application is running. Add a line like this to the objects/commands.cfg file of your Nagios installation.<br />
<code><br />
define command{<br />
command_name check_your_application_name<br />
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ -e "This application is alive and well"<br />
}</code></p>
<p>The -e parameter checks for a specific text that is to be returned by the application. This we can use later on to check for more detailed information.<span id="more-174"></span> Next we need to add a service to Nagios for using this command. We do this by adding the following lines to the objects/localhost.cfg file. To keep this short I left out some lines which configure the frequency of the checks and the types of alerts.<br />
<code><br />
define service {<br />
use                    generic-service<br />
host_name              your_server_name<br />
service_description    your_service_name<br />
check_command          check_your_application_name!2222<br />
}</code></p>
<p><strong>Creating a listener port in Java</strong><br />
In the second part I will show you the actual code to add to your application. Because this is a blog post I left out the package definition and the includes, but other than that the class itself is usable. To add the check to the Java app we need to add a listener thread to application. We do this by creating a class that is derived from Thread. This listener will open a port which is specified by the main application and a respond to any incoming data with a preset text. We really don&#8217;t care about the input on this end so any input will be  ignored:</p>
<pre class="java">public class NagiosChecker extends Thread {
    // Server socket
    private ServerSocket srv;

    // Flag that indicates whether the poller is running or not.
    private boolean isRunning = true;

    // Constructor.
    public OVMChecker(ServerSocket srv) {
        this.isRunning = true;
        this.srv = srv;
    }

    // Method for terminating the listener
    public void terminate() {
        this.isRunning = false;
    }

    /**
    * This method start the thread and performs all the operations.
    */
    public void run() {
        try {
            // Wait for connection from client.
            while (isRunning) {
                Socket socket = srv.accept();

            // Open a reader to receive (and ignore the input)
            BufferedReader rd = new BufferedReader(new InputStreamReader(socket.getInputStream()));

            // Write the status message to the outputstream
            try {
                BufferedWriter wr = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
                wr.write("This application is alive and well");
                wr.flush();
            } catch (IOException e) {
                System.out.println(e.getMessage()));
            }

            // Close the inputstream since we really don't care about it
            rd.close();
        } catch (Exception e) {
            System.out.println(e.getMessage()));
        }
    }
}</pre>
<p>In case you&#8217;re still reading this you&#8217;re probably interested in how to call this class. The following code should be executed in the initialization of the application. It creates the actual socket for port 2222 and starts the listener class. After this the listener class will run indefinitely until the application terminates.</p>
<pre class="java">ServerSocket srv = null;
try {
    srv = new ServerSocket(2222);
    NagiosChecker checker = new NagiosChecker(srv);
    checker.start();
} catch (Exception e) {
    System.out.println(e.getMessage());
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2009/01/monitoring-a-java-application-from-nagios/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Homegrown MySQL monitoring</title>
		<link>http://linuxsysadminblog.com/2008/10/homegrown-mysql-monitoring/</link>
		<comments>http://linuxsysadminblog.com/2008/10/homegrown-mysql-monitoring/#comments</comments>
		<pubDate>Thu, 23 Oct 2008 12:46:49 +0000</pubDate>
		<dc:creator>Pim</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Nagios]]></category>
		<category><![CDATA[monitoring]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[connections]]></category>
		<category><![CDATA[shell]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=99</guid>
		<description><![CDATA[If you can&#8217;t do it with a shell script it usually ain&#8217;t worth doin&#8217;, right? Of course the number and quality of monitoring tools available to sys admins has gone up dramatically. Thanks to Nagios and other great tools it&#8217;s pretty easy to keep track of what&#8217;s going on and where and get notified pretty [...]]]></description>
			<content:encoded><![CDATA[<p>If you can&#8217;t do it with a shell script it usually ain&#8217;t worth doin&#8217;, right? Of course the number and quality of monitoring tools available to sys admins has gone up dramatically. Thanks to Nagios and other great tools it&#8217;s pretty easy to keep track of what&#8217;s going on and where and get notified pretty quickly. But some times you just want to monitor a couple of things first to see if they are actually worth monitoring. In my case it started out as a temporary thing to keep track of a recurring problem. The number of MySQL connections would max out of from time to time and we needed to be alerted very quickly if this happened of course.</p>
<p>So we let crontab run a shell script every minute which just executes one command:<br />
<code>mysql -e "show processlist;" &gt; job.tmp</code></p>
<p>This command will let you track all user connections at that moment. Of course a lot can happen in a minute and there&#8217;s lots of stuff you won&#8217;t catch, but problems that are growing will manifest themselves here. To distill the number of running connections:<br />
<code>CONNS=`cut ${SCRIPTS_DIR}/jobs.tmp -f5 | grep "Query" | wc -l | cut -f1 -d"/"`</code></p>
<p>The number of locked out queries:<br />
<code>LOCKED=`cut ${SCRIPTS_DIR}/jobs.tmp -f7 | grep "Locked" | wc -l | cut -f1 -d"/"`</code></p>
<p>The longest running query:<br />
<code>LONGRUN=`grep "Query" ${SCRIPTS_DIR}/jobs.tmp | cut -f6 | sort -n | tail -1`</code></p>
<p>In all these cases a simple if statement will mail the processlist in case a threshold is passed. No better way to get your attention then your mailbox filling up with an alert per minute. Besides, seeing the processlist at the time things started going wrong can be very useful in identifying the culprit.</p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2008/10/homegrown-mysql-monitoring/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nagios Plugin:  check_hparray Error</title>
		<link>http://linuxsysadminblog.com/2008/09/nagios-plugin-check_hparray-error/</link>
		<comments>http://linuxsysadminblog.com/2008/09/nagios-plugin-check_hparray-error/#comments</comments>
		<pubDate>Thu, 11 Sep 2008 23:22:37 +0000</pubDate>
		<dc:creator>gerold</dc:creator>
				<category><![CDATA[Installation]]></category>
		<category><![CDATA[Nagios]]></category>
		<category><![CDATA[monitoring]]></category>

		<guid isPermaLink="false">http://linuxsysadminblog.com/?p=58</guid>
		<description><![CDATA[We have check_hparray plugin installed on all of our HP servers, running CentOS, for monitoring hardware raid via
HP Array Configuration Utility CLI (hpacucli) tool, and we have NRPE installed as well for this check from our remote Nagios server.
Yesterday we&#8217;ve setup another HP server and installed NRPE, check_hparray, and hpacucli, same process used on previous [...]]]></description>
			<content:encoded><![CDATA[<p>We have <a href="http://www.nagiosexchange.org/cgi-bin/page.cgi?g=2508.html;d=1">check_hparray plugin</a> installed on all of our HP servers, running CentOS, for monitoring hardware raid via<br />
<a href="ftp://ftp.hp.com/pub/softlib2/software1/pubsw-linux/p414707558/v47111/hpacucli-8.10-2.noarch.txt">HP Array Configuration Utility CLI</a> (hpacucli) tool, and we have <a href="http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf">NRPE</a> installed as well for this check from our remote Nagios server.</p>
<p>Yesterday we&#8217;ve setup another HP server and installed NRPE, check_hparray, and hpacucli, same process used on previous installations.  NRPE worked fine locally and from the Nagios server as the local disk space check was configured properly, but when we tried check_hparray we got &#8220;check_hparray Error&#8221;.  This error can have different causes like invalid slot value used or problems with permissions on executing hpacucli command.</p>
<p><span id="more-58"></span></p>
<p>We reviewed our setup and installations and we have the same settings (based on setup with other servers).</p>
<p>  We run check_hparray from NRPE we got the error:</p>
<p><em>      [root@web161 nagios]# /usr/local/nagios/libexec/check_nrpe -H localhost -c check_raid<br />
      check_hparray Error.<br />
</em></p>
<p>and it worked fine if run check_hparray command directly:</p>
<p><em>      [root@web161 nagios]# /usr/local/nagios/libexec/check_hparray -s 1<br />
      RAID OK &#8211; (Smart Array P400 in Slot 1 array A logicaldrive 1 (546.8 GB, RAID 1+0, OK))<br />
</em></p>
<p>Both of the commands above were tested using root and nagios users and they have the same results.  Then we enabled NRPE DEBUG option to get details on the problem:</p>
<p>edit: <em>/usr/local/nagios/etc/nrpe.cfg</em></p>
<p><em># DEBUGGING OPTION<br />
# This option determines whether or not debugging messages are logged to the<br />
# syslog facility.<br />
# Values: 0=debugging off, 1=debugging on</p>
<p>debug=1<br />
</em></p>
<p>and by looking on the system logs we saw the problem with our sudo:</p>
<p><em>Sep  10 04:11:35 hostname sudo:   root : TTY=pts/2 ; PWD=/var/log ; USER=root ; COMMAND=/usr/sbin/hpacucli controller slot=1 ld all show<br />
Sep  10 04:12:55 hostname sudo:   nagios : sorry, you must have a tty to run sudo ; TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/sbin/hpacucli controller slot=1 ld all show&#8221;<br />
</em></p>
<p>and the solution was to comment out in <em>/etc/sudoers</em> file the line:</p>
<p><em>Defaults    requiretty</em></p>
]]></content:encoded>
			<wfw:commentRss>http://linuxsysadminblog.com/2008/09/nagios-plugin-check_hparray-error/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
