Nagios Automation
Nagios Automation
Nagios Automation
cd ~/myrepos
git clone https://github.com/jonschipp/nagios-plugins.git
cd nagios-plugins
cp ~/new/plugins/check_everything.sh ~/myrepos/nagios-plugins/
git add check_everything.sh
git commit -m "adding new plug-in to try out called check_everything"
git push
Then at the next hour, all the machines will have that plug-in. Plug-in removal
can be done with
$ git rm check_everything
$ git add check_everything
$ git commit -m "removing check_everything, because it doesn t really check ever
ything"
$ git push
1
2
3
4
$ git rm check_everything
$ git add check_everything
$ git commit -m "removing check_everything, because it doesn t really check ever
ything"
$ git push
Again, at the next hour, the plug-in will be gone from all those systems. If an
hour is not often enough adjust the cron job time as needed.
Using Git to Manage the Nagios Configuration:
With the /usr/local/nagios/etc directory in version control it is easy to keep t
rack of all config changes. One of the big benefits, like when using Github, in
my opinion, is being able to make changes to the server configuration without ha
ving to log into the Nagios server system. This is helpful when consulting from
home as I dont even need to VPN into the organizations network. Another large ben
efit is being able to spin up a new Nagios server with the exact same configurat
ion by cloning the repository.
To get our Nagios servers configuration into a repository create a new repository
on Github.com, your Git server, etc. Log into the nagios server and go to the c
onfiguration directory:
$ ssh nagios-server.company.com
$ apt-get install git
$ cd /usr/local/nagios/etc
$ git init
$ git add .
$ git commit -m "initial commit adding everything in the directories"
$ git remote add origin [email protected]:company-X/nagios-config.git
$ git push origin master
1
2
3
4
5
6
7
8
$
$
$
$
$
$
$
$
ssh nagios-server.company.com
apt-get install git
cd /usr/local/nagios/etc
git init
git add .
git commit -m "initial commit adding everything in the directories"
git remote add origin [email protected]:company-X/nagios-config.git
git push origin master
Now we have the Nagios server configuration in a repository. Changes can be made
to the configuration and pushed back up e.g.:
$ cd /usr/local/nagios/etc/objects/
$ vim templates.cfg
$ git add templates.cfg
$ git commit -m "increased max_check_attempts to 2 for critical host template"
$ git push
1
2
3
4
5
$
$
$
$
$
cd /usr/local/nagios/etc/objects/
vim templates.cfg
git add templates.cfg
git commit -m "increased max_check_attempts to 2 for critical host template"
git push
Or, you can set up a cron job on the nagios server that runs every 5 minutes to
pull down the latest configuration from remote repository. This way, we can make
our changes from workstations and avoid having to log into the nagios server.
# Nagios config
*/5 * * * *
root
cd /usr/local/nagios/etc && git pull
1
2
# Nagios config
*/5 * * * *
root
Note 1: If you do not need to store the configuration on another server, its more
efficient to make a repository out of the Nagios configuration directory and pu
sh directly to the server over SSH. This way the cron job on the server can be r
emoved and the changes via the hook will be applied and soon as the server merge
s in the data from the push. Though, a stipulation is that you will need to be a
ble to contact the Nagios server directly which may require a VPN connection.
Note 2: Though you should choose one method otherwise its possible to have confli
cts like when make changes on the server without syncing to the repository and t
hen making another change from your workstation to the remote repository. When t
he cron job runs next Git will that the two locations are not in sync and will n
ot apply the changes until someone manually resolves the conflicts. As mentioned
previously, Ive been happy not having to touch the Nagios server directly.
Using Git Hooks to automatically apply a Nagios Configuration:
We can go further by building on top of the previous section to automatically pu
t the new changes into effect after theyve been pulled down from the remote repos
itory. For this we will use Git hooks which are Git workflow points from which G
it can execute scripts.
In each Git repository there is a hidden .git directory which is where Git store
s all its information about the file, their changes, and so on. In this hidden d
irectory theres another directory called hooks which is where we place our script
to run with the file name of the Git lifecycle point. When a git pull is perfor
med two steps are done: a git fetch and a git merge. We want to run our hook scr
ipt when Git merges in the new changes to the repository so we must name it post
-merge.
The following simple shell script first validates the new Nagios configuration a
fter its pulled down. If it passed the Nagios configuration check then the new co
nfiguration is applied by restarting Nagios and an e-mail is sent to the admin t
eam indicating that the new configuration was successful and is now in effect. I
f the configuration check does not pass then Nagios is not restarted and an e-ma
il is sent to the admin team mentioning an error in the configuration. In both c
ases the diff of the latest change is sent in the e-mail to give context of the
new configuration.
$ cat .git/hooks/post-merge
#!/bin/bash
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
if [ $? -ne 0 ]; then
mail -s "[nagios] [broken] nagios config failed validation" [email protected]
om <
1
2
3
4
5
6
$ cat .git/hooks/post-merge
#!/bin/bash
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
if [ $? -ne 0 ]; then
mail -s "[nagios] [broken] nagios config failed validation" [email protected]
om <
Using Git to Manage the NRPE Configuration:
NRPE, the ultimate plug-in for Linux machines, has a configuration file that is
a good candidate for a Git repository. Benefits:
A single NRPE configuration file
Automatic synchronization across machines using NRPE
We can reduce complexity by using a single NRPE config file since the file is ve
ry simple and in most situations theres not much need to have different versions
of the file for different hosts. As explained in the other sections, synchroniza
tion is done with a simple cron job so that all systems have the same copy on th
e hour.
To do this and not have to worry about setting the server_address or allowed_hos
t variables which are machine dependent we will have to tell NRPE to include ano
ther file i.e. our file in the repository which is where all the checks will be
located. This is so we dont touch the machine specific settings. Ubuntus nrpe pack
age has the include option on by default, so Ill be using this as an example for
the rest of this section:
$ grep ^include /etc/nagios/nrpe.cfg
include=/etc/nagios/nrpe_local.cfg
# only snipplets ending in .cfg will get included
include_dir=/etc/nagios/nrpe.d/
1
2
3
4
$ grep ^include /etc/nagios/nrpe.cfg
include=/etc/nagios/nrpe_local.cfg
# only snipplets ending in .cfg will get included
include_dir=/etc/nagios/nrpe.d/
Notice that nrpe.cfg includes the nrpe_local.cfg or any file in the nrpe.d direc
tory with a .cfg extension. What we need to do is use the nrpe.d directory as th
e repository or choose another directory and update the include_dir parameter. Im
choosing to use /etc/nagios/nrpe.d as the repository:
$ apt-get install git
$ rmdir /etc/nagios/nrpd.d
$ git clone [email protected]:company-X/nrpe-config.git /etc/nagios/nrpe.d
1
2
3
$ apt-get install git
$ rmdir /etc/nagios/nrpd.d
$ git clone [email protected]:company-X/nrpe-config.git /etc/nagios/nrpe.d
Theres different ways this could be done but Im replacing the nrpe.d folder with o
ur repository. Do that and then configure a cron job for each system.
# Nagios NRPE config
@hourly * * * *
root
cd /etc/nagios/nrpe.d && git pull
1
2
# Nagios NRPE config
@hourly * * * *
root
From now on, clone the repository on your workstation, make changes, and then pu
sh it back up to the remote repo and on the hour all your workstations will have
the latest file. We can also add a git hook to each machine, explained in the s
ection titled Using Git Hooks to automatically apply a Nagios Configuration, which
can be used to automatically restart the NRPE daemon after the git pull so the
changes take effect immediately on the hour. Something as simple as the followin
g would work:
$ cat .git/hooks/post-merge
#!/bin/bash
service nrpe restart
1
2
3
$ cat .git/hooks/post-merge
#!/bin/bash
service nrpe restart
Note: In the case where its not possible to pass parameters to NRPE from the Nagi
os server and the servers are not all the same e.g. Different number of CPUs, the
re will be multiple commands like check_load with different parameters to take a
ccount of the differences. e.g.
# System
command[check_load_8_cores]=/usr/local/nagios/libexec/check_load -w 6 -c 8
command[check_load_16_cores]=/usr/local/nagios/libexec/check_load -w 12 -c 16
command[check_load_host-big-dataserver]=/usr/local/nagios/libexec/check_load -w
50 -c 64
1
2
3
4
# System
command[check_load_8_cores]=/usr/local/nagios/libexec/check_load -w 6 -c 8
command[check_load_16_cores]=/usr/local/nagios/libexec/check_load -w 12 -c 16
command[check_load_host-big-dataserver]=/usr/local/nagios/libexec/check_load -w
50 -c 64
I dont consider this much of an issue but extra checks will need to be added so t
hat all systems can use the same configuration file. Its useful to prefix the hos
tname to the check name if only one host uses the check. e.g. the hostname is as
terisk
command[asterisk_check_voip]=/usr/local/nagios/libexec/check_voip status
1
command[asterisk_check_voip]=/usr/local/nagios/libexec/check_voip status
Though, this can all be avoided by using Git in combination with Puppet and ERB.
Explained later.
Consolidating Repositories with Git Subtrees:
To build upon the usage of Git, subtrees can be used to combined multiples repos
itories under one. Subtrees are a little more complicated than what weve discusse
d so far so I refer you to this excellent article on creating them. A requiremen
t is that the each repository must be in its own directory somewhere under the pa
rent. For example, creating a new repository called nagios who has two subtrees:
nagios-configs and nagios-plugins.
$ tree nagios
nagios
nagios-configs
nagios-plugins
2 directories, 0 files
1
2
3
4
5
$ tree nagios
nagios
nagios-configs
nagios-plugins
2 directories, 0 files
The benefit of this is that you update the subtrees which will pull in all the c
hanges from each included repository. An example could be: Syncing multiple Nagi
os plug-in repositories each with different plug-ins, from different authors, un
der one main repository. e.g.
$ tree nagios-plugins/
nagios-plugins/
linux-plugins
check_load
check_procs
check_syslog
network-plugins
check_bps
check_interface
check_pps
windows-plugins
check_iis
check_windows_update
3 directories, 8 files
1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ tree nagios-plugins/
nagios-plugins/
linux-plugins
check_load
check_procs
check_syslog
network-plugins
check_bps
check_interface
check_pps
windows-plugins
check_iis
check_windows_update
3 directories, 8 files
You can then create a small script to update the subtrees which will pull in all
the changes from each of the repositories. In the example of Nagios plugins one
would follow the same steps as in the section titled Using Git to Automate the D
istribution of Nagios Plug-ins but cloning the new repository with the subtrees t
o libexec. libexec will then have three subdirectories each with plug-ins.
Macros can then be added to the Nagios resource file which point to each of the
subdirectories under libexec. Using check_by_ssh from the server, the command de
finitions would use the macros to find the locations of the plugins on the clien
t.
$ cat /usr/local/nagios/etc/resource.cfg
$USER1$=/usr/local/nagios/libexec
$USER2$=/usr/local/nagios/libexec/windows-plugins
$USER3$=/usr/local/nagios/libexec/linux-plugins
$USER4$=/usr/local/nagios/libexec/network-plugins
1
2
3
4
5
$ cat /usr/local/nagios/etc/resource.cfg
$USER1$=/usr/local/nagios/libexec
$USER2$=/usr/local/nagios/libexec/windows-plugins
$USER3$=/usr/local/nagios/libexec/linux-plugins
$USER4$=/usr/local/nagios/libexec/network-plugins
An example: The check_by_ssh plugin is located in the default plugin directory (
$USER1$) on the Nagios server but the plugin to be executed on the client is in
the directory with the
Linux plugins ($USER3$).
define command{
command_name
check_linux_service
command_line
$USER1$/check_by_ssh -p 22 \
-H $HOSTADDRESS$ -l nagios -i /home/nagios/.ssh/$HOSTNAME$ \
-C sudo /usr/local/nagios/libexec/$USER3$/check_service.sh -o linux -s $ARG1$
}
1
2
3
4
5
6
define command{
command_name
check_linux_service
command_line
$USER1$/check_by_ssh -p 22 \
-H $HOSTADDRESS$ -l nagios -i /home/nagios/.ssh/$HOSTNAME$ \
-C sudo /usr/local/nagios/libexec/$USER3$/check_service.sh -o linux -s $ARG1$
}
This is useful for organization or if you have Windows admins writing Windows pl
ugins and Linux admins writing Linux plugins each in there own separate repos an
d combine them for production. Another use case is if you find a number of plugi
n repositories on Github and want to use them and at the same time always stay c
urrent with the authors changes. Subtrees solve that problem.
You could include a script to the repository to pull in the changes for each sub
tree. You would run this periodically to pull down the changes and then push the
m up to the parent repository.
#!/bin/bash
if ! git remote | grep -q linux-plugins; then
echo "adding linux-plugins remote"
git remote add linux-plugins https://github.com/company-X/linux-plugins.git
fi
if ! git remote | grep -q windows-plugins; then
echo "adding windows-plugins remote"
git remote add windows-plugins https://github.com/company-X/windows-plugins.git
fi
if ! git remote | grep -q network-plugins; then
echo "adding network-plugins remote"
git remote add network-plugins https://github.com/company-X/network-plugins.git
fi
Place your Nagios repositories as subtrees under a puppet repository and configu
re your Puppet manifests to use them:
$ tree -d
puppet
|
modules
|
nagios
|
files # Files that get copied to systems
nagios-config # subtree
objects
hosts
templates
nagios-plugins # subtree
manifests # Puppet configuration files
templates # Files that get modified e.g. nrpe.cfg using ERB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ tree -d
puppet
|
modules
|
nagios
|
files # Files that get copied to systems
nagios-config # subtree
objects
hosts
templates
nagios-plugins # subtree
manifests # Puppet configuration files
templates # Files that get modified e.g. nrpe.cfg using ERB
A file resource type in Puppet can be used to automatically copy the files under
nagios-plugins to each machine and set the permissions and ownership. When the
subtree is update puppet sees that the files have changed and will copy over the
new changes each time.
$nagios_plugins = "/usr/local/nagios/libexec"
file { $nagios_plugins:
ensure => "directory",
owner => "root",
group => "nagios",
mode
=> 0550,
class nagios::nrpe_install {
$version = 2.15
$install_script = "/usr/local/nagios/install_nrpe.sh"
package { openssl-devel :
ensure => installed,
}
file { $install_script :
source => "puppet:///modules/nagios/install_nrpe.sh",
mode => 755,
require => Package["openssl-devel"],
}
exec { "$install_script":
logoutput => true,
timeout => 600,
unless => "/usr/local/nagios/bin/nrpe -h | /bin/grep -q Version: 2.15 ",
require => [ File[$install_script], Class[ nagios::nrpe_configure ] ],
}
service { nrpe :
name => nrpe,
ensure => true,
or n
exampl
syslog
with
allows the Nagios server to pass macros to the NRPE configuration file. This is
a security risk that the NRPE creators have warned about and should only be used
if necessary. To avoid the situation where all hosts have that enabled we can u
se Puppet, Facter (hostname), and ERB to significantly minimize the exposure win
dow by only enabling dont_blame_nrpe on hosts that absolutely require it.
# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed. This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments
<% if (hostname == syslog ) || (hostname == syslog1 ) then %>
dont_blame_nrpe=1
<% else %>
dont_blame_nrpe=0
<% end %>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# COMMAND ARGUMENT PROCESSING
# This option determines whether or not the NRPE daemon will allow clients
# to specify arguments to commands that are executed. This option only works
# if the daemon was configured with the --enable-command-args configure script
# option.
#
# *** ENABLING THIS OPTION IS A SECURITY RISK! ***
# Read the SECURITY file for information on some of the security implications
# of enabling this variable.
#
# Values: 0=do not allow arguments, 1=allow command arguments
<% if (hostname == syslog ) || (hostname == syslog1 ) then %>
dont_blame_nrpe=1
<% else %>
dont_blame_nrpe=0
<% end %>
- See more at: http://sickbits.net/nagios-deployment-automation-tips-and-tricks/
#sthash.u2W9HM1E.dpuf