Hostgroups In Services

Creating services in Nagios is pretty straight forward. Lets take this one for example:

define service{
        use local-service
        host_name host1
        service_description Disk Usage - C:
        check_command check_nrpe!CheckDriveSize!ShowAll MinWarn=10G MinCrit=5G Drive=C:
        max_check_attempts 4
        check_interval 1
        retry_interval 1
        }

That's great for one service on one host. So what could you do later on when you add another host (host2) and you want it to get the identical services? In any monitoring setup you want to be able to add more hosts to be monitored as quick as possible, but also have the confidence that all the important items are being monitored AND to the standards you have defined.


Option 1
You could just duplicate the service definition and change host1 to host2. While this is pretty simple, if you had ten services then you would would need to do this for all ten services. Later on down the track when you wanted to change a setting on this service AND you wanted all the hosts with the same service to get this change, you would have to change all the services ... right so that's becoming an administrative overhead just thinking about it!


Option 2
You could just ADD the new hostname (host2) to the existing host_name in a comma separate list, like this:

        host_name host1,host2

Great, this means you now have one service definition that is being applied to multiple hosts. Later on down the track when you wanted to change a setting on this service, after making the change ALL the hosts will now have the update setting and you would have only needed to make the change in one location.

However if you had ten common services then you would would need to update the host_name directive for all ten services. So while there is less to do, there is still a bit of administrative overhead involved. It would also be easy to miss one of the ten services and suddenly the new host is not having an important service being monitored ... you won't really know about it until the day the hosts's C: drive fills up!


Option 3
Use hostgroups. A service can be assigned to host(s) using the host_name directive and/or the hostgroup_name directive. So when you create your new host definition, you can add it to a hostgroup, in this example I'm using the name nrpe_hosts_windows. Something like this:

define hostgroup {
        hostgroup_name nrpe_hosts_windows
        alias nrpe_hosts_windows
        members host1,host2
        }

The service definition now looks like this:

define service{
        use local-service
        hostgroup_name nrpe_hosts_windows
        service_description Disk Usage - C:
        check_command check_nrpe!CheckDriveSize!ShowAll MinWarn=10G MinCrit=5G Drive=C:
        max_check_attempts 4
        check_interval 1
        retry_interval 1
        }

As soon as you add the new host to the hostgroup nrpe_hosts_windows and restart Nagios, this host will automatically get the service Disk Usage - C:. Also, any other services that use this hostgroup will also be created for this new host. What this means is that you have a much less administrative overhead AND it ensures consistent monitoring in your environment.

However in saying this, it also brings limitations. There are always going to be exceptions where you might need a different threshold for the warning and critical values for a specific server/disk. When these instances arrive you'll find that you won't be able to use this common service any more and you'll need to remove it from this hostgroup and create a dedicated service for it.

With this in mind, you should plan how you want to create definitions and more to the point plan how the exceptions will need to be catered for.