Windows monitoring with Shinken 2
In this post we are going to supervise a Windows machine. There are 3 methods to do this. The classic SNMP protocol, the NRPE agent or the use of the Microsoft Windows Management Instrumentation (WMI) system.
Here we will configure an NRPE agent. The principle is that the Shinken server initiates a connection to a remote process, the process calls the system command requested by Shinken and returns the result with the return code and standard output.
I. Configuration of the Windows computer
Download and install the NSClient ++ client available here.
During installation, you will be prompted to enter the agent login password and the address of the Shinken server. Keep the password well, it will be necessary when configuring the Shinken server. A word of advice: no special characters like “!” which is an escape character for Shinken configuration files.
Check the following boxes:
- Enable common check plugins: activates NRPE base plugins
- Enable nsclient server (check_nt): mandatory for check_nt plugins to work from Shinken
- Enable NRPE server (check_nrpe): activates agent mode. Used to make personal supervision scripts.
- Enable WMI checks: as I said previously, this activates the supervision mode à la Mircrosoft.
Once installed, you can start the service. To do this, we go to the service manager (services.msc) in order to check that NSClient ++ is indeed in “started” and “automatic” state.
The service will have to be restarted after each modification of the agent configuration. This one is under C: Program FilesNSClient ++ nsclient.
One last detail to complete the configuration. For added security Windows disables remote connection access and permissions. To check, right click and property on the service. In the tab connexion, check “Authorize the service to interact with the Office”.
II. Configuring Shinken
We will first test the good connection of our agent by directly calling the check_nt script.
1
|
/usr/lib/nagios/plugins/check_nt –H host.domain.local –p 12489 –v CLIENTVERSION –s password
|
Explanation of the options:
- -H: Name or IP address of the host to query.
- -p: port. Default 12489
- -s: The password. The one entered when installing the NSClient ++ client
- -v: Variable to query
Here, we ask for the version of the agent installed on the machine. The result should look like this:
1
|
NSClient++ 0,4,1,105 2014–04–28
|
The agent is working. We will now create a Shinken command that uses this script. For that we create the file commands / check_nt.cfg and we place the following lines
1
2
3
4
|
define command {
command_name check_nt ; Nom de la commande qui sera appelé
command_line $USER1$/check_nt –H $HOSTADDRESS$ –p 12489 –s password –v $ARG1$ $ARG2$ ; syntaxe ‘brute’ de la commande
}
|
The command is only the syntactic form of the call we made previously to test the script. The only difference is in the addition of the -v arguments to pass additional parameters. Each plugin requires a parameter such as a criticality threshold, the name of a particular service or a drive letter.
We then proceed to the creation of a group which will act as a base for all Windows servers. We therefore create the file hostgroups / windows_nrpe.cfg to place the following lines there
1
2
3
4
5
|
define hostgroup{
hostgroup_name windows_nrpe
alias Serveur Windows Via NSClient++
members serveur_windows
}
|
We will now attach services to this group. Each of the Windows servers in the latter will therefore be supervised by these services. You can place them after the hostgroups file for better readability or create a new configuration file.
Display the version of the NSClient ++ agent
1
2
3
4
5
6
|
define service {
service_description Check version NS Client ; Description de la commande
hostgroup_name windows_nrpe ; Nom du groupe sur lequel la commande sera exécutée
use generic–service ; Utilisation du template générique
check_command check_nt!CLIENTVERSION ; Commande à effectuer
}
|
Machine uptime
1
2
3
4
5
6
|
define service {
service_description Uptime
hostgroup_name windows_nrpe
use generic–service
check_command check_nt!UPTIME
}
|
CPU load
With an interrogation which allows a mode of calculation quite close to that observed on a Linux / Unix machine, that is to say an average over the last minute, the last 5 and the last 15 minutes. The warning (90) and critical (95) thresholds are specified for each of the values queried.
1
2
3
4
5
6
|
define service {
service_description CPU load
hostgroup_name windows_nrpe
use generic–service
check_command check_nt!CPULOAD!–l 1,90,95,5,90,95,15,90,95 ; Comme linux
}
|
Memory load
An 80% warning and a 90% alert.
1
2
3
4
5
6
|
define service {
service_description RAM load
hostgroup_name windows_nrpe
use generic–service
check_command check_nt!MEMUSE!–w 80 –c 90
}
|
Filling rate of hard drive “C:”
-lc: selection of the reader to supervise
-w: threshold to trigger a warning
-c: critical threshold
1
2
3
4
5
6
|
define service {
service_description Charge disque C
hostgroup_name windows_nrpe
use generic–service
check_command check_nt!USEDDISKSPACE!–l c –w 80 –c 90
}
|
Basic services are configured. We will finally create a machine that will belong to the group windows_nrpe. Creation of the /etc/shinken/hosts/serveur_windows_test.cfg file
1
2
3
4
5
|
define host{
use generic–host
host_name serveur_windows
address serveur_windows.domain.local
}
|
Edit the hostgroups / windows_nrpe.cfg file to add this machine as a member.
Finally, relaunch the Shinken Arbiter to take the changes into account.
1
|
/etc/init.d/shinken–arbiter restart
|
The result is the following