I’ve had some issues with some of my VMWare guests crashing for some odd reason.
The guests were my pfSense routers, so when they crashed, the house lost internet which was causing some issues as you could imagine!
Since I couldn’t work out why the guests kept crashing at first, I configured Zabbix to just reboot the virtual machines each time they went down so that the internet connectivity would be restored automatically. This meant that the emails that told me that my routers had crashed would actually be sent so that I would know that my internet had a hiccup, and for me to check to make sure any downloads I had running either finished properly, or that I had to redownload.
Zabbix has the ability to run remote commands on hosts via SSH, and my VMWare hosts had SSH enabled so I could run commands that I needed to reboot the hosts when the hosts went down.
Finding the VM
First thing I needed to do was to find the ID of the VM that I needed to reboot.
To get those, I needed to SSH onto my VM host and run the command vim-cmd vmsvc/getallvms | grep pfsense-2
which outputs this –
70 pfsense-2 [vmhost-500gb] pfsense-2/pfsense-2.vmx freebsd64Guest vmx-08 pfSense backup node
This command gave me the ID of 70 that I needed to use to reboot the VM from the command line.
Configuring the Zabbix Action
In order to run a command when a host went down, I created a new action to be run when a certain host goes down.
I used the following 3 conditions –
(A) Trigger value = PROBLEM
(B) Trigger = Template ICMP Ping: Template ICMP Ping is unavailable by ICMP
(C) Host = pfsense-2
This would make sure that the actions that run are only for the host that I have the ID for
In the Operations section of the action, I created a new step with the following settings –
Operation Type: Remote Command
Target : vmhost
Type : SSH
Username : root
Password : password
Port : 22
Commands : /bin/vim-cmd vmsvc/power.off 70 && /bin/vim-cmd vmsvc/power.on 70
The commands that I have used instructs VMWare to forcefully power off the VM with an ID of 70 – which in this case is my pfsense-2 guest, and then power it back on.
This was done on an ESX 5.1 host, but should work on anything newer as long as SSH is enabled.