Mass deploy Zabbix agent using Ansible

Zabbix is an enterprise-class open source monitoring tool. I have been working on an integration from Zabbix with IBM Netcool suite for event management and subsequently into Service-Now for incident management.

I wanted to get my Zabbix environment to start monitoring some hosts quickly so I can trigger some triggers… I wanted to roll out zabbix agents to all my lab CentOS / RHEL based servers, so I decided to use Ansible… For those who are not aware, it’s a configuration management tool like Puppet, Chef, etc. however, it’s based on SSH and hence requires no agents on the end targets… which is aces in my books, who wants yet another agent? Apart from the Zabbix one of course…

Install ansible

To install ansible on a CentOS/RHEL host server (I did this on my Zabbix server host):

yum install ansible -y

edit the /etc/ansible/hosts file to include the hosts in questions:

# This is the default ansible 'hosts' file.

# zabbix-server host defintion

zabbix-server ansible_ssh_host=192.168.0.174 ansible_ssh_user=root

[random_group]

apache01 ansible_ssh_host=192.168.0.120 ansible_ssh_user=root

mariadb01 ansible_ssh_host=192.168.0.121 ansible_ssh_user=root

nginx01 ansible_ssh_host=192.168.0.115 ansible_ssh_user=root

redis01 ansible_ssh_host=192.16840.122 ansible_ssh_user=root

Note the “[random_group]” host group name, this will help with categorizing your hosts and applying different playbooks later, so worth investing some time doing this properly.

Generate ssh-keys by running this on the Ansible host:

ssh-keygen -t rsa

and copy over to each of the targets:

cd /root/.ssh

ssh-copy-id -i id_rsa.pub 192.168.0.115

Do a quick test by running:

ansible random_group -m ping

 

and you should get:

apache01 | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

mariadb01 | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

nginx01 | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

redis01 | SUCCESS => {

    "changed": false,

    "ping": "pong"

}

 

Zabbix agent playbook

These YAML files are easy to read and understand, obviously mine is a lot less complex but you can find some really good playbooks in the interweb that can build an entire application stack.

Solider on to build the yaml file for ansible, I called mine zabbix_agent.yml and it had the below contents:

zabbix_agent.yml

 

---

- hosts: random_group

  remote_user: root

 


  tasks:

  - name: install epel repo

    yum: name=epel-release state=latest

 


  - name: install zabbix-agent

    yum: name=zabbix22-agent state=latest

 


  - name: copy zabbix configuration file

    copy: src=./conf/zabbix_agentd.conf dest=/etc/zabbix_agentd.conf seuser=system_u

    notify:

      - Start Zabbix-agent

 


  handlers:

  - name: Start zabbix-agent

    service: name=zabbix-agent state=started enabled=yes

 

I am hoping you know what to put in the zabbix_agent.conf file, the basics will do.

This will install the epel repo on CentOS and copy the conf file before starting the zabbix agent.

The file it moves is stored in the conf directory from where the YAML file is stored, file structure looks like below:

.
├── conf
│   └── zabbix_agentd.conf
└── zabbix_agent.yml

 

Host metadata and agent auto-registration

Whilst on the topic of Zabbix agent conf file, I thought it would be useful to highlight the host metadata variable. I used this to link my servers to templates automatically when the agent first connects to the Zabbix environment.

This is incredibly powerful and if used the right way, most of the churn of adding the host manually to the correct templates would be a thing of the past. What this means is that you could enter a host metadata like ‘auto-rhel-nginx-mariadb’ and when the agents connect into zabbix, it looks up the metadata and links the Linux OS, Nginx, Mariadb templates straight away.

Especially in a dynamic cloud environment with auto-scaling, where hosts come and go – this would be incredibly useful.

“But I don’t want to monitor and alert on everything as soon as it connects, that’s too much noise…” – That’s where event management tools like Netcool can help you… based on your CMDB (Service Now, some csv file or anything that allows us to connect and read) we can determine the ‘Operational State’ of your CI and build rules so it doesn’t raise any incidents for it just yet. Once it is marked as operationally live in your CMDB, you can swim in its alerts.

 

Deploying the agents

You can now run the playbook by entering below in the Ansible host:

ansible-playbook zabbix_agent.yml

You will see the output like below:

PLAY [random_group] ************************************************************

TASK [setup] *******************************************************************

ok: [apache01]

ok: [mariadb01]

ok: [redis01]

TASK [install epel repo] *******************************************************

ok: [apache01]

ok: [mariadb01]

ok: [redis01]

TASK [install zabbix-agent] ****************************************************

changed: [apache01]

changed: [mariadb01]

changed: [redis01]

TASK [copy configuration file] *************************************************

changed: [apache01]

changed: [mariadb01]

changed: [redis01]

RUNNING HANDLER [Start Zabbix-agent] *******************************************

changed: [apache01]

changed: [mariadb01]

changed: [redis01]

PLAY RECAP *********************************************************************

apache01                    : ok=5    changed=3    unreachable=0    failed=0

mariadb01                   : ok=5    changed=3    unreachable=0    failed=0

redis01                     : ok=5    changed=3    unreachable=0    failed=0

 

And if you mess up, just update the files and run the playbook again…it updates the changes and brings it all up to date.

 

You can also run adhoc commands on ansible, use below to bounce the zabbix agent:

ansible random_group -a "service zabbix-agent restart" -u root

 

That’s all folks, hopefully you found this useful. Obviously, in a production environment, you would use a non-root user and Ansible can be coded to use sudo to do the operations as well… Enjoy your newly founded powers!

 

Visits: 5849