Continuing from part 1 of the blog that I wrote, exploring some of the key concepts and options for monitoring Docker, I wanted to follow up and update on the monitoring using IBM Tivoli Monitoring (ITM) and IBM Application Performance Management (APM) offerings.
Most of the gritty details were covered on the previous blog, so I am not going to revisit those topics instead I will concentrate on the what the agent I have written does and explore what future possibilities are.
A custom agent was written and predominantly uses the Docker API to remotely gather information from the Docker host. The idea is to have a singular host querying multiple docker hosts rather than installing an agent on each host.
IBM Tivoli Monitoring agent
A screen shot of the agent’s default workspace can be seen above… it gives you a quick snapshot of the docker environment that is monitored.
The different sections of the agents are as follows,
- Container Performance
- Container Status
- Docker Events
- Docker Info
- Docker Logs
I will explore each in more detail in the following sections…
Uses the API to retrieve the:
CPU Used (%), Memory Used (% and MB), Memory Limit, Network/Received (KB/s), Netowork/Sent (KB/s), Disk Read I/O, Disk Write I/O
for each container running on the Docker host.
Uses the API to retrieve the:
real-time container state, name, image name, IP address and status.
Here you will find the exit codes for the containers and alerts are raised when a container exits with a non-zero exit code. (N.B. this will need amending as per container exit codes)
Using the API, internal docker events are also retrieved. Here you can see the various actions on the docker host and the messages from each action.
Using the API, information about the Docker host and an overview of the host is retrieved, the list of the attributes retrieved are:
Hostname, Total number of containers, Number of running containers, Number of stopped containers, Number of paused containers, Host Operating System, Host Architecture, Host Kernel Version, Host CPU count, Host Memory Allocated, Docker Version
By default Docker Engine captures all data sent to /dev/stdout and /dev/stderr and stores it in a file using its default json log-driver but many log drivers available, including syslog, awslogs, gelf, etc.
Each container’s logs are retrieved and can be monitored.
IBM Application Performance Management agent
The same ITM agent can also be used with IBM Application Performance Management (APM) server and the attributes retrieved are identical to the ITM agent.
It’s also worth mentioning that IBM APM Linux OS agent is now also capable of monitoring docker. However, it needs to be installed on each docker host rather than querying remotely.
Here are some screenshots of the custom agent reporting into the APM server:
CPU usage per container
Memory usage per container
Number of containers (Running / Total / Paused / Stopped)
Disk I/O usage per container
Docker host information and overview
Sample APM alert
To improve on the current level of monitoring, we need to develop for it to be cluster aware. Today’s production Docker environments would be using a cluster management tool such as:
Google’s Kubernetes, Apache’s Mesos, Docker Swarm
The attributes gathered from these tools would allow us to do a service based monitoring, i.e. monitoring containers in a cluster serving an application / resource as a whole rather an individual containers. This could be achieved with a concept of using container tags, labels to aggregate the data by service.
And finally, we would also need to monitor applications hosted by containers. And this is where custom / bespoke requirements will need to be fitted into a monitoring solution as this could heavily vary and a one-size-fits-all approach would not be feasible.
If you would like to discuss monitoring options for your docker estate, feel free to contact me via email at: email@example.com