This section contains tips and best practices for writing Zabbix templates.
While passing passwords as user macros may sound like a convenient idea – avoid as much as you can.
If you need to authenticate in order to gather metrics – prefer to create user named zbx_monitor with read-only access.
Prefer using user parameters/external check or modules with dependent items/preprocessing over Zabbix trapper if you can, since when using Zabbix trapper you have less control over data collection.
Prefer using Zabbix trapper over user parameters/external check if one of the following statements are true:
Removing HTTP headers in web.page.get
As of Zabbix 4.4, web.paget.get returns HTTP body and headers together as the item value. So, to get valid JSON/XML data with Zabbix agent key web.page.get use Regular expression preprocessing to remove HTTP headers:
Regular expression,
parameter = \n\s?\n(.*)
output: \1
This will return JSON/XML you can now easily parse with JSONPath/XMLPath.
Always use value mappings for discrete states passed as integers.
Consider using “Boolean to decimal” preprocessing if item check result can only have two states such as YES/NO, TRUE/FALSE to preserve DB space and then apply simple value mapping.
Consider using “Discard unchanged with heartbeat” preprocessing for discrete states. This will improve state change reaction dramatically without putting additional load on Zabbix DB. Start with something like 10s/5m or 1m/30m. Note though that trigger functions such as count() or diff() may work differently.
For health check triggers consider using simple trigger expression:
If your health check metric that returns only integer values and not text statuses, you may also use:
for simplicity.
If your health check can return multiple different values, try to map them to the following triggers of different severity (simplified scale):
Level | Suggested Zabbix severity | Trigger name | Trigger dependencies | Sample expressions |
---|---|---|---|---|
Not OK | Information | Service X is not OK | depends on warning and critical level triggers | count(/TEMPLATE_NAME/METRIC,#1,{$SERVICE.STATUS.OK},ne)=1 |
Warning | Warning | Service X is in warning state | depends on critical level trigger | count(/TEMPLATE_NAME/METRIC,#1,{$SERVICE.STATUS.WARN},eq)=1 |
Critical | High or Average | Service X is in critical state | count(/TEMPLATE_NAME/METRIC,#1,{$SERVICE.STATUS.CRIT},eq)=1 |
Use 'Not OK' level if there are too many bad statuses or not all of them known.
Note 'ne' in Not OK expression.
If there are multiple metric values all indicating critical level, put them together in the single expression:
count(/TEMPLATE_NAME/METRIC,#1,{$SERVICE.STATUS.CRIT:"not_responding"},eq)=1 or
count(/TEMPLATE_NAME/METRIC,#1,{$SERVICE.STATUS.CRIT:"timeout"},eq)=1
Note that you may use macros context to label different statuses.
For noisy items, consider adding recovery expression:
Consider using “Discard unchanged with heartbeat” preprocessing for inventory and other textual data that rarely changes. This will improve inventory change reaction dramatically without putting additional load on Zabbix DB. Start with something like 15m/1d. Note though that trigger functions such as count() or diff() may work differently.
Always use this preprocessing step if rarely changing inventory field is collected from a general master item that is frequently polled.