This section provides item value preprocessing details. The item value preprocessing allows to define and execute transformation rules for the received item values.
Preprocessing is managed by the preprocessing manager process along with preprocessing workers that perform the preprocessing steps. All values (with or without preprocessing) from different data gatherers pass through the preprocessing manager before being added to the history cache. Socket-based IPC communication is used between data gatherers (pollers, trappers, etc.) and the preprocessing process. Either Zabbix server or Zabbix proxy (for the items monitored by the proxy) performs the preprocessing steps.
To visualize the data flow from data source to the Zabbix database, we can use the following simplified diagram:
The diagram above shows only processes, objects and actions related to item value processing in a simplified form. The diagram does not show conditional direction changes, error handling or loops. The local data cache of the preprocessing manager is not shown either because it doesn't affect the data flow directly. The aim of this diagram is to show processes involved in the item value processing and the way they interact.
An item can change its state to NOT SUPPORTED while preprocessing is performed if any of preprocessing steps fail.
An item can change its state to NOT SUPPORTED if data normalization fails (for example, when a textual value cannot be converted to number).
Data preprocessing is performed in the following steps:
Note that in the diagram the master item preprocessing is slightly simplified by skipping the preprocessing caching.
The preprocessing queue is organized as:
Preprocessing caching was introduced to improve the preprocessing performance for multiple dependent items having similar preprocessing steps (which is a common LLD outcome).
Caching is done by preprocessing one dependent item and reusing some of the internal preprocessing data for the rest of the dependent items. The preprocessing cache is supported only for the first preprocessing step of the following types:
[?(@.path == "value")]
)The Zabbix server configuration file allows users to set the count of preprocessing worker threads. The StartPreprocessors configuration parameter should be used to set the number of pre-started instances of preprocessing workers, which should at least match the number of available CPU cores. If preprocessing tasks are not CPU-bound and involve frequent network requests, configuring additional workers is recommended. The optimal number of preprocessing workers can be determined by many factors, including the count of "preprocessable" items (items that require to execute any preprocessing steps), the count of data gathering processes, the average step count for item preprocessing, etc.
But assuming that there are no heavy preprocessing operations like parsing large XML/JSON chunks, the number of preprocessing workers can match the total number of data gatherers. This way, there will mostly (except for the cases when data from the gatherer comes in bulk) be at least one unoccupied preprocessing worker for collected data.
Too many data gathering processes (pollers, unreachable pollers, ODBC pollers, HTTP pollers, Java pollers, pingers, trappers, proxypollers) together with IPMI manager, SNMP trapper and preprocessing workers can exhaust the per-process file descriptor limit for the preprocessing manager.
Exhausting the per-process file descriptor limit will cause Zabbix server to stop, typically shortly after startup but sometimes taking longer. To avoid such issues, review the Zabbix server configuration file to optimize the number of concurrent checks and processes. Additionally, if necessary, ensure that the file descriptor limit is set sufficiently high by checking and adjusting system limits.
Item value processing is executed in multiple steps (or phases) by multiple processes. This can cause:
UINT
(trapper item can be used), dependent item has value type TEXT
.As a result, the dependent item receives a value, while the master item changes its state to NOT SUPPORTED.
CHAR
type is used for master item, then master item value will be truncated at the history synchronization phase, while dependent items will receive their values from the initial (not truncated) value of the master item.