Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/server/dell/dell_r720_http
This is a template for monitoring DELL PowerEdge R720 servers with iDRAC 8/9 firmware 4.32 (and later) with Redfish API enabled via Zabbix script items. This template works without any external scripts.
Zabbix version: 7.4 and higher.
This template has been tested on:
Zabbix should be configured according to the instructions in the Templates out of the box section.
1. Enable Redfish API in the Dell iDRAC interface of your server.
2. Create a user for monitoring with read-only permissions in the Dell iDRAC interface.
3. Create a host for Dell server with iDRAC IP as the Zabbix agent interface.
4. Link the template to the host.
5. Customize the values of the {$DELL.HTTP.API.URL}
, {$DELL.HTTP.API.USER}
, and {$DELL.HTTP.API.PASSWORD}
macros.
NOTE! If you are experiencing timeouts on some of the items that are executing requests, adjust the
{$DELL.HTTP.REQUEST.TIMEOUT}
macro accordingly.
Name | Description | Default |
---|---|---|
{$DELL.HTTP.API.URL} | The Dell iDRAC Redfish API URL in the format |
<Put your URL here> |
{$DELL.HTTP.API.USER} | The Dell iDRAC username. |
<Put your username here> |
{$DELL.HTTP.API.PASSWORD} | The Dell iDRAC user password. |
<Put your password here> |
{$DELL.HTTP.PROXY} | Set an HTTP proxy for Redfish API requests if needed. |
|
{$DELL.HTTP.RETURN.CODE.OK} | Set the HTTP return code that represents an OK response from the API. The default is "200", but can vary, for example, if a proxy is used. |
200 |
{$DELL.HTTP.REQUEST.TIMEOUT} | Set the timeout for HTTP requests. |
10s |
{$DELL.HTTP.IFCONTROL} | Link status trigger will be fired only for interfaces that have the context macro equal to "1". |
1 |
{$DELL.HTTP.CPU.UTIL.HIGH} | Sets the percentage threshold for creating a "high" severity event about CPU utilization. |
90 |
{$DELL.HTTP.CPU.UTIL.WARN} | Sets the percentage threshold for creating a "warning" severity event about CPU utilization. |
75 |
{$DELL.HTTP.MEM.UTIL.HIGH} | Sets the percentage threshold for creating a "high" severity event about memory utilization. |
90 |
{$DELL.HTTP.MEM.UTIL.WARN} | Sets the percentage threshold for creating a "warning" severity event about memory utilization. |
75 |
{$DELL.HTTP.IO.UTIL.HIGH} | Sets the percentage threshold for creating a "high" severity event about IO utilization. |
90 |
{$DELL.HTTP.IO.UTIL.WARN} | Sets the percentage threshold for creating a "warning" severity event about IO utilization. |
75 |
{$DELL.HTTP.SYS.UTIL.HIGH} | Sets the percentage threshold for creating a "high" severity event about SYS utilization. |
90 |
{$DELL.HTTP.SYS.UTIL.WARN} | Sets the percentage threshold for creating a "warning" severity event about SYS utilization. |
75 |
Name | Description | Type | Key and additional info |
---|---|---|---|
Get system | Returns system metrics. |
Script | dell.server.system.get |
Get sensors | Returns sensors. |
Script | dell.server.sensors.get |
Get array controller resources | Returns array controller resources. |
Script | dell.server.array.resources.get |
Get disks | Returns storage resources. |
Script | dell.server.disks.get |
Get network interfaces | Returns network interfaces. |
Script | dell.server.net.iface.get |
CPU utilization, in % | CPU utilization. |
Dependent item | dell.server.util.cpu Preprocessing
|
Memory utilization, in % | Memory utilization. |
Dependent item | dell.server.util.mem Preprocessing
|
IO utilization, in % | IO utilization. |
Dependent item | dell.server.util.io Preprocessing
|
SYS utilization, in % | SYS utilization. |
Dependent item | dell.server.util.sys Preprocessing
|
Overall system health status | This attribute defines the overall rollup status of all the components in the system monitored by the remote access card. Includes system, storage, IO devices, iDRAC, CPU, memory, etc. |
Dependent item | dell.server.status Preprocessing
|
Hardware model name | This attribute defines the model name of the system. |
Dependent item | dell.server.hw.model Preprocessing
|
Hardware serial number | This attribute defines the service tag of the system. |
Dependent item | dell.server.hw.serialnumber Preprocessing
|
Firmware version | This attribute defines the firmware version of a remote access card. |
Dependent item | dell.server.hw.firmware Preprocessing
|
Redfish API status | Availability of Redfish API on the server. Possible values: 0 - Unavailable 1 - Available |
Simple check | net.tcp.service[https] |
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: CPU utilization is too high | Current CPU utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.cpu,5m)>={$DELL.HTTP.CPU.UTIL.HIGH} |
High | |
Dell R720: CPU utilization is high | Current CPU utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.cpu,5m)>={$DELL.HTTP.CPU.UTIL.WARN} |
Warning | Depends on:
|
Dell R720: Memory utilization is too high | Current memory utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.mem,5m)>={$DELL.HTTP.MEM.UTIL.HIGH} |
High | |
Dell R720: Memory utilization is high | Current memory utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.mem,5m)>={$DELL.HTTP.MEM.UTIL.WARN} |
Warning | Depends on:
|
Dell R720: IO utilization is too high | Current IO utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.io,5m)>={$DELL.HTTP.IO.UTIL.HIGH} |
High | |
Dell R720: IO utilization is high | Current IO utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.io,5m)>={$DELL.HTTP.IO.UTIL.WARN} |
Warning | Depends on:
|
Dell R720: SYS utilization is too high | Current SYS utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.sys,5m)>={$DELL.HTTP.SYS.UTIL.HIGH} |
High | |
Dell R720: SYS utilization is high | Current SYS utilization has exceeded |
min(/DELL PowerEdge R720 by HTTP/dell.server.util.sys,5m)>={$DELL.HTTP.SYS.UTIL.WARN} |
Warning | Depends on:
|
Dell R720: Server is in a critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.status,)=3 |
Average | |
Dell R720: Server is in a warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.status,)=2 |
Warning | Depends on:
|
Dell R720: Device has been replaced | The device serial number has changed. Acknowledge to close the problem manually. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.serialnumber,#1)<>last(/DELL PowerEdge R720 by HTTP/dell.server.hw.serialnumber,#2) and length(last(/DELL PowerEdge R720 by HTTP/dell.server.hw.serialnumber))>0 |
Info | Manual close: Yes |
Dell R720: Firmware has changed | The firmware version has changed. Acknowledge to close the problem manually. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.firmware,#1)<>last(/DELL PowerEdge R720 by HTTP/dell.server.hw.firmware,#2) and length(last(/DELL PowerEdge R720 by HTTP/dell.server.hw.firmware))>0 |
Info | Manual close: Yes |
Dell R720: Redfish API service is unavailable | The service is unavailable or does not accept TCP connections. |
last(/DELL PowerEdge R720 by HTTP/net.tcp.service[https])=0 |
High |
Name | Description | Type | Key and additional info |
---|---|---|---|
Temperature discovery | Discovery of temperature sensors. |
Dependent item | dell.server.temp.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Probe [{#SENSOR_NAME}]: Get sensor | Returns the metrics of a sensor. |
Dependent item | dell.server.sensor.temp.get[{#SENSOR_NAME}] Preprocessing
|
Probe [{#SENSOR_NAME}]: Value | Sensor value. |
Dependent item | dell.server.sensor.temp.value[{#SENSOR_NAME}] Preprocessing
|
Probe [{#SENSOR_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.sensor.temp.status[{#SENSOR_NAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Probe [{#SENSOR_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.temp.status[{#SENSOR_NAME}],)=3 |
Average | |
Dell R720: Probe [{#SENSOR_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.temp.status[{#SENSOR_NAME}],)=2 |
Warning | Depends on:
|
Name | Description | Type | Key and additional info |
---|---|---|---|
PSU discovery | Discovery of PSU sensors. |
Dependent item | dell.server.psu.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Power supply [{#SENSOR_NAME}]: Get sensor | Returns the metrics of a sensor. |
Dependent item | dell.server.sensor.psu.get[{#SENSOR_NAME}] Preprocessing
|
Power supply [{#SENSOR_NAME}]: Voltage | Sensor value. |
Dependent item | dell.server.sensor.psu.voltage[{#SENSOR_NAME}] Preprocessing
|
Power supply [{#SENSOR_NAME}]: Voltage sensor status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.sensor.psu.voltage.status[{#SENSOR_NAME}] Preprocessing
|
Power supply [{#SENSOR_NAME}]: Current | Sensor value. |
Dependent item | dell.server.sensor.psu.current[{#SENSOR_NAME}] Preprocessing
|
Power supply [{#SENSOR_NAME}]: Current sensor status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.sensor.psu.current.status[{#SENSOR_NAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Power supply [{#SENSOR_NAME}]: Voltage sensor: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.psu.voltage.status[{#SENSOR_NAME}],)=3 |
Average | |
Dell R720: Power supply [{#SENSOR_NAME}]: Voltage sensor: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.psu.voltage.status[{#SENSOR_NAME}],)=2 |
Warning | Depends on:
|
Dell R720: Power supply [{#SENSOR_NAME}]: Current sensor: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.psu.current.status[{#SENSOR_NAME}],)=3 |
Average | |
Dell R720: Power supply [{#SENSOR_NAME}]: Current sensor: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.psu.current.status[{#SENSOR_NAME}],)=2 |
Warning | Depends on:
|
Name | Description | Type | Key and additional info |
---|---|---|---|
FAN discovery | Discovery of FAN sensors. |
Dependent item | dell.server.fan.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Fan [{#SENSOR_NAME}]: Get sensor | Returns the metrics of a sensor. |
Dependent item | dell.server.sensor.fan.get[{#SENSOR_NAME}] Preprocessing
|
Fan [{#SENSOR_NAME}]: Speed | Sensor value. |
Dependent item | dell.server.sensor.fan.speed[{#SENSOR_NAME}] Preprocessing
|
Fan [{#SENSOR_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.sensor.fan.status[{#SENSOR_NAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Fan [{#SENSOR_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.fan.status[{#SENSOR_NAME}],)=3 |
Average | |
Dell R720: Fan [{#SENSOR_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.sensor.fan.status[{#SENSOR_NAME}],)=2 |
Warning | Depends on:
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Array controller discovery | Discovery of disk array controllers. |
Dependent item | dell.server.array.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Controller [{#CNTLR_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.array.status[{#ID}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Controller [{#CNTLR_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.array.status[{#ID}],)=3 |
Average | |
Dell R720: Controller [{#CNTLR_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.array.status[{#ID}],)=2 |
Warning | Depends on:
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Battery discovery | Discovery of battery controllers. |
Dependent item | dell.server.controller.battery.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Battery [{#BATTERY_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.controller.battery.status[{#ID}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Battery [{#BATTERY_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.controller.battery.status[{#ID}],)=3 |
Average | |
Dell R720: Battery [{#BATTERY_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.controller.battery.status[{#ID}],)=2 |
Warning | Depends on:
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Physical disk discovery | Discovery of physical disks. |
Dependent item | dell.server.physicaldisk.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Physical disk [{#DISK_NAME}]: Get disk | Returns the metrics of a physical disk. |
Script | dell.server.hw.physicaldisk.get[{#DISK_NAME}] |
Physical disk [{#DISK_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.hw.physicaldisk.status[{#DISK_NAME}] Preprocessing
|
Physical disk [{#DISK_NAME}]: Serial number | The serial number of this drive. |
Dependent item | dell.server.hw.physicaldisk.serialnumber[{#DISK_NAME}] Preprocessing
|
Physical disk [{#DISK_NAME}]: Model name | The model number of the drive. |
Dependent item | dell.server.hw.physicaldisk.model[{#DISK_NAME}] Preprocessing
|
Physical disk [{#DISK_NAME}]: Media type | The type of media contained in this drive. Possible values: HDD, SSD, SMR, null. |
Dependent item | dell.server.hw.physicaldisk.media_type[{#DISK_NAME}] Preprocessing
|
Physical disk [{#DISK_NAME}]: Size | The size, in bytes, of this drive. |
Dependent item | dell.server.hw.physicaldisk.size[{#DISK_NAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Physical disk [{#DISK_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.physicaldisk.status[{#DISK_NAME}],)=3 |
Average | |
Dell R720: Physical disk [{#DISK_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.physicaldisk.status[{#DISK_NAME}],)=2 |
Warning | Depends on:
|
Dell R720: Physical disk [{#DISK_NAME}]: Has been replaced | [{#DISK_NAME}] serial number has changed. Acknowledge to close the problem manually. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.physicaldisk.serialnumber[{#DISK_NAME}],#1)<>last(/DELL PowerEdge R720 by HTTP/dell.server.hw.physicaldisk.serialnumber[{#DISK_NAME}],#2) and length(last(/DELL PowerEdge R720 by HTTP/dell.server.hw.physicaldisk.serialnumber[{#DISK_NAME}]))>0 |
Info | Manual close: Yes |
Name | Description | Type | Key and additional info |
---|---|---|---|
Virtual disk discovery | Discovery of virtual disks. |
Dependent item | dell.server.virtualdisk.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Virtual disk [{#DISK_NAME}]: Get disk | Returns the metrics of a virtual disk. |
Script | dell.server.hw.virtualdisk.get[{#DISK_NAME}] |
Virtual disk [{#DISK_NAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.hw.virtualdisk.status[{#DISK_NAME}] Preprocessing
|
Virtual disk [{#DISK_NAME}]: RAID status | This property represents the RAID specific status. Possible values: Blocked, Degraded, Failed, Foreign, Offline, Online, Ready, Unknown, null. |
Dependent item | dell.server.hw.virtualdisk.raid_status[{#DISK_NAME}] Preprocessing
|
Virtual disk [{#DISK_NAME}]: Size | The size in bytes of this Volume. |
Dependent item | dell.server.hw.virtualdisk.size[{#DISK_NAME}] Preprocessing
|
Virtual disk [{#DISK_NAME}]: Current state | The known state of the Resource, for example, Enabled. Possible values: Enabled, Disabled, StandbyOffline, StandbySpare, InTest, Starting, Absent, UnavailableOffline, Deferring, Quiesced, Updating, Qualified. |
Dependent item | dell.server.hw.virtualdisk.state[{#DISK_NAME}] Preprocessing
|
Virtual disk [{#DISK_NAME}]: Read policy | Indicates the read cache policy setting for the Volume. Possible values: ReadAhead, NoReadAhead, AdaptiveReadAhead. |
Dependent item | dell.server.hw.virtualdisk.read_policy[{#DISK_NAME}] Preprocessing
|
Virtual disk [{#DISK_NAME}]: Write policy | Indicates the write cache policy setting for the Volume. Possible values: WriteThrough, WriteBack, ProtectedWriteBack, UnprotectedWriteBack. |
Dependent item | dell.server.hw.virtualdisk.write_policy[{#DISK_NAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Virtual disk [{#DISK_NAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.virtualdisk.status[{#DISK_NAME}],)=3 |
Average | |
Dell R720: Virtual disk [{#DISK_NAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.virtualdisk.status[{#DISK_NAME}],)=2 |
Warning | Depends on:
|
Dell R720: Virtual disk [{#DISK_NAME}]: RAID status not OK | Please check the disk for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.hw.virtualdisk.raid_status[{#DISK_NAME}],)<8 |
Average |
Name | Description | Type | Key and additional info |
---|---|---|---|
Network interface discovery | Discovery of network interfaces. |
Dependent item | dell.server.net.if.discovery Preprocessing
|
Name | Description | Type | Key and additional info |
---|---|---|---|
Interface [{#IFNAME}]: Get interface | Returns the metrics of a network interface. |
Script | dell.server.net.if.get[{#IFNAME}] |
Interface [{#IFNAME}]: Speed | The network port current link speed. |
Dependent item | dell.server.net.if.speed[{#IFNAME}] Preprocessing
|
Interface [{#IFNAME}]: Link status | The status of the link between this port and its link partner. Possible values: Down, Up, null. |
Dependent item | dell.server.net.if.status[{#IFNAME}] Preprocessing
|
Interface [{#IFNAME}]: State | The known state of the Resource, for example, Enabled. Possible values: Enabled, Disabled, StandbyOffline, StandbySpare, InTest, Starting, Absent, UnavailableOffline, Deferring, Quiesced, Updating, Qualified. |
Dependent item | dell.server.net.if.state[{#IFNAME}] Preprocessing
|
Interface [{#IFNAME}]: Status | The status of the job. Possible values: OK, Warning, Critical. |
Dependent item | dell.server.net.if.health[{#IFNAME}] Preprocessing
|
Name | Description | Expression | Severity | Dependencies and additional info |
---|---|---|---|---|
Dell R720: Interface [{#IFNAME}]: Link down | This trigger expression works as follows: |
{$DELL.HTTP.IFCONTROL:"{#IFNAME}"}=1 and (last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],)=2 and last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],#1)<>last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],#2)) |
Average | Manual close: Yes |
Dell R720: Interface [{#IFNAME}]: Link status issue | This trigger expression works as follows: |
{$DELL.HTTP.IFCONTROL:"{#IFNAME}"}=1 and (last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],)<2 and last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],#1)<>last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.status[{#IFNAME}],#2)) |
Average | Manual close: Yes |
Dell R720: Interface [{#IFNAME}]: Critical state | Please check the device for faults. |
last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.health[{#IFNAME}],)=3 |
Average | |
Dell R720: Interface [{#IFNAME}]: Warning state | Please check the device for warnings. |
last(/DELL PowerEdge R720 by HTTP/dell.server.net.if.health[{#IFNAME}],)=2 |
Warning | Depends on:
|
Please report any issues with the template at https://support.zabbix.com
You can also provide feedback, discuss the template, or ask for help at ZABBIX forums