Monitors and Metrics

Hardware Sentry collects the health metrics of all the hardware components that compose your servers, network switches, or storage systems and exposes them as monitors in your monitoring platform(s). Information specific to each monitor is provided as attributes to help you distinguish monitor instances. device_id, vendor, serial_number, model are for example some of the attributes available for physical disks.

The tables below provide detailed information about the metrics scrapped by Hardware Sentry for each monitor. The attributes agent.host.name, host.id, host.name, host.type, os.type are not listed since they apply to all monitors.

Agent

Metric Description Type Unit Attributes
hardware_sentry.agent.info Hardware Sentry Agent information. Gauge build_date, build_number, hc_version, name, otel_version, version

Battery

Metric Description Type Unit Attributes
hw.battery.charge Battery charge ratio. Gauge chemistry, device_id, id, info, model, name, parent, type, vendor
hw.battery.time_left Number of seconds left before recharging the battery when state is discharging. Gauge s chemistry, device_id, id, info, model, name, parent, state, type, vendor
hw.status Operational status of the monitored battery. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter chemistry, device_id, hw.type, id, info, model, name, parent, state, type, vendor

Blade

Metric Description Type Unit Attributes
hw.blade.power_state Blade power state. Each of the possible states (off, on and suspended) will either take the value 1 (true) or 0 (false). UpDownCounter blade_name, device_id, id, info, model, name, parent, serial_number, state
hw.status Operational status of the monitored blade. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter blade_name, device_id, hw.type, id, info, model, name, parent, serial_number, state

Connector

Metric Description Type Unit Attributes
hardware_sentry.connector.status Connector operational status. Each of the possible states (degraded, failed and ok) will either take the value 1 (true) or 0 (false). UpDownCounter applies_to_os, connector_id, description, id, name, parent, state

CPU

Metric Description Type Unit Attributes
hw.errors Number of detected and corrected errors. Counter {errors} device_id, hw.type, id, info, model, name, parent, vendor
hw.errors.limit Number of detected and corrected errors that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, id, info, limit_type, model, name, parent, vendor
hw.cpu.speed CPU current speed. Gauge Hz device_id, id, info, model, name, parent, vendor
hw.cpu.speed.limit CPU maximum speed. Gauge Hz device_id, id, info, limit_type, model, name, parent, vendor
hw.energy Energy consumed by the monitored CPU since the start of the Hardware Sentry Agent. Counter J device_id, hw.type, id, info, model, name, parent, vendor
hw.power Energy consumed by the monitored CPU. Gauge W device_id, hw.type, id, info, model, name, parent, vendor
hw.status Operational status of the monitored CPU. Each of the possible states (degraded, failed, ok, predicted_failure and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, model, name, parent, state, vendor

CPU Core

Metric Description Type Unit Attributes
hw.cpu_core.speed Current speed of the CPU core. Gauge Hz device_id, id, info, name, parent
hw.cpu_core.utilization Ratio of the CPU core usage. Gauge device_id, id, info, name, parent
hw.status Operational status of the monitored CPU Core. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, state

Disk Controller

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored disk controller since the start of the Hardware Sentry Agent. Counter J bios_version, device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.power Energy consumed by the monitored disk controller. Gauge W bios_version, device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.status Operational status of the monitored disk controller. Each of the possible states (degraded, failed, ok and present) and battery states (ok, degraded and failed) will either take the value 1 (true) or 0 (false). UpDownCounter battery_state, bios_version, device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, state, vendor

Enclosure

Metric Description Type Unit Attributes
hw.enclosure.energy Energy consumed by the enclosure since the start of the Hardware Sentry Agent. Counter J bios_version, device_id, id, info, ip_address, model, name, parent, serial_number, type, vendor
hw.enclosure.power Energy consumed by the enclosure. Gauge W bios_version, device_id, id, info, ip_address, model, name, parent, serial_number, type, vendor
hw.status Operational status of the monitored enclosure. Each of the possible states (degraded, failed, ok, open and present) will either take the value 1 (true) or 0 (false). UpDownCounter bios_version, device_id, hw.type, id, info, ip_address, model, name, parent, serial_number, state, type, vendor

Fan

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored fan since the start of the Hardware Sentry Agent. Counter J device_id, hw.type, id, info, name, parent, sensor_location
hw.fan.speed Fan speed. Gauge rpm device_id, id, info, name, parent, sensor_location
hw.fan.speed.limit Speed of the corresponding fan (in revolutions/minute) that will generate a warning or an alarm when limit_type is low.degraded or low.critical Gauge rpm device_id, id, info, limit_type, name, parent, sensor_location
hw.fan.speed_ratio Fan speed ratio. Gauge device_id, id, info, name, parent, sensor_location
hw.fan.speed_ratio.limit Fan speed ratio that will generate a warning or an alarm when limit_type is low.degraded or low.critical. Gauge device_id, id, info, limit_type, name, parent, sensor_location
hw.power Energy consumed by the monitored fan. Gauge W device_id, hw.type, id, info, name, parent, sensor_location
hw.status Operational status of the monitored fan. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, sensor_location, state

GPU

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored GPU since the start of the Hardware Sentry Agent. Counter J device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.errors Number of errors encountered by the GPU since the start of the Hardware Sentry Agent. Possible error types: all and corrected. Counter {errors} device_id, hw.type, driver_version, firmware_version, id, info, model, name, parent, serial_number, type, vendor
hw.gpu.io Number of bytes transmitted and received through the GPU when direction is transmit and receive. Counter By device_id, direction, driver_version, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.gpu.io.receive (Deprecated) Number of bytes received through the GPU. Counter By device_id, driver_version, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.gpu.io.transmit (Deprecated) Number of bytes transmitted through the GPU. Counter By device_id, driver_version, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.gpu.memory.limit GPU memory size. Gauge By device_id, driver_version, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.gpu.memory.utilization GPU memory utilization ratio. Gauge device_id, driver_version, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.gpu.memory.utilization.limit GPU memory utilization ratio that will generate a warning or an alarm when limit_type is degraded or critical. Gauge device_id, driver_version, firmware_version, id, info, limit_type, model, name, parent, serial_number, vendor
hw.gpu.utilization Ratio of time spent by the GPU for each task (decoder, encoder and general). Gauge device_id, driver_version, firmware_version, id, info, model, name, parent, serial_number, task, vendor
hw.gpu.utilization.limit GPU used time ratio that will generate a warning or an alarm when limit_type is degraded or critical. Gauge device_id, driver_version, firmware_version, id, info, limit_type, model, name, parent, serial_number, vendor
hw.power Energy consumed by the monitored device GPU. Gauge W device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.status Operational status of the monitored GPU. Each of the possible states (degraded, failed, ok, predicted_failure and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, driver_version, firmware_version, hw.type, id, info, model, name, parent, serial_number, state, vendor

Host

Metric Description Type Unit Attributes
hardware_sentry.host.configured Whether the host is configured or not. UpDownCounter id, location, name, parent
hardware_sentry.host.up Whether the configured protocol (http, ipmi, snmp, ssh, wbem, winrm and wmi) is up (1) or not (0). UpDownCounter id, location, name, parent, protocol
hw.host.ambient_temperature Host's current ambient temperature in degrees Celsius (°C). This metric is only reported if the value is between 5°C and 35°C. Gauge Cel id, location, name, parent
hw.host.configured Whether the host is configured or not. UpDownCounter id, location, name, parent
hw.host.energy Energy consumed by the components discovered for the monitored host since the start of the Hardware Sentry agent. Energy is either measured or estimated. Counter J id, location, name, parent, quality
hw.host.heating_margin Number of degrees Celsius (°C) remaining before the temperature reaches the closest warning threshold. Gauge Cel id, location, name, parent
hw.host.power Energy consumed by all the components discovered for the monitored host. Energy is either measured or estimated. Gauge W id, location, name, parent, quality

LED

Metric Description Type Unit Attributes
hw.status Operational status of the monitored LED. Each of the possible states (blinking, degraded, failed, off, ok, on and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, state

Logical Disk

Metric Description Type Unit Attributes
hw.errors Number of errors encountered by the logical disk since the start of the Hardware Sentry Agent. Counter {errors} device_id, hw.type, id, info, name, parent, raid_level
hw.errors.limit Number of errors encountered that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, id, info, limit_type, name, parent, raid_level
hw.logical_disk.limit Logical disk size. Gauge By device_id, id, info, name, parent, raid_level
hw.logical_disk.usage Amount of used or unused (free) space in the logical disk. UpDownCounter By device_id, id, info, name, parent, raid_level, state
hw.logical_disk.utilization Ratio of used or unused (free) space in the logical disk. Gauge device_id, id, info, name, parent, raid_level, state
hw.status Operational status of the monitored logical disk. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, raid_level, state

LUN

Metric Description Type Unit Attributes
hw.lun.paths Number of expected or available paths. Gauge {paths} array_name, device_id, id, info, local_device_name, name, parent, remote_device_name, type, wwn
hw.lun.paths.limit Number of available paths that will generate a warning when limit_type is low.degraded. Gauge {paths} array_name, device_id, id, info, limit_type, local_device_name, name, parent, remote_device_name, wwn
hw.status Operational status of the monitored LUN. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter array_name, device_id, hw.type, id, info, local_device_name, name, parent, remote_device_name, state, wwn

Memory Module

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored memory module since the start of the Hardware Sentry Agent. Counter J device_id, hw.type, id, info, model, name, parent, serial_number, type, vendor
hw.errors Number of errors encountered by the memory since the start of the Hardware Sentry Agent. Counter {errors} device_id, hw.type, id, info, model, name, parent, serial_number, type, vendor
hw.errors.limit Number of errors encountered that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, id, info, limit_type, model, name, parent, serial_number, type, vendor
hw.memory.limit Memory size. Gauge By device_id, id, info, model, name, parent, serial_number, type, vendor
hw.power Energy consumed by the monitored memory module. Gauge W device_id, hw.type, id, info, model, name, parent, serial_number, type, vendor
hw.status Operational status of the monitored memory module. Each of the possible states (degraded, failed, ok, predicted_failure and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, model, name, parent, serial_number, state, type, vendor

Network Card

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored network card since the start of the Hardware Sentry Agent. Counter J bandwidth, device_id, hw.type, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.errors Number of errors encountered by the network interface since the start of the Hardware Sentry Agent. Possible error types: all and zero_buffer_credit. Counter {errors} bandwidth, device_id, hw.type, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, type, vendor
hw.network.bandwidth.limit Speed that the network adapter and its remote counterpart currently use to communicate with each other. Gauge By bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.bandwidth.utilization Ratio of the available bandwidth utilization. Gauge bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.error_ratio Ratio of sent and received packets that were in error. Gauge bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.error_ratio.limit Network interface error ratio that will generate a warning or an alarm when limit_type is degraded or critical. Gauge bandwidth, device_id, id, info, limit_type, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.full_duplex Whether the port is configured to operate in full-duplex mode. UpDownCounter bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.io Total number of bytes transmitted and received through the network interface when direction is transmit and receive. Counter By bandwidth, device_id, direction, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.io.receive (Deprecated) Total number of bytes received through the network interface. Counter By bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.io.transmit (Deprecated) Total number of bytes transmitted through the network interface. Counter By bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.packets Total number of packets transmitted and received through the network interface when direction is transmit and receive. Counter {packets} bandwidth, device_id, direction, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.packets.receive (Deprecated) Total number of packets received through the network interface. Counter {packets} bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.packets.transmit (Deprecated) Total number of packets transmitted through the network interface. Counter {packets} bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.network.up Whether the network interface is plugged into the network or not. UpDownCounter bandwidth, device_id, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.power Energy consumed by the monitored network card. Gauge W bandwidth, device_id, hw.type, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, vendor
hw.status Operational status of the monitored network card. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter bandwidth, device_id, hw.type, id, info, logical_address, model, name, parent, physical_address, remote_physical_address, serial_number, state, vendor

Other Device

Metric Description Type Unit Attributes
hw.other_device.uses Number of times the device has been used. Counter {uses} device_id, device_type, id, info, name, parent
hw.other_device.uses.limit Number of times the device has been used which will generate a warning or an alarm when limit_type is degraded or critical. Gauge {uses} device_id, device_type, id, info, limit_type, name, parent
hw.other_device.value Currently reported value of the device. Gauge device_id, device_type, id, info, name, parent
hw.other_device.value.limit Device reported value that will generate a warning or an alarm when limit_type is degraded or critical. Gauge device_id, device_type, id, info, limit_type, name, parent
hw.status Operational status of the monitored device. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, device_type, hw.type, id, info, name, parent, state

Physical Disk

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored physical disk since the start of the Hardware Sentry Agent. Counter J device_id, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.errors Number of errors encountered by the physical disk since the start of the Hardware Sentry Agent. Counter {errors} device_id, hw.type, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.errors.limit Number of errors encountered that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, firmware_version, id, info, limit_type, model, name, parent, serial_number, vendor
hw.physical_disk.endurance_utilization Physical disk remaining endurance ratio. Gauge device_id, firmware_version, id, info, model, name, parent, serial_number, state, vendor
hw.physical_disk.size Physical disk size. Gauge By device_id, firmware_version, id, info, model, name, parent, serial_number, vendor
hw.power Energy consumed by the monitored physical disk. Gauge W device_id, firmware_version, hw.type, id, info, model, name, parent, serial_number, vendor
hw.status Operational status of the monitored physical disk. Each of the possible states (degraded, failed, ok, predicted_failure and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, firmware_version, hw.type, id, info, model, name, parent, serial_number, state, vendor

Power Supply

Metric Description Type Unit Attributes
hw.power_supply.limit Maximum power output. Gauge W device_id, id, info, limit_type, name, parent, power_supply_type
hw.power_supply.utilization Ratio of the power supply power currently in use. Gauge device_id, id, info, name, parent, power_supply_type
hw.status Operational status of the monitored power supply. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, power_supply_type, state

Robotics

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored robotic device since the start of the Hardware Sentry Agent. Counter J device_id, hw.type, id, info, model, name, parent, robotic_type, serial_number, vendor
hw.errors Number of errors encountered by the robotic device since the start of the Hardware Sentry Agent. Counter {errors} device_id, hw.type, id, info, model, name, parent, robotic_type, serial_number, vendor
hw.errors.limit Number of errors encountered that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, id, info, limit_type, model, name, parent, robotic_type, serial_number, vendor
hw.power Energy consumed by the monitored robotic device. Gauge W device_id, hw.type, id, info, model, name, parent, robotic_type, serial_number, vendor
hw.robotics.moves Number of moves operations that occurred during the last collect interval. Counter {moves} device_id, id, info, model, name, parent, robotic_type, serial_number, vendor
hw.status Operational status of the monitored robotic device. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, model, name, parent, robotic_type, serial_number, state, vendor

Tape Drive

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored tape drive since the start of the Hardware Sentry Agent. Counter J device_id, hw.type, id, info, model, name, parent, serial_number, vendor
hw.errors Number of errors encountered by the tape drive since the start of the Hardware Sentry Agent. Counter {errors} device_id, hw.type, id, info, model, name, parent, serial_number, vendor
hw.errors.limit Number of errors encountered that will generate a warning or an alarm when limit_type is degraded or critical. Gauge {errors} device_id, hw.type, id, info, limit_type, model, name, parent, serial_number, vendor
hw.power Energy consumed by the monitored device tape drive. Gauge W device_id, hw.type, id, info, model, name, parent, serial_number, vendor
hw.status Operational status of the monitored tape drive. Each of the possible states (degraded, failed, needs_cleaning, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, model, name, parent, serial_number, state, vendor
hw.tape_drive.operations Number of mount or unmount operations that occurred during the last collect interval. Counter {operations} device_id, id, info, model, name, parent, serial_number, type, vendor

Temperature

Metric Description Type Unit Attributes
hw.status Operational status of the monitored temperature sensor. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, sensor_location, state
hw.temperature Current temperature reading in Celsius degrees. Gauge Cel device_id, id, info, name, parent, sensor_location
hw.temperature.limit Current temperature in degrees Celsius (°C) that will generate a warning or an alarm when limit_type is high.degraded or high.critical. Gauge Cel device_id, id, info, limit_type, name, parent, sensor_location

Virtual Machine

Metric Description Type Unit Attributes
hw.energy Energy consumed by the monitored virtual machine since the start of the Hardware Sentry Agent. Counter J device_id, domain, hw.type, id, info, name, parent, vm.host.name
hw.power Energy consumed by the monitored virtual machine. Gauge W device_id, domain, hw.type, id, info, name, parent, vm.host.name
hw.status Operational status of the monitored virtual machine. The possible state (present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, domain, hw.type, id, info, name, parent, state, vm.host.name
hw.vm.power_ratio Ratio of host power consumed by the virtual machine. Gauge device_id, domain, id, info, name, parent, vm.host.name
hw.vm.power_state Virtual machine power state. Each of the possible states (off, on and suspended) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, domain, id, info, name, parent, state, vm.host.name

Voltage

Metric Description Type Unit Attributes
hw.status Operational status of the monitored voltage sensor. Each of the possible states (degraded, failed, ok and present) will either take the value 1 (true) or 0 (false). UpDownCounter device_id, hw.type, id, info, name, parent, sensor_location, state
hw.voltage Voltage output. Gauge V device_id, id, info, name, parent, sensor_location
hw.voltage.limit Upper (high.critical) or lower (low.critical) threshold of the voltage. Gauge V device_id, id, info, limit_type, name, parent, sensor_location
No results.