Add ComparisonOperator.NAGIOS ("nagios") that maps Nagios exit codes
directly to alert levels (0=OK 1=WARNING 2=CRITICAL 3=UNKNOWN) without
requiring numeric warning/critical thresholds. Hysteresis is bypassed for
discrete codes. Display template defaults to "{check_name}: {output}".
_format_display() handles None threshold_value gracefully.
Add nagios_runner.status_code as a built-in default threshold config so
nagios checks alert out of the box.
Also: fix alerts.html scrolling (override html,body), make hostname a link
to /plugins#<hostname>, remove overall_status/overall_status_code/plugin_count
from nagios_runner and hbc_mini, replace with computed worst-status in
plugins.html via nagiosWorstStatus() helper.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7.4 KiB
Nagios Plugin Integration Guide
The Heartbeat monitoring system now supports running existing Nagios-compatible monitoring plugins through the nagios_runner plugin. This allows you to leverage the thousands of existing Nagios plugins without modification.
Quick Start
1. Install Nagios Plugins
Debian/Ubuntu:
sudo apt-get install nagios-plugins
RHEL/CentOS/Fedora:
sudo yum install nagios-plugins-all
# or
sudo dnf install nagios-plugins-all
Arch Linux:
sudo pacman -S monitoring-plugins
2. Configure Heartbeat
Add the nagios_runner section to your ~/.hb.yaml config:
nagios_runner:
interval: 60 # Run plugins every 60 seconds
timeout: 30 # Command timeout in seconds
commands:
- name: check_disk_root
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
- name: check_load
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
- name: check_procs
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
3. Start Heartbeat Client
hbc -v localhost
The client will now execute the configured Nagios plugins and send their results to the server.
How It Works
Nagios Plugin Standard
Nagios plugins follow a simple interface:
-
Exit Codes:
0= OK1= WARNING2= CRITICAL3= UNKNOWN
-
Output Format:
STATUS - Message | performance_data -
Performance Data Format:
'label'=value[UOM];[warn];[crit];[min];[max]
Example Plugin Output
$ /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
DISK OK - free space: / 156 GB (78%); | /=44GB;127;142;0;159
This output includes:
- Status:
DISK OK - Message:
free space: / 156 GB (78%) - Performance Data:
/=44GB;127;142;0;159- Current value: 44GB
- Warning threshold: 127GB
- Critical threshold: 142GB
- Min: 0GB
- Max: 159GB
Data Collected
The nagios_runner plugin collects:
For each configured command:
{name}_status- Status string (OK, WARNING, CRITICAL, UNKNOWN){name}_status_code- Numeric exit code (0-3){name}_output- Status message{name}_{metric}- Each performance metric value{name}_{metric}_uom- Unit of measurement (if present){name}_{metric}_warn- Warning threshold (if present){name}_{metric}_crit- Critical threshold (if present){name}_{metric}_min- Minimum value (if present){name}_{metric}_max- Maximum value (if present)
Configuration Options
nagios_runner:
# Collection interval in seconds (default: 60)
interval: 60
# Command execution timeout in seconds (default: 30)
timeout: 30
# Execute commands via shell (default: true)
# Set to false for direct execution (more secure but less flexible)
shell: true
# List of Nagios plugins to run
commands:
- name: unique_name # Required: unique identifier
command: /path/to/plugin [args] # Required: full command to execute
Common Nagios Plugins
System Resources
Disk Space:
- name: check_disk_root
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
Load Average:
- name: check_load
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
Swap Usage:
- name: check_swap
command: /usr/lib/nagios/plugins/check_swap -w 20% -c 10%
Process Count:
- name: check_procs
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
Users Logged In:
- name: check_users
command: /usr/lib/nagios/plugins/check_users -w 5 -c 10
Network Services
SSH:
- name: check_ssh
command: /usr/lib/nagios/plugins/check_ssh localhost
HTTP:
- name: check_http_local
command: /usr/lib/nagios/plugins/check_http -H localhost
- name: check_http_ssl
command: /usr/lib/nagios/plugins/check_http -H example.com --ssl
DNS:
- name: check_dns
command: /usr/lib/nagios/plugins/check_dns -H google.com
Ping:
- name: check_ping_gateway
command: /usr/lib/nagios/plugins/check_ping -H 192.168.1.1 -w 100,20% -c 500,60%
Databases
MySQL:
- name: check_mysql
command: /usr/lib/nagios/plugins/check_mysql -H localhost -u user -p password
PostgreSQL:
- name: check_pgsql
command: /usr/lib/nagios/plugins/check_pgsql -H localhost -d database
Writing Custom Nagios Plugins
You can write your own Nagios-compatible plugins in any language. Here's a simple example:
Bash:
#!/bin/bash
# /usr/local/bin/check_example.sh
# Get the value to check
value=$(some_command)
# Define thresholds
warn=80
crit=90
# Check and output result
if [ $value -ge $crit ]; then
echo "CRITICAL - Value is $value | value=${value};${warn};${crit};0;100"
exit 2
elif [ $value -ge $warn ]; then
echo "WARNING - Value is $value | value=${value};${warn};${crit};0;100"
exit 1
else
echo "OK - Value is $value | value=${value};${warn};${crit};0;100"
exit 0
fi
Python:
#!/usr/bin/env python3
# /usr/local/bin/check_example.py
import sys
def check_something():
value = get_value() # Your check logic here
warn = 80
crit = 90
perfdata = f"value={value};{warn};{crit};0;100"
if value >= crit:
print(f"CRITICAL - Value is {value} | {perfdata}")
sys.exit(2)
elif value >= warn:
print(f"WARNING - Value is {value} | {perfdata}")
sys.exit(1)
else:
print(f"OK - Value is {value} | {perfdata}")
sys.exit(0)
if __name__ == "__main__":
check_something()
Then configure in Heartbeat:
nagios_runner:
commands:
- name: my_custom_check
command: /usr/local/bin/check_example.sh
Troubleshooting
Plugin not found
Error: Command not found
Solution: Use the full path to the plugin. Common locations:
/usr/lib/nagios/plugins//usr/lib64/nagios/plugins//usr/local/nagios/libexec/
Permission denied
Error: Permission denied
Solution: Ensure the plugin is executable:
chmod +x /path/to/plugin
Timeout errors
Command timed out after 30s
Solution: Increase the timeout in config:
nagios_runner:
timeout: 60 # Increase timeout
No performance data
If performance data is not being parsed:
- Check plugin output includes
|separator - Verify performance data format:
'label'=value[UOM];... - Enable debug logging:
hbc -v -x localhost
Benefits
- Massive Plugin Library: Thousands of existing Nagios plugins available
- No Rewriting: Use plugins as-is without modification
- Community Support: Well-documented and maintained plugins
- Flexibility: Mix Nagios plugins with native Heartbeat plugins
- Standard Interface: Consistent exit codes and output format
- Performance Data: Automatic extraction of metrics
Resources
- Nagios Plugin Development Guidelines
- Monitoring Plugins Project
- Nagios Exchange - Plugin repository
- Check_MK Local Checks - Compatible format
Next Steps
- Configure threshold alerts based on Nagios plugin status codes
- View plugin data in the Heartbeat web UI
- Create custom plugins for your specific monitoring needs
- Integrate with existing Nagios/Icinga configurations