a534c06b26
Add ComparisonOperator.NAGIOS ("nagios") that maps Nagios exit codes
directly to alert levels (0=OK 1=WARNING 2=CRITICAL 3=UNKNOWN) without
requiring numeric warning/critical thresholds. Hysteresis is bypassed for
discrete codes. Display template defaults to "{check_name}: {output}".
_format_display() handles None threshold_value gracefully.
Add nagios_runner.status_code as a built-in default threshold config so
nagios checks alert out of the box.
Also: fix alerts.html scrolling (override html,body), make hostname a link
to /plugins#<hostname>, remove overall_status/overall_status_code/plugin_count
from nagios_runner and hbc_mini, replace with computed worst-status in
plugins.html via nagiosWorstStatus() helper.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
327 lines
7.4 KiB
Markdown
327 lines
7.4 KiB
Markdown
# Nagios Plugin Integration Guide
|
|
|
|
The Heartbeat monitoring system now supports running existing Nagios-compatible monitoring plugins through the `nagios_runner` plugin. This allows you to leverage the thousands of existing Nagios plugins without modification.
|
|
|
|
## Quick Start
|
|
|
|
### 1. Install Nagios Plugins
|
|
|
|
**Debian/Ubuntu:**
|
|
```bash
|
|
sudo apt-get install nagios-plugins
|
|
```
|
|
|
|
**RHEL/CentOS/Fedora:**
|
|
```bash
|
|
sudo yum install nagios-plugins-all
|
|
# or
|
|
sudo dnf install nagios-plugins-all
|
|
```
|
|
|
|
**Arch Linux:**
|
|
```bash
|
|
sudo pacman -S monitoring-plugins
|
|
```
|
|
|
|
### 2. Configure Heartbeat
|
|
|
|
Add the `nagios_runner` section to your `~/.hb.yaml` config:
|
|
|
|
```yaml
|
|
nagios_runner:
|
|
interval: 60 # Run plugins every 60 seconds
|
|
timeout: 30 # Command timeout in seconds
|
|
commands:
|
|
- name: check_disk_root
|
|
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
|
|
|
- name: check_load
|
|
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
|
|
|
|
- name: check_procs
|
|
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
|
|
```
|
|
|
|
### 3. Start Heartbeat Client
|
|
|
|
```bash
|
|
hbc -v localhost
|
|
```
|
|
|
|
The client will now execute the configured Nagios plugins and send their results to the server.
|
|
|
|
## How It Works
|
|
|
|
### Nagios Plugin Standard
|
|
|
|
Nagios plugins follow a simple interface:
|
|
|
|
1. **Exit Codes:**
|
|
- `0` = OK
|
|
- `1` = WARNING
|
|
- `2` = CRITICAL
|
|
- `3` = UNKNOWN
|
|
|
|
2. **Output Format:**
|
|
```
|
|
STATUS - Message | performance_data
|
|
```
|
|
|
|
3. **Performance Data Format:**
|
|
```
|
|
'label'=value[UOM];[warn];[crit];[min];[max]
|
|
```
|
|
|
|
### Example Plugin Output
|
|
|
|
```bash
|
|
$ /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
|
DISK OK - free space: / 156 GB (78%); | /=44GB;127;142;0;159
|
|
```
|
|
|
|
This output includes:
|
|
- **Status:** `DISK OK`
|
|
- **Message:** `free space: / 156 GB (78%)`
|
|
- **Performance Data:** `/=44GB;127;142;0;159`
|
|
- Current value: 44GB
|
|
- Warning threshold: 127GB
|
|
- Critical threshold: 142GB
|
|
- Min: 0GB
|
|
- Max: 159GB
|
|
|
|
### Data Collected
|
|
|
|
The `nagios_runner` plugin collects:
|
|
|
|
**For each configured command:**
|
|
- `{name}_status` - Status string (OK, WARNING, CRITICAL, UNKNOWN)
|
|
- `{name}_status_code` - Numeric exit code (0-3)
|
|
- `{name}_output` - Status message
|
|
- `{name}_{metric}` - Each performance metric value
|
|
- `{name}_{metric}_uom` - Unit of measurement (if present)
|
|
- `{name}_{metric}_warn` - Warning threshold (if present)
|
|
- `{name}_{metric}_crit` - Critical threshold (if present)
|
|
- `{name}_{metric}_min` - Minimum value (if present)
|
|
- `{name}_{metric}_max` - Maximum value (if present)
|
|
|
|
## Configuration Options
|
|
|
|
```yaml
|
|
nagios_runner:
|
|
# Collection interval in seconds (default: 60)
|
|
interval: 60
|
|
|
|
# Command execution timeout in seconds (default: 30)
|
|
timeout: 30
|
|
|
|
# Execute commands via shell (default: true)
|
|
# Set to false for direct execution (more secure but less flexible)
|
|
shell: true
|
|
|
|
# List of Nagios plugins to run
|
|
commands:
|
|
- name: unique_name # Required: unique identifier
|
|
command: /path/to/plugin [args] # Required: full command to execute
|
|
```
|
|
|
|
## Common Nagios Plugins
|
|
|
|
### System Resources
|
|
|
|
**Disk Space:**
|
|
```yaml
|
|
- name: check_disk_root
|
|
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
|
```
|
|
|
|
**Load Average:**
|
|
```yaml
|
|
- name: check_load
|
|
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
|
|
```
|
|
|
|
**Swap Usage:**
|
|
```yaml
|
|
- name: check_swap
|
|
command: /usr/lib/nagios/plugins/check_swap -w 20% -c 10%
|
|
```
|
|
|
|
**Process Count:**
|
|
```yaml
|
|
- name: check_procs
|
|
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
|
|
```
|
|
|
|
**Users Logged In:**
|
|
```yaml
|
|
- name: check_users
|
|
command: /usr/lib/nagios/plugins/check_users -w 5 -c 10
|
|
```
|
|
|
|
### Network Services
|
|
|
|
**SSH:**
|
|
```yaml
|
|
- name: check_ssh
|
|
command: /usr/lib/nagios/plugins/check_ssh localhost
|
|
```
|
|
|
|
**HTTP:**
|
|
```yaml
|
|
- name: check_http_local
|
|
command: /usr/lib/nagios/plugins/check_http -H localhost
|
|
|
|
- name: check_http_ssl
|
|
command: /usr/lib/nagios/plugins/check_http -H example.com --ssl
|
|
```
|
|
|
|
**DNS:**
|
|
```yaml
|
|
- name: check_dns
|
|
command: /usr/lib/nagios/plugins/check_dns -H google.com
|
|
```
|
|
|
|
**Ping:**
|
|
```yaml
|
|
- name: check_ping_gateway
|
|
command: /usr/lib/nagios/plugins/check_ping -H 192.168.1.1 -w 100,20% -c 500,60%
|
|
```
|
|
|
|
### Databases
|
|
|
|
**MySQL:**
|
|
```yaml
|
|
- name: check_mysql
|
|
command: /usr/lib/nagios/plugins/check_mysql -H localhost -u user -p password
|
|
```
|
|
|
|
**PostgreSQL:**
|
|
```yaml
|
|
- name: check_pgsql
|
|
command: /usr/lib/nagios/plugins/check_pgsql -H localhost -d database
|
|
```
|
|
|
|
## Writing Custom Nagios Plugins
|
|
|
|
You can write your own Nagios-compatible plugins in any language. Here's a simple example:
|
|
|
|
**Bash:**
|
|
```bash
|
|
#!/bin/bash
|
|
# /usr/local/bin/check_example.sh
|
|
|
|
# Get the value to check
|
|
value=$(some_command)
|
|
|
|
# Define thresholds
|
|
warn=80
|
|
crit=90
|
|
|
|
# Check and output result
|
|
if [ $value -ge $crit ]; then
|
|
echo "CRITICAL - Value is $value | value=${value};${warn};${crit};0;100"
|
|
exit 2
|
|
elif [ $value -ge $warn ]; then
|
|
echo "WARNING - Value is $value | value=${value};${warn};${crit};0;100"
|
|
exit 1
|
|
else
|
|
echo "OK - Value is $value | value=${value};${warn};${crit};0;100"
|
|
exit 0
|
|
fi
|
|
```
|
|
|
|
**Python:**
|
|
```python
|
|
#!/usr/bin/env python3
|
|
# /usr/local/bin/check_example.py
|
|
|
|
import sys
|
|
|
|
def check_something():
|
|
value = get_value() # Your check logic here
|
|
warn = 80
|
|
crit = 90
|
|
|
|
perfdata = f"value={value};{warn};{crit};0;100"
|
|
|
|
if value >= crit:
|
|
print(f"CRITICAL - Value is {value} | {perfdata}")
|
|
sys.exit(2)
|
|
elif value >= warn:
|
|
print(f"WARNING - Value is {value} | {perfdata}")
|
|
sys.exit(1)
|
|
else:
|
|
print(f"OK - Value is {value} | {perfdata}")
|
|
sys.exit(0)
|
|
|
|
if __name__ == "__main__":
|
|
check_something()
|
|
```
|
|
|
|
Then configure in Heartbeat:
|
|
```yaml
|
|
nagios_runner:
|
|
commands:
|
|
- name: my_custom_check
|
|
command: /usr/local/bin/check_example.sh
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Plugin not found
|
|
```
|
|
Error: Command not found
|
|
```
|
|
**Solution:** Use the full path to the plugin. Common locations:
|
|
- `/usr/lib/nagios/plugins/`
|
|
- `/usr/lib64/nagios/plugins/`
|
|
- `/usr/local/nagios/libexec/`
|
|
|
|
### Permission denied
|
|
```
|
|
Error: Permission denied
|
|
```
|
|
**Solution:** Ensure the plugin is executable:
|
|
```bash
|
|
chmod +x /path/to/plugin
|
|
```
|
|
|
|
### Timeout errors
|
|
```
|
|
Command timed out after 30s
|
|
```
|
|
**Solution:** Increase the timeout in config:
|
|
```yaml
|
|
nagios_runner:
|
|
timeout: 60 # Increase timeout
|
|
```
|
|
|
|
### No performance data
|
|
If performance data is not being parsed:
|
|
1. Check plugin output includes `|` separator
|
|
2. Verify performance data format: `'label'=value[UOM];...`
|
|
3. Enable debug logging: `hbc -v -x localhost`
|
|
|
|
## Benefits
|
|
|
|
1. **Massive Plugin Library:** Thousands of existing Nagios plugins available
|
|
2. **No Rewriting:** Use plugins as-is without modification
|
|
3. **Community Support:** Well-documented and maintained plugins
|
|
4. **Flexibility:** Mix Nagios plugins with native Heartbeat plugins
|
|
5. **Standard Interface:** Consistent exit codes and output format
|
|
6. **Performance Data:** Automatic extraction of metrics
|
|
|
|
## Resources
|
|
|
|
- [Nagios Plugin Development Guidelines](https://nagios-plugins.org/doc/guidelines.html)
|
|
- [Monitoring Plugins Project](https://www.monitoring-plugins.org/)
|
|
- [Nagios Exchange](https://exchange.nagios.org/) - Plugin repository
|
|
- [Check_MK Local Checks](https://docs.checkmk.com/latest/en/localchecks.html) - Compatible format
|
|
|
|
## Next Steps
|
|
|
|
- Configure threshold alerts based on Nagios plugin status codes
|
|
- View plugin data in the Heartbeat web UI
|
|
- Create custom plugins for your specific monitoring needs
|
|
- Integrate with existing Nagios/Icinga configurations
|