Major refactoring of the codebase, including restructuring of files and directories, renaming of modules and classes, and improvements to the overall organization and readability of the code. This refactoring aims to enhance maintainability, scalability, and clarity of the codebase while preserving existing functionality. The changes include:
- Restructuring of the project directory into client and server components - Renaming of modules and classes to better reflect their purpose and functionality - Moving common utilities and configurations to a shared location - Updating import statements to reflect the new structure - Adding new documentation files for better clarity on various aspects of the project - Removing deprecated or unused code to streamline the codebase - Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.
This commit is contained in:
@@ -0,0 +1,331 @@
|
||||
# Nagios Plugin Integration Guide
|
||||
|
||||
The Heartbeat monitoring system now supports running existing Nagios-compatible monitoring plugins through the `nagios_runner` plugin. This allows you to leverage the thousands of existing Nagios plugins without modification.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Install Nagios Plugins
|
||||
|
||||
**Debian/Ubuntu:**
|
||||
```bash
|
||||
sudo apt-get install nagios-plugins
|
||||
```
|
||||
|
||||
**RHEL/CentOS/Fedora:**
|
||||
```bash
|
||||
sudo yum install nagios-plugins-all
|
||||
# or
|
||||
sudo dnf install nagios-plugins-all
|
||||
```
|
||||
|
||||
**Arch Linux:**
|
||||
```bash
|
||||
sudo pacman -S monitoring-plugins
|
||||
```
|
||||
|
||||
### 2. Configure Heartbeat
|
||||
|
||||
Add the `nagios_runner` section to your `~/.hb.yaml` config:
|
||||
|
||||
```yaml
|
||||
nagios_runner:
|
||||
interval: 60 # Run plugins every 60 seconds
|
||||
timeout: 30 # Command timeout in seconds
|
||||
commands:
|
||||
- name: check_disk_root
|
||||
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
||||
|
||||
- name: check_load
|
||||
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
|
||||
|
||||
- name: check_procs
|
||||
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
|
||||
```
|
||||
|
||||
### 3. Start Heartbeat Client
|
||||
|
||||
```bash
|
||||
hbc -v localhost
|
||||
```
|
||||
|
||||
The client will now execute the configured Nagios plugins and send their results to the server.
|
||||
|
||||
## How It Works
|
||||
|
||||
### Nagios Plugin Standard
|
||||
|
||||
Nagios plugins follow a simple interface:
|
||||
|
||||
1. **Exit Codes:**
|
||||
- `0` = OK
|
||||
- `1` = WARNING
|
||||
- `2` = CRITICAL
|
||||
- `3` = UNKNOWN
|
||||
|
||||
2. **Output Format:**
|
||||
```
|
||||
STATUS - Message | performance_data
|
||||
```
|
||||
|
||||
3. **Performance Data Format:**
|
||||
```
|
||||
'label'=value[UOM];[warn];[crit];[min];[max]
|
||||
```
|
||||
|
||||
### Example Plugin Output
|
||||
|
||||
```bash
|
||||
$ /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
||||
DISK OK - free space: / 156 GB (78%); | /=44GB;127;142;0;159
|
||||
```
|
||||
|
||||
This output includes:
|
||||
- **Status:** `DISK OK`
|
||||
- **Message:** `free space: / 156 GB (78%)`
|
||||
- **Performance Data:** `/=44GB;127;142;0;159`
|
||||
- Current value: 44GB
|
||||
- Warning threshold: 127GB
|
||||
- Critical threshold: 142GB
|
||||
- Min: 0GB
|
||||
- Max: 159GB
|
||||
|
||||
### Data Collected
|
||||
|
||||
The `nagios_runner` plugin collects:
|
||||
|
||||
**For each configured command:**
|
||||
- `{name}_status` - Status string (OK, WARNING, CRITICAL, UNKNOWN)
|
||||
- `{name}_status_code` - Numeric exit code (0-3)
|
||||
- `{name}_output` - Status message
|
||||
- `{name}_{metric}` - Each performance metric value
|
||||
- `{name}_{metric}_uom` - Unit of measurement (if present)
|
||||
- `{name}_{metric}_warn` - Warning threshold (if present)
|
||||
- `{name}_{metric}_crit` - Critical threshold (if present)
|
||||
- `{name}_{metric}_min` - Minimum value (if present)
|
||||
- `{name}_{metric}_max` - Maximum value (if present)
|
||||
|
||||
**Overall:**
|
||||
- `overall_status` - Worst status from all commands
|
||||
- `overall_status_code` - Worst status code
|
||||
- `plugin_count` - Number of Nagios plugins executed
|
||||
|
||||
## Configuration Options
|
||||
|
||||
```yaml
|
||||
nagios_runner:
|
||||
# Collection interval in seconds (default: 60)
|
||||
interval: 60
|
||||
|
||||
# Command execution timeout in seconds (default: 30)
|
||||
timeout: 30
|
||||
|
||||
# Execute commands via shell (default: true)
|
||||
# Set to false for direct execution (more secure but less flexible)
|
||||
shell: true
|
||||
|
||||
# List of Nagios plugins to run
|
||||
commands:
|
||||
- name: unique_name # Required: unique identifier
|
||||
command: /path/to/plugin [args] # Required: full command to execute
|
||||
```
|
||||
|
||||
## Common Nagios Plugins
|
||||
|
||||
### System Resources
|
||||
|
||||
**Disk Space:**
|
||||
```yaml
|
||||
- name: check_disk_root
|
||||
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
||||
```
|
||||
|
||||
**Load Average:**
|
||||
```yaml
|
||||
- name: check_load
|
||||
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
|
||||
```
|
||||
|
||||
**Swap Usage:**
|
||||
```yaml
|
||||
- name: check_swap
|
||||
command: /usr/lib/nagios/plugins/check_swap -w 20% -c 10%
|
||||
```
|
||||
|
||||
**Process Count:**
|
||||
```yaml
|
||||
- name: check_procs
|
||||
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
|
||||
```
|
||||
|
||||
**Users Logged In:**
|
||||
```yaml
|
||||
- name: check_users
|
||||
command: /usr/lib/nagios/plugins/check_users -w 5 -c 10
|
||||
```
|
||||
|
||||
### Network Services
|
||||
|
||||
**SSH:**
|
||||
```yaml
|
||||
- name: check_ssh
|
||||
command: /usr/lib/nagios/plugins/check_ssh localhost
|
||||
```
|
||||
|
||||
**HTTP:**
|
||||
```yaml
|
||||
- name: check_http_local
|
||||
command: /usr/lib/nagios/plugins/check_http -H localhost
|
||||
|
||||
- name: check_http_ssl
|
||||
command: /usr/lib/nagios/plugins/check_http -H example.com --ssl
|
||||
```
|
||||
|
||||
**DNS:**
|
||||
```yaml
|
||||
- name: check_dns
|
||||
command: /usr/lib/nagios/plugins/check_dns -H google.com
|
||||
```
|
||||
|
||||
**Ping:**
|
||||
```yaml
|
||||
- name: check_ping_gateway
|
||||
command: /usr/lib/nagios/plugins/check_ping -H 192.168.1.1 -w 100,20% -c 500,60%
|
||||
```
|
||||
|
||||
### Databases
|
||||
|
||||
**MySQL:**
|
||||
```yaml
|
||||
- name: check_mysql
|
||||
command: /usr/lib/nagios/plugins/check_mysql -H localhost -u user -p password
|
||||
```
|
||||
|
||||
**PostgreSQL:**
|
||||
```yaml
|
||||
- name: check_pgsql
|
||||
command: /usr/lib/nagios/plugins/check_pgsql -H localhost -d database
|
||||
```
|
||||
|
||||
## Writing Custom Nagios Plugins
|
||||
|
||||
You can write your own Nagios-compatible plugins in any language. Here's a simple example:
|
||||
|
||||
**Bash:**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /usr/local/bin/check_example.sh
|
||||
|
||||
# Get the value to check
|
||||
value=$(some_command)
|
||||
|
||||
# Define thresholds
|
||||
warn=80
|
||||
crit=90
|
||||
|
||||
# Check and output result
|
||||
if [ $value -ge $crit ]; then
|
||||
echo "CRITICAL - Value is $value | value=${value};${warn};${crit};0;100"
|
||||
exit 2
|
||||
elif [ $value -ge $warn ]; then
|
||||
echo "WARNING - Value is $value | value=${value};${warn};${crit};0;100"
|
||||
exit 1
|
||||
else
|
||||
echo "OK - Value is $value | value=${value};${warn};${crit};0;100"
|
||||
exit 0
|
||||
fi
|
||||
```
|
||||
|
||||
**Python:**
|
||||
```python
|
||||
#!/usr/bin/env python3
|
||||
# /usr/local/bin/check_example.py
|
||||
|
||||
import sys
|
||||
|
||||
def check_something():
|
||||
value = get_value() # Your check logic here
|
||||
warn = 80
|
||||
crit = 90
|
||||
|
||||
perfdata = f"value={value};{warn};{crit};0;100"
|
||||
|
||||
if value >= crit:
|
||||
print(f"CRITICAL - Value is {value} | {perfdata}")
|
||||
sys.exit(2)
|
||||
elif value >= warn:
|
||||
print(f"WARNING - Value is {value} | {perfdata}")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print(f"OK - Value is {value} | {perfdata}")
|
||||
sys.exit(0)
|
||||
|
||||
if __name__ == "__main__":
|
||||
check_something()
|
||||
```
|
||||
|
||||
Then configure in Heartbeat:
|
||||
```yaml
|
||||
nagios_runner:
|
||||
commands:
|
||||
- name: my_custom_check
|
||||
command: /usr/local/bin/check_example.sh
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Plugin not found
|
||||
```
|
||||
Error: Command not found
|
||||
```
|
||||
**Solution:** Use the full path to the plugin. Common locations:
|
||||
- `/usr/lib/nagios/plugins/`
|
||||
- `/usr/lib64/nagios/plugins/`
|
||||
- `/usr/local/nagios/libexec/`
|
||||
|
||||
### Permission denied
|
||||
```
|
||||
Error: Permission denied
|
||||
```
|
||||
**Solution:** Ensure the plugin is executable:
|
||||
```bash
|
||||
chmod +x /path/to/plugin
|
||||
```
|
||||
|
||||
### Timeout errors
|
||||
```
|
||||
Command timed out after 30s
|
||||
```
|
||||
**Solution:** Increase the timeout in config:
|
||||
```yaml
|
||||
nagios_runner:
|
||||
timeout: 60 # Increase timeout
|
||||
```
|
||||
|
||||
### No performance data
|
||||
If performance data is not being parsed:
|
||||
1. Check plugin output includes `|` separator
|
||||
2. Verify performance data format: `'label'=value[UOM];...`
|
||||
3. Enable debug logging: `hbc -v -x localhost`
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Massive Plugin Library:** Thousands of existing Nagios plugins available
|
||||
2. **No Rewriting:** Use plugins as-is without modification
|
||||
3. **Community Support:** Well-documented and maintained plugins
|
||||
4. **Flexibility:** Mix Nagios plugins with native Heartbeat plugins
|
||||
5. **Standard Interface:** Consistent exit codes and output format
|
||||
6. **Performance Data:** Automatic extraction of metrics
|
||||
|
||||
## Resources
|
||||
|
||||
- [Nagios Plugin Development Guidelines](https://nagios-plugins.org/doc/guidelines.html)
|
||||
- [Monitoring Plugins Project](https://www.monitoring-plugins.org/)
|
||||
- [Nagios Exchange](https://exchange.nagios.org/) - Plugin repository
|
||||
- [Check_MK Local Checks](https://docs.checkmk.com/latest/en/localchecks.html) - Compatible format
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Configure threshold alerts based on Nagios plugin status codes
|
||||
- View plugin data in the Heartbeat web UI
|
||||
- Create custom plugins for your specific monitoring needs
|
||||
- Integrate with existing Nagios/Icinga configurations
|
||||
Reference in New Issue
Block a user