Files
heartbeat/hbd/config_nagios_example.yaml
T
Andreas Wrede 0543266c92 Major refactoring of the codebase, including restructuring of files and directories, renaming of modules and classes, and improvements to the overall organization and readability of the code. This refactoring aims to enhance maintainability, scalability, and clarity of the codebase while preserving existing functionality. The changes include:
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.
2026-03-29 11:13:40 -04:00

112 lines
3.5 KiB
YAML

# Heartbeat Configuration Example with Nagios Plugin Runner
# This example shows how to configure the Nagios Runner plugin
# to execute existing Nagios-compatible monitoring plugins
# Basic server settings (existing config)
hb_port: 50003
hbd_port: 50004
interval: 20
grace: 2
# Plugin configuration
# Each plugin can have its own configuration section
# CPU Monitor Plugin
cpu_monitor:
interval: 300 # Collect every 5 minutes (default)
per_core: false # Set to true to get per-core CPU usage
# Nagios Runner Plugin
nagios_runner:
interval: 300 # Run Nagios plugins every 5 minutes (default)
timeout: 30 # Command execution timeout in seconds
shell: true # Execute commands via shell
# List of Nagios plugins to run
commands:
# Example 1: Check disk space
- name: check_disk_root
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
# Example 2: Check disk space for /home
- name: check_disk_home
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /home
# Example 3: Check system load
- name: check_load
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
# Example 4: Check process count
- name: check_procs
command: /usr/lib/nagios/plugins/check_procs -w 250 -c 400
# Example 5: Check SSH service
- name: check_ssh
command: /usr/lib/nagios/plugins/check_ssh localhost
# Example 6: Check HTTP service
- name: check_http
command: /usr/lib/nagios/plugins/check_http -H localhost
# Example 7: Check swap usage
- name: check_swap
command: /usr/lib/nagios/plugins/check_swap -w 20% -c 10%
# Example 8: Custom script (Nagios plugin format)
- name: check_custom
command: /usr/local/bin/my_custom_check.sh
# Example 9: Check specific log file
- name: check_logs
command: /usr/lib/nagios/plugins/check_log -F /var/log/syslog -O /var/tmp/check_log.old -q "ERROR"
# Notes:
#
# 1. Nagios Plugin Output Format:
# - Single line: STATUS - Message | performance_data
# - Performance data format: 'label'=value[UOM];[warn];[crit];[min];[max]
#
# 2. Exit Codes:
# - 0 = OK
# - 1 = WARNING
# - 2 = CRITICAL
# - 3 = UNKNOWN
#
# 3. Performance Data:
# - Automatically parsed and included in heartbeat data
# - Metrics are stored as: {plugin_name}_{metric_name}
# - Example: check_disk_root_/ will contain the disk usage percentage
#
# 4. Overall Status:
# - The plugin reports the worst status from all commands
# - Useful for quick health checks
#
# 5. Plugin Paths:
# Common Nagios plugin directories:
# - Debian/Ubuntu: /usr/lib/nagios/plugins/
# - RHEL/CentOS: /usr/lib64/nagios/plugins/
# - Custom installs: /usr/local/nagios/libexec/
#
# 6. Installing Nagios Plugins:
# Debian/Ubuntu: sudo apt-get install nagios-plugins
# RHEL/CentOS: sudo yum install nagios-plugins-all
# Arch Linux: sudo pacman -S monitoring-plugins
#
# 7. Writing Custom Nagios Plugins:
# Any script can be a Nagios plugin if it:
# - Returns appropriate exit codes (0-3)
# - Prints status message to stdout
# - Optionally includes performance data after "|"
#
# Example custom plugin (save as /usr/local/bin/check_example.sh):
# #!/bin/bash
# if [ $(uptime | awk '{print $1}') -gt 50 ]; then
# echo "CRITICAL - Too many users | users=52;40;50;0"
# exit 2
# else
# echo "OK - Normal user count | users=25;40;50;0"
# exit 0
# fi