756 lines
22 KiB
Markdown
756 lines
22 KiB
Markdown
# Heartbeat Daemon (hbd)
|
|
|
|
A lightweight UDP-based host monitoring system. Monitored hosts run a client (`hbc`) that sends periodic heartbeat packets and system metrics to a central server (`hbd`). The server tracks host reachability, evaluates metric thresholds, sends notifications, and serves a web dashboard.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
[ host running hbc ] [ server running hbd ]
|
|
┌────────────────────┐ ┌────────────────────────────┐
|
|
│ heartbeat client │ UDP 50003 │ heartbeat daemon │
|
|
│ │ ──────────> │ │
|
|
│ plugins: │ HTB / PLG │ host state tracking │
|
|
│ - cpu_monitor │ │ threshold evaluation │
|
|
│ - memory_monitor │ <────────── │ DNS updates (nsupdate) │
|
|
│ - disk_monitor │ ACK/CMD/UPD │ notifications │
|
|
│ - nagios_runner │ │ web dashboard + REST API │
|
|
│ - ... │ │ WebSocket live updates │
|
|
└────────────────────┘ └────────────────────────────┘
|
|
```
|
|
|
|
**Package:** `hbd` v5.3.10
|
|
**Python:** 3.11+
|
|
|
|
### Subpackages
|
|
|
|
| Package | Purpose |
|
|
|---|---|
|
|
| `hbd.common` | Protocol encoding/decoding, shared utilities |
|
|
| `hbd.server` | The `hbd` daemon |
|
|
| `hbd.client` | The `hbc` client |
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
Dependencies are declared in `pyproject.toml`. Install into a virtualenv:
|
|
|
|
```bash
|
|
# Server + client
|
|
pip install .
|
|
|
|
# Using the install script
|
|
scripts/hb_install.sh
|
|
```
|
|
|
|
**Entry points:**
|
|
- `hbd` — server (`hbd.server.cli:main`)
|
|
- `hbc` — client (`hbd.client.main:main`)
|
|
|
|
**Runtime dependencies:**
|
|
|
|
| Component | Packages |
|
|
|---|---|
|
|
| Both | PyYAML ≥6.0 |
|
|
| Client | psutil ≥5.9.0 |
|
|
| Server | aiohttp ≥3.11, websockets ≥13.2, Jinja2 ≥3.1.6, ruamel.yaml ≥0.18, mattermostdriver ≥7.3.0, matrix-nio ≥0.24 |
|
|
|
|
---
|
|
|
|
## Server (`hbd`)
|
|
|
|
### Starting the server
|
|
|
|
```bash
|
|
# Foreground, verbose, with config file
|
|
hbd serve -c /etc/hb.yaml -f -v
|
|
|
|
# As a module
|
|
python -m hbd.server.cli serve -c /etc/hb.yaml
|
|
```
|
|
|
|
### CLI subcommands
|
|
|
|
| Command | Description |
|
|
|---|---|
|
|
| `hbd serve` | Start the daemon (default) |
|
|
| `hbd passwd <username>` | Generate a password hash for config |
|
|
| `hbd notify` | Test notification channels |
|
|
| `hbd stop` | Stop a running daemon |
|
|
| `hbd reload` | Reload config (send SIGHUP) |
|
|
| `hbd restart` | Restart daemon |
|
|
|
|
### Configuration (`~/.hb.yaml`)
|
|
|
|
```yaml
|
|
# Network
|
|
hb_port: 50003 # UDP port for heartbeat messages
|
|
hbd_port: 50004 # HTTP API / web UI port
|
|
hbd_host: "" # Bind address (empty = all interfaces)
|
|
ws_port: 50005 # WebSocket port (plain)
|
|
wss_port: ~ # WebSocket port (TLS; requires cert_path/wss_pem/wss_key)
|
|
|
|
# Timing
|
|
interval: 20 # Expected heartbeat interval (seconds)
|
|
grace: 2 # Extra seconds before declaring a host overdue
|
|
|
|
# Persistence
|
|
pickfile: ~/.hb.pick # Host state persistence
|
|
pidfile: ~/.hb.pid
|
|
logfile: ~/.hb.log
|
|
|
|
# Message journal
|
|
journal_enabled: true
|
|
journal_dir: /var/log/heartbeat
|
|
journal_file: messages.journal
|
|
journal_max_size: 104857600 # 100 MB
|
|
journal_max_backups: 10
|
|
|
|
# DNS
|
|
nsupdate_bin: /usr/bin/nsupdate
|
|
dyndomains:
|
|
- example.com
|
|
|
|
# Threshold alert re-notification interval (seconds)
|
|
threshold_renotify_interval: 3600
|
|
|
|
# Notification channels
|
|
notification_channels:
|
|
pushover_ops:
|
|
type: pushover
|
|
token: YOUR_APP_TOKEN
|
|
user: YOUR_USER_KEY
|
|
email_ops:
|
|
type: email
|
|
smtp_server: smtp.example.com
|
|
port: 587
|
|
user: alerts@example.com
|
|
password: secret
|
|
recipients: [ops@example.com]
|
|
|
|
# Users
|
|
users:
|
|
alice:
|
|
full_name: Alice Smith
|
|
password: pbkdf2:sha256:... # generate with: hbd passwd alice
|
|
admin: true
|
|
notification_channels: [pushover_ops]
|
|
bob:
|
|
password: pbkdf2:sha256:...
|
|
notification_channels: [email_ops]
|
|
|
|
default_owner: alice
|
|
|
|
# Hosts
|
|
hosts:
|
|
webserver01:
|
|
dyndns: true # Update DNS when address changes
|
|
owner: alice
|
|
managers: [bob]
|
|
monitors: []
|
|
database01:
|
|
watch: false # Suppress all notifications for this host
|
|
```
|
|
|
|
Send SIGHUP (or `hbd reload`) to reload configuration without restarting. Changes to ports, certificates, pickle path, and journal path require a full restart.
|
|
|
|
### Persistence
|
|
|
|
Host state (reachability, plugin data, alert states) is saved to `pickfile` every 5 minutes and on clean shutdown. The server loads this state on startup.
|
|
|
|
---
|
|
|
|
## Client (`hbc`)
|
|
|
|
### Usage
|
|
|
|
```bash
|
|
# Basic — send heartbeats to a server
|
|
hbc your-server.example.com
|
|
|
|
# Multiple servers
|
|
hbc server1.example.com server2.example.com
|
|
|
|
# With config file, running as a daemon
|
|
hbc -d -c /etc/hbc.yaml your-server.example.com
|
|
|
|
# Send a boot message, then heartbeat normally
|
|
hbc -b your-server.example.com
|
|
|
|
# One-off message
|
|
hbc -m "maintenance starting" your-server.example.com
|
|
|
|
# Force IPv4 or IPv6 only
|
|
hbc -4 your-server.example.com
|
|
hbc -6 your-server.example.com
|
|
```
|
|
|
|
### Options
|
|
|
|
| Flag | Description |
|
|
|---|---|
|
|
| `-b`, `--boot` | Send a boot message at startup |
|
|
| `-c`, `--config FILE` | Config file path (default: `~/.hbc.yaml`) |
|
|
| `-d`, `--daemon` | Daemonize (logs go to syslog) |
|
|
| `-m`, `--message TEXT` | Send a one-off message and exit |
|
|
| `-n`, `--name NAME` | Override reported hostname |
|
|
| `-v`, `--verbose` | Verbose output |
|
|
| `-x`, `--debug` | Debug level (repeatable) |
|
|
| `-4` / `-6` | Restrict to IPv4 or IPv6 |
|
|
|
|
### Configuration (`~/.hbc.yaml`)
|
|
|
|
```yaml
|
|
hb_port: 50003 # Server UDP port
|
|
interval: 10 # Heartbeat interval (seconds)
|
|
owner: alice # Optional: claim ownership of this host
|
|
|
|
plugins:
|
|
cpu_monitor:
|
|
interval: 300 # Override collection interval
|
|
per_core: true # Report per-core CPU usage
|
|
memory_monitor:
|
|
interval: 300
|
|
disk_monitor:
|
|
interval: 300
|
|
network_monitor:
|
|
interval: 300
|
|
ping_monitor:
|
|
interval: 60
|
|
hosts: [8.8.8.8, 192.168.1.1]
|
|
nagios_runner:
|
|
interval: 300
|
|
commands:
|
|
- name: check_load
|
|
command: /usr/lib/nagios/plugins/check_load -w 5,4,3 -c 10,8,6
|
|
- name: check_disk_root
|
|
command: /usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /
|
|
zfs_monitor:
|
|
interval: 300
|
|
```
|
|
|
|
### Connection behaviour
|
|
|
|
- The client sends heartbeats over UDP to each server address resolved from the hostname (IPv4 and IPv6).
|
|
- If a connection fails to open at startup, IPv6 connections are dropped after 3 consecutive failures. IPv4 connections retry indefinitely.
|
|
- In daemon mode (`-d`), all log output goes to syslog (`LOG_DAEMON` facility).
|
|
|
|
---
|
|
|
|
## UDP Protocol
|
|
|
|
All messages are zlib-compressed key=value pairs with an ID prefix.
|
|
|
|
```
|
|
!<ID>: <zlib-compressed payload>
|
|
```
|
|
|
|
Payload format: `key=value;key=value;...`
|
|
|
|
| Message | Direction | Purpose |
|
|
|---|---|---|
|
|
| `HTB` | client → server | Heartbeat (name, timestamp, RTT, acks, interval) |
|
|
| `PLG` | client → server | Plugin data (plugin name + metrics) |
|
|
| `ACK` | server → client | Acknowledgment |
|
|
| `CMD` | server → client | Execute a shell command on the client |
|
|
| `UPD` | server → client | Trigger self-update via `hb_install.sh` |
|
|
|
|
Value encoding:
|
|
- Floats: 5 decimal places
|
|
- Lists/dicts: JSON prefixed with `@`
|
|
- Booleans: `1` / `0`
|
|
|
|
RTT is measured using kernel SO_TIMESTAMP when available (Linux, macOS, FreeBSD), falling back to application-layer timing.
|
|
|
|
---
|
|
|
|
## Plugin System
|
|
|
|
Plugins run on the client and collect system metrics that are sent to the server as `PLG` messages.
|
|
|
|
### Plugin types
|
|
|
|
| Type | `interval` | When collected |
|
|
|---|---|---|
|
|
| `InfoPlugin` | 0 | Once at startup; re-collected on server request |
|
|
| `MonitorPlugin` | 30 (default) | Periodically on the configured interval |
|
|
|
|
### Built-in plugins
|
|
|
|
| Plugin | Type | Data collected |
|
|
|---|---|---|
|
|
| `os_info` | Info | OS, kernel, distro, architecture, Python version, hbc version |
|
|
| `cpu_monitor` | Monitor | cpu_percent, per-core usage, load averages, process count, frequency |
|
|
| `memory_monitor` | Monitor | RAM and swap usage (ZFS ARC-aware) |
|
|
| `disk_monitor` | Monitor | Per-partition usage, disk I/O stats |
|
|
| `network_monitor` | Monitor | Per-interface byte/packet counts, connection count |
|
|
| `ping_monitor` | Monitor | RTT, packet loss, jitter per configured host |
|
|
| `filesystem_info` | Info | Mounted filesystems (excludes pseudo filesystems) |
|
|
| `nagios_runner` | Monitor | Output of configured Nagios-compatible check commands |
|
|
| `zfs_monitor` | Monitor | ZFS pool health, capacity, fragmentation, dedup ratio, I/O |
|
|
|
|
### Custom plugins
|
|
|
|
Create a `.py` file in `hbd/client/plugins/`:
|
|
|
|
```python
|
|
from hbd.client.plugin import MonitorPlugin
|
|
|
|
class MyPlugin(MonitorPlugin):
|
|
name = "my_plugin"
|
|
interval = 60
|
|
|
|
async def collect(self):
|
|
return {"my_metric": 42}
|
|
```
|
|
|
|
`initialize()` is called once at load time; return `False` to disable the plugin (e.g., if a required binary is missing).
|
|
|
|
### Nagios integration
|
|
|
|
The `nagios_runner` plugin executes any Nagios-compatible check binary:
|
|
|
|
```yaml
|
|
plugins:
|
|
nagios_runner:
|
|
commands:
|
|
- name: check_http
|
|
command: /usr/lib/nagios/plugins/check_http -H example.com
|
|
```
|
|
|
|
- Commands are validated (absolute paths, executable) at startup.
|
|
- Exit codes map to OK / WARNING / CRITICAL / UNKNOWN.
|
|
- Performance data fields are extracted and stored individually.
|
|
- The `nagios` threshold operator maps exit codes directly to alert levels (see Threshold Alerting).
|
|
|
|
---
|
|
|
|
## Threshold Alerting
|
|
|
|
The server evaluates plugin metrics against configurable thresholds and fires notifications on state changes.
|
|
|
|
### Configuration
|
|
|
|
```yaml
|
|
thresholds:
|
|
cpu_monitor:
|
|
cpu_percent:
|
|
warning: 80.0
|
|
critical: 90.0
|
|
operator: ">" # >, >=, <, <=, ==, != (default: >)
|
|
hysteresis: 0.1 # 10%: recover at 81 when critical=90
|
|
count: 1 # Require N consecutive breaches before alerting
|
|
display: "CPU {cpu_percent}% (threshold: {op_symbol}{threshold_value})"
|
|
|
|
memory_monitor:
|
|
percent:
|
|
warning: 85.0
|
|
critical: 95.0
|
|
|
|
disk_monitor:
|
|
partitions:
|
|
/:
|
|
percent:
|
|
warning: 80.0
|
|
critical: 90.0
|
|
free_gb:
|
|
warning: 10.0
|
|
critical: 5.0
|
|
operator: "<"
|
|
|
|
nagios_runner:
|
|
status_code:
|
|
operator: "nagios" # 0=OK 1=WARNING 2=CRITICAL 3=UNKNOWN
|
|
display: "{check_name}: {output}"
|
|
```
|
|
|
|
### Per-host threshold profiles
|
|
|
|
Named profiles let different hosts use different thresholds. A single name or a list is accepted; lists are applied left-to-right.
|
|
|
|
```yaml
|
|
threshold_configs:
|
|
default:
|
|
thresholds:
|
|
cpu_monitor:
|
|
cpu_percent: {warning: 80, critical: 90}
|
|
|
|
tight_cpu:
|
|
thresholds:
|
|
cpu_monitor:
|
|
cpu_percent: {warning: 60, critical: 75}
|
|
|
|
hosts:
|
|
web-01:
|
|
threshold_config: default
|
|
db-01:
|
|
threshold_config: [default, tight_cpu]
|
|
```
|
|
|
|
### Alert states
|
|
|
|
| State | Meaning |
|
|
|---|---|
|
|
| OK | Metric within normal range |
|
|
| WARNING | Metric crossed warning threshold |
|
|
| CRITICAL | Metric crossed critical threshold |
|
|
| UNKNOWN | Cannot determine (e.g. Nagios exit code 3) |
|
|
|
|
Notifications are sent on state transitions (OK → WARNING, WARNING → CRITICAL, CRITICAL → OK). De-escalations (CRITICAL → WARNING) do not trigger a notification. Ongoing alerts generate a re-notification every `threshold_renotify_interval` seconds (default: 3600). Alerts can be acknowledged via the web UI or API to suppress re-notifications.
|
|
|
|
### RTT thresholds
|
|
|
|
The server measures heartbeat round-trip time and supports RTT thresholds using the same format:
|
|
|
|
```yaml
|
|
thresholds:
|
|
rtt:
|
|
webserver01:
|
|
warning: 100.0 # ms
|
|
critical: 500.0
|
|
```
|
|
|
|
### Generic threshold matching
|
|
|
|
When a metric has no exact threshold entry, the server strips leading segments and retries. This allows one entry to cover all Nagios checks:
|
|
|
|
```
|
|
nagios_runner.check_disk_root_status_code → no match
|
|
nagios_runner.disk_root_status_code → no match
|
|
nagios_runner.root_status_code → no match
|
|
nagios_runner.status_code → matched ✓
|
|
```
|
|
|
|
The stripped prefix (`check_disk_root`) is available as `{check_name}` in the `display` template.
|
|
|
|
### Display template variables
|
|
|
|
| Variable | Description |
|
|
|---|---|
|
|
| `{value}` | Current metric value |
|
|
| `{threshold_value}` | Threshold that was crossed |
|
|
| `{op_symbol}` | Comparison operator |
|
|
| `{check_name}` | Prefix stripped by generic matching |
|
|
| `{metric_name}` | Full field name |
|
|
| `{output}` | Nagios check output text |
|
|
| `{status}` | Nagios status name (OK/WARNING/CRITICAL/UNKNOWN) |
|
|
| any plugin field | Any field present in the plugin's data |
|
|
|
|
---
|
|
|
|
## Notification Channels
|
|
|
|
Notifications are dispatched to the host's owner, managers, and monitors. Each user specifies which channels to use.
|
|
|
|
### Supported channel types
|
|
|
|
| Type | Required fields |
|
|
|---|---|
|
|
| `pushover` | `token`, `user` |
|
|
| `email` | `smtp_server`, `recipients`, `sender`, `user`, `password`, `port` |
|
|
| `mattermost` | `webhook_url`, `channel` |
|
|
| `matrix` | `homeserver`, `user`, `password`, `room_id` |
|
|
| `signal` | `phone_number`, `recipient` |
|
|
| `sms_voipms` | `api_key`, `recipient` |
|
|
|
|
Each channel can set a `min_level` (`WARNING` or `CRITICAL`) to filter low-severity alerts.
|
|
|
|
Recovery notifications are only sent to channels that received the original alert.
|
|
|
|
---
|
|
|
|
## Web Dashboard & HTTP API
|
|
|
|
The server exposes a web UI and REST API on `hbd_port` (default 50004).
|
|
|
|
### Web pages
|
|
|
|
| Path | Description |
|
|
|---|---|
|
|
| `/login` | Login form (shown automatically when auth is configured) |
|
|
| `/live` | Real-time host connectivity, RTT, and message stream |
|
|
| `/plugins/<host>` | Per-host plugin metrics |
|
|
| `/alerts` | Active alerts with severity filtering |
|
|
| `/settings` | Server config, users, notification channels, thresholds |
|
|
|
|
Live views use WebSocket connections for real-time updates.
|
|
|
|
Non-admin users see only hosts where they have a role (monitor, manager, or owner). Admins see all hosts.
|
|
|
|
### REST API
|
|
|
|
All endpoints are under `/api/0/`. When authentication is configured, include a session token:
|
|
|
|
```bash
|
|
# Log in, get a token
|
|
TOKEN=$(curl -s -X POST http://localhost:50004/api/0/auth/login \
|
|
-H 'Content-Type: application/json' \
|
|
-d '{"username":"alice","password":"secret"}' | jq -r .token)
|
|
|
|
# Use the token
|
|
curl -H "Authorization: Bearer $TOKEN" http://localhost:50004/api/0/hosts
|
|
```
|
|
|
|
| Method | Endpoint | Description |
|
|
|---|---|---|
|
|
| GET | `/api/0/hosts` | All visible hosts |
|
|
| GET | `/api/0/alerts` | All active alerts |
|
|
| GET | `/api/0/alert_summary` | Count of ok/warning/critical |
|
|
| GET | `/api/0/messages` | Last 30 messages |
|
|
| GET | `/api/0/hosts/{host}/plugins` | All plugin data for host |
|
|
| GET | `/api/0/hosts/{host}/plugins/{plugin}?limit=N` | Plugin samples |
|
|
| GET | `/api/0/hosts/{host}/alerts` | Alert states for host |
|
|
| GET | `/api/0/hosts/{host}/access` | Access roles |
|
|
| PUT | `/api/0/hosts/{host}/access` | Update access roles |
|
|
| GET | `/api/0/hosts/{host}/info` | Host info (hbc version, thresholds) |
|
|
| POST | `/api/0/alerts/acknowledge` | Acknowledge alert |
|
|
| GET | `/api/0/users` | All users (admin only) |
|
|
| GET | `/api/0/users/me` | Current user profile |
|
|
| PUT | `/api/0/users/me` | Update own profile |
|
|
| POST | `/api/0/auth/login` | Create session |
|
|
| POST | `/api/0/auth/logout` | Destroy session |
|
|
| GET | `/api/0/config` | Server config (secrets redacted) |
|
|
| POST | `/api/0/config` | Update config |
|
|
| GET | `/api/0/config/backups` | List config backups |
|
|
| POST | `/api/0/config/rollback` | Roll back to previous config |
|
|
| GET | `/api/0/notification_channels` | List channels |
|
|
| POST | `/api/0/notification_channels` | Create channel |
|
|
| PUT | `/api/0/notification_channels/{name}` | Update channel |
|
|
| DELETE | `/api/0/notification_channels/{name}` | Delete channel |
|
|
|
|
---
|
|
|
|
## User Management & Authentication
|
|
|
|
When no `users:` block is in config, the server runs unauthenticated — all existing behaviour is preserved.
|
|
|
|
### Roles
|
|
|
|
| Role | Capabilities |
|
|
|---|---|
|
|
| monitor | View status, plugin data, alerts |
|
|
| manager | monitor + queue commands, trigger DNS, queue upgrades |
|
|
| owner | manager + drop host, transfer ownership, update access |
|
|
| admin | Owner-level on all hosts + access to server config and users |
|
|
|
|
### Setup
|
|
|
|
```yaml
|
|
users:
|
|
alice:
|
|
full_name: Alice Smith
|
|
password: pbkdf2:sha256:... # hbd passwd alice
|
|
admin: true
|
|
notification_channels: [pushover_ops]
|
|
|
|
default_owner: alice # Owns any host with no explicit owner
|
|
|
|
hosts:
|
|
webserver01:
|
|
owner: alice
|
|
managers: [bob]
|
|
monitors: [carol]
|
|
```
|
|
|
|
Password hashing uses PBKDF2-HMAC-SHA256 (260,000 iterations). Sessions expire after 24 hours.
|
|
|
|
OAuth2 login (Gitea) is supported:
|
|
|
|
```yaml
|
|
oauth:
|
|
gitea:
|
|
url: https://git.example.com
|
|
client_id: xxx
|
|
client_secret: yyy
|
|
```
|
|
|
|
---
|
|
|
|
## Dynamic DNS
|
|
|
|
When `dyndns: true` is set on a host and `dyndomains` is configured, the server updates DNS via `nsupdate` whenever the host's source address changes.
|
|
|
|
```yaml
|
|
nsupdate_bin: /usr/bin/nsupdate
|
|
dyndomains:
|
|
- example.com
|
|
|
|
hosts:
|
|
webserver01:
|
|
dyndns: true
|
|
```
|
|
|
|
DNS updates run asynchronously in a background worker.
|
|
|
|
---
|
|
|
|
## Message Journal
|
|
|
|
All received messages are logged in JSONL format with automatic size-based rotation.
|
|
|
|
```yaml
|
|
journal_enabled: true
|
|
journal_dir: /var/log/heartbeat
|
|
journal_file: messages.journal
|
|
journal_max_size: 104857600 # 100 MB
|
|
journal_max_backups: 10
|
|
```
|
|
|
|
Example entry:
|
|
|
|
```json
|
|
{"timestamp":1711234567.123,"datetime":"2026-03-28T12:34:56","source_ip":"192.168.1.100","source_port":50003,"message":{"ID":"HTB","name":"webserver01","interval":10}}
|
|
```
|
|
|
|
---
|
|
|
|
## `hbc_mini` — Zero-dependency client
|
|
|
|
`scripts/hbc_mini.py` is a single-file client requiring only Python 3.8+ and no external packages. Copy it to any host and run directly.
|
|
|
|
```bash
|
|
python3 hbc_mini.py your-server.example.com
|
|
python3 hbc_mini.py -d your-server.example.com # daemon mode
|
|
python3 hbc_mini.py -b your-server.example.com # send boot message
|
|
```
|
|
|
|
Config: `~/.hbc.json` (JSON format, same keys as `~/.hbc.yaml`).
|
|
|
|
**Available plugins:**
|
|
|
|
| Plugin | Platform |
|
|
|---|---|
|
|
| `os_info` | All |
|
|
| `ping_monitor` | All |
|
|
| `nagios_runner` | All (not Windows) |
|
|
| `cpu_monitor` | Linux (`/proc/stat`; no per-core, no frequency) |
|
|
| `memory_monitor` | Linux (`/proc/meminfo`) |
|
|
| `disk_monitor` | Linux, macOS, BSD (`df -P`) |
|
|
| `network_monitor` | Linux (`/proc/net/dev`) |
|
|
|
|
Not available vs full `hbc`: no YAML config, no `filesystem_info`, no `zfs_monitor`, no IPv6 early-fail protection.
|
|
|
|
---
|
|
|
|
## `hbc_mini.c` — C client
|
|
|
|
`scripts/c/hbc_mini.c` is a single-file C port of `hbc_mini.py`. It has no runtime dependencies beyond libc, zlib, pthreads, and libm, and runs on Linux, FreeBSD, NetBSD, and DragonFly BSD.
|
|
|
|
### Build
|
|
|
|
```bash
|
|
cc -O2 -o hbc_mini scripts/c/hbc_mini.c -lz -lpthread -lm
|
|
```
|
|
|
|
### Usage
|
|
|
|
The CLI is identical to `hbc_mini.py`:
|
|
|
|
```bash
|
|
./hbc_mini your-server.example.com
|
|
./hbc_mini -d your-server.example.com # daemon mode (logs to syslog)
|
|
./hbc_mini -b your-server.example.com # send boot message
|
|
./hbc_mini -m "note" your-server.example.com # send one-shot message
|
|
./hbc_mini -4 your-server.example.com # IPv4 only
|
|
./hbc_mini -6 your-server.example.com # IPv6 only
|
|
```
|
|
|
|
Config: `~/.hbc.json` (JSON, same keys as the Python version).
|
|
|
|
### Architecture
|
|
|
|
The C client uses two threads:
|
|
|
|
- **Main thread** — heartbeat sender loop + `select()`-based receive loop (1 s timeout). Sends `HTB` at the configured interval, receives `ACK`/`CMD` messages, and re-sends `os_info` on server request.
|
|
- **Monitor thread** — all periodic plugins in a single thread with a 1-second sleep loop. Each plugin has its own next-run timestamp tracked independently.
|
|
|
|
SIGHUP causes the process to restart itself via `execv()`. SIGTERM/SIGINT trigger a clean shutdown (sends a shutdown heartbeat if `-b` was used).
|
|
|
|
### Available plugins
|
|
|
|
| Plugin | Platform | Data source |
|
|
|---|---|---|
|
|
| `os_info` | Linux, FreeBSD, NetBSD, DragonFly | `uname(2)`, `/etc/os-release`, `kern.osrelease` sysctl |
|
|
| `cpu_monitor` | Linux | `/proc/stat` |
|
|
| `cpu_monitor` | FreeBSD, DragonFly, NetBSD | `kern.cp_time` sysctl |
|
|
| `memory_monitor` | Linux | `/proc/meminfo` (ZFS ARC-aware) |
|
|
| `memory_monitor` | FreeBSD, DragonFly | `vm.stats.vm.*` sysctl |
|
|
| `memory_monitor` | NetBSD | `VM_UVMEXP` sysctl |
|
|
| `disk_monitor` | All | `df -P` subprocess |
|
|
| `network_monitor` | Linux | `/proc/net/dev` |
|
|
| `network_monitor` | FreeBSD, NetBSD, DragonFly | `getifaddrs()` + `AF_LINK` |
|
|
| `ping_monitor` | All | `ping` subprocess |
|
|
| `nagios_runner` | All | `popen()` subprocess |
|
|
|
|
`cpu_monitor` reports: `cpu_percent`, `cpu_user`, `cpu_system`, `cpu_idle`, `cpu_iowait` (Linux only), load averages, `cpu_core_count`, `uptime_seconds`.
|
|
|
|
`memory_monitor` reports: `memory_total`, `memory_used`, `memory_available`, `memory_free`, `memory_percent`, and swap fields when swap is present.
|
|
|
|
`network_monitor` reports per-interface cumulative `bytes_recv`/`bytes_sent` and interval deltas. The loopback interface (`lo`) is skipped by default; this is configurable:
|
|
|
|
```json
|
|
{
|
|
"plugins": {
|
|
"network_monitor": {
|
|
"skip_interfaces": ["lo", "docker0"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
`disk_monitor` reports per-mount `total`, `used`, `free`, `percent`. An optional mount filter restricts reporting to specific paths:
|
|
|
|
```json
|
|
{
|
|
"plugins": {
|
|
"disk_monitor": {
|
|
"mounts": ["/", "/data"]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Differences from `hbc_mini.py`
|
|
|
|
- No `filesystem_info` or `zfs_monitor` plugins
|
|
- `UPD` (self-update) messages are logged but not acted on
|
|
- No IPv6 early-fail protection
|
|
- Config is JSON only (`~/.hbc.json`), no YAML
|
|
|
|
---
|
|
|
|
## Development
|
|
|
|
### Running tests
|
|
|
|
```bash
|
|
PYTHONPATH=. python -m unittest discover -v
|
|
# or
|
|
pytest -q
|
|
```
|
|
|
|
### Linting and type checking
|
|
|
|
```bash
|
|
tox -e lint
|
|
tox -e mypy
|
|
```
|
|
|
|
### Debugging in VS Code
|
|
|
|
A `.vscode/launch.json` is included with configurations for running and attaching the debugger. Select the project `.venv` as the Python interpreter, then use F5.
|
|
|
|
To start with debugpy and wait for attach:
|
|
|
|
```bash
|
|
PYTHONPATH=. python -m debugpy --listen 5678 --wait-for-client -m hbd.server.cli serve -c .hb.yaml -f -v
|
|
```
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT. See `LICENSE` for details.
|