error_received() no longer sets _dead=True; it just closes the transport
so the existing retry loop in heartbeat_sender (hbc) and sendto (hbc_mini)
reopens the connection on the next interval. This allows hbc to recover
when it starts before network connectivity is established.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- nagios_runner: remove overall_status/overall_status_code/plugin_count fields;
each command still reports its own <name>_status and <name>_status_code
- threshold: expose {output} and {status} aliases in display templates for
nagios_runner generic matches (mapped from <check_name>_output/status)
- alerts.html: fix scrolling by overriding html,body height/overflow (style.css
sets both); make hostname a link to /plugins/<hostname>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
memory_monitor / hbc_mini: ZFS ARC is reclaimable but not reflected in
MemAvailable by the Linux kernel (not in SReclaimable). Read ARC size
from /proc/spl/kstat/zfs/arcstats and add it to available memory before
computing memory_percent and memory_used. No-op on systems without ZFS.
cpu_monitor: report uptime_seconds via psutil.boot_time() (full client)
and /proc/uptime (hbc_mini).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace break-after-first-iteration with next(c for c in connections if
c.transport) so the message goes to the first connection that actually
has an open transport. Falls back to connections[0] if none are open
yet (sendto will attempt reopen), avoiding silent message loss when the
leading connection is still connecting.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Settings page: pass threshold_checker to http.start so the Threshold
Configurations section has data. Use threshold_checker's already-parsed
ThresholdConfig objects instead of re-parsing the raw nested YAML.
Named (non-default) configs now display only their explicit overrides
via threshold_raw_configs, not the full merged set with defaults.
hbc/hbc_mini: send boot and shutdown messages on first connection only
to avoid duplicate packets when multiple servers are configured.
Replace print("Daemonizing...") with logging.info so output goes to
syslog in daemon mode.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
IPv4 connections are retried forever in heartbeat_sender if open() fails,
so a temporary network outage does not terminate the sender.
IPv6 connections that have never opened successfully are dropped after
IPV6_EARLY_FAIL_LIMIT (3) consecutive failures so that a network without
IPv6 support does not keep a dead sender running.
At startup all resolved connections are added to the list regardless of
whether the initial open() succeeds; the heartbeat_sender loop handles
the first real connection attempt.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- watch: true (default) per host; watch: false suppresses all notifications
for that host in udp.py and threshold.py
- Live Dashboard and Host Overview now show only hosts where the logged-in
user is owner or manager (admins see all); WebSocket broadcasts filtered
per-connection by the same rule
- Add hbd/client/plugins/zfs_monitor.py: collects per-pool health, capacity,
fragmentation, dedup ratio, and cumulative I/O ops/bandwidth via zpool(8)
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sets dorestart and triggers a clean shutdown; os.execv re-execs
the process with the original arguments after cleanup.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- scripts/hbc_mini.py: self-contained hbc with no external deps; uses
/proc for CPU/memory/network on Linux, df for disk, JSON config
- hbc + hbc_mini: mark connection _dead and stop sending on protocol error
- README: document hbc_mini usage, config, and plugin availability
- pyproject.toml: include hbc_mini.py in script-files
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Server now sends a bare UPD command; client runs hb_install.sh to
reinstall from the package registry, then restarts. hb_install.sh
also copies itself alongside hbc on client installs.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
After daemonize() redirects stderr to /dev/null, the existing StreamHandler
writes to /dev/null. logging.basicConfig() is a no-op when handlers are
already configured, so log messages are silently lost.
Replace the daemon block to:
1. Call daemonize() first
2. Explicitly remove existing handlers (pointing to /dev/null)
3. Add SysLogHandler pointing to /dev/log with fallback to UDP localhost:514
4. Log startup message to the new syslog handler
Removes redundant syslog.openlog() call which is no longer needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When NagiosRunnerPlugin has no commands configured, set skip_reason before
returning False from initialize(). This allows PluginLoader to log INFO
(not WARNING) when the plugin is skipped.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CLIENT_DEFAULTS seeds "plugins": {} so raw_config.get("plugins", raw_config)
always returned the empty subdict instead of falling back to the full config.
Plugins configured at top-level (e.g. nagios_runner: ...) were therefore
never found, resulting in "No Nagios commands configured".
Now checks the plugins subdict first, then top-level keys, so both
config layouts work correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.