- fix: matrix/sms_voipms notifications blocked the event loop on timeout;
make send_notification async, dispatch all channel drivers as non-blocking
tasks (asyncio.to_thread for sync drivers, asyncio.wait_for for async);
update all call sites to fire-and-forget via create_task
- feat: add /about page with version, runtime, uptime counter, and repo link
- fix: hbc_mini plugin data format now matches full hbc client so Host
Overview displays memory, disk, and network metrics correctly
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Server now sends a bare UPD command; client runs hb_install.sh to
reinstall from the package registry, then restarts. hb_install.sh
also copies itself alongside hbc on client installs.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Replace pill-tab plugin view with an accordion layout that shows key
metrics (CPU%, MEM%, top disk%, net delta, nagios status) at a glance
in each host card header. Plugin sections expand as structured tables.
- Rename page to "Host Overview" (URL /plugins unchanged)
- Three-wave parallel data loading: glance plugins on host expand,
on-demand fetch for filesystem_info and extras
- Per-plugin table renderers with inline percent bars and threshold
colour coding
- Add escHtml() for XSS-safe rendering of all field values
- Remove stale planning docs (REFACTORING.md, hbd/Plan.md)
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Threshold alerts (plugin metrics, RTT) were firing immediately on the
first breach. Now every state transition to WARNING/CRITICAL starts a
grace-period timer (grace_seconds from the 'grace' config key). The
notification is deferred until the next heartbeat after grace_seconds
have elapsed. If the metric recovers within the grace window, both the
alert and the recovery are suppressed — no spurious pages for transient
spikes.
Two helper methods added to ThresholdChecker:
- _apply_grace: handles the state-change path (defer or suppress)
- _check_pending_or_renotify: handles the stable-alert path (fire
deferred notification once grace expires, or fall through to reminders)
The overdue case is unchanged — on_overdue already fires only after
interval+grace seconds of silence, which is equivalent behaviour.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
threshold.py was emitting level="RECOVERED" for metric recoveries, which
failed the is_recover check in send_notification (which only matched "RECOVER"),
bypassing _alerted_channels routing and the min_level bypass added in the
previous commit. Changed to "RECOVER" so all recovery paths are consistent.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AlertState.update() now resets last_notification when the alert level
changes, so a WARNING→CRITICAL escalation restarts the reminder interval
rather than inheriting a nearly-expired timer.
- _dispatch_to_channel() bypasses min_level for RECOVER, so recovery
notifications are delivered even after a server restart when
_alerted_channels is empty and the fallback dispatch path is used.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.