heartbeat

Public Access

Author	SHA1	Message	Date
Andreas Wrede	ae60844a8a	feat: link hostnames in Live Dashboard to Host Overview Hostnames in the live dashboard table are now links to /plugins#hostname, which expands and scrolls to that host's card in the Host Overview page. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 14:37:08 -04:00
Andreas Wrede	49fa310361	feat: add Threshold Configurations section to settings page Reads threshold_configs (or legacy thresholds) from config and renders per-named-config tables showing metric path, operator, warning/critical values, hysteresis, and count. Disabled entries are dimmed. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 14:30:31 -04:00
Andreas Wrede	28e2180f7b	fix: suppress notifications on alert de-escalation (e.g. CRITICAL→WARNING) Only notify on worsening transitions (OK→WARNING, OK→CRITICAL, WARNING→CRITICAL) and recovery (any→OK). De-escalation within alert states no longer sends a duplicate notification since the metric never recovered. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 14:27:18 -04:00
Andreas Wrede	ce0590f015	fix: suppress recover messages for down durations under 4 seconds Transient blips caused by hbc client restarts no longer generate eventlog entries or notifications. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 14:18:58 -04:00
Andreas Wrede	72fc82b91f	feat: add ZFS pool renderer to Host Overview Add renderZfsTables() to plugins.html with health/capacity/frag/dedup table and cumulative I/O table; colour-code health and capacity thresholds; add zfs_monitor to plugin_order and summary/render dispatch. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 13:21:28 -04:00
Andreas Wrede	691f62aa69	feat: host-level watch flag suppresses notifications; filter dashboard/overview by owner/manager; add ZFS monitor plugin - watch: true (default) per host; watch: false suppresses all notifications for that host in udp.py and threshold.py - Live Dashboard and Host Overview now show only hosts where the logged-in user is owner or manager (admins see all); WebSocket broadcasts filtered per-connection by the same rule - Add hbd/client/plugins/zfs_monitor.py: collects per-pool health, capacity, fragmentation, dedup ratio, and cumulative I/O ops/bandwidth via zpool(8) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 12:42:35 -04:00
Andreas Wrede	cffc9805f9	fix: mask api_password and access_token in settings page; add List to threshold imports Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 11:51:55 -04:00
Andreas Wrede	917d6a401b	feat: composable threshold_config list for per-host threshold layering threshold_config in the hosts section now accepts a list of named configs applied left-to-right on top of the defaults, so focused override profiles can be mixed without duplication. Single-string and legacy host_threshold_mapping forms are unchanged. - Add threshold_raw_configs to store per-config overrides separately - Normalise threshold_config to list on parse (string or list) - get_thresholds_for_host folds the list over the default base - Update README and docs/THRESHOLD_ALERTING.md with examples Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-02 10:35:23 -04:00
Andreas Wrede	c4f09e9ced	version 5.1.8 Release / release (push) Successful in 5s Details - fix: matrix/sms_voipms notifications blocked the event loop on timeout; make send_notification async, dispatch all channel drivers as non-blocking tasks (asyncio.to_thread for sync drivers, asyncio.wait_for for async); update all call sites to fire-and-forget via create_task - feat: add /about page with version, runtime, uptime counter, and repo link - fix: hbc_mini plugin data format now matches full hbc client so Host Overview displays memory, disk, and network metrics correctly Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 05:33:27 -04:00
Andreas Wrede	64710fd4cd	tweak h1 margins	2026-05-01 04:51:11 -04:00
Andreas Wrede	1f5e7465a3	fix nav bar position	2026-05-01 04:32:04 -04:00
Andreas Wrede	b6dcce4f35	simplify eventlog usage, fix arguments	2026-04-30 15:38:46 -04:00
Andreas Wrede	c5ce41762e	feat: update hbc via hb_install.sh instead of code patching Server now sends a bare UPD command; client runs hb_install.sh to reinstall from the package registry, then restarts. hb_install.sh also copies itself alongside hbc on client installs. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-30 13:55:15 -04:00
Andreas Wrede	ddf7067d13	feat: redesign Plugin Metrics page as Host Overview Replace pill-tab plugin view with an accordion layout that shows key metrics (CPU%, MEM%, top disk%, net delta, nagios status) at a glance in each host card header. Plugin sections expand as structured tables. - Rename page to "Host Overview" (URL /plugins unchanged) - Three-wave parallel data loading: glance plugins on host expand, on-demand fetch for filesystem_info and extras - Per-plugin table renderers with inline percent bars and threshold colour coding - Add escHtml() for XSS-safe rendering of all field values - Remove stale planning docs (REFACTORING.md, hbd/Plan.md) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-04-30 08:12:07 -04:00
andreas	990c658e65	Apply grace period to all threshold alerts before logging/notifying Threshold alerts (plugin metrics, RTT) were firing immediately on the first breach. Now every state transition to WARNING/CRITICAL starts a grace-period timer (grace_seconds from the 'grace' config key). The notification is deferred until the next heartbeat after grace_seconds have elapsed. If the metric recovers within the grace window, both the alert and the recovery are suppressed — no spurious pages for transient spikes. Two helper methods added to ThresholdChecker: - _apply_grace: handles the state-change path (defer or suppress) - _check_pending_or_renotify: handles the stable-alert path (fire deferred notification once grace expires, or fall through to reminders) The overdue case is unchanged — on_overdue already fires only after interval+grace seconds of silence, which is equivalent behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 12:00:40 +02:00
andreas	b78d6ac0fe	Fix RECOVER routing: use consistent level name and route via alerted channel threshold.py was emitting level="RECOVERED" for metric recoveries, which failed the is_recover check in send_notification (which only matched "RECOVER"), bypassing _alerted_channels routing and the min_level bypass added in the previous commit. Changed to "RECOVER" so all recovery paths are consistent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-24 11:29:04 +02:00
andreas	afd5060f59	Fix early reminder notifications and lost recovery notifications - AlertState.update() now resets last_notification when the alert level changes, so a WARNING→CRITICAL escalation restarts the reminder interval rather than inheriting a nearly-expired timer. - _dispatch_to_channel() bypasses min_level for RECOVER, so recovery notifications are delivered even after a server restart when _alerted_channels is empty and the fallback dispatch path is used. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 18:11:22 +02:00
Andreas Wrede	5c382d2b8d	One more nit	2026-04-13 09:31:35 -04:00
Andreas Wrede	35bba451f5	Various formating nits	2026-04-13 09:27:51 -04:00
Andreas Wrede	80edfba0c0	fix inconsistencies in page layout, add swiss clock	2026-04-13 08:45:50 -04:00
Andreas Wrede	6bc8de192e	fix non-alerting of overdue hosts	2026-04-12 18:44:36 -04:00
Andreas Wrede	d0c8c186f4	Fix typo	2026-04-12 13:04:17 -04:00
Andreas Wrede	19f7c8312e	Mkae columns sortabel agian, check hbc version, provide modile html pages	2026-04-12 12:53:00 -04:00
Andreas Wrede	24b0e362fb	provide cli function stop, restart and reload for hbd Thought for 1s	2026-04-12 12:06:07 -04:00
Andreas Wrede	3a030548c0	Fix profile not updating	2026-04-12 11:57:12 -04:00
Andreas Wrede	094cb7ed9d	Merge branch 'master' of git.wrede.ca:andreas/heartbeat	2026-04-12 11:23:28 -04:00
Andreas Wrede	0199ca4693	re-factor notifications, add sms and matrix as channels	2026-04-12 11:21:21 -04:00
Andreas Wrede	75344ebbbd	re-factor notifications, add sms and matrix as channels	2026-04-12 11:04:00 -04:00
Andreas Wrede	7f049a4e26	accept websocket connection on http:.../ws	2026-04-12 06:44:32 -04:00
Andreas Wrede	6217f7a124	fix bogus notification on new clients	2026-04-10 13:39:18 -04:00
Andreas Wrede	2468386f24	adjust default log, pick and config locations. renotify on critical only, make user sessions persistem	2026-04-10 13:24:57 -04:00
Andreas Wrede	2015195112	Grace interval on restart of hbd, fix SIGHUP processing	2026-04-10 12:58:38 -04:00
Andreas Wrede	3426185383	Set SO_TIMESTAMP correctly for the various platforms	2026-04-10 11:19:47 -04:00
Andreas Wrede	9eedbafe97	Show overdue in alerts instead of null	2026-04-10 09:20:28 -04:00
Andreas Wrede	a5f31c5cb5	update picked data strucures	2026-04-10 09:18:38 -04:00
Andreas Wrede	2f72cf0118	typo	2026-04-10 09:17:57 -04:00
Andreas Wrede	ba27d2e300	Add count to rtt threshold	2026-04-10 08:07:50 -04:00
Andreas Wrede	381e37efce	fix log-section height	2026-04-10 08:01:22 -04:00
Andreas Wrede	97dfc08f4d	fix log level settiung	2026-04-10 08:00:51 -04:00
Andreas Wrede	d281ac5a70	provide defaults for threshold_configs	2026-04-10 07:47:39 -04:00
andreas	d77277857f	Add user management and a settings page	2026-04-08 16:21:55 -04:00
Andreas Wrede	8421f472f2	there is only one __version__	2026-04-07 11:00:22 -04:00
Andreas Wrede	51f9bdc2b5	use SO_TIMESTAMP, works on Linux, FreeBSD and macOS	2026-04-07 10:46:54 -04:00
andreas	02bc42fbf0	get rtt time differently	2026-04-07 10:40:12 -04:00
andreas	832a8b0bda	save state to pickle file, restart timers on restart	2026-04-06 17:24:59 -04:00
Andreas Wrede	73aa89f8f4	fix web page issues	2026-04-04 12:43:30 -04:00
Andreas Wrede	941f3ea4b0	display and acknowledge alerts	2026-04-03 06:35:45 -04:00
Andreas Wrede	c5770006f7	hbc proper termination, hbd config reloadable	2026-04-02 07:17:00 -04:00
Andreas Wrede	84c1aef51f	removal of cver	2026-04-01 20:47:29 -04:00
Andreas Wrede	460d2be9e9	Fix rtt, including bug in time compute	2026-04-01 19:41:53 -04:00

1 2

55 Commits