Compare commits

..

19 Commits

Author SHA1 Message Date
Andreas Wrede e0443293e9 Merge branch 'master' of git.wrede.ca:andreas/heartbeat
Release / release (push) Successful in 44s
2026-06-06 08:31:26 -04:00
Andreas Wrede 39670f4e63 version 5.3.10 2026-06-06 08:28:43 -04:00
Andreas Wrede 2e88ee2269 feat: clear stale plugin data and persist OAuth users to config
- hbdclass: add per-plugin stale timers; clear history and alerts after
  3× heartbeat interval with no PLG data received
- udp: wire stale timer on every PLG message via _make_plugin_stale_callback
- http: persist new OAuth users to config file on first login

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 08:27:20 -04:00
andreas 2ef7d473c3 Merge pull request 'hbc_mini.c: make it compile on NetBSD' (#1) from woods/heartbeat:master into master
Merge pull request: hbc_mini.c: make it compile on NetBSD
2026-06-03 12:05:29 -04:00
woods 862a9cdea0 hbc_mini.c: make it work on NetBSD
This fixes the numbers by using the correct MIB to match the struct.
2026-06-02 13:42:11 -07:00
woods 9351938b15 hbc_mini.c: make it compile on NetBSD
Use the public "struct uvmexp_sysctl" instead of "struct uvmexp".

The numbers from the memory_monitor are wonky, but it builds and runs.
2026-06-02 12:05:42 -07:00
andreas b6ef2fe065 Merge branch 'master' of git.wrede.ca:andreas/heartbeat
sequencing
2026-06-02 08:01:47 -04:00
andreas d5d2f066b3 fix: don't use pusbover title 2026-06-02 08:01:32 -04:00
Andreas Wrede d9563392c3 fix: remove bak file in bumpminor.sh 2026-06-01 08:34:07 -04:00
andreas 5f090b9d96 feat: auto-scale CPU history graph Y axis
Y axis now fits the actual data range with 10% padding rather than
fixed 0-100%. Grid lines use nice tick steps (1/2/5/10 × magnitude).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 07:59:54 -04:00
andreas 3cc1d92eb4 Merge branch 'master' of git.wrede.ca:andreas/heartbeat 2026-06-01 07:56:02 -04:00
andreas 2ddba203df feat: add CPU usage history graph to CPU Monitor section
Renders an SVG line chart above the CPU Usage row using all available
history samples (up to 100). Color adapts green/orange/red by load level.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-01 07:55:55 -04:00
Andreas Wrede 8a1f412d1d version 5.3.9
Release / release (push) Successful in 43s
2026-05-31 20:58:58 -04:00
Andreas Wrede 40c44f53f1 feat: auto-update CHANGELOG and README in bumpminor.sh
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 20:58:46 -04:00
andreas a6fe8546a8 Update README.md 2026-05-31 20:38:03 -04:00
Andreas Wrede e56660454d tidy up what commited 2026-05-30 15:17:36 -04:00
Andreas Wrede 9cbf0ecb13 docs: update CHANGELOG for 5.3.7 and 5.3.8
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 15:15:25 -04:00
Andreas Wrede 313bbd37ac version 5.3.8
Release / release (push) Successful in 42s
2026-05-30 15:06:46 -04:00
Andreas Wrede f7320644f3 fix: avoid SIGPIPE in changelog step by using grep -m 1
Replacing head -1 (and the broken head -2|tail -1 attempt) with grep -m 1
stops grep after the first match, eliminating the SIGPIPE that caused exit 141.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 15:06:19 -04:00
15 changed files with 648 additions and 33 deletions
-20
View File
@@ -1,20 +0,0 @@
{
"permissions": {
"allow": [
"Edit(*)",
"Bash(pytest *)",
"Bash(python *)",
"Bash(python3 *)",
"Bash(.venv/bin/pytest *)",
"Bash(npm *)",
"Bash(git *)",
"Bash(ls *)",
"Bash(cat *)",
"Bash(grep *)",
"Bash(find *)",
"Bash(mkdir *)",
"Bash(touch *)",
"Bash(uv *)"
]
}
}
+1 -1
View File
@@ -33,7 +33,7 @@ jobs:
- name: Generate changelog
id: changelog
run: |
PREV_TAG=$(git tag --sort=-version:refname | grep -v "^${GITHUB_REF#refs/tags/}$" | head -1)
PREV_TAG=$(git tag --sort=-version:refname | grep -m 1 -v "^${GITHUB_REF#refs/tags/}$")
if [ -n "$PREV_TAG" ]; then
CHANGELOG=$(git log --pretty=format:"- %s" "${PREV_TAG}..HEAD")
else
+1
View File
@@ -5,6 +5,7 @@ __pycache__/
*.pyo
.flake8
.venv/
.continue/
test/
build/
dist/
+457
View File
@@ -0,0 +1,457 @@
# Changelog
All notable changes to this project are documented here, organized by release.
## [5.3.10]
### Added
- clear stale plugin data and persist OAuth users to config
- auto-scale CPU history graph Y axis
- add CPU usage history graph to CPU Monitor section
### Fixed
- remove bak file in bumpminor.sh
---
## [5.3.9]
### Added
- auto-update CHANGELOG and README in bumpminor.sh
---
## [5.3.8]
### Added
- Wiki home page with overview and getting started guide
### Fixed
- Release workflow: use `GITHUB_REF`/`GITHUB_OUTPUT` (Gitea Actions uses GitHub-compatible variable names)
- Release workflow: replace `head -1` with `grep -m 1` to avoid SIGPIPE (exit 141) in changelog step
---
## [5.3.7]
### Added
- Dark mode with light/dark/auto theme setting
- UNKNOWN level filter in Log of Events
- Per-metric grace period input in threshold settings
- Replace Dynamic DNS YAML editor with a web form
- Sort hosts, thresholds, and channels alphabetically on settings page
- Suppress alerts for unwatched hosts
### Fixed
- Preserve log message order when replaying history on connect
---
## [5.3.6]
### Added
- MIT license
### Fixed
- Correct ZFS pool status threshold operator and add per-metric grace
- Normalize email and domain fields
- Move dependencies back under `[project]` in pyproject.toml
---
## [5.3.4]
### Fixed
- Run full reload after HTTP config publish, not just `config.reload()`
---
## [5.3.3]
### Added
- Replace YAML threshold editor with a form-based UI
- Replace multi-select fields with dual-panel picker on settings page
- Nav bar button to publish pending config changes
- Host, level, and message filters in Log of Events
### Fixed
- Remove container max-width; stop stretching inputs on settings page
### Removed
- Legacy `dyndnshosts`/`drophosts` config keys
---
## [5.3.2]
### Added
- Retry DNS resolution indefinitely; add `-4`/`-6` address-family flags to `hbc` and `hbc_mini`
- Replace YAML hosts editor with form-based CRUD table
- Replace YAML notification channel editor with form-based UI
### Fixed
- Support list-valued `threshold_config` in hosts table
- Derive hosts threshold config list from config file keys
- Replace channel checkboxes in Users table with multi-select
- Support plugin-level `enabled: false` in threshold config
- Always populate glance strip for all hosts on page load
- Fetch host info on initial page load
---
## [5.3.1]
### Added
- Host info section in Host Overview (fetched and rendered on card expand)
- `GET /api/0/hosts/{hostname}/info` endpoint
- Show suffix-matched metric coverage in host info threshold table
- Move `hbc_version` and `hbc_type` out of `os_info` into the host info section
### Fixed
- Correct `THRESHOLD_DEFAULTS` metric keys and add missing defaults
---
## [5.3.0]
### Added
- Profile page self-service: change identity, password, and notification channels
- Settings page editor with form sections, YAML editors, stage/publish/rollback workflow
- Config read API: `GET /api/0/config`, `/section/{name}`, `/backups`
- Config write API: `POST /api/0/config`, `POST /api/0/config/rollback`
- `configio` module for comment-preserving YAML round-trip writes
- Multi-provider OAuth2 login page and generic provider routes
- Log login/logout events to the event log with auth source
### Fixed
- ZFS monitor alerts dropped on restart with wildcard pool thresholds
- Preserve OAuth users across config reload
- Config API error handling, consistent 403 messages, deduplicated key lists
- Validate password body type; coerce `notification_channels` to strings in profile API
- Preserve OAuth `client_secret` on roundtrip; harden rollback path validation
---
## [5.2.6]
### Added
- Alerts host-filter field with URL query parameter and notify URL
- Optional logo on Gitea OAuth login button
### Fixed
- Show human-readable duration in re-notification messages
---
## [5.2.5]
### Added
- Alert CRITICAL on degraded or suspended ZFS pools (ONLINE=OK, DEGRADED=WARNING, all else=CRITICAL)
- Sign in with Gitea button on login page with OAuth2 redirect/callback routes
- OAuth2 CSRF state management
- Host owner shown in glance strip for admin users
- C port of `hbc_mini` (single-file client in `scripts/c/`)
### Fixed
- Use `base_url` config for OAuth redirect URI to handle reverse proxy deployments
- Preserve OAuth users across config reload
- Escape HTML in login page error display
---
## [5.2.4]
### Added
- `hbc`/`hbc_mini`: `owner` config field included in `os_info`; server applies to host record
- Server requests InfoPlugin refresh when a host has no plugin data
- Event log stores structured dicts; filter by user
### Fixed
- Strip `_status_code` suffix from displayed metric names in threshold alerts
- Use plain URL in Mattermost plugin metrics link
- Fall back to `default_owner` when `os_info` has no owner
---
## [5.2.3]
### Added
- `hbc`/`hbc_mini`: log name and version at startup
- Show metric name inline with hostname in alerts and notifications
### Fixed
- Send shutdown message only if a boot message was previously sent; suppress both on restart
---
## [5.2.2]
### Fixed
- Retry connection on network error instead of permanently dropping it
- Silence `aiohttp.access` log; strip plugin prefix in alerts UI
---
## [5.2.1]
### Fixed
- Threshold and logging improvements
---
## [5.2.0]
### Added
- `nagios` operator for direct exit-code severity mapping
### Fixed
- Always show `THRESHOLD_DEFAULTS` in Settings threshold config
---
## [5.1.21]
### Added
- `nagios_runner` improvements and alerts page fixes
---
## [5.1.20]
### Added
- Generic threshold matching for `nagios_runner` with `{check_name}` display support
### Fixed
- Reduce default hysteresis from 10% to 2%
- Show recovery threshold in alerts UI
---
## [5.1.19]
### Added
- Exclude ZFS ARC from `memory_percent`
- Add `uptime_seconds` to `cpu_monitor`
### Fixed
- Send boot/shutdown message on the first open connection, not blindly on the first in list
---
## [5.1.18]
### Added
- Fetch-based Update/Delete buttons with toast notifications on Host Overview
### Fixed
- Settings thresholds show correct per-config metrics; miscellaneous `hbc` fixes
---
## [5.1.17]
### Added
- Owner Update/Delete buttons on Host Overview; purge stale alerts on reload
- Retry `AsyncConnection.open()` indefinitely; drop IPv6 only on early startup failure
- Alert pie chart in the nav bar
### Fixed
- Make Alerts page scrollable
---
## [5.1.16]
### Added
- Generic `ping_monitor` thresholds; round RTT to nearest ms
---
## [5.1.15]
### Added
- Link hostnames in Live Dashboard to Host Overview
- Threshold Configurations section on settings page
### Fixed
- Suppress notifications on alert de-escalation (e.g. CRITICAL→WARNING)
- Suppress recover messages for down durations under 4 seconds
---
## [5.1.14]
### Added
- ZFS pool renderer in Host Overview
---
## [5.1.13]
### Added
- ZFS monitor plugin
- Host-level watch flag to suppress notifications
- Filter Live Dashboard and Host Overview by owner/manager
- Composable `threshold_config` list for per-host threshold layering
- Restart on SIGHUP in `hbc` and `hbc_mini`
### Fixed
- Mask `api_password` and `access_token` in settings page
---
## [5.1.12]
Internal release — no user-visible changes.
---
## [5.1.11]
### Fixed
- Install under Docker
- Clean up install script
---
## [5.1.10]
### Fixed
- Synchronize version in `hbc_mini`
- Install script no longer overwrites itself
---
## [5.1.9]
### Added
- Install `hbc_mini` via package or install script
---
## [5.1.8]
### Added
- Track `hbc` type and version
### Fixed
- Nav bar position
---
## [5.1.7]
### Added
- `hbc_mini`: single-file heartbeat client
### Fixed
- Drop dead connections on protocol error
---
## [5.1.6]
### Fixed
- Simplify event log usage; fix argument handling
---
## [5.1.5]
### Added
- Update `hbc` via `hb_install.sh` instead of code patching
---
## [5.1.4]
### Added
- Redesign Plugin Metrics page as Host Overview
---
## [5.1.3]
### Added
- Validate absolute command paths at `nagios_runner` init
- Async subprocess in `nagios_runner` with stderr capture and signal handling
- `skip_reason` field on `Plugin`; surface in `PluginLoader` init messaging
### Fixed
- Use `shlex.split()` for `nagios_runner` path validation to handle quoted paths
- Reconfigure logging to syslog after `daemonize()`
---
## [5.1.2]
### Fixed
- Plugin config lookup shadowed by `CLIENT_DEFAULTS` plugins key
- Apply grace period to all threshold alerts before logging/notifying
- RECOVER routing: use consistent level name and route via alerted channel
- Early reminder notifications and lost recovery notifications
- Non-alerting of overdue hosts
### Added
- Swiss clock widget in the UI
---
## [5.1.1]
### Added
- SMS and Matrix notification channels
- CLI commands `stop`, `restart`, and `reload` for `hbd`
- WebSocket endpoint at `http://.../ws`
- Mobile HTML pages
### Fixed
- Profile not updating
- Sortable columns in tables
---
## [5.1.0]
### Added
- Ping monitor plugin
- Persist state to pickle file; restart timers on server restart
- SIGHUP config reload for `hbd`
- Renotify on CRITICAL only; persistent user sessions
- RTT count threshold
### Fixed
- Bogus notification on new clients
- Show "overdue" in alerts instead of null
---
## [5.0.12]
### Added
- User management and settings page
---
## [5.0.10]
### Added
- Publish package to Gitea PyPI registry
---
## [5.0.9]
### Added
- Use `SO_TIMESTAMP` for RTT measurement (Linux, FreeBSD, macOS)
- Persist state to pickle file; restart timers on restart
---
## [5.0.6]
### Added
- Major codebase refactoring: restructured into client/server components
- Per-client threshold configuration
- Display and acknowledge alerts in the UI
- Proper `hbc` termination; `hbd` config reloadable at runtime
+1 -1
View File
@@ -20,7 +20,7 @@ A lightweight UDP-based host monitoring system. Monitored hosts run a client (`h
└────────────────────┘ └────────────────────────────┘
```
**Package:** `hbd` v5.3.4
**Package:** `hbd` v5.3.10
**Python:** 3.11+
### Subpackages
+1 -1
View File
@@ -14,4 +14,4 @@ Install options:
"""
__all__ = ["__version__"]
__version__ = "5.3.7"
__version__ = "5.3.10"
+32
View File
@@ -297,6 +297,8 @@ class Host:
self.plugin_retention = 100 # Keep last N samples per plugin
# Alert state tracking: {metric_path: AlertState}
self.alert_states = {}
# Stale-data timers: {plugin_name: asyncio.TimerHandle}
self.plugin_timers = {}
# User access control
self.owner: str | None = None # username of owner
self.managers: list = [] # usernames with manager role
@@ -483,6 +485,8 @@ class Host:
self.managers = []
if not hasattr(self, "monitors"):
self.monitors = []
if not hasattr(self, "plugin_timers"):
self.plugin_timers = {}
pass
@@ -542,6 +546,34 @@ class Host:
"""
return self.plugin_data
def reset_plugin_timer(self, plugin_name, timeout_seconds, callback):
"""Reset the stale-data timer for a plugin.
If no new PLG data arrives within timeout_seconds, callback(host, plugin_name)
is called so the caller can clear history and alerts.
"""
import asyncio
existing = self.plugin_timers.get(plugin_name)
if existing and not existing.cancelled():
existing.cancel()
async def _fire():
await callback(self, plugin_name)
try:
loop = asyncio.get_event_loop()
self.plugin_timers[plugin_name] = loop.call_later(
timeout_seconds, lambda: asyncio.create_task(_fire())
)
except RuntimeError:
pass
def cancel_plugin_timer(self, plugin_name):
"""Cancel the stale timer for a plugin, if any."""
handle = self.plugin_timers.pop(plugin_name, None)
if handle and not handle.cancelled():
handle.cancel()
# ------------------------------------------------------------------
# User-role helpers
# ------------------------------------------------------------------
+17
View File
@@ -1182,6 +1182,23 @@ async def start(
profile["full_name"],
profile["avatar_url"],
)
# Persist new OAuth users to the config file so they survive restarts.
# Only write when the user isn't already in the config's users section.
if _config_path and not (config.get("users") or {}).get(user.username):
try:
disk_data = configio_mod.read_roundtrip(_config_path)
if not disk_data.get("users"):
disk_data["users"] = {}
disk_data["users"][user.username] = {
k: v for k, v in [
("full_name", user.full_name),
("avatar", user.avatar),
] if v
}
configio_mod.write_config(_config_path, disk_data)
logger.info("Persisted OAuth user %r to config", user.username)
except Exception as exc:
logger.warning("Failed to persist OAuth user %r to config: %s", user.username, exc)
session_token = users_mod.create_session(user.username)
eventlog("hbd", "INFO", f"Login: {user.username} via {provider.type}")
resp = web.HTTPFound("/")
+3 -1
View File
@@ -140,7 +140,9 @@ def _send_pushover(channel_cfg: dict, notif: Notification) -> bool:
if not token or not user:
logger.warning("pushover: missing token or user")
return False
params: dict = {"token": token, "user": user, "title": notif.title, "message": notif.body}
body = "%s: %s" % (notif.title, notif.body)
title = ""
params: dict = {"token": token, "user": user, "title": title, "message": body}
if channel_cfg.get("sound"):
params["sound"] = channel_cfg["sound"]
if notif.url:
+93 -3
View File
@@ -914,7 +914,7 @@
let html = '';
switch (pluginName) {
case 'os_info': html = renderOsInfoTable(cached.data); break;
case 'cpu_monitor': html = renderCpuTable(cached.data); break;
case 'cpu_monitor': html = renderCpuTable(hostname, cached.data); break;
case 'memory_monitor': html = renderMemoryTable(cached.data); break;
case 'disk_monitor': html = renderDiskTables(cached.data); break;
case 'network_monitor':html = renderNetworkTables(cached.data); break;
@@ -926,6 +926,10 @@
html += `<div class="timestamp">Last updated: ${new Date(cached.timestamp * 1000).toLocaleString()}</div>`;
body.innerHTML = html;
if (pluginName === 'cpu_monitor') {
fetchCpuHistory(hostname).then(samples => renderCpuChart(hostname, samples)).catch(() => {});
}
}
// ── Per-plugin renderers ────────────────────────────────────────────────
@@ -948,7 +952,92 @@
return html;
}
function renderCpuTable(d) {
async function fetchCpuHistory(hostname) {
const r = await fetch(`/api/0/hosts/${encodeURIComponent(hostname)}/plugins/cpu_monitor?limit=100`);
if (!r.ok) return [];
const json = await r.json();
return json.samples || [];
}
function renderCpuChart(hostname, samples) {
const el = document.getElementById(`cpu-chart-${hostname}`);
if (!el || !samples.length) return;
const pts = samples
.filter(s => s.data.cpu_percent != null)
.map(s => ({ t: s.timestamp, v: s.data.cpu_percent }));
if (pts.length < 2) { el.style.display = 'none'; return; }
const W = 600, H = 80, PAD = { top: 6, right: 8, bottom: 18, left: 28 };
const cW = W - PAD.left - PAD.right;
const cH = H - PAD.top - PAD.bottom;
const tMin = pts[0].t, tMax = pts[pts.length - 1].t;
const tRange = tMax - tMin || 1;
const x = t => PAD.left + ((t - tMin) / tRange) * cW;
// Auto-scale Y axis with 10% padding, clamped to [0, 100]
const vMin = Math.min(...pts.map(p => p.v));
const vMax = Math.max(...pts.map(p => p.v));
const vRange = vMax - vMin || 1;
const vPad = Math.max(vRange * 0.1, 1);
const yLow = Math.max(0, vMin - vPad);
const yHigh = Math.min(100, vMax + vPad);
const yRange = yHigh - yLow || 1;
const y = v => PAD.top + cH - ((v - yLow) / yRange) * cH;
// Build polyline points and filled area path
const linePoints = pts.map(p => `${x(p.t).toFixed(1)},${y(p.v).toFixed(1)}`).join(' ');
const areaPath = `M${x(pts[0].t).toFixed(1)},${(PAD.top + cH).toFixed(1)} ` +
pts.map(p => `L${x(p.t).toFixed(1)},${y(p.v).toFixed(1)}`).join(' ') +
` L${x(pts[pts.length-1].t).toFixed(1)},${(PAD.top + cH).toFixed(1)} Z`;
// Color based on latest absolute CPU %
const latest = pts[pts.length - 1].v;
const strokeColor = latest > 90 ? '#e53935' : latest > 70 ? '#fb8c00' : '#43a047';
const fillColor = latest > 90 ? '#ffcdd2' : latest > 70 ? '#ffe0b2' : '#c8e6c9';
// Compute nice tick step for ~3-5 grid lines
const rawStep = yRange / 4;
const mag = Math.pow(10, Math.floor(Math.log10(rawStep || 1)));
const niceStep = [1, 2, 5, 10].map(f => f * mag).find(s => yRange / s <= 5) || mag * 10;
const tickStart = Math.ceil(yLow / niceStep) * niceStep;
let gridLines = '';
for (let v = tickStart; v <= yHigh + 0.001; v += niceStep) {
const yy = y(v).toFixed(1);
const label = Number.isInteger(v) ? v : v.toFixed(1);
gridLines += `<line x1="${PAD.left}" y1="${yy}" x2="${PAD.left + cW}" y2="${yy}" stroke="#e0e0e0" stroke-width="1"/>`;
gridLines += `<text x="${(PAD.left - 3).toFixed(1)}" y="${yy}" text-anchor="end" dominant-baseline="middle" font-size="8" fill="#999">${label}</text>`;
}
// X-axis time labels
const fmt = ts => {
const d = new Date(ts * 1000);
return d.toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' });
};
const xLabels = `
<text x="${PAD.left}" y="${H - 2}" text-anchor="start" font-size="8" fill="#999">${fmt(pts[0].t)}</text>
<text x="${PAD.left + cW}" y="${H - 2}" text-anchor="end" font-size="8" fill="#999">${fmt(pts[pts.length-1].t)}</text>`;
el.innerHTML = `<svg viewBox="0 0 ${W} ${H}" preserveAspectRatio="none"
style="width:100%;height:${H}px;display:block;">
<defs>
<clipPath id="cpu-clip-${hostname}">
<rect x="${PAD.left}" y="${PAD.top}" width="${cW}" height="${cH}"/>
</clipPath>
</defs>
${gridLines}
<line x1="${PAD.left}" y1="${PAD.top}" x2="${PAD.left}" y2="${PAD.top + cH}" stroke="#ccc" stroke-width="1"/>
<line x1="${PAD.left}" y1="${PAD.top + cH}" x2="${PAD.left + cW}" y2="${PAD.top + cH}" stroke="#ccc" stroke-width="1"/>
<g clip-path="url(#cpu-clip-${hostname})">
<path d="${areaPath}" fill="${fillColor}" opacity="0.6"/>
<polyline points="${linePoints}" fill="none" stroke="${strokeColor}" stroke-width="1.5" stroke-linejoin="round"/>
</g>
${xLabels}
</svg>`;
}
function renderCpuTable(hostname, d) {
const KEYS = [
['cpu_percent', 'CPU Usage', 'bar'],
['load_1min', 'Load (1 min)', 'num'],
@@ -966,7 +1055,8 @@
];
const handled = new Set(KEYS.map(r => r[0]));
let html = '<table class="data-table"><thead><tr><th>Metric</th><th>Value</th></tr></thead><tbody>';
let html = `<div id="cpu-chart-${hostname}" style="margin-bottom:8px;"></div>`;
html += '<table class="data-table"><thead><tr><th>Metric</th><th>Value</th></tr></thead><tbody>';
for (const [k, label, fmt] of KEYS) {
if (!(k in d)) continue;
const v = d[k];
+21
View File
@@ -232,6 +232,23 @@ def _make_timer_callbacks(uname, host, ctx):
return on_overdue, on_unknown
def _make_plugin_stale_callback(uname, ctx):
"""Return an async callback that clears stale plugin data and its alerts."""
msg_to_websockets = ctx.get("msg_to_websockets")
async def on_plugin_stale(host, plugin_name):
host.plugin_data.pop(plugin_name, None)
stale_keys = [k for k in host.alert_states if k.startswith(f"{plugin_name}.")]
for k in stale_keys:
del host.alert_states[k]
eventlog(uname, "INFO", f"plugin data stale: {plugin_name}")
if msg_to_websockets:
msg_to_websockets("plugin_stale", {"host": uname, "plugin": plugin_name})
msg_to_websockets("host", host.stateinfo())
return on_plugin_stale
def restore_connection_timers(hbdclass, ctx):
"""Restore overdue timers for all loaded connections after a pickle restore.
@@ -372,6 +389,10 @@ def handle_datagram(msg: dict, addr, transport, ctx: dict):
if k not in ("ID", "plugin", "id", "name")}
# Store plugin data with timestamp
host.add_plugin_data(plugin_name, plugin_data, timestamp=now)
# Reset stale timer — 3× the heartbeat interval (min 60 s)
stale_timeout = max(host.interval * 3, 60)
host.reset_plugin_timer(plugin_name, stale_timeout,
_make_plugin_stale_callback(uname, ctx))
# If os_info reports an owner and none is configured server-side, apply it
if plugin_name == "os_info":
+1 -1
View File
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "hbd"
version = "5.3.7"
version = "5.3.10"
description = "Heartbeat monitoring system — client (hbc) and server (hbd)"
readme = "README.md"
requires-python = ">=3.11"
+16 -1
View File
@@ -5,9 +5,23 @@ uv version --bump patch
VER=$(uv version --short)
sed -i".bak" "s/__version__ = \"[0-9.]*\"\(.*\)$/__version__ = \"$VER\"\1/" hbd/__init__.py
sed -i".bak" "s/__version__ = \"[0-9.]*\"\(.*\)$/__version__ = \"$VER\"\1/" scripts/hbc_mini.py
sed -i".bak" "s/\*\*Package:\*\* \`hbd\` v[0-9.]*/\*\*Package:\*\* \`hbd\` v$VER/" README.md
# Update CHANGELOG.md with commits since last tag
LASTTAG=$(git describe --tags --abbrev=0 2>/dev/null || true)
ADDED=$(git log "${LASTTAG:+$LASTTAG..}HEAD" --pretty="%s" | grep "^feat:" | sed 's/^feat: /- /')
FIXED=$(git log "${LASTTAG:+$LASTTAG..}HEAD" --pretty="%s" | grep "^fix:" | sed 's/^fix: /- /')
{
printf "## [%s]\n" "$VER"
[ -n "$ADDED" ] && printf "\n### Added\n%s\n" "$ADDED"
[ -n "$FIXED" ] && printf "\n### Fixed\n%s\n" "$FIXED"
printf "\n---\n\n"
} > /tmp/changelog_entry.txt
sed -i".bak" "4r /tmp/changelog_entry.txt" CHANGELOG.md
rm /tmp/changelog_entry.txt CHANGELOG.md.bak
# commit pyproject.toml
git commit -m "version $VER" pyproject.toml hbd/__init__.py scripts/hbc_mini.py
git commit -m "version $VER" pyproject.toml hbd/__init__.py scripts/hbc_mini.py README.md CHANGELOG.md
git push
# tag version
git tag -a v$VER -m "Version $VER"
@@ -15,3 +29,4 @@ git push --tags
rm hbd/__init__.py.bak
rm scripts/hbc_mini.py.bak
rm README.md.bak
+3 -3
View File
@@ -789,7 +789,7 @@ static void plugin_cpu_monitor(conn_t *c, const config_t *cfg) {
* Plugin: memory_monitor
* Linux: /proc/meminfo
* FreeBSD: sysctl vm.stats.vm.*
* NetBSD: sysctl vm.uvmexp (struct uvmexp)
* NetBSD: sysctl vm.uvmexp (struct uvmexp_sysctl)
* ============================================================ */
/* emit the common kvdict fields and send */
@@ -896,9 +896,9 @@ static void plugin_memory_monitor(conn_t *c, const config_t *cfg) {
static void plugin_memory_monitor(conn_t *c, const config_t *cfg) {
(void)cfg;
struct uvmexp uvm;
struct uvmexp_sysctl uvm;
size_t len = sizeof(uvm);
int mib[2] = {CTL_VM, VM_UVMEXP};
int mib[2] = {CTL_VM, VM_UVMEXP2};
if (sysctl(mib, 2, &uvm, &len, NULL, 0) != 0) return;
long long ps = uvm.pagesize;
+1 -1
View File
@@ -41,7 +41,7 @@ from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
# updated by scripts/bumpminor.sh
__version__ = "5.3.7"
__version__ = "5.3.10"
# ---------------------------------------------------------------------------
# Protocol (mirrors hbd/common/proto.py)