version 5.1.17

feat: owner Update/Delete buttons on Host Overview; purge stale alerts on reload
Host Overview (plugins.html): show Update and Delete buttons in the host-right zone when the logged-in user is the host owner (or admin / unauthenticated mode). Buttons link to /u?h=<host> and /d?h=<host> with stopPropagation so they don't toggle the accordion; Delete prompts for confirmation first. ThresholdChecker.purge_stale_alerts(): removes alert states whose metric_path has no matching threshold in the current config. Called after startup pickle restore and after every SIGHUP config reload so alerts orphaned by upgrades or config changes do not persist indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 08:04:01 -04:00 · 2026-05-04 08:03:46 -04:00 · 2026-05-04 13:33:08 +02:00 · 2026-05-04 12:46:35 +02:00 · 2026-05-04 12:29:35 +02:00 · 2026-05-03 13:45:15 -04:00
14 changed files with 248 additions and 36 deletions
@@ -27,6 +27,7 @@ A lightweight daemon that listens for UDP heartbeat messages and acts on them: k
  - Configurable retention and backup management
 - **Plugin system for extensible monitoring** ✅
  - Collect system metrics (CPU, memory, disk, network)
+  - Monitor ZFS pool health, capacity, and I/O via `zpool(8)`
  - Execute existing Nagios monitoring plugins
  - Create custom plugins with simple Python classes
 - **Threshold alerting system** ✅
@@ -34,6 +35,8 @@ A lightweight daemon that listens for UDP heartbeat messages and acts on them: k
  - Hysteresis to prevent alert flapping
  - Automatic notifications on state changes
  - Re-notification for ongoing alerts
+- **Per-host watch flag** — set `watch: false` on any host to silence all notifications for that host without removing its configuration ✅
+- **Role-filtered dashboards** — Live Dashboard and Host Overview show only hosts where the logged-in user is owner or manager (admins see all) ✅
 - Modular codebase suitable for unit testing and CI ✅

 ---
@@ -61,12 +64,16 @@ Heartbeat includes a comprehensive plugin architecture that extends monitoring b
 - `network_monitor`: Monitors network interface statistics, bandwidth, and connections
 - `filesystem_info`: Collects mounted filesystem information (physical filesystems only by default)
 - `nagios_runner`: Executes Nagios monitoring plugins (check_disk, check_load, check_http, etc.)
+- `zfs_monitor`: Monitors ZFS pool health, capacity, fragmentation, dedup ratio, and cumulative I/O via `zpool(8)`

 ### Nagios Integration

 The `nagios_runner` plugin provides seamless integration with the vast Nagios plugin ecosystem. You can run any Nagios-compatible plugin and have the results automatically parsed and stored:

- Executes plugins via subprocess with timeout protection
+- Executes plugins asynchronously (non-blocking) with timeout protection
+- Captures both stdout and stderr; if stdout is empty, stderr is used as the status message
+- Handles signal-killed processes (negative exit code → UNKNOWN status)
+- Validates absolute command paths at startup and warns on missing or non-executable files
 - Parses exit codes (OK/WARNING/CRITICAL/UNKNOWN)
 - Extracts performance data with thresholds
 - Reports aggregated status across all configured checks
@@ -147,9 +154,11 @@ Heartbeat includes a sophisticated threshold alerting system that monitors plugi
 - **Multi-level alerts**: WARNING and CRITICAL severity levels
 - **Flexible operators**: Support for >, >=, <, <=, ==, != comparisons
 - **Hysteresis**: Prevents alert flapping with configurable recovery thresholds
- **Smart notifications**: Alerts only on state changes, not every check
+- **Smart notifications**: Alerts only on state changes, not every check; de-escalations (e.g. CRITICAL → WARNING) do not generate a notification
 - **Re-notifications**: Periodic reminders for ongoing alerts
+- **Short-duration suppression**: Recovery notifications are suppressed for down events under 4 seconds (avoids noise from transient blips)
 - **Journal integration**: All threshold events logged for audit trail
+- **`ping_monitor` thresholds**: Latency and packet-loss thresholds use the same format as all other plugin metrics

 ### Configuration

@@ -363,9 +372,10 @@ Heartbeat includes a built-in HTTP/WebSocket server that provides both a REST AP
 ### Web Dashboards

 - **Login** (`/login`): Browser login form (shown automatically when auth is configured)
- **Live View** (`/live`): Real-time host connectivity, latency, and messages
- **Plugin Metrics** (`/plugins`): Browse and visualize metrics from all plugins
- **Alerts Dashboard** (`/alerts`): Monitor active alerts with severity filtering
+- **Live View** (`/live`): Real-time host connectivity, latency, and messages; hostnames link directly to the Host Overview page
+- **Host Overview** (`/plugins/<host>`): Per-host plugin metrics with ZFS pool visualization; filtered to hosts where the logged-in user is owner or manager (admins see all)
+- **Alerts Dashboard** (`/alerts`): Monitor active alerts with severity filtering; alert count pie chart shown in the navigation bar
+- **Settings** (`/settings`): Server configuration, user management, and threshold configuration viewer

 ### API Endpoints

@@ -476,6 +486,10 @@ plugins:

 All monitoring plugins default to 5-minute (300 second) intervals, but can be customized as needed.

+**Connection retry:** If a server is temporarily unreachable, `hbc` retries `open()` indefinitely on every heartbeat interval. IPv6 connections that never succeeded during early startup are dropped after 3 consecutive failures (to handle hosts without IPv6 routing), while IPv4 connections always retry.
+
+**Daemon logging:** When running with `-d`, `hbc` routes all log output to syslog (`LOG_DAEMON` facility) after daemonizing. Without `-d`, logs go to stderr as usual.
+
 ### hbc_mini — single-file client (no external dependencies)

 `scripts/hbc_mini.py` is a self-contained version of the heartbeat client that requires only Python 3.8+ and no external packages. Copy it to any host and run it directly — no virtualenv, no `pip install`.
@@ -531,8 +545,10 @@ python3 hbc_mini.py -m "maintenance starting" your-server.example.com

 - No YAML config (use JSON instead)
 - No `filesystem_info` plugin
+- No `zfs_monitor` plugin (requires `zpool(8)` and the full plugin loader)
 - `cpu_monitor` does not report per-core usage or CPU frequency (no psutil)
 - Plugins cannot be loaded from external `.py` files — all plugins are compiled in
+- No IPv6 early-fail protection — connections that fail to open at startup are silently skipped rather than retried

 Everything else — heartbeat protocol, ACK/CMD/UPD handling, `hb_install.sh`-based self-update, daemonize, syslog — is identical to the full client.

@@ -14,4 +14,4 @@ Install options:
 """

 __all__ = ["__version__"]
-__version__ = "5.1.15"
+__version__ = "5.1.17"
@@ -56,6 +56,8 @@ class AsyncConnection:
        self.transport: Optional[asyncio.DatagramTransport] = None
        self.protocol: Optional[asyncio.DatagramProtocol] = None
        self._dead = False
+        self._ever_opened = False
+        self._open_fail_count = 0   # consecutive failures before first success

        self.logger = logging.getLogger(f"hbc.conn.{addr}")

@@ -73,6 +75,7 @@ class AsyncConnection:
                lambda: HeartbeatProtocol(self),
                family=self.af
            )
+            self._ever_opened = True
            self.logger.debug(f"Opened connection to {self.addr}:{self.port}")
            return True
        except Exception as e:
@@ -262,15 +265,51 @@ async def handle_update(conn: AsyncConnection, _msg: dict):  # pyright: ignore[r


 async def heartbeat_sender(conn: AsyncConnection, interval: int):
-    """Send periodic heartbeats.
+    """Send periodic heartbeats, retrying the connection if it is not open.
+
+    IPv6 connections that fail to open before their first successful send are
+    dropped after IPV6_EARLY_FAIL_LIMIT attempts so that a network without IPv6
+    does not keep a dead sender alive.  IPv4 connections are retried indefinitely.

    Args:
        conn: Connection to send on
        interval: Heartbeat interval in seconds
    """
    logger = logging.getLogger("hbc.heartbeat")
+    IPV6_EARLY_FAIL_LIMIT = 3
+
+    while running and not conn._dead:
+        # Ensure transport is open before attempting to send.
+        if not conn.transport:
+            opened = await conn.open()
+            if opened:
+                conn._open_fail_count = 0
+            else:
+                conn._open_fail_count += 1
+                # Drop an IPv6 connection that has never come up within the
+                # first few attempts — it is likely unavailable on this network.
+                if (not conn._ever_opened
+                        and conn.af == socket.AF_INET6
+                        and conn._open_fail_count >= IPV6_EARLY_FAIL_LIMIT):
+                    logger.warning(
+                        f"IPv6 connection to {conn.addr} unreachable after "
+                        f"{conn._open_fail_count} attempts, disabling"
+                    )
+                    conn._dead = True
+                    break
+                # Retry after the normal interval; IPv4 retries forever.
+                try:
+                    if shutdown_event:
+                        await asyncio.wait_for(shutdown_event.wait(), timeout=interval)
+                        break
+                    else:
+                        await asyncio.sleep(interval)
+                except asyncio.TimeoutError:
+                    pass
+                except asyncio.CancelledError:
+                    raise
+                continue

-    while running:
        try:
            msg = {
                "acks": conn.ackcount,
@@ -279,19 +318,16 @@ async def heartbeat_sender(conn: AsyncConnection, interval: int):
            }
            await conn.sendto(msg, "HTB")

-        except Exception as e:
-            logger.error(f"Error sending heartbeat: {e}", exc_info=True)
        except asyncio.CancelledError:
            logger.debug("Heartbeat sender cancelled")
            raise
+        except Exception as e:
+            logger.error(f"Error sending heartbeat: {e}", exc_info=True)

        # Wait for next interval or shutdown event
        try:
            if shutdown_event:
-                await asyncio.wait_for(
-                    shutdown_event.wait(), 
-                    timeout=interval
-                )
+                await asyncio.wait_for(shutdown_event.wait(), timeout=interval)
                break
            else:
                await asyncio.sleep(interval)
@@ -481,12 +517,13 @@ async def async_main(args, config):
            addr = addr_info[4][0]

            conn = AsyncConnection(conn_id, addr, hb_port, af, iam)
-            if await conn.open():
+            if not await conn.open():
+                logger.warning(f"Initial open to {addr} failed, heartbeat sender will retry")
            connections.append(conn)
            conn_id += 1

    if not connections:
-        logger.error("No connections established")
+        logger.error("No connections established (DNS resolution failed for all hosts)")
        return 1
    
    logger.info(f"Created {len(connections)} connections")
@@ -95,7 +95,7 @@ class Connection:
        if not Null:
            d["addr"] = self.addr
            if self.rtts[-1]:
-                d["rtt"] = "%0.1f" % self.rtts[-1]
+                d["rtt"] = "%d" % round(self.rtts[-1])
            elif self.state == Connection.UNKNOWN:
                d["rtt"] = ""
            else:
@@ -154,6 +154,25 @@ async def start(
        lst = [h.jsons() for h in hosts]
        return web.json_response(json.loads("[" + ",".join(lst) + "]"))

+    async def api_alert_summary(request):
+        """GET /api/0/alert_summary — counts of ok/warning/critical hosts visible to caller."""
+        user, err = _require_auth(request)
+        if err:
+            return err
+        from .threshold import AlertLevel
+        critical = warning = ok = 0
+        for host in hbdclass.Host.hosts.values():
+            if not _can_operate_host(user, host):
+                continue
+            levels = {s.level for s in host.alert_states.values()}
+            if AlertLevel.CRITICAL in levels:
+                critical += 1
+            elif AlertLevel.WARNING in levels:
+                warning += 1
+            else:
+                ok += 1
+        return web.json_response({"critical": critical, "warning": warning, "ok": ok})
+
    async def api_messages(request):
        lst = data.msgs[-30:]
        return web.json_response(lst)
@@ -518,6 +537,7 @@ async def start(
                hosts_with_plugins.append({
                    "name": hostname,
                    "plugins": list(host.plugin_data.keys()),
+                    "is_owner": _can_own_host(current_user, host),
                })

        tmpl = env.get_template("plugins.html")
@@ -893,6 +913,7 @@ async def start(
            web.get("/api/0/users/{username}/avatar", api_user_avatar),
            # Hosts
            web.get("/api/0/hosts", api_hosts),
+            web.get("/api/0/alert_summary", api_alert_summary),
            web.get("/api/0/messages", api_messages),
            web.get("/api/0/hosts/{hostname}/plugins", api_host_plugins),
            web.get("/api/0/hosts/{hostname}/plugins/{plugin_name}", api_host_plugin_detail),
@@ -101,9 +101,10 @@ async def reload_configuration(config_obj, config_path, components):
            access = config_mod.get_host_access(new_config, hostname)
            host.apply_access(access["owner"], access["managers"], access["monitors"])

-        # Reload threshold checker
+        # Reload threshold checker and prune alerts orphaned by the new config
        if 'threshold_checker' in components:
            components['threshold_checker'].reload(new_config)
+            components['threshold_checker'].purge_stale_alerts(hbdclass)
        
        # Note: Changes to the following require restart:
        # - hb_port, hbd_port, ws_port (already bound)
@@ -241,6 +242,10 @@ async def _run_async(config, config_path=None):
    )
    udp.restore_connection_timers(hbdclass, restore_ctx)

+    # Drop alert states that no longer have a matching threshold (stale after
+    # upgrade or config change between runs).
+    threshold_checker.purge_stale_alerts(hbdclass)
+
    # HTTP server (asyncio-based via aiohttp)
    try:
        http_task = asyncio.create_task(
@@ -4,6 +4,11 @@

  <style>

+    body {
+      height: auto;
+      overflow-y: auto;
+    }
+
    .container {
      max-width: 1400px;
      margin: 0 auto;
@@ -126,11 +126,17 @@
      }

      /* Swiss railway clock — nav */
-      .nav-clock {
+      .nav-pie {
        flex-shrink: 0;
        line-height: 0;
        margin-left: auto;
        padding: 4px 4px 4px 0;
+      }
+      #alert-pie { display: block; cursor: default; }
+      .nav-clock {
+        flex-shrink: 0;
+        line-height: 0;
+        padding: 4px 4px 4px 0;
        cursor: pointer;
      }
      #swiss-clock { display: block; }
@@ -408,7 +408,7 @@
        );
        if (data.connections[i].state == "up") {
          state = '<span class="state-up">up</span>';
-          latency = Number.parseFloat(data.connections[i].rtts[0]).toFixed(2);
+          latency = String(Math.round(Number.parseFloat(data.connections[i].rtts[0])));
        } else {
          if (data.connections[i].state == "unknown") {
            state = "";
@@ -11,6 +11,9 @@
    {% endif %}
    <a href="/about"{% if active_page == "about" %} class="active"{% endif %}>About</a>
  </div>
+  <div class="nav-pie" title="Host alert status">
+    <canvas id="alert-pie" width="44" height="44"></canvas>
+  </div>
  <div class="nav-clock" title="Click for full-screen clock">
    <canvas id="swiss-clock" width="44" height="44"></canvas>
  </div>
@@ -42,4 +45,52 @@
      });
    }
  })();
+
+  function drawAlertPie(critical, warning, ok) {
+    var canvas = document.getElementById('alert-pie');
+    if (!canvas) return;
+    var ctx = canvas.getContext('2d');
+    var SIZE = canvas.width;
+    var R = SIZE / 2;
+    ctx.clearRect(0, 0, SIZE, SIZE);
+    var total = critical + warning + ok;
+    if (total === 0) {
+      ctx.beginPath();
+      ctx.arc(R, R, R - 1, 0, Math.PI * 2);
+      ctx.fillStyle = '#ccc';
+      ctx.fill();
+      return;
+    }
+    var slices = [
+      { value: critical, color: '#e53935' },
+      { value: warning,  color: '#ffb300' },
+      { value: ok,       color: '#43a047' }
+    ];
+    var start = -Math.PI / 2;
+    slices.forEach(function(s) {
+      if (s.value === 0) return;
+      var sweep = (s.value / total) * Math.PI * 2;
+      ctx.beginPath();
+      ctx.moveTo(R, R);
+      ctx.arc(R, R, R - 1, start, start + sweep);
+      ctx.closePath();
+      ctx.fillStyle = s.color;
+      ctx.fill();
+      start += sweep;
+    });
+  }
+
+  function updateAlertPie() {
+    fetch('/api/0/alert_summary').then(function(r) {
+      if (!r.ok) return;
+      return r.json();
+    }).then(function(d) {
+      if (d) drawAlertPie(d.critical || 0, d.warning || 0, d.ok || 0);
+    }).catch(function() {});
+  }
+
+  document.addEventListener('DOMContentLoaded', function() {
+    updateAlertPie();
+    setInterval(updateAlertPie, 30000);
+  });
 </script>
@@ -131,6 +131,27 @@
      text-overflow: ellipsis;
    }

+    .host-action-btn {
+      font-size: 0.75em;
+      font-weight: bold;
+      padding: 3px 10px;
+      border-radius: 4px;
+      border: none;
+      cursor: pointer;
+      text-decoration: none;
+      white-space: nowrap;
+    }
+    .host-action-btn.update-btn {
+      background: #e3f2fd;
+      color: #1565c0;
+    }
+    .host-action-btn.update-btn:hover { background: #bbdefb; }
+    .host-action-btn.delete-btn {
+      background: #ffebee;
+      color: #c62828;
+    }
+    .host-action-btn.delete-btn:hover { background: #ffcdd2; }
+
    /* ── Host body ──────────────────────────────────────────────── */

    .host-body {
@@ -379,6 +400,14 @@
              <span class="nagios-badge" id="nagios-badge-{{ host.name }}">—</span>
              {% endif %}
              <span class="os-label" id="os-label-{{ host.name }}"></span>
+              {% if host.is_owner %}
+              <a class="host-action-btn update-btn"
+                 href="/u?h={{ host.name }}"
+                 onclick="event.stopPropagation()">Update</a>
+              <a class="host-action-btn delete-btn"
+                 href="/d?h={{ host.name }}"
+                 onclick="event.stopPropagation(); return confirm('Delete host {{ host.name }}?')">Delete</a>
+              {% endif %}
            </div>
          </div>

@@ -803,6 +803,29 @@ class ThresholdChecker:
            self._check_pending_or_renotify(host_name, alert_state, metric_path, value, threshold, None)

        return None
+    def _find_threshold(
+        self, thresholds: Dict[str, "ThresholdConfig"], metric_path: str
+    ) -> Optional["ThresholdConfig"]:
+        """Return the threshold for *metric_path*, falling back to suffix matches.
+
+        Allows generic thresholds like ``ping_monitor.rtt_avg`` to match
+        fully-qualified paths like ``ping_monitor.8_8_8_8_rtt_avg``.
+        The exact match is always tried first; then successive leading
+        underscore-delimited segments are stripped from the field name until
+        a match is found or no segments remain.
+        """
+        if metric_path in thresholds:
+            return thresholds[metric_path]
+        plugin, sep, field = metric_path.partition(".")
+        if not sep:
+            return None
+        parts = field.split("_")
+        for i in range(1, len(parts)):
+            candidate = plugin + "." + "_".join(parts[i:])
+            if candidate in thresholds:
+                return thresholds[candidate]
+        return None
+
    def check_plugin_data(
        self,
        host_name: str,
@@ -831,11 +854,10 @@ class ThresholdChecker:
        for metric_name, value in data.items():
            metric_path = f"{plugin_name}.{metric_name}"
            
-            if metric_path not in thresholds:
+            threshold = self._find_threshold(thresholds, metric_path)
+            if threshold is None:
                continue
            
-            threshold = thresholds[metric_path]
-            
            # Get or create alert state
            if metric_path not in alert_states:
                alert_states[metric_path] = AlertState(metric_path)
@@ -1254,6 +1276,26 @@ class ThresholdChecker:
            alert_state.last_notification = now
            alert_state.notification_count += 1
    
+    def purge_stale_alerts(self, hbdclass) -> None:
+        """Remove alert states that have no matching threshold configuration.
+
+        Called after startup (pickle restore) and after each config reload so
+        that alerts orphaned by configuration changes do not linger forever.
+        Alerts whose metric_path is not present in the current threshold config
+        for that host are silently dropped.
+        """
+        for hostname, host in hbdclass.Host.hosts.items():
+            if not host.alert_states:
+                continue
+            configured = self.get_thresholds_for_host(hostname)
+            stale = [mp for mp in host.alert_states if mp not in configured]
+            for mp in stale:
+                logger.info(
+                    "Purging stale alert state for %s / %s (no threshold configured)",
+                    hostname, mp,
+                )
+                del host.alert_states[mp]
+
    def get_active_alerts(self, alert_states: Dict[str, AlertState]) -> list:
        """
        Get all currently active (non-OK) alerts.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

 [project]
 name = "hbd"
-version = "5.1.15"
+version = "5.1.17"
 description = "Heartbeat monitoring system — client (hbc) and server (hbd)"
 readme = "README.md"
 requires-python = ">=3.11"
@@ -41,7 +41,7 @@ from pathlib import Path
 from typing import Any, Dict, List, Optional, Tuple

 # updated by scripts/bumpminor.sh
-__version__ = "5.1.15"
+__version__ = "5.1.17"

 # ---------------------------------------------------------------------------
 # Protocol  (mirrors hbd/common/proto.py)
Author	SHA1	Message	Date
andreas	74c89d098c	version 5.1.17 Release / release (push) Successful in 5s Details	2026-05-04 08:04:01 -04:00
andreas	3301dbfe34	feat: owner Update/Delete buttons on Host Overview; purge stale alerts on reload Host Overview (plugins.html): show Update and Delete buttons in the host-right zone when the logged-in user is the host owner (or admin / unauthenticated mode). Buttons link to /u?h=<host> and /d?h=<host> with stopPropagation so they don't toggle the accordion; Delete prompts for confirmation first. ThresholdChecker.purge_stale_alerts(): removes alert states whose metric_path has no matching threshold in the current config. Called after startup pickle restore and after every SIGHUP config reload so alerts orphaned by upgrades or config changes do not persist indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 08:03:46 -04:00
andreas	d00d903e7d	fix: make Alerts page scrollable Override the global style.css body height/overflow that locks all pages to the viewport height (a remnant of the old drawer-menu layout). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 13:33:08 +02:00
andreas	babb5d61aa	docs: update README with changes since `917d6a4` - ZFS monitor plugin (zfs_monitor) added to plugin list and features - nagios_runner: async execution, stderr capture, signal handling, path validation - Threshold alerting: de-escalation suppression, short-duration suppression, ping_monitor thresholds - Per-host watch flag and role-filtered dashboards - HTTP API & Web UI: hostname links in Live View, Host Overview with ZFS renderer, alert pie chart in nav bar, Settings threshold viewer - hbc connection retry: indefinite retry for IPv4; IPv6 dropped after 3 early startup failures - hbc daemon mode: logs routed to syslog after daemonizing - hbc_mini: noted zfs_monitor and IPv6 early-fail protection not available Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 12:46:35 +02:00
andreas	11d1c718b3	feat: retry AsyncConnection.open() indefinitely; drop IPv6 only on early startup failure IPv4 connections are retried forever in heartbeat_sender if open() fails, so a temporary network outage does not terminate the sender. IPv6 connections that have never opened successfully are dropped after IPV6_EARLY_FAIL_LIMIT (3) consecutive failures so that a network without IPv6 support does not keep a dead sender running. At startup all resolved connections are added to the list regardless of whether the initial open() succeeds; the heartbeat_sender loop handles the first real connection attempt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 12:29:35 +02:00
Andreas Wrede	a99b6b54c7	feat: add alert pie chart to nav bar Show a colour-coded pie chart (red=critical, yellow=warning, green=ok) to the left of the clock in the nav bar. Backed by a new GET /api/0/alert_summary endpoint that counts hosts per alert level for the current user's visible hosts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-03 13:45:15 -04:00
Andreas Wrede	8da3d550eb	version 5.1.16 Release / release (push) Successful in 5s Details	2026-05-03 06:08:14 -04:00
Andreas Wrede	a76d0fc840	feat: generic ping_monitor thresholds; round RTT to nearest ms - threshold.py: add _find_threshold() with suffix fallback so thresholds like ping_monitor.rtt_avg match ping_monitor.8_8_8_8_rtt_avg etc.; each pinged host keeps its own alert state - hbdclass.py: format RTT as integer ms (round()) - live.html: JS RTT display rounded to nearest ms (Math.round) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 06:08:11 -04:00