hbc/server: request InfoPlugin refresh when host has no plugin data; update docs
- Server sets request_update=1 in ACK when host.plugin_data is empty - hbc: AsyncConnection.request_info_event; handle_ack sets it on request_update - hbc: _info_plugin_refresh_loop clears InfoPlugin caches and resends on demand - hbc_mini: same via _request_info event and _info_refresh_loop - docs/USERS.md: document client-declared owner config key - docs/PLUGIN_DEVELOPMENT.md: document server-initiated InfoPlugin refresh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,6 +8,7 @@ This guide explains how to create custom plugins for the Heartbeat monitoring sy
|
|||||||
- [Plugin Types](#plugin-types)
|
- [Plugin Types](#plugin-types)
|
||||||
- [Creating a Plugin](#creating-a-plugin)
|
- [Creating a Plugin](#creating-a-plugin)
|
||||||
- [Plugin Lifecycle](#plugin-lifecycle)
|
- [Plugin Lifecycle](#plugin-lifecycle)
|
||||||
|
- [Server-initiated InfoPlugin refresh](#server-initiated-infoplugin-refresh)
|
||||||
- [Configuration](#configuration)
|
- [Configuration](#configuration)
|
||||||
- [Best Practices](#best-practices)
|
- [Best Practices](#best-practices)
|
||||||
- [Examples](#examples)
|
- [Examples](#examples)
|
||||||
@@ -250,6 +251,28 @@ Understanding the plugin lifecycle helps you implement plugins correctly:
|
|||||||
└─> Plugin releases resources, closes connections
|
└─> Plugin releases resources, closes connections
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Server-initiated InfoPlugin refresh
|
||||||
|
|
||||||
|
When a heartbeat packet arrives from a host the server has no plugin data for (e.g. after a server restart), the server sets `request_update = 1` in the ACK reply. The client detects this flag and immediately re-runs all InfoPlugins — clearing their cached results first — then resends the data as PLG messages.
|
||||||
|
|
||||||
|
This means InfoPlugin data will always reach the server as soon as possible without requiring a client restart. No action is needed from plugin authors: the framework handles cache invalidation and re-collection automatically.
|
||||||
|
|
||||||
|
The lifecycle for this case looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
Server restarts, host reconnects
|
||||||
|
└─> hbd receives HTB with no existing plugin_data for host
|
||||||
|
└─> hbd sets request_update=1 in ACK
|
||||||
|
|
||||||
|
Client receives ACK
|
||||||
|
└─> Detects request_update flag
|
||||||
|
└─> Clears _cache on every registered InfoPlugin
|
||||||
|
└─> Calls collect() on each InfoPlugin
|
||||||
|
└─> Sends fresh PLG messages to server
|
||||||
|
```
|
||||||
|
|
||||||
|
If you write an `InfoPlugin` with side effects in `_collect_info()` (opening connections, writing files, etc.), be aware it may be called more than once per client session when this mechanism triggers.
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
### Plugin-Specific Configuration
|
### Plugin-Specific Configuration
|
||||||
|
|||||||
@@ -46,6 +46,24 @@ default_owner: andreas # owns hosts with no explicit owner
|
|||||||
# falls back to the first admin user if omitted
|
# falls back to the first admin user if omitted
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Client-declared host ownership
|
||||||
|
|
||||||
|
A host can declare its own owner directly in the hbc or hbc_mini client configuration. This is useful for hosts that are not listed in the server config, or during initial setup before a server-side config entry has been created.
|
||||||
|
|
||||||
|
**`~/.hbc.yaml`** (hbc):
|
||||||
|
```yaml
|
||||||
|
owner: andreas
|
||||||
|
```
|
||||||
|
|
||||||
|
**`~/.hbc.json`** (hbc_mini):
|
||||||
|
```json
|
||||||
|
{ "owner": "andreas" }
|
||||||
|
```
|
||||||
|
|
||||||
|
When set, the value is included in the `os_info` plugin data sent to the server. The server applies it as `host.owner` the first time `os_info` arrives, provided no owner has been configured server-side for that host. Server-configured ownership always takes precedence.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Assigning roles to hosts
|
### Assigning roles to hosts
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
|
|||||||
+36
-15
@@ -59,6 +59,7 @@ class AsyncConnection:
|
|||||||
self._dead = False
|
self._dead = False
|
||||||
self._ever_opened = False
|
self._ever_opened = False
|
||||||
self._open_fail_count = 0 # consecutive failures before first success
|
self._open_fail_count = 0 # consecutive failures before first success
|
||||||
|
self.request_info_event: asyncio.Event = asyncio.Event()
|
||||||
|
|
||||||
self.logger = logging.getLogger(f"hbc.conn.{addr}")
|
self.logger = logging.getLogger(f"hbc.conn.{addr}")
|
||||||
|
|
||||||
@@ -138,6 +139,9 @@ class AsyncConnection:
|
|||||||
|
|
||||||
self.ackcount += 1
|
self.ackcount += 1
|
||||||
self.logger.debug(f"ACK received, RTT: {rtt:.1f}ms")
|
self.logger.debug(f"ACK received, RTT: {rtt:.1f}ms")
|
||||||
|
if msg.get("request_update"):
|
||||||
|
self.logger.info("server requested plugin info refresh")
|
||||||
|
self.request_info_event.set()
|
||||||
|
|
||||||
|
|
||||||
class HeartbeatProtocol(asyncio.DatagramProtocol):
|
class HeartbeatProtocol(asyncio.DatagramProtocol):
|
||||||
@@ -338,6 +342,26 @@ async def heartbeat_sender(conn: AsyncConnection, interval: int):
|
|||||||
raise
|
raise
|
||||||
|
|
||||||
|
|
||||||
|
async def _info_plugin_refresh_loop(conn: AsyncConnection, info_plugins: List):
|
||||||
|
"""Wait for server requests to re-send InfoPlugin data."""
|
||||||
|
logger = logging.getLogger("hbc.plugins")
|
||||||
|
while running:
|
||||||
|
await conn.request_info_event.wait()
|
||||||
|
if not running:
|
||||||
|
break
|
||||||
|
conn.request_info_event.clear()
|
||||||
|
logger.info("refreshing InfoPlugins on server request")
|
||||||
|
for plugin in info_plugins:
|
||||||
|
plugin._cache = None
|
||||||
|
try:
|
||||||
|
data = await plugin.collect()
|
||||||
|
if data:
|
||||||
|
await conn.sendto({"plugin": plugin.name, **data}, "PLG")
|
||||||
|
logger.info(f"Resent {plugin.name} data")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error re-collecting {plugin.name}: {e}", exc_info=True)
|
||||||
|
|
||||||
|
|
||||||
async def plugin_collector(conn: AsyncConnection, registry: PluginRegistry):
|
async def plugin_collector(conn: AsyncConnection, registry: PluginRegistry):
|
||||||
"""Collect and send plugin data.
|
"""Collect and send plugin data.
|
||||||
|
|
||||||
@@ -369,24 +393,21 @@ async def plugin_collector(conn: AsyncConnection, registry: PluginRegistry):
|
|||||||
for plugin in monitor_plugins:
|
for plugin in monitor_plugins:
|
||||||
by_interval[plugin.interval].append(plugin)
|
by_interval[plugin.interval].append(plugin)
|
||||||
|
|
||||||
# Create tasks for each interval
|
# Create tasks for each interval; always include the info-refresh watcher
|
||||||
tasks = []
|
tasks = [asyncio.create_task(_info_plugin_refresh_loop(conn, info_plugins))]
|
||||||
for interval, plugins in by_interval.items():
|
for interval, plugins in by_interval.items():
|
||||||
task = asyncio.create_task(
|
tasks.append(asyncio.create_task(
|
||||||
plugin_collector_interval(conn, plugins, interval)
|
plugin_collector_interval(conn, plugins, interval)
|
||||||
)
|
))
|
||||||
tasks.append(task)
|
|
||||||
|
|
||||||
# Wait for all tasks
|
try:
|
||||||
if tasks:
|
await asyncio.gather(*tasks, return_exceptions=True)
|
||||||
try:
|
except asyncio.CancelledError:
|
||||||
await asyncio.gather(*tasks, return_exceptions=True)
|
logger.debug("Plugin collector cancelled, cancelling sub-tasks")
|
||||||
except asyncio.CancelledError:
|
for task in tasks:
|
||||||
logger.debug("Plugin collector cancelled, cancelling sub-tasks")
|
if not task.done():
|
||||||
for task in tasks:
|
task.cancel()
|
||||||
if not task.done():
|
raise
|
||||||
task.cancel()
|
|
||||||
raise
|
|
||||||
|
|
||||||
|
|
||||||
async def plugin_collector_interval(
|
async def plugin_collector_interval(
|
||||||
|
|||||||
+3
-1
@@ -350,8 +350,10 @@ def handle_datagram(msg: dict, addr, transport, ctx: dict):
|
|||||||
|
|
||||||
if msg.get("ID") == "HTB":
|
if msg.get("ID") == "HTB":
|
||||||
host.doesack = msg.get("acks", -1)
|
host.doesack = msg.get("acks", -1)
|
||||||
# send ACK back
|
# send ACK back; ask client to resend plugin info when we have none yet
|
||||||
rmsg = {"time": time.time()}
|
rmsg = {"time": time.time()}
|
||||||
|
if not host.plugin_data:
|
||||||
|
rmsg["request_update"] = 1
|
||||||
opkt = dicttos("ACK", rmsg)
|
opkt = dicttos("ACK", rmsg)
|
||||||
try:
|
try:
|
||||||
transport.sendto(opkt, addr)
|
transport.sendto(opkt, addr)
|
||||||
|
|||||||
+22
-8
@@ -791,7 +791,7 @@ class _HeartbeatProtocol(asyncio.DatagramProtocol):
|
|||||||
msg_id = msg.get("ID")
|
msg_id = msg.get("ID")
|
||||||
now = time.time()
|
now = time.time()
|
||||||
if msg_id == "ACK":
|
if msg_id == "ACK":
|
||||||
self._conn._handle_ack(now)
|
self._conn._handle_ack(msg, now)
|
||||||
elif msg_id == "CMD":
|
elif msg_id == "CMD":
|
||||||
asyncio.create_task(_handle_command(self._conn, msg))
|
asyncio.create_task(_handle_command(self._conn, msg))
|
||||||
elif msg_id == "UPD":
|
elif msg_id == "UPD":
|
||||||
@@ -818,6 +818,7 @@ class AsyncConnection:
|
|||||||
self.rtts: List[float] = [0.0]
|
self.rtts: List[float] = [0.0]
|
||||||
self._transport: Optional[asyncio.DatagramTransport] = None
|
self._transport: Optional[asyncio.DatagramTransport] = None
|
||||||
self._dead = False
|
self._dead = False
|
||||||
|
self._request_info: asyncio.Event = asyncio.Event()
|
||||||
self._log = logging.getLogger(f"hbc.conn.{addr}")
|
self._log = logging.getLogger(f"hbc.conn.{addr}")
|
||||||
|
|
||||||
async def open(self) -> bool:
|
async def open(self) -> bool:
|
||||||
@@ -836,12 +837,14 @@ class AsyncConnection:
|
|||||||
self._transport.close()
|
self._transport.close()
|
||||||
self._transport = None
|
self._transport = None
|
||||||
|
|
||||||
def _handle_ack(self, now: float):
|
def _handle_ack(self, msg: Dict[str, Any], now: float):
|
||||||
rtt = (now - self.lastsend) * 1000.0
|
rtt = (now - self.lastsend) * 1000.0
|
||||||
self.rtts.append(rtt)
|
self.rtts.append(rtt)
|
||||||
if len(self.rtts) > 10:
|
if len(self.rtts) > 10:
|
||||||
self.rtts.pop(0)
|
self.rtts.pop(0)
|
||||||
self.ackcount += 1
|
self.ackcount += 1
|
||||||
|
if msg.get("request_update"):
|
||||||
|
self._request_info.set()
|
||||||
|
|
||||||
async def sendto(self, msg: Dict[str, Any], msg_id: str = "HTB"):
|
async def sendto(self, msg: Dict[str, Any], msg_id: str = "HTB"):
|
||||||
if self._dead:
|
if self._dead:
|
||||||
@@ -974,6 +977,19 @@ async def _run_monitor_group(conn: AsyncConnection, plugins: List[Plugin], inter
|
|||||||
await _sleep(interval)
|
await _sleep(interval)
|
||||||
|
|
||||||
|
|
||||||
|
async def _info_refresh_loop(conn: AsyncConnection, info: List[Plugin]):
|
||||||
|
log = logging.getLogger("hbc.plugins")
|
||||||
|
while _running:
|
||||||
|
await conn._request_info.wait()
|
||||||
|
if not _running:
|
||||||
|
break
|
||||||
|
conn._request_info.clear()
|
||||||
|
log.info("refreshing InfoPlugins on server request")
|
||||||
|
for plugin in info:
|
||||||
|
plugin._cache = None
|
||||||
|
await _run_info_plugins(conn, info)
|
||||||
|
|
||||||
|
|
||||||
async def _plugin_collector(conn: AsyncConnection, plugins: List[Plugin]):
|
async def _plugin_collector(conn: AsyncConnection, plugins: List[Plugin]):
|
||||||
info = [p for p in plugins if isinstance(p, InfoPlugin)]
|
info = [p for p in plugins if isinstance(p, InfoPlugin)]
|
||||||
monitor = [p for p in plugins if isinstance(p, MonitorPlugin)]
|
monitor = [p for p in plugins if isinstance(p, MonitorPlugin)]
|
||||||
@@ -984,12 +1000,10 @@ async def _plugin_collector(conn: AsyncConnection, plugins: List[Plugin]):
|
|||||||
for p in monitor:
|
for p in monitor:
|
||||||
by_interval[p.interval].append(p)
|
by_interval[p.interval].append(p)
|
||||||
|
|
||||||
if by_interval:
|
tasks = [asyncio.create_task(_info_refresh_loop(conn, info))]
|
||||||
await asyncio.gather(
|
tasks += [asyncio.create_task(_run_monitor_group(conn, grp, iv))
|
||||||
*[asyncio.create_task(_run_monitor_group(conn, grp, iv))
|
for iv, grp in by_interval.items()]
|
||||||
for iv, grp in by_interval.items()],
|
await asyncio.gather(*tasks, return_exceptions=True)
|
||||||
return_exceptions=True,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
Reference in New Issue
Block a user