fix: don't set stale timer until two plugin samples establish real interval
Avoids false-stale firing for slow plugins (e.g. nagios_runner at 300 s) when the heartbeat interval is much shorter. On the first sample cancel any leftover timer; arm the 3× stale timer only after the second sample. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
+11
-4
@@ -389,10 +389,17 @@ def handle_datagram(msg: dict, addr, transport, ctx: dict):
|
||||
if k not in ("ID", "plugin", "id", "name")}
|
||||
# Store plugin data with timestamp
|
||||
host.add_plugin_data(plugin_name, plugin_data, timestamp=now)
|
||||
# Reset stale timer — 3× the heartbeat interval (min 60 s)
|
||||
stale_timeout = max(host.interval * 3, 60)
|
||||
host.reset_plugin_timer(plugin_name, stale_timeout,
|
||||
_make_plugin_stale_callback(uname, ctx))
|
||||
# Reset stale timer using the observed send interval for this plugin.
|
||||
# We need two samples to know the real interval; on the first sample
|
||||
# we cancel any leftover timer but don't set a new one, to avoid
|
||||
# false-stale firing for slow plugins (e.g. nagios_runner at 300 s).
|
||||
history = host.plugin_data.get(plugin_name, [])
|
||||
if len(history) >= 2:
|
||||
plugin_interval = max(history[-1][0] - history[-2][0], 1)
|
||||
host.reset_plugin_timer(plugin_name, plugin_interval * 3,
|
||||
_make_plugin_stale_callback(uname, ctx))
|
||||
else:
|
||||
host.cancel_plugin_timer(plugin_name)
|
||||
|
||||
# If os_info reports an owner and none is configured server-side, apply it
|
||||
if plugin_name == "os_info":
|
||||
|
||||
Reference in New Issue
Block a user