fix: don't set stale timer until two plugin samples establish real interval

Avoids false-stale firing for slow plugins (e.g. nagios_runner at 300 s)
when the heartbeat interval is much shorter. On the first sample cancel
any leftover timer; arm the 3× stale timer only after the second sample.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Andreas Wrede
2026-06-06 09:00:09 -04:00
parent e0443293e9
commit 7bab15ae52
+11 -4
View File
@@ -389,10 +389,17 @@ def handle_datagram(msg: dict, addr, transport, ctx: dict):
if k not in ("ID", "plugin", "id", "name")}
# Store plugin data with timestamp
host.add_plugin_data(plugin_name, plugin_data, timestamp=now)
# Reset stale timer — 3× the heartbeat interval (min 60 s)
stale_timeout = max(host.interval * 3, 60)
host.reset_plugin_timer(plugin_name, stale_timeout,
_make_plugin_stale_callback(uname, ctx))
# Reset stale timer using the observed send interval for this plugin.
# We need two samples to know the real interval; on the first sample
# we cancel any leftover timer but don't set a new one, to avoid
# false-stale firing for slow plugins (e.g. nagios_runner at 300 s).
history = host.plugin_data.get(plugin_name, [])
if len(history) >= 2:
plugin_interval = max(history[-1][0] - history[-2][0], 1)
host.reset_plugin_timer(plugin_name, plugin_interval * 3,
_make_plugin_stale_callback(uname, ctx))
else:
host.cancel_plugin_timer(plugin_name)
# If os_info reports an owner and none is configured server-side, apply it
if plugin_name == "os_info":