Commit Graph

50 Commits

Author SHA1 Message Date
andreas f640574e4f version 5.2.2
Release / release (push) Successful in 5s
2026-05-06 09:57:43 -04:00
andreas 9a19424279 fix: retry connection on network error instead of permanently dropping it
error_received() no longer sets _dead=True; it just closes the transport
so the existing retry loop in heartbeat_sender (hbc) and sendto (hbc_mini)
reopens the connection on the next interval. This allows hbc to recover
when it starts before network connectivity is established.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 09:57:32 -04:00
andreas f3d08d1c9e version 5.2.1
Release / release (push) Successful in 5s
2026-05-06 07:07:01 -04:00
andreas e931acb9f5 version 5.2.0
Release / release (push) Successful in 5s
2026-05-05 13:47:46 -04:00
andreas a534c06b26 feat: nagios operator for direct exit-code severity mapping
Add ComparisonOperator.NAGIOS ("nagios") that maps Nagios exit codes
directly to alert levels (0=OK 1=WARNING 2=CRITICAL 3=UNKNOWN) without
requiring numeric warning/critical thresholds. Hysteresis is bypassed for
discrete codes. Display template defaults to "{check_name}: {output}".
_format_display() handles None threshold_value gracefully.

Add nagios_runner.status_code as a built-in default threshold config so
nagios checks alert out of the box.

Also: fix alerts.html scrolling (override html,body), make hostname a link
to /plugins#<hostname>, remove overall_status/overall_status_code/plugin_count
from nagios_runner and hbc_mini, replace with computed worst-status in
plugins.html via nagiosWorstStatus() helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 12:26:56 -04:00
andreas d7b5c97a4e version 5.1.21
Release / release (push) Successful in 6s
2026-05-05 11:05:48 -04:00
andreas d44ce3d124 version 5.1.20
Release / release (push) Successful in 6s
2026-05-05 10:48:24 -04:00
andreas d7b368c7c6 version 5.1.19
Release / release (push) Successful in 5s
2026-05-04 12:10:01 -04:00
andreas e790663f9f feat: exclude ZFS ARC from memory_percent; add uptime_seconds to cpu_monitor
memory_monitor / hbc_mini: ZFS ARC is reclaimable but not reflected in
MemAvailable by the Linux kernel (not in SReclaimable). Read ARC size
from /proc/spl/kstat/zfs/arcstats and add it to available memory before
computing memory_percent and memory_used. No-op on systems without ZFS.

cpu_monitor: report uptime_seconds via psutil.boot_time() (full client)
and /proc/uptime (hbc_mini).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 12:09:58 -04:00
andreas 475319e248 fix: send boot/shutdown on first open connection, not blindly first in list
Replace break-after-first-iteration with next(c for c in connections if
c.transport) so the message goes to the first connection that actually
has an open transport. Falls back to connections[0] if none are open
yet (sendto will attempt reopen), avoiding silent message loss when the
leading connection is still connecting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 09:59:30 -04:00
andreas ca5ef384a8 version 5.1.18
Release / release (push) Successful in 5s
2026-05-04 09:13:18 -04:00
andreas c93dbdc0f4 fix: settings thresholds show correct per-config metrics; misc hbc fixes
Settings page: pass threshold_checker to http.start so the Threshold
Configurations section has data. Use threshold_checker's already-parsed
ThresholdConfig objects instead of re-parsing the raw nested YAML.
Named (non-default) configs now display only their explicit overrides
via threshold_raw_configs, not the full merged set with defaults.

hbc/hbc_mini: send boot and shutdown messages on first connection only
to avoid duplicate packets when multiple servers are configured.
Replace print("Daemonizing...") with logging.info so output goes to
syslog in daemon mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 09:12:39 -04:00
andreas 74c89d098c version 5.1.17
Release / release (push) Successful in 5s
2026-05-04 08:04:01 -04:00
Andreas Wrede 8da3d550eb version 5.1.16
Release / release (push) Successful in 5s
2026-05-03 06:08:14 -04:00
Andreas Wrede 94cbb31c48 version 5.1.15
Release / release (push) Successful in 6s
2026-05-02 14:37:11 -04:00
Andreas Wrede f50acca509 version 5.1.14
Release / release (push) Successful in 5s
2026-05-02 13:21:40 -04:00
Andreas Wrede 46f8c32c0b version 5.1.13
Release / release (push) Successful in 5s
2026-05-02 12:43:06 -04:00
Andreas Wrede 2bd3a9beb6 feat: restart on SIGHUP in hbc and hbc_mini
Sets dorestart and triggers a clean shutdown; os.execv re-execs
the process with the original arguments after cleanup.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-05-02 10:06:26 -04:00
Andreas Wrede 5523c60866 version 5.1.12
Release / release (push) Successful in 5s
2026-05-02 08:56:04 -04:00
Andreas Wrede 9fd945a481 fix install under docker 2026-05-02 08:32:14 -04:00
Andreas Wrede 26df08eeff version 5.1.11
Release / release (push) Failing after 5s
2026-05-02 07:55:27 -04:00
Andreas Wrede 5819dd6b25 cleanup install script 2026-05-02 07:55:18 -04:00
Andreas Wrede 6fb67f8615 version 5.1.10
Release / release (push) Successful in 5s
2026-05-01 13:50:15 -04:00
Andreas Wrede e70ae6f176 fix: change version in hbc_mini as well 2026-05-01 13:50:04 -04:00
Andreas Wrede a77f6d380c fix: install script should not copy over itself 2026-05-01 12:48:29 -04:00
Andreas Wrede 85ee0e1040 install hbc_mini via package or script 2026-05-01 11:13:33 -04:00
Andreas Wrede c4f09e9ced version 5.1.8
Release / release (push) Successful in 5s
- fix: matrix/sms_voipms notifications blocked the event loop on timeout;
  make send_notification async, dispatch all channel drivers as non-blocking
  tasks (asyncio.to_thread for sync drivers, asyncio.wait_for for async);
  update all call sites to fire-and-forget via create_task
- feat: add /about page with version, runtime, uptime counter, and repo link
- fix: hbc_mini plugin data format now matches full hbc client so Host
  Overview displays memory, disk, and network metrics correctly

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 05:33:27 -04:00
Andreas Wrede b290b21e23 track hbc type and version 2026-04-30 18:22:35 -04:00
Andreas Wrede 462a445235 feat: add hbc_mini single-file client; drop dead connections on protocol error
- scripts/hbc_mini.py: self-contained hbc with no external deps; uses
  /proc for CPU/memory/network on Linux, df for disk, JSON config
- hbc + hbc_mini: mark connection _dead and stop sending on protocol error
- README: document hbc_mini usage, config, and plugin availability
- pyproject.toml: include hbc_mini.py in script-files

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-30 17:50:19 -04:00
Andreas Wrede b6dcce4f35 simplify eventlog usage, fix arguments 2026-04-30 15:38:46 -04:00
Andreas Wrede c5ce41762e feat: update hbc via hb_install.sh instead of code patching
Server now sends a bare UPD command; client runs hb_install.sh to
reinstall from the package registry, then restarts. hb_install.sh
also copies itself alongside hbc on client installs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-30 13:55:15 -04:00
Andreas Wrede 26ca0c095f install.sh --> hb_innstall.sh 2026-04-30 09:54:48 -04:00
Andreas Wrede 1eecd67594 update docu 2026-04-30 09:19:11 -04:00
Andreas Wrede caf3c2c0ac don't error exit on pip insttalled test 2026-04-30 09:16:22 -04:00
Andreas Wrede 2d8166d04a unse python3 -mpip instead of plain pip 2026-04-12 18:44:11 -04:00
Andreas Wrede 2c0328f36d update install.sh to handle missing venv module 2026-04-12 16:39:14 -04:00
Andreas Wrede fb8e27825d make install.sh work on systems withou pip 2026-04-12 14:16:44 -04:00
Andreas Wrede 6556d35f97 Merge branch 'master' of git.wrede.ca:andreas/heartbeat 2026-04-12 06:32:52 -04:00
Andreas Wrede 8d3de01117 Update install script 2026-04-11 16:36:20 -04:00
Andreas Wrede 5bedf026b1 Update install script 2026-04-11 16:19:41 -04:00
Andreas Wrede 68b1c65384 version 5.0.10 2026-04-07 14:15:46 -04:00
Andreas Wrede 2373b55d8b fix actions host label. cp[e woth debian flavor sed 2026-04-04 14:43:07 -04:00
Andreas Wrede c5770006f7 hbc proper termination, hbd config reloadable 2026-04-02 07:17:00 -04:00
Andreas Wrede 0543266c92 Major refactoring of the codebase, including restructuring of files and directories, renaming of modules and classes, and improvements to the overall organization and readability of the code. This refactoring aims to enhance maintainability, scalability, and clarity of the codebase while preserving existing functionality. The changes include:
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.
2026-03-29 11:13:40 -04:00
andreas d339133981 collect correct args for restart 2026-02-09 07:25:42 -05:00
andreas 5e6dfc75ad link and flake cleanup 2026-02-08 16:05:03 -05:00
andreas d9ca0b74e2 fix -i option 2026-02-08 14:42:10 -05:00
andreas 83b7139643 typo 2026-02-06 15:53:01 -05:00
andreas 5dca9369dd fix bumpminor.sh script 2026-02-06 15:51:48 -05:00
andreas 4df700e4ef refactor and rewrite for asyncio 2026-02-06 12:34:59 -05:00