5 Commits

Author SHA1 Message Date
Andreas Wrede 0f90be659e fix: correct ZFS pool status threshold operator and add per-metric grace
The default zfs_monitor.*.status threshold used operator '>' with warning=1,
so a DEGRADED pool (status=1) never alerted (1 > 1 is false) and a FAULTED
pool (status=2) only triggered WARNING instead of CRITICAL.

Fix the operator to '>=' in THRESHOLD_DEFAULTS and the example config.

Also adds a per-metric grace period override (ThresholdConfig.grace) so
individual thresholds can bypass or shorten the global grace delay. Alerts
with grace=0 fire immediately on state change rather than waiting for a
second collection cycle. Sets grace=0 on zfs_monitor.*.status so pool
degradation alerts fire on the first data report after the event.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 06:33:06 -04:00
andreas b95f1a5bb7 fix: agree: zpool ONLINE=OK, DEGRADED=WARNING, all else is CRITICAL 2026-05-08 17:18:41 -04:00
andreas 217bba1b76 fix: change health_ok to status 2026-05-08 16:57:45 -04:00
andreas c20245b0ab docs: document ZFS pool health alerting; fix pushover sound+url_title 2026-05-08 16:25:55 -04:00
Andreas Wrede 0543266c92 Major refactoring of the codebase, including restructuring of files and directories, renaming of modules and classes, and improvements to the overall organization and readability of the code. This refactoring aims to enhance maintainability, scalability, and clarity of the codebase while preserving existing functionality. The changes include:
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.
2026-03-29 11:13:40 -04:00