per-client threshold config

This commit is contained in:
Andreas Wrede
2026-04-01 15:22:42 -04:00
parent 079e84f729
commit 090d341244
7 changed files with 873 additions and 77 deletions
+217
View File
@@ -56,6 +56,7 @@ thresholds:
critical: 90.0
operator: ">"
hysteresis: 0.1
display: "display format"
enabled: true
```
@@ -82,6 +83,8 @@ Note: At least one of `warning` or `critical` must be specified.
- Range: 0.0 to 1.0
- Prevents rapid state transitions when value hovers near threshold
- **display**: f-string to hold the display format for alert messages
- defaults to "(threshold: {op_symbol} {threshold_value})"
- **enabled**: Whether this threshold is active (default: `true`)
### Comparison Operators
@@ -740,3 +743,217 @@ Planned features:
- [Message Journal Documentation](MESSAGE_JOURNAL.md)
- Configuration examples: `hbd/config_thresholds_example.yaml`
- Test suite: `test_threshold.py`
## Multi-Threshold Configuration
**New in version 2.0**: Support for multiple named threshold configurations with per-host mapping.
### Overview
The multi-threshold feature allows you to:
- Define multiple sets of threshold configurations
- Map different hosts to different threshold sets
- Use different sensitivity levels for different environments
- Maintain a default configuration for unmapped hosts
### Configuration Structure
```yaml
# Optional: Set the default configuration name (defaults to "default")
default_threshold_config: "default"
# Define multiple named threshold configurations
threshold_configs:
# Configuration name 1
default:
thresholds:
# Standard threshold definitions
cpu_monitor:
cpu_percent:
warning: 80.0
critical: 90.0
# Configuration name 2
high_sensitivity:
thresholds:
cpu_monitor:
cpu_percent:
warning: 60.0
critical: 75.0
# Configuration name 3
low_sensitivity:
thresholds:
cpu_monitor:
cpu_percent:
warning: 90.0
critical: 95.0
# Map specific hosts to specific configurations
host_threshold_mapping:
prod-web-01: high_sensitivity
prod-web-02: high_sensitivity
dev-server-01: low_sensitivity
# Unmapped hosts use default_threshold_config
```
### Use Cases
#### 1. Environment-Based Thresholds
Different thresholds for production vs. development:
```yaml
threshold_configs:
production:
thresholds:
cpu_monitor:
cpu_percent:
warning: 70.0 # Alert earlier in production
critical: 85.0
development:
thresholds:
cpu_monitor:
cpu_percent:
warning: 90.0 # More relaxed for dev
critical: 98.0
host_threshold_mapping:
prod-web-01: production
prod-web-02: production
dev-web-01: development
dev-web-02: development
```
#### 2. Server Role-Based Thresholds
Different thresholds based on server function:
```yaml
threshold_configs:
webserver:
thresholds:
cpu_monitor:
cpu_percent:
warning: 80.0
critical: 90.0
database:
thresholds:
cpu_monitor:
cpu_percent:
warning: 70.0
critical: 85.0
memory_monitor:
percent:
warning: 90.0 # Databases can use high memory
critical: 97.0
disk_monitor:
partitions:
/var/lib/mysql:
percent:
warning: 75.0
critical: 85.0
cache:
thresholds:
memory_monitor:
percent:
warning: 95.0 # Redis/Memcached can use very high memory
critical: 99.0
host_threshold_mapping:
web-01: webserver
web-02: webserver
db-01: database
db-02: database
redis-01: cache
memcached-01: cache
```
#### 3. Sensitivity Levels
Different sensitivity for critical vs. non-critical systems:
```yaml
threshold_configs:
critical:
thresholds:
disk_monitor:
partitions:
/:
percent:
warning: 70.0 # Very sensitive
critical: 80.0
hysteresis: 0.15
standard:
thresholds:
disk_monitor:
partitions:
/:
percent:
warning: 85.0
critical: 95.0
hysteresis: 0.1
relaxed:
thresholds:
disk_monitor:
partitions:
/:
percent:
warning: 90.0
critical: 98.0
hysteresis: 0.05
host_threshold_mapping:
payment-gateway: critical
auth-server: critical
web-01: standard
web-02: standard
test-server: relaxed
```
### Backward Compatibility
The legacy single threshold configuration is fully supported:
```yaml
# Old format - still works
thresholds:
cpu_monitor:
cpu_percent:
warning: 80.0
critical: 90.0
```
This is equivalent to:
```yaml
# New format
threshold_configs:
default:
thresholds:
cpu_monitor:
cpu_percent:
warning: 80.0
critical: 90.0
```
### Configuration Priority
1. **Host-specific mapping**: If host is in `host_threshold_mapping`, use that config
2. **Default config**: Use `default_threshold_config`
3. **First alphabetically**: If default not found, use first config alphabetically
4. **Legacy fallback**: If `threshold_configs` not present, use `thresholds`
### Example: Complete Multi-Threshold Setup
See `hbd/config_multi_threshold_example.yaml` for a complete example with:
- 4 named configurations (default, high_sensitivity, low_sensitivity, database)
- Host-to-config mappings for production, development, and test systems
- Specialized database server thresholds
- Custom display messages with plugin data