# Notification System ## Overview The Heartbeat Monitoring System includes a flexible notification system that can send alerts through multiple channels including Email, Pushover, Signal, and Mattermost. The system supports centralized channel definitions with per-host routing, allowing fine-grained control over notification delivery. ## Architecture ### Components 1. **Notification Channels** (`notification_channels` in config) - Centralized definitions of notification providers - Each channel has a type and type-specific credentials - Reusable across multiple hosts 2. **Channel Dispatcher** (`hbd/server/notify.py`) - `pushmsg_for_host(hostname, message)`: Main entry point for host-specific notifications - `_dispatch_to_channel(channel_name, channel_config, message)`: Routes to specific provider - Provider functions: `pushover()`, `pushsignal()`, `pushmattermost()`, `send_email()` 3. **Configuration Utilities** (`hbd/server/config.py`) - `get_notification_channels_for_host(config, hostname)`: Retrieves channel names for a host - `get_notification_channels_config(config, hostname)`: Retrieves full channel configurations - `get_channel_config(config, channel_name)`: Gets configuration for a specific channel 4. **Integration Points** - **Threshold alerts**: `threshold.py` calls `notify_mod.pushmsg_for_host()` - **Heartbeat events**: `udp.py` calls `notify_mod.pushmsg_for_host()` for boot/shutdown/overdue - **Custom alerts**: Any code can call `notify_mod.pushmsg_for_host(hostname, message)` ## Configuration ### Centralized Channel Definitions Define notification channels once in your configuration file: ```yaml notification_channels: # Signal notifications signal_ops: type: signal cli_path: /usr/local/bin/signal-cli user: +1234567890 # Your Signal number recipient: +1234567890 # Recipient number signal_oncall: type: signal cli_path: /usr/local/bin/signal-cli user: +1234567890 recipient: +0987654321 # Different recipient # Email notifications email_ops: type: email recipients: - ops@example.com - alerts@example.com sender: heartbeat@example.com smtp_server: smtp.example.com smtp_port: 587 smtp_user: heartbeat@example.com smtp_password: your-smtp-password email_devteam: type: email recipients: [dev-alerts@example.com] sender: heartbeat-dev@example.com smtp_server: smtp.example.com smtp_port: 587 smtp_user: heartbeat-dev@example.com smtp_password: your-smtp-password # Pushover notifications pushover_urgent: type: pushover token: your-pushover-app-token user: your-pushover-user-key pushover_normal: type: pushover token: your-pushover-app-token user: another-user-key # Mattermost notifications mattermost_devops: type: mattermost host: mattermost.example.com token: your-webhook-token channel: devops-alerts username: heartbeat-bot icon: https://example.com/heartbeat-icon.png ``` ### Default Notification Channels Specify default channels for hosts that don't have specific channel assignments: ```yaml default_notification_channels: - email_ops - mattermost_devops ``` Hosts without `notification_channels` defined will use these defaults. ### Per-Host Channel Assignment Assign specific channels to each host in the `hosts` section: ```yaml hosts: # Critical production web server - multiple channels for redundancy prod-web-01: threshold_config: high_sensitivity watch: true notification_channels: - signal_oncall # Immediate mobile notification - pushover_urgent # Secondary mobile notification - email_ops # Email for record keeping dyndns: false # Database server - ops team notifications only prod-db-01: threshold_config: database watch: true notification_channels: - signal_ops - email_ops dyndns: false # Development server - email only, no urgent notifications dev-server-01: threshold_config: low_sensitivity watch: false notification_channels: - email_devteam dyndns: false # Test server - uses default_notification_channels test-server-01: threshold_config: default watch: false dyndns: false # No notification_channels specified = uses default_notification_channels ``` ## Channel Types ### Email Sends notifications via SMTP. **Configuration fields:** ```yaml type: email recipients: [email1@example.com, email2@example.com] # Required: List of recipients sender: heartbeat@example.com # Required: From address smtp_server: smtp.example.com # Required: SMTP server hostname smtp_port: 587 # Optional: Default 587 smtp_user: heartbeat@example.com # Optional: For authenticated SMTP smtp_password: your-password # Optional: For authenticated SMTP ``` **Features:** - Supports multiple recipients - TLS/STARTTLS support on port 587 - Authenticated and unauthenticated SMTP **Example:** ```yaml notification_channels: email_critical: type: email recipients: [admin@example.com, oncall@example.com] sender: alerts@example.com smtp_server: smtp.fastmail.com smtp_port: 587 smtp_user: alerts@example.com smtp_password: app-specific-password ``` ### Pushover Sends push notifications to mobile devices via Pushover API. **Configuration fields:** ```yaml type: pushover token: your-application-token # Required: Your Pushover app token user: your-user-key # Required: Recipient's user key ``` **Features:** - Instant mobile push notifications - Works on iOS and Android - Supports delivery confirmations **Setup:** 1. Create a Pushover account at https://pushover.net 2. Create an application to get your app token 3. Note your user key from your account dashboard **Example:** ```yaml notification_channels: pushover_admin: type: pushover token: azGDORePK8gMaC0QOYAMyEEuzJnyUi user: uQiRzpo4DXghDmr9QzzfQu27cmVRsG ``` ### Signal Sends notifications via Signal messenger using signal-cli. **Configuration fields:** ```yaml type: signal cli_path: /usr/local/bin/signal-cli # Optional: Path to signal-cli binary user: +1234567890 # Required: Your Signal phone number recipient: +0987654321 # Required: Recipient phone number ``` **Prerequisites:** 1. Install signal-cli: https://github.com/AsamK/signal-cli 2. Register signal-cli with your phone number: ```bash signal-cli -u +1234567890 register signal-cli -u +1234567890 verify CODE ``` 3. Ensure signal-cli is in PATH or specify full path in config **Features:** - End-to-end encrypted messaging - Works without phone being online - No API fees or rate limits **Example:** ```yaml notification_channels: signal_admin: type: signal cli_path: /usr/local/bin/signal-cli user: +12025551234 recipient: +12025559999 ``` ### Mattermost Sends notifications to Mattermost team chat via incoming webhooks. **Configuration fields:** ```yaml type: mattermost host: mattermost.example.com # Required: Mattermost server hostname token: your-webhook-token # Required: Incoming webhook token channel: channel-name # Required: Target channel name username: heartbeat-bot # Optional: Bot display name icon: https://example.com/icon.png # Optional: Bot icon URL ``` **Prerequisites:** 1. Enable incoming webhooks in Mattermost 2. Create an incoming webhook for your team 3. Note the webhook token from the webhook URL **Features:** - Team-wide visibility - Rich formatting support - Message threading **Example:** ```yaml notification_channels: mattermost_ops: type: mattermost host: chat.example.com token: abc123def456ghi789 channel: infrastructure-alerts username: heartbeat-monitor icon: https://example.com/heartbeat-icon.png ``` ## Notification Events The system sends notifications for various events: ### Threshold Alerts When monitored metrics exceed configured thresholds: - **State changes**: OK → WARNING, WARNING → CRITICAL, CRITICAL → OK - **Format**: `{LEVEL}: {hostname} - {metric_path} = {value} {threshold_info}` - **Example**: `CRITICAL: prod-web-01 - cpu_monitor.cpu_percent = 95.2 (threshold: > 90.0)` - **Re-notifications**: Periodic reminders for ongoing alerts (default: hourly) ### Heartbeat Events Host lifecycle events: - **Host boot**: `{hostname} booted` - **Host shutdown**: `{hostname} {connection_type} shutdown` - **Host recovery**: `{hostname} {connection_type} is back` - **Connection issues**: `{hostname} {message}` - **Host overdue**: `{hostname} {connection_type} overdue` Only hosts with `watch: true` send heartbeat event notifications. ### Custom Alerts Application code can send custom notifications: ```python from hbd.server import notify as notify_mod # Send to host-specific channels notify_mod.pushmsg_for_host("prod-web-01", "Custom alert message") # Send using global config notify_mod.pushmsg_from_config("Global notification") # Send to specific config notify_mod.pushmsg(custom_config_dict, "Targeted notification") ``` ## Design Principles The notification system follows these core principles: - **Centralization**: Define notification providers once, reference them by name - **Flexibility**: Each host can use different channels for different notification needs - **Redundancy**: Critical hosts can specify multiple channels for failover - **Clarity**: Clean separation between channel definition and channel assignment - **Type Safety**: Provider-specific validation at configuration time ## Best Practices ### Channel Organization - **Create purpose-specific channels**: `email_ops`, `signal_oncall`, `pushover_urgent` - **Separate by team/role**: `email_devteam`, `signal_dbateam`, `mattermost_security` - **Use descriptive names**: Channel names appear in logs and debugging ### Redundancy For critical hosts, use multiple notification channels: ```yaml hosts: critical-db: notification_channels: - signal_oncall # Primary: Mobile alert - pushover_urgent # Backup: Different mobile platform - email_ops # Tertiary: Email for record-keeping ``` ### Notification Fatigue Prevention - **Use `watch: false`** for non-critical hosts - **Configure appropriate thresholds** to avoid false positives - **Set different channels for different severities** - **Use `default_notification_channels`** for baseline, add more for critical systems ### Security - **Protect credentials**: Use file permissions to protect config files with passwords/tokens - **Rotate tokens**: Periodically rotate API tokens and passwords - **Use app-specific passwords**: For email, use app-specific passwords instead of main account password - **Separate accounts**: Consider separate notification accounts for different environments (prod vs dev) ### Testing Test notification channels before relying on them: ```bash # Test signal-cli directly signal-cli -u +1234567890 send -m "Test message" +0987654321 # Test SMTP echo "Test" | mail -s "Test Subject" admin@example.com # Test through heartbeat system (Python REPL) from hbd.server import notify as notify_mod, config as config_mod cfg = config_mod.load_config(".hb.yaml") notify_mod.setup(cfg) notify_mod.pushmsg_for_host("test-host", "Test notification") ``` ## Troubleshooting ### Notifications Not Sending 1. **Check logs**: Look for "Failed to send notification" errors 2. **Verify host is watched**: Ensure `watch: true` in host definition 3. **Check channel configuration**: Verify credentials and settings 4. **Test channel directly**: Use command-line tools to test provider 5. **Check network**: Ensure server can reach notification endpoints ### Signal Issues - **signal-cli not found**: Specify full path in `cli_path` - **Not registered**: Run `signal-cli -u +NUMBER register` and verify - **Trust issues**: Run `signal-cli -u +NUMBER receive` to sync trust store - **Recipient not found**: Ensure recipient is in your Signal contacts ### Email Issues - **Authentication failed**: Check SMTP username/password - **TLS errors**: Verify SMTP port (587 for STARTTLS, 465 for SSL) - **Relay denied**: Ensure SMTP server allows relay from your IP - **Timeout**: Check firewall rules for SMTP ports ### Pushover Issues - **Invalid token/user**: Verify token and user key from Pushover dashboard - **API rate limits**: Pushover has monthly message limits on free tier - **HTTP errors**: Check Pushover API status page ### Mattermost Issues - **Webhook not found**: Verify webhook token and ensure webhook is enabled - **Channel not found**: Check channel name spelling and permissions - **Driver import error**: Install mattermostdriver: `pip install mattermostdriver` ## API Reference ### Main Functions #### `pushmsg_for_host(hostname: str, msg: str, debug: int = 0) -> dict` Send notification to host-specific channels. **Parameters:** - `hostname`: Name of the host (used to look up notification channels) - `msg`: Message to send - `debug`: Debug level (0=no debug, 1+=debug output) **Returns:** Dictionary of results per channel: `{"signal_ops": True, "email_ops": False}` **Example:** ```python from hbd.server import notify as notify_mod notify_mod.pushmsg_for_host("prod-web-01", "Server CPU at 95%") ``` **Behavior:** 1. Looks up notification channels configured for the host 2. If no host-specific channels, uses `default_notification_channels` 3. Dispatches to each channel in parallel 4. Returns dict of results keyed by channel name 5. Logs success/failure for each channel ## Examples ### Complete Configuration Example ```yaml # Notification channel definitions notification_channels: signal_oncall: type: signal cli_path: /usr/local/bin/signal-cli user: +12025551234 recipient: +12025555678 email_ops: type: email recipients: [ops@example.com, alerts@example.com] sender: heartbeat@example.com smtp_server: smtp.fastmail.com smtp_port: 587 smtp_user: heartbeat@example.com smtp_password: app-password-here # Default channels default_notification_channels: [email_ops] # Host definitions with channel assignments hosts: prod-web-01: threshold_config: high_sensitivity watch: true notification_channels: [signal_oncall, email_ops] dyndns: false dev-server-01: threshold_config: low_sensitivity watch: false notification_channels: [email_ops] dyndns: false ``` ### Multiple Environments Example ```yaml notification_channels: # Production channels signal_prod_oncall: type: signal user: +12025551234 recipient: +12025551111 # On-call phone email_prod_ops: type: email recipients: [prod-ops@example.com] sender: prod-heartbeat@example.com smtp_server: smtp.example.com # Staging channels email_staging: type: email recipients: [staging-alerts@example.com] sender: staging-heartbeat@example.com smtp_server: smtp.example.com # Development channels mattermost_dev: type: mattermost host: chat.example.com token: dev-webhook-token channel: dev-alerts hosts: prod-api-01: notification_channels: [signal_prod_oncall, email_prod_ops] staging-api-01: notification_channels: [email_staging] dev-api-01: notification_channels: [mattermost_dev] ```