Files
heartbeat/docs/NOTIFICATIONS.md
T
2026-04-01 19:41:53 -04:00

15 KiB

Notification System

Overview

The Heartbeat Monitoring System includes a flexible notification system that can send alerts through multiple channels including Email, Pushover, Signal, and Mattermost. The system supports centralized channel definitions with per-host routing, allowing fine-grained control over notification delivery.

Architecture

Components

  1. Notification Channels (notification_channels in config)

    • Centralized definitions of notification providers
    • Each channel has a type and type-specific credentials
    • Reusable across multiple hosts
  2. Channel Dispatcher (hbd/server/notify.py)

    • pushmsg_for_host(hostname, message): Main entry point for host-specific notifications
    • _dispatch_to_channel(channel_name, channel_config, message): Routes to specific provider
    • Provider functions: pushover(), pushsignal(), pushmattermost(), send_email()
  3. Configuration Utilities (hbd/server/config.py)

    • get_notification_channels_for_host(config, hostname): Retrieves channel names for a host
    • get_notification_channels_config(config, hostname): Retrieves full channel configurations
    • get_channel_config(config, channel_name): Gets configuration for a specific channel
  4. Integration Points

    • Threshold alerts: threshold.py calls notify_mod.pushmsg_for_host()
    • Heartbeat events: udp.py calls notify_mod.pushmsg_for_host() for boot/shutdown/overdue
    • Custom alerts: Any code can call notify_mod.pushmsg_for_host(hostname, message)

Configuration

Centralized Channel Definitions

Define notification channels once in your configuration file:

notification_channels:
  # Signal notifications
  signal_ops:
    type: signal
    cli_path: /usr/local/bin/signal-cli
    user: +1234567890        # Your Signal number
    recipient: +1234567890   # Recipient number
  
  signal_oncall:
    type: signal
    cli_path: /usr/local/bin/signal-cli
    user: +1234567890
    recipient: +0987654321   # Different recipient
  
  # Email notifications
  email_ops:
    type: email
    recipients:
      - ops@example.com
      - alerts@example.com
    sender: heartbeat@example.com
    smtp_server: smtp.example.com
    smtp_port: 587
    smtp_user: heartbeat@example.com
    smtp_password: your-smtp-password
  
  email_devteam:
    type: email
    recipients: [dev-alerts@example.com]
    sender: heartbeat-dev@example.com
    smtp_server: smtp.example.com
    smtp_port: 587
    smtp_user: heartbeat-dev@example.com
    smtp_password: your-smtp-password
  
  # Pushover notifications
  pushover_urgent:
    type: pushover
    token: your-pushover-app-token
    user: your-pushover-user-key
  
  pushover_normal:
    type: pushover
    token: your-pushover-app-token
    user: another-user-key
  
  # Mattermost notifications
  mattermost_devops:
    type: mattermost
    host: mattermost.example.com
    token: your-webhook-token
    channel: devops-alerts
    username: heartbeat-bot
    icon: https://example.com/heartbeat-icon.png

Default Notification Channels

Specify default channels for hosts that don't have specific channel assignments:

default_notification_channels:
  - email_ops
  - mattermost_devops

Hosts without notification_channels defined will use these defaults.

Per-Host Channel Assignment

Assign specific channels to each host in the hosts section:

hosts:
  # Critical production web server - multiple channels for redundancy
  prod-web-01:
    threshold_config: high_sensitivity
    watch: true
    notification_channels:
      - signal_oncall        # Immediate mobile notification
      - pushover_urgent      # Secondary mobile notification
      - email_ops            # Email for record keeping
    dyndns: false
  
  # Database server - ops team notifications only
  prod-db-01:
    threshold_config: database
    watch: true
    notification_channels:
      - signal_ops
      - email_ops
    dyndns: false
  
  # Development server - email only, no urgent notifications
  dev-server-01:
    threshold_config: low_sensitivity
    watch: false
    notification_channels:
      - email_devteam
    dyndns: false
  
  # Test server - uses default_notification_channels
  test-server-01:
    threshold_config: default
    watch: false
    dyndns: false
    # No notification_channels specified = uses default_notification_channels

Channel Types

Email

Sends notifications via SMTP.

Configuration fields:

type: email
recipients: [email1@example.com, email2@example.com]  # Required: List of recipients
sender: heartbeat@example.com                         # Required: From address
smtp_server: smtp.example.com                         # Required: SMTP server hostname
smtp_port: 587                                        # Optional: Default 587
smtp_user: heartbeat@example.com                      # Optional: For authenticated SMTP
smtp_password: your-password                          # Optional: For authenticated SMTP

Features:

  • Supports multiple recipients
  • TLS/STARTTLS support on port 587
  • Authenticated and unauthenticated SMTP

Example:

notification_channels:
  email_critical:
    type: email
    recipients: [admin@example.com, oncall@example.com]
    sender: alerts@example.com
    smtp_server: smtp.fastmail.com
    smtp_port: 587
    smtp_user: alerts@example.com
    smtp_password: app-specific-password

Pushover

Sends push notifications to mobile devices via Pushover API.

Configuration fields:

type: pushover
token: your-application-token    # Required: Your Pushover app token
user: your-user-key              # Required: Recipient's user key

Features:

  • Instant mobile push notifications
  • Works on iOS and Android
  • Supports delivery confirmations

Setup:

  1. Create a Pushover account at https://pushover.net
  2. Create an application to get your app token
  3. Note your user key from your account dashboard

Example:

notification_channels:
  pushover_admin:
    type: pushover
    token: azGDORePK8gMaC0QOYAMyEEuzJnyUi
    user: uQiRzpo4DXghDmr9QzzfQu27cmVRsG

Signal

Sends notifications via Signal messenger using signal-cli.

Configuration fields:

type: signal
cli_path: /usr/local/bin/signal-cli    # Optional: Path to signal-cli binary
user: +1234567890                       # Required: Your Signal phone number
recipient: +0987654321                  # Required: Recipient phone number

Prerequisites:

  1. Install signal-cli: https://github.com/AsamK/signal-cli
  2. Register signal-cli with your phone number:
    signal-cli -u +1234567890 register
    signal-cli -u +1234567890 verify CODE
    
  3. Ensure signal-cli is in PATH or specify full path in config

Features:

  • End-to-end encrypted messaging
  • Works without phone being online
  • No API fees or rate limits

Example:

notification_channels:
  signal_admin:
    type: signal
    cli_path: /usr/local/bin/signal-cli
    user: +12025551234
    recipient: +12025559999

Mattermost

Sends notifications to Mattermost team chat via incoming webhooks.

Configuration fields:

type: mattermost
host: mattermost.example.com           # Required: Mattermost server hostname
token: your-webhook-token               # Required: Incoming webhook token
channel: channel-name                   # Required: Target channel name
username: heartbeat-bot                 # Optional: Bot display name
icon: https://example.com/icon.png      # Optional: Bot icon URL

Prerequisites:

  1. Enable incoming webhooks in Mattermost
  2. Create an incoming webhook for your team
  3. Note the webhook token from the webhook URL

Features:

  • Team-wide visibility
  • Rich formatting support
  • Message threading

Example:

notification_channels:
  mattermost_ops:
    type: mattermost
    host: chat.example.com
    token: abc123def456ghi789
    channel: infrastructure-alerts
    username: heartbeat-monitor
    icon: https://example.com/heartbeat-icon.png

Notification Events

The system sends notifications for various events:

Threshold Alerts

When monitored metrics exceed configured thresholds:

  • State changes: OK → WARNING, WARNING → CRITICAL, CRITICAL → OK
  • Format: {LEVEL}: {hostname} - {metric_path} = {value} {threshold_info}
  • Example: CRITICAL: prod-web-01 - cpu_monitor.cpu_percent = 95.2 (threshold: > 90.0)
  • Re-notifications: Periodic reminders for ongoing alerts (default: hourly)

Heartbeat Events

Host lifecycle events:

  • Host boot: {hostname} booted
  • Host shutdown: {hostname} {connection_type} shutdown
  • Host recovery: {hostname} {connection_type} is back
  • Connection issues: {hostname} {message}
  • Host overdue: {hostname} {connection_type} overdue

Only hosts with watch: true send heartbeat event notifications.

Custom Alerts

Application code can send custom notifications:

from hbd.server import notify as notify_mod

# Send to host-specific channels
notify_mod.pushmsg_for_host("prod-web-01", "Custom alert message")

# Send using global config
notify_mod.pushmsg_from_config("Global notification")

# Send to specific config
notify_mod.pushmsg(custom_config_dict, "Targeted notification")

Design Principles

The notification system follows these core principles:

  • Centralization: Define notification providers once, reference them by name
  • Flexibility: Each host can use different channels for different notification needs
  • Redundancy: Critical hosts can specify multiple channels for failover
  • Clarity: Clean separation between channel definition and channel assignment
  • Type Safety: Provider-specific validation at configuration time

Best Practices

Channel Organization

  • Create purpose-specific channels: email_ops, signal_oncall, pushover_urgent
  • Separate by team/role: email_devteam, signal_dbateam, mattermost_security
  • Use descriptive names: Channel names appear in logs and debugging

Redundancy

For critical hosts, use multiple notification channels:

hosts:
  critical-db:
    notification_channels:
      - signal_oncall      # Primary: Mobile alert
      - pushover_urgent    # Backup: Different mobile platform
      - email_ops          # Tertiary: Email for record-keeping

Notification Fatigue Prevention

  • Use watch: false for non-critical hosts
  • Configure appropriate thresholds to avoid false positives
  • Set different channels for different severities
  • Use default_notification_channels for baseline, add more for critical systems

Security

  • Protect credentials: Use file permissions to protect config files with passwords/tokens
  • Rotate tokens: Periodically rotate API tokens and passwords
  • Use app-specific passwords: For email, use app-specific passwords instead of main account password
  • Separate accounts: Consider separate notification accounts for different environments (prod vs dev)

Testing

Test notification channels before relying on them:

# Test signal-cli directly
signal-cli -u +1234567890 send -m "Test message" +0987654321

# Test SMTP
echo "Test" | mail -s "Test Subject" admin@example.com

# Test through heartbeat system (Python REPL)
from hbd.server import notify as notify_mod, config as config_mod
cfg = config_mod.load_config(".hb.yaml")
notify_mod.setup(cfg)
notify_mod.pushmsg_for_host("test-host", "Test notification")

Troubleshooting

Notifications Not Sending

  1. Check logs: Look for "Failed to send notification" errors
  2. Verify host is watched: Ensure watch: true in host definition
  3. Check channel configuration: Verify credentials and settings
  4. Test channel directly: Use command-line tools to test provider
  5. Check network: Ensure server can reach notification endpoints

Signal Issues

  • signal-cli not found: Specify full path in cli_path
  • Not registered: Run signal-cli -u +NUMBER register and verify
  • Trust issues: Run signal-cli -u +NUMBER receive to sync trust store
  • Recipient not found: Ensure recipient is in your Signal contacts

Email Issues

  • Authentication failed: Check SMTP username/password
  • TLS errors: Verify SMTP port (587 for STARTTLS, 465 for SSL)
  • Relay denied: Ensure SMTP server allows relay from your IP
  • Timeout: Check firewall rules for SMTP ports

Pushover Issues

  • Invalid token/user: Verify token and user key from Pushover dashboard
  • API rate limits: Pushover has monthly message limits on free tier
  • HTTP errors: Check Pushover API status page

Mattermost Issues

  • Webhook not found: Verify webhook token and ensure webhook is enabled
  • Channel not found: Check channel name spelling and permissions
  • Driver import error: Install mattermostdriver: pip install mattermostdriver

API Reference

Main Functions

pushmsg_for_host(hostname: str, msg: str, debug: int = 0) -> dict

Send notification to host-specific channels.

Parameters:

  • hostname: Name of the host (used to look up notification channels)
  • msg: Message to send
  • debug: Debug level (0=no debug, 1+=debug output)

Returns: Dictionary of results per channel: {"signal_ops": True, "email_ops": False}

Example:

from hbd.server import notify as notify_mod

notify_mod.pushmsg_for_host("prod-web-01", "Server CPU at 95%")

Behavior:

  1. Looks up notification channels configured for the host
  2. If no host-specific channels, uses default_notification_channels
  3. Dispatches to each channel in parallel
  4. Returns dict of results keyed by channel name
  5. Logs success/failure for each channel

Examples

Complete Configuration Example

# Notification channel definitions
notification_channels:
  signal_oncall:
    type: signal
    cli_path: /usr/local/bin/signal-cli
    user: +12025551234
    recipient: +12025555678
  
  email_ops:
    type: email
    recipients: [ops@example.com, alerts@example.com]
    sender: heartbeat@example.com
    smtp_server: smtp.fastmail.com
    smtp_port: 587
    smtp_user: heartbeat@example.com
    smtp_password: app-password-here

# Default channels
default_notification_channels: [email_ops]

# Host definitions with channel assignments
hosts:
  prod-web-01:
    threshold_config: high_sensitivity
    watch: true
    notification_channels: [signal_oncall, email_ops]
    dyndns: false
  
  dev-server-01:
    threshold_config: low_sensitivity
    watch: false
    notification_channels: [email_ops]
    dyndns: false

Multiple Environments Example

notification_channels:
  # Production channels
  signal_prod_oncall:
    type: signal
    user: +12025551234
    recipient: +12025551111  # On-call phone
  
  email_prod_ops:
    type: email
    recipients: [prod-ops@example.com]
    sender: prod-heartbeat@example.com
    smtp_server: smtp.example.com
  
  # Staging channels
  email_staging:
    type: email
    recipients: [staging-alerts@example.com]
    sender: staging-heartbeat@example.com
    smtp_server: smtp.example.com
  
  # Development channels
  mattermost_dev:
    type: mattermost
    host: chat.example.com
    token: dev-webhook-token
    channel: dev-alerts

hosts:
  prod-api-01:
    notification_channels: [signal_prod_oncall, email_prod_ops]
  
  staging-api-01:
    notification_channels: [email_staging]
  
  dev-api-01:
    notification_channels: [mattermost_dev]