Files
heartbeat/docs/NOTIFICATIONS.md
T
2026-04-12 11:21:21 -04:00

8.1 KiB

Notification System

Overview

Notifications are dispatched to the owner and managers of a host, each via their own configured notification channels. Channel definitions are global; users reference them by name. No users configured → no notifications sent.

Architecture

Alert event (udp.py / threshold.py)
  └─ notify.send_notification(host_name, Notification)
       ├─ look up host.owner + host.managers
       ├─ for each user → user.notification_channels
       └─ for each channel → _dispatch_to_channel (filtered by min_level)

Every notification carries:

  • title[LEVEL] hostname (e.g. [CRITICAL] webserver01)
  • body — detail message (metric value, threshold, duration)
  • url — link to the plugin metrics page ({base_url}/plugins#{hostname})
  • levelRECOVER | WARNING | CRITICAL | INFO

Configuration

Base URL

Set base_url so notification links point to your hbd instance:

base_url: https://hbd.example.com

Global channel definitions

Define channels once; reference them by name from user configs:

notification_channels:

  pushover_ops:
    type: pushover
    token: your-app-token
    user: your-user-key
    min_level: WARNING        # optional, default: WARNING

  email_ops:
    type: email
    recipients: [ops@example.com]
    sender: hbd@example.com
    smtp_server: smtp.example.com
    smtp_port: 587
    smtp_user: hbd@example.com
    smtp_password: secret
    min_level: WARNING

  matrix_oncall:
    type: matrix
    homeserver: https://matrix.example.org
    access_token: syt_xxx
    room_id: "!abc:matrix.example.org"
    min_level: CRITICAL       # only send critical alerts to this room

  sms_oncall:
    type: sms_voipms
    api_user: me@example.com
    api_password: secret
    did: "5551234567"         # your voip.ms DID number
    dst: "5559876543"         # destination number
    min_level: CRITICAL

  signal_ops:
    type: signal
    cli_path: /usr/local/bin/signal-cli
    user: +12025551234
    recipient: +12025559999

  mattermost_devops:
    type: mattermost
    host: mattermost.example.com
    token: webhook-token
    channel: devops-alerts
    username: heartbeat-bot

Users with notification channels

Each user lists which global channels they receive notifications on:

users:
  alice:
    full_name: Alice Smith
    password: pbkdf2:sha256:...
    admin: true
    notification_channels: [pushover_ops, email_ops]

  bob:
    full_name: Bob Jones
    password: pbkdf2:sha256:...
    notification_channels: [sms_oncall, matrix_oncall]

Host access — owner and managers

Notifications for a host go to its owner and all managers:

hosts:
  webserver01:
    owner: alice             # receives all notifications for this host
    managers: [bob]          # also receives notifications
    threshold_config: default
    watch: true              # bold in dashboard (cosmetic only)
    dyndns: false

  dbserver01:
    owner: alice
    managers: [bob]
    threshold_config: database
    dyndns: false

watch: true only affects display (bold name in the live dashboard). Notifications are now controlled entirely by owner/managers.

Channel Types

min_level filtering

Every channel accepts an optional min_level field:

Value Channels receive
WARNING (default) WARNING, CRITICAL, RECOVER
CRITICAL CRITICAL only (and RECOVER)

RECOVER is always passed through — you don't want to miss a recovery.

pushover

Sends push notifications via Pushover. Includes title, body, and a clickable URL.

type: pushover
token: your-app-token     # Required: Pushover application token
user: your-user-key       # Required: Recipient's user key
min_level: WARNING

email

Sends via SMTP. Subject = title, body = message + URL on final line.

type: email
recipients: [ops@example.com, oncall@example.com]
sender: hbd@example.com
smtp_server: smtp.example.com
smtp_port: 587             # 587 = STARTTLS (default), 465 = SSL
smtp_user: hbd@example.com
smtp_password: secret
min_level: WARNING

matrix

Sends a formatted HTML message to a Matrix room via matrix-nio.

type: matrix
homeserver: https://matrix.example.org
access_token: syt_xxx      # Bot account access token
room_id: "!abc:matrix.example.org"
min_level: WARNING

Setup:

  1. Create a bot Matrix account
  2. Obtain its access token (Element → Settings → Help & About → Access Token)
  3. Invite the bot to the target room and note the room ID

sms_voipms

Sends SMS via the voip.ms REST API. Message is truncated to 160 characters.

type: sms_voipms
api_user: me@example.com   # voip.ms account email
api_password: secret       # voip.ms API password
did: "5551234567"          # Your voip.ms DID (sending number)
dst: "5559876543"          # Destination number
min_level: CRITICAL

signal

Sends via signal-cli.

type: signal
cli_path: /usr/local/bin/signal-cli
user: +12025551234         # Your registered Signal number
recipient: +12025559999    # Recipient number
min_level: WARNING

Setup:

signal-cli -u +12025551234 register
signal-cli -u +12025551234 verify CODE

mattermost

Sends via Mattermost incoming webhook. Message is formatted as Markdown.

type: mattermost
host: mattermost.example.com
token: your-webhook-token
channel: devops-alerts
username: heartbeat-bot    # Optional: display name
icon: https://…/icon.png   # Optional: bot icon URL
min_level: WARNING

Notification events

Source Level Title example Body example
Host overdue CRITICAL [CRITICAL] webserver01 IPv4 overdue
Host recover RECOVER [RECOVER] webserver01 IPv4 back after being overdue for 5:23
Host boot INFO [INFO] webserver01 webserver01 booted
Host shutdown INFO [INFO] webserver01 IPv4 shutdown
Threshold breach WARNING/CRITICAL [CRITICAL] webserver01 cpu_percent = 95.2 (threshold: > 90.0)
Threshold reminder CRITICAL [REMINDER/CRITICAL] webserver01 REMINDER (CRITICAL): … ongoing for 3600s
Connection issue WARNING [WARNING] webserver01 new address detected …

Reminder notifications (re-notify) are sent only for CRITICAL level alerts.

API reference

send_notification(host_name, notif) -> dict

Main entry point. Dispatches to owner + managers.

from hbd.server.notify import send_notification, Notification

send_notification(
    "webserver01",
    Notification(
        title="[CRITICAL] webserver01",
        body="cpu_percent = 95.2 (threshold: > 90.0)",
        level="CRITICAL",
        url="https://hbd.example.com/plugins#webserver01",
    ),
)

Returns {channel_name: bool} for each channel dispatched.

setup(cfg, loop=None)

Called once at startup from main.py. Pass the running asyncio event loop so Matrix sends work correctly.

Troubleshooting

No notifications sent:

  • Check that users are configured (users: section in yaml)
  • Check that the host has an owner or managers set
  • Check that users have notification_channels listed
  • Check that the channel names in user config match keys under notification_channels:

min_level filtering too aggressive:

  • Default is WARNING — both WARNING and CRITICAL are sent
  • Set min_level: WARNING explicitly if you were expecting warnings but set CRITICAL

Matrix sends time out:

  • Verify the access token is valid and the bot is in the room
  • matrix-nio must be installed: pip install matrix-nio

voip.ms SMS fails:

  • Enable the API in your voip.ms account (Account → API)
  • Verify the DID is SMS-capable in your voip.ms account

Signal not found:

  • Specify full cli_path
  • Run signal-cli -u +NUMBER receive to sync trust store

Email authentication failed:

  • Use app-specific passwords for Gmail/Fastmail
  • Verify port: 587 for STARTTLS, 465 for SSL

Pushover 400 errors:

  • Double-check token (app) and user (user key) — they are different values