Files
Andreas Wrede 12e8812070 docs: update notification channel and API docs for form-based management
- NOTIFICATIONS.md: document owner/private fields, channel visibility
  rules, and user-created channels; add troubleshooting note for
  private channel visibility
- HTTP_API.md: add notification channel API endpoints table and full
  endpoint reference (GET types, GET/POST/PUT/DELETE channels)
- USERS.md: add missing PUT /api/0/users/me endpoint documentation
  with all three update modes (identity, channels, password)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 07:45:30 -04:00

18 KiB

HTTP API and Web UI Documentation

Overview

The Heartbeat Daemon provides a comprehensive HTTP API and web-based UI for monitoring plugin data and alert states. The API follows RESTful conventions and returns JSON responses.

Base URL

All API endpoints are relative to the server base URL:

http://your-server:50004

Default port is 50004 (configurable via hbd_port in configuration).


Authentication

When user accounts are configured, every request must be authenticated.

  • Browser requests to HTML pages are redirected to /login automatically. JavaScript fetch() calls on the dashboards send the session cookie automatically — no JS changes are needed.
  • API / programmatic requests must include the token in an Authorization: Bearer <token> header or an X-Auth-Token header.

Unauthenticated API requests receive 401 Unauthorized. When no users are configured the server runs in unauthenticated mode and all endpoints are open.

Login

TOKEN=$(curl -s -X POST http://localhost:50004/api/0/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"username":"alice","password":"secret"}' | jq -r .token)

curl -H "Authorization: Bearer $TOKEN" http://localhost:50004/api/0/hosts

See User Management for full authentication documentation.


API Endpoints

Authentication

Method Path Description Auth required
POST /api/0/auth/login Obtain session token No
POST /api/0/auth/logout Invalidate session Token

Users

Method Path Description Role
GET /api/0/users List all users Admin
GET /api/0/users/me Own profile Authenticated
PUT /api/0/users/me Update own profile Authenticated

Notification Channels

Method Path Description Role
GET /api/0/notification_channel_types Channel type schemas Authenticated
GET /api/0/notification_channels List visible channels Authenticated
POST /api/0/notification_channels Create a channel Authenticated
PUT /api/0/notification_channels/{name} Update a channel Owner or Admin
DELETE /api/0/notification_channels/{name} Delete a channel Owner or Admin

Host Management

GET /api/0/hosts

Get list of all monitored hosts with their state information. When auth is enabled, only hosts the caller has at least monitor access to are returned.

Response:

[
  {
    "name": "webserver01",
    "dyn": false,
    "owner": "alice",
    "managers": ["bob"],
    "monitors": ["carol"],
    "connections": [...]
  }
]

GET /api/0/messages

Get recent heartbeat messages (last 30).

Response:

[
  {
    "time": 1711234567.123,
    "host": "webserver01",
    "msg": "heartbeat received"
  }
]

Plugin Data Endpoints

GET /api/0/hosts/{hostname}/plugins

Get all plugin data for a specific host.

Parameters:

  • hostname (path): Name of the host

Response:

{
  "hostname": "webserver01",
  "plugins": {
    "cpu_monitor": {
      "timestamp": 1711234567.123,
      "data": {
        "cpu_percent": 45.2,
        "load_1min": 2.5,
        "load_5min": 2.1,
        "load_15min": 1.8
      },
      "sample_count": 100
    },
    "memory_monitor": {
      "timestamp": 1711234568.456,
      "data": {
        "percent": 65.4,
        "available_mb": 4096,
        "total_mb": 16384
      },
      "sample_count": 100
    }
  }
}

Example:

curl http://localhost:50004/api/0/hosts/webserver01/plugins

GET /api/0/hosts/{hostname}/plugins/{plugin_name}

Get detailed historical data for a specific plugin.

Parameters:

  • hostname (path): Name of the host
  • plugin_name (path): Name of the plugin
  • limit (query, optional): Number of recent samples to return (default: 10)

Response:

{
  "hostname": "webserver01",
  "plugin": "cpu_monitor",
  "samples": [
    {
      "timestamp": 1711234567.123,
      "data": {
        "cpu_percent": 45.2,
        "load_1min": 2.5
      }
    },
    {
      "timestamp": 1711234267.123,
      "data": {
        "cpu_percent": 42.1,
        "load_1min": 2.3
      }
    }
  ],
  "sample_count": 2
}

Examples:

# Get last 1 sample (most recent)
curl http://localhost:50004/api/0/hosts/webserver01/plugins/cpu_monitor?limit=1

# Get last 50 samples
curl http://localhost:50004/api/0/hosts/webserver01/plugins/memory_monitor?limit=50

# Get disk monitor data
curl http://localhost:50004/api/0/hosts/database01/plugins/disk_monitor

Host Access

GET /api/0/hosts/{hostname}/access

Get owner/managers/monitors for a host. Requires monitor role or higher.

Response:

{
  "owner": "alice",
  "managers": ["bob"],
  "monitors": ["carol"]
}

PUT /api/0/hosts/{hostname}/access

Update owner/managers/monitors. Requires owner role or admin.

Request body (all fields optional):

{ "owner": "bob", "managers": ["carol"], "monitors": [] }

Changes take effect immediately but are not written back to the config file. Update the config file and send SIGHUP to make them permanent.



Notification Channel Endpoints

Channels are visible to all users by default. Channels marked private: true are only visible to their owner. Admins see all channels.

GET /api/0/notification_channel_types

Return the schema for every supported notifier type. Used by the web UI to dynamically render the channel creation form.

Response:

{
  "pushover": {
    "label": "Pushover",
    "fields": [
      {"key": "token",  "label": "App token",  "type": "secret", "required": true},
      {"key": "user",   "label": "User key",   "type": "secret", "required": true},
      {"key": "sound",  "label": "Sound",      "type": "text",   "required": false}
    ]
  },
  "email": { "label": "E-mail", "fields": [ ... ] },
  ...
}

GET /api/0/notification_channels

List channels visible to the current user (public channels + own private channels). Admins receive all channels.

Response:

[
  {
    "name": "pushover_ops",
    "type": "pushover",
    "type_label": "Pushover",
    "owner": null,
    "private": false,
    "min_level": "WARNING",
    "fields": [
      {"key": "token", "label": "App token", "value": "•••", "sensitive": true},
      {"key": "user",  "label": "User key",  "value": "•••", "sensitive": true}
    ]
  }
]

Sensitive fields (type: "secret") are always returned as "•••".


POST /api/0/notification_channels

Create a new channel. The creating user becomes the channel's owner.

Request body:

{
  "name": "my_pushover",
  "type": "pushover",
  "token": "app-token",
  "user": "user-key",
  "min_level": "WARNING",
  "private": true
}

Response: {"ok": true, "name": "my_pushover"}

Status codes: 200 OK, 400 (missing required field or unknown type), 409 (name already exists)


PUT /api/0/notification_channels/{name}

Update an existing channel. Only the channel owner or an admin may update it.

Secret fields sent as "•••" are preserved from the existing config (same pattern as OAuth secrets in the admin config editor).

Request body: same shape as POST, name ignored (taken from URL).

Response: {"ok": true}

Status codes: 200 OK, 403 Forbidden, 404 Not Found


DELETE /api/0/notification_channels/{name}

Delete a channel. Only the channel owner or an admin may delete it.

Response: {"ok": true}

Status codes: 200 OK, 403 Forbidden, 404 Not Found


Alert Endpoints

GET /api/0/hosts/{hostname}/alerts

Get alert states for a specific host.

Parameters:

  • hostname (path): Name of the host

Response:

{
  "hostname": "webserver01",
  "alerts": [
    {
      "metric_path": "cpu_monitor.cpu_percent",
      "level": "WARNING",
      "since": 1711234000.0,
      "last_value": 85.5,
      "last_check": 1711234567.123,
      "notification_count": 2
    },
    {
      "metric_path": "disk_monitor./.percent",
      "level": "OK",
      "since": 1711230000.0,
      "last_value": 65.0,
      "last_check": 1711234567.123,
      "notification_count": 0
    }
  ],
  "summary": {
    "ok": 15,
    "warning": 1,
    "critical": 0,
    "unknown": 0
  }
}

Example:

curl http://localhost:50004/api/0/hosts/webserver01/alerts

GET /api/0/alerts

Get all active alerts across all monitored hosts.

Response:

{
  "alerts": [
    {
      "hostname": "webserver01",
      "metric_path": "cpu_monitor.cpu_percent",
      "level": "CRITICAL",
      "since": 1711234000.0,
      "last_value": 95.5,
      "last_check": 1711234567.123,
      "notification_count": 3
    },
    {
      "hostname": "database01",
      "metric_path": "memory_monitor.percent",
      "level": "WARNING",
      "since": 1711233000.0,
      "last_value": 88.2,
      "last_check": 1711234567.123,
      "notification_count": 1
    }
  ],
  "summary": {
    "critical": 1,
    "warning": 1,
    "unknown": 0,
    "total": 2
  },
  "host_count": 5
}

Example:

curl http://localhost:50004/api/0/alerts | jq .

Web UI Pages

Login

URL: /login

Shown automatically when a browser request is made without a valid session (when users are configured). After successful login the browser is redirected to the originally requested page.

Logout

URL: /logout

Clears the session cookie and redirects to /login.

Live Dashboard

URL: /live

Real-time dashboard showing:

  • Host connection states
  • IPv4/IPv6 connectivity
  • Latency metrics
  • Recent messages

Features:

  • WebSocket-powered live updates
  • Sortable columns
  • Color-coded status indicators

Plugin Metrics

URL: /plugins

Interactive visualization of plugin metrics:

  • Select host and plugin from dropdown
  • View current metric values
  • Automatic refresh every 30 seconds
  • Support for nested metrics (e.g., per-partition disk stats)

Features:

  • Card-based metric display
  • Unit formatting (%, MB, GB)
  • Nested object visualization
  • Auto-refresh

Screenshots of available data:

  • CPU usage, load average, frequency
  • Memory usage, available memory, swap
  • Disk usage per partition, I/O statistics
  • Network interface statistics, connection counts
  • Custom plugin data

Alerts Dashboard

URL: /alerts

Comprehensive alert monitoring:

  • Summary cards (Critical, Warning, Total Hosts)
  • Filter by severity (All, Critical, Warning)
  • Alert details with duration
  • Auto-refresh every 15 seconds

Features:

  • Color-coded alert levels
  • Duration tracking
  • Filterable list
  • Real-time updates
  • Summary statistics

Integration Examples

Monitoring Script

#!/bin/bash
# Check for critical alerts and send notification

# Log in first (when auth is configured)
TOKEN=$(curl -s -X POST http://localhost:50004/api/0/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"username":"monitor","password":"secret"}' | jq -r .token)
AUTH="-H \"Authorization: Bearer $TOKEN\""

RESPONSE=$(curl -s $AUTH http://localhost:50004/api/0/alerts)
CRITICAL_COUNT=$(echo "$RESPONSE" | jq '.summary.critical')

if [ "$CRITICAL_COUNT" -gt 0 ]; then
    echo "CRITICAL: $CRITICAL_COUNT critical alerts detected!"
    echo "$RESPONSE" | jq '.alerts[] | select(.level=="CRITICAL")'
    # Send notification
    # mail -s "Critical Alerts" admin@example.com < alert_details.txt
fi

Python Client

import requests
import json

BASE = 'http://localhost:50004'

# Log in (skip if auth not configured)
resp = requests.post(f'{BASE}/api/0/auth/login',
                     json={"username": "alice", "password": "secret"})
token = resp.json().get("token")
headers = {"Authorization": f"Bearer {token}"} if token else {}

# Get all plugin data for a host
response = requests.get(f'{BASE}/api/0/hosts/webserver01/plugins', headers=headers)
data = response.json()

print(f"Host: {data['hostname']}")
print(f"Plugins: {', '.join(data['plugins'].keys())}")

for plugin, info in data['plugins'].items():
    print(f"\n{plugin}:")
    for metric, value in info['data'].items():
        print(f"  {metric}: {value}")

# Check for alerts
response = requests.get(f'{BASE}/api/0/alerts', headers=headers)
alerts = response.json()

if alerts['summary']['critical'] > 0:
    print(f"\n⚠️  {alerts['summary']['critical']} CRITICAL ALERTS!")
    for alert in alerts['alerts']:
        if alert['level'] == 'CRITICAL':
            print(f"  - {alert['hostname']}: {alert['metric_path']} = {alert['last_value']}")

Grafana Integration

The API endpoints can be used with Grafana's JSON datasource plugin:

  1. Install the SimpleJSON datasource plugin
  2. Configure datasource URL: http://your-server:50004
  3. Create queries:
    • Metrics: /api/0/hosts/webserver01/plugins/cpu_monitor?limit=100
    • Alerts: /api/0/alerts

Prometheus Integration

Export metrics in Prometheus format (future enhancement):

# Example prometheus exporter
from prometheus_client import Gauge, generate_latest
import requests

cpu_usage = Gauge('heartbeat_cpu_percent', 'CPU usage percentage', ['hostname'])
memory_usage = Gauge('heartbeat_memory_percent', 'Memory usage percentage', ['hostname'])

def collect_metrics():
    hosts = requests.get('http://localhost:50004/api/0/hosts').json()
    for host in hosts:
        hostname = host['name']
        plugins = requests.get(f'http://localhost:50004/api/0/hosts/{hostname}/plugins').json()
        
        if 'cpu_monitor' in plugins['plugins']:
            cpu_data = plugins['plugins']['cpu_monitor']['data']
            cpu_usage.labels(hostname=hostname).set(cpu_data.get('cpu_percent', 0))
        
        if 'memory_monitor' in plugins['plugins']:
            mem_data = plugins['plugins']['memory_monitor']['data']
            memory_usage.labels(hostname=hostname).set(mem_data.get('percent', 0))

Response Formats

Success Response

All successful API calls return HTTP 200 with JSON body:

{
  "field": "value",
  ...
}

Error Response

API errors return appropriate HTTP status codes with JSON:

{
  "error": "Host 'unknown-host' not found"
}

Common Status Codes:

  • 200 OK - Success
  • 400 Bad Request - Invalid parameters
  • 401 Unauthorized - Missing or invalid session token
  • 403 Forbidden - Authenticated but insufficient role
  • 404 Not Found - Resource not found
  • 500 Internal Server Error - Server error

WebSocket API

For real-time updates, connect to the WebSocket endpoint:

URL: ws://your-server:50005/hbd (or wss:// for secure)

Messages:

{
  "type": "host",
  "data": {
    "name": "webserver01",
    "state": "UP"
  }
}
{
  "type": "plugin",
  "data": {
    "host": "webserver01",
    "plugin": "cpu_monitor",
    "data": {...},
    "timestamp": 1711234567.123
  }
}

Configuration

Enable HTTP Server

# In your hbd configuration file
hbd_host: ""           # Listen on all interfaces
hbd_port: 50004        # HTTP port
ws_port: 50005         # WebSocket port (optional)
# wss_port: 50006      # Secure WebSocket (requires SSL)

SSL/TLS Configuration

For secure WebSocket connections:

wss_port: 50006
cert_path: /etc/heartbeat/certs/
wss_pem: server.pem
wss_key: server.key

Rate Limiting

The API currently does not implement rate limiting. For production use, consider:

  • Placing behind a reverse proxy (nginx, Apache)
  • Using API gateway for rate limiting
  • Implementing caching for frequently accessed endpoints

CORS Support

By default, CORS is not enabled. To enable for web applications:

# In http.py, add CORS middleware
from aiohttp_cors import setup as cors_setup

app = web.Application()
cors = cors_setup(app)

# Configure CORS for all routes
for route in list(app.router.routes()):
    cors.add(route, {
        "*": aiohttp_cors.ResourceOptions(
            allow_credentials=True,
            expose_headers="*",
            allow_headers="*",
        )
    })

Performance Considerations

Caching

  • Plugin data is cached in memory (last 100 samples per plugin)
  • No database queries required
  • Responses are fast (<10ms typical)

Scalability

  • Each host stores its own data independently
  • Memory usage: ~1KB per host + ~1KB per plugin sample
  • For 100 hosts with 5 plugins: ~50MB memory

Best Practices

  1. Use limit parameter to control response size
  2. Cache responses on client side when appropriate
  3. Use WebSocket for real-time updates instead of polling
  4. Consider pagination for large deployments (future enhancement)

Troubleshooting

API Returns 401

  • Auth is configured — include Authorization: Bearer <token> header
  • Token may have expired (24 h TTL) — log in again

API Returns 403

  • Authenticated user lacks the required role for this host/action
  • Check host's owner, managers, monitors config

API Returns 404

  • Verify hostname in URL matches actual host name
  • Check host is sending heartbeats: curl http://localhost:50004/api/0/hosts

No Plugin Data

  • Verify client is configured with plugins
  • Check client logs for plugin errors
  • Ensure plugins are sending data (check journal logs)

Empty Alerts

  • Verify thresholds are configured
  • Check host is in watchhosts list
  • Ensure plugins are collecting metrics
  • Review server logs for threshold checker errors

See Also