Files
heartbeat/docs/HTTP_API.md
T
2026-04-02 07:17:00 -04:00

11 KiB

HTTP API and Web UI Documentation

Overview

The Heartbeat Daemon provides a comprehensive HTTP API and web-based UI for monitoring plugin data and alert states. The API follows RESTful conventions and returns JSON responses.

Base URL

All API endpoints are relative to the server base URL:

http://your-server:50004

Default port is 50004 (configurable via hbd_port in configuration).


API Endpoints

Host Management

GET /api/0/hosts

Get list of all monitored hosts with their state information.

Response:

[
  {
    "name": "webserver01",
    "dyn": false,
    "connections": [...]
  }
]

GET /api/0/messages

Get recent heartbeat messages (last 30).

Response:

[
  {
    "time": 1711234567.123,
    "host": "webserver01",
    "msg": "heartbeat received"
  }
]

Plugin Data Endpoints

GET /api/0/hosts/{hostname}/plugins

Get all plugin data for a specific host.

Parameters:

  • hostname (path): Name of the host

Response:

{
  "hostname": "webserver01",
  "plugins": {
    "cpu_monitor": {
      "timestamp": 1711234567.123,
      "data": {
        "cpu_percent": 45.2,
        "load_1min": 2.5,
        "load_5min": 2.1,
        "load_15min": 1.8
      },
      "sample_count": 100
    },
    "memory_monitor": {
      "timestamp": 1711234568.456,
      "data": {
        "percent": 65.4,
        "available_mb": 4096,
        "total_mb": 16384
      },
      "sample_count": 100
    }
  }
}

Example:

curl http://localhost:50004/api/0/hosts/webserver01/plugins

GET /api/0/hosts/{hostname}/plugins/{plugin_name}

Get detailed historical data for a specific plugin.

Parameters:

  • hostname (path): Name of the host
  • plugin_name (path): Name of the plugin
  • limit (query, optional): Number of recent samples to return (default: 10)

Response:

{
  "hostname": "webserver01",
  "plugin": "cpu_monitor",
  "samples": [
    {
      "timestamp": 1711234567.123,
      "data": {
        "cpu_percent": 45.2,
        "load_1min": 2.5
      }
    },
    {
      "timestamp": 1711234267.123,
      "data": {
        "cpu_percent": 42.1,
        "load_1min": 2.3
      }
    }
  ],
  "sample_count": 2
}

Examples:

# Get last 1 sample (most recent)
curl http://localhost:50004/api/0/hosts/webserver01/plugins/cpu_monitor?limit=1

# Get last 50 samples
curl http://localhost:50004/api/0/hosts/webserver01/plugins/memory_monitor?limit=50

# Get disk monitor data
curl http://localhost:50004/api/0/hosts/database01/plugins/disk_monitor

Alert Endpoints

GET /api/0/hosts/{hostname}/alerts

Get alert states for a specific host.

Parameters:

  • hostname (path): Name of the host

Response:

{
  "hostname": "webserver01",
  "alerts": [
    {
      "metric_path": "cpu_monitor.cpu_percent",
      "level": "WARNING",
      "since": 1711234000.0,
      "last_value": 85.5,
      "last_check": 1711234567.123,
      "notification_count": 2
    },
    {
      "metric_path": "disk_monitor./.percent",
      "level": "OK",
      "since": 1711230000.0,
      "last_value": 65.0,
      "last_check": 1711234567.123,
      "notification_count": 0
    }
  ],
  "summary": {
    "ok": 15,
    "warning": 1,
    "critical": 0,
    "unknown": 0
  }
}

Example:

curl http://localhost:50004/api/0/hosts/webserver01/alerts

GET /api/0/alerts

Get all active alerts across all monitored hosts.

Response:

{
  "alerts": [
    {
      "hostname": "webserver01",
      "metric_path": "cpu_monitor.cpu_percent",
      "level": "CRITICAL",
      "since": 1711234000.0,
      "last_value": 95.5,
      "last_check": 1711234567.123,
      "notification_count": 3
    },
    {
      "hostname": "database01",
      "metric_path": "memory_monitor.percent",
      "level": "WARNING",
      "since": 1711233000.0,
      "last_value": 88.2,
      "last_check": 1711234567.123,
      "notification_count": 1
    }
  ],
  "summary": {
    "critical": 1,
    "warning": 1,
    "unknown": 0,
    "total": 2
  },
  "host_count": 5
}

Example:

curl http://localhost:50004/api/0/alerts | jq .

Web UI Pages

Live Dashboard

URL: /live

Real-time dashboard showing:

  • Host connection states
  • IPv4/IPv6 connectivity
  • Latency metrics
  • Recent messages

Features:

  • WebSocket-powered live updates
  • Sortable columns
  • Color-coded status indicators

Plugin Metrics

URL: /plugins

Interactive visualization of plugin metrics:

  • Select host and plugin from dropdown
  • View current metric values
  • Automatic refresh every 30 seconds
  • Support for nested metrics (e.g., per-partition disk stats)

Features:

  • Card-based metric display
  • Unit formatting (%, MB, GB)
  • Nested object visualization
  • Auto-refresh

Screenshots of available data:

  • CPU usage, load average, frequency
  • Memory usage, available memory, swap
  • Disk usage per partition, I/O statistics
  • Network interface statistics, connection counts
  • Custom plugin data

Alerts Dashboard

URL: /alerts

Comprehensive alert monitoring:

  • Summary cards (Critical, Warning, Total Hosts)
  • Filter by severity (All, Critical, Warning)
  • Alert details with duration
  • Auto-refresh every 15 seconds

Features:

  • Color-coded alert levels
  • Duration tracking
  • Filterable list
  • Real-time updates
  • Summary statistics

Integration Examples

Monitoring Script

#!/bin/bash
# Check for critical alerts and send notification

RESPONSE=$(curl -s http://localhost:50004/api/0/alerts)
CRITICAL_COUNT=$(echo "$RESPONSE" | jq '.summary.critical')

if [ "$CRITICAL_COUNT" -gt 0 ]; then
    echo "CRITICAL: $CRITICAL_COUNT critical alerts detected!"
    echo "$RESPONSE" | jq '.alerts[] | select(.level=="CRITICAL")'
    # Send notification
    # mail -s "Critical Alerts" admin@example.com < alert_details.txt
fi

Python Client

import requests
import json

# Get all plugin data for a host
response = requests.get('http://localhost:50004/api/0/hosts/webserver01/plugins')
data = response.json()

print(f"Host: {data['hostname']}")
print(f"Plugins: {', '.join(data['plugins'].keys())}")

for plugin, info in data['plugins'].items():
    print(f"\n{plugin}:")
    for metric, value in info['data'].items():
        print(f"  {metric}: {value}")

# Check for alerts
response = requests.get('http://localhost:50004/api/0/alerts')
alerts = response.json()

if alerts['summary']['critical'] > 0:
    print(f"\n⚠️  {alerts['summary']['critical']} CRITICAL ALERTS!")
    for alert in alerts['alerts']:
        if alert['level'] == 'CRITICAL':
            print(f"  - {alert['hostname']}: {alert['metric_path']} = {alert['last_value']}")

Grafana Integration

The API endpoints can be used with Grafana's JSON datasource plugin:

  1. Install the SimpleJSON datasource plugin
  2. Configure datasource URL: http://your-server:50004
  3. Create queries:
    • Metrics: /api/0/hosts/webserver01/plugins/cpu_monitor?limit=100
    • Alerts: /api/0/alerts

Prometheus Integration

Export metrics in Prometheus format (future enhancement):

# Example prometheus exporter
from prometheus_client import Gauge, generate_latest
import requests

cpu_usage = Gauge('heartbeat_cpu_percent', 'CPU usage percentage', ['hostname'])
memory_usage = Gauge('heartbeat_memory_percent', 'Memory usage percentage', ['hostname'])

def collect_metrics():
    hosts = requests.get('http://localhost:50004/api/0/hosts').json()
    for host in hosts:
        hostname = host['name']
        plugins = requests.get(f'http://localhost:50004/api/0/hosts/{hostname}/plugins').json()
        
        if 'cpu_monitor' in plugins['plugins']:
            cpu_data = plugins['plugins']['cpu_monitor']['data']
            cpu_usage.labels(hostname=hostname).set(cpu_data.get('cpu_percent', 0))
        
        if 'memory_monitor' in plugins['plugins']:
            mem_data = plugins['plugins']['memory_monitor']['data']
            memory_usage.labels(hostname=hostname).set(mem_data.get('percent', 0))

Response Formats

Success Response

All successful API calls return HTTP 200 with JSON body:

{
  "field": "value",
  ...
}

Error Response

API errors return appropriate HTTP status codes with JSON:

{
  "error": "Host 'unknown-host' not found"
}

Common Status Codes:

  • 200 OK - Success
  • 400 Bad Request - Invalid parameters
  • 404 Not Found - Resource not found
  • 500 Internal Server Error - Server error

WebSocket API

For real-time updates, connect to the WebSocket endpoint:

URL: ws://your-server:50005/hbd (or wss:// for secure)

Messages:

{
  "type": "host",
  "data": {
    "name": "webserver01",
    "state": "UP"
  }
}
{
  "type": "plugin",
  "data": {
    "host": "webserver01",
    "plugin": "cpu_monitor",
    "data": {...},
    "timestamp": 1711234567.123
  }
}

Configuration

Enable HTTP Server

# In your hbd configuration file
hbd_host: ""           # Listen on all interfaces
hbd_port: 50004        # HTTP port
ws_port: 50005         # WebSocket port (optional)
# wss_port: 50006      # Secure WebSocket (requires SSL)

SSL/TLS Configuration

For secure WebSocket connections:

wss_port: 50006
cert_path: /etc/heartbeat/certs/
wss_pem: server.pem
wss_key: server.key

Rate Limiting

The API currently does not implement rate limiting. For production use, consider:

  • Placing behind a reverse proxy (nginx, Apache)
  • Using API gateway for rate limiting
  • Implementing caching for frequently accessed endpoints

CORS Support

By default, CORS is not enabled. To enable for web applications:

# In http.py, add CORS middleware
from aiohttp_cors import setup as cors_setup

app = web.Application()
cors = cors_setup(app)

# Configure CORS for all routes
for route in list(app.router.routes()):
    cors.add(route, {
        "*": aiohttp_cors.ResourceOptions(
            allow_credentials=True,
            expose_headers="*",
            allow_headers="*",
        )
    })

Performance Considerations

Caching

  • Plugin data is cached in memory (last 100 samples per plugin)
  • No database queries required
  • Responses are fast (<10ms typical)

Scalability

  • Each host stores its own data independently
  • Memory usage: ~1KB per host + ~1KB per plugin sample
  • For 100 hosts with 5 plugins: ~50MB memory

Best Practices

  1. Use limit parameter to control response size
  2. Cache responses on client side when appropriate
  3. Use WebSocket for real-time updates instead of polling
  4. Consider pagination for large deployments (future enhancement)

Troubleshooting

API Returns 404

  • Verify hostname in URL matches actual host name
  • Check host is sending heartbeats: curl http://localhost:50004/api/0/hosts

No Plugin Data

  • Verify client is configured with plugins
  • Check client logs for plugin errors
  • Ensure plugins are sending data (check journal logs)

Empty Alerts

  • Verify thresholds are configured
  • Check host is in watchhosts list
  • Ensure plugins are collecting metrics
  • Review server logs for threshold checker errors

See Also