docs: add wiki home page with overview and getting started guide
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,210 @@
|
||||
# Heartbeat
|
||||
|
||||
Heartbeat is a lightweight host monitoring system built around a simple idea: each machine you want to monitor runs a small client (`hbc`) that sends a UDP "heartbeat" packet to a central server (`hbd`) on a regular interval. If a heartbeat stops arriving, you get notified. Alongside reachability, clients can ship system metrics — CPU, memory, disk, network — and the server will alert you when any of those cross a threshold.
|
||||
|
||||
## How it works
|
||||
|
||||
```
|
||||
[ monitored host ] [ your server ]
|
||||
┌─────────────┐ UDP 50003 ┌────────────────────────┐
|
||||
│ hbc │ ────────────> │ hbd │
|
||||
│ │ │ host state tracking │
|
||||
│ plugins: │ <──────────── │ threshold alerting │
|
||||
│ cpu, mem, │ ACK / CMD │ notifications │
|
||||
│ disk, ... │ │ web dashboard + API │
|
||||
└─────────────┘ └────────────────────────┘
|
||||
```
|
||||
|
||||
- **hbd** — the server daemon. Tracks which hosts are alive, evaluates metric thresholds, fires notifications, serves the web dashboard and REST API.
|
||||
- **hbc** — the client. Sends heartbeats and plugin data over UDP. Runs on any Linux/BSD/macOS host.
|
||||
- **hbc_mini** — a zero-dependency single-file alternative (`hbc_mini.py` or `hbc_mini.c`) for hosts where you can't install Python packages.
|
||||
|
||||
Notifications can go to Pushover, email, Mattermost, Matrix, Signal, or VoIP.ms SMS. The dashboard shows host connectivity, RTT graphs, active alerts, and per-host plugin metrics in real time via WebSocket.
|
||||
|
||||
---
|
||||
|
||||
## Getting started
|
||||
|
||||
This tutorial sets up a server on one machine and a client on a second machine. You'll end up with a working dashboard and your first host being monitored.
|
||||
|
||||
### 1. Install the server
|
||||
|
||||
On the machine that will run `hbd`:
|
||||
|
||||
```bash
|
||||
git clone https://git.wrede.ca/andreas/heartbeat.git
|
||||
cd heartbeat
|
||||
python3 -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install .
|
||||
```
|
||||
|
||||
Verify the install:
|
||||
|
||||
```bash
|
||||
hbd --help
|
||||
```
|
||||
|
||||
### 2. Create a server config
|
||||
|
||||
Create `~/.hb.yaml`:
|
||||
|
||||
```yaml
|
||||
hb_port: 50003 # UDP port — clients send heartbeats here
|
||||
hbd_port: 50004 # HTTP port — web dashboard and API
|
||||
ws_port: 50005 # WebSocket port — live dashboard updates
|
||||
|
||||
interval: 20 # Expected heartbeat interval (seconds)
|
||||
grace: 2 # Seconds of slack before a host is considered overdue
|
||||
|
||||
pickfile: ~/.hb.pick
|
||||
pidfile: ~/.hb.pid
|
||||
logfile: ~/.hb.log
|
||||
```
|
||||
|
||||
That's enough to get started. No hosts, no users, no notifications needed yet — the server will accept any client that connects.
|
||||
|
||||
### 3. Start the server
|
||||
|
||||
```bash
|
||||
hbd serve -c ~/.hb.yaml -f -v
|
||||
```
|
||||
|
||||
`-f` keeps it in the foreground so you can watch the log. You should see:
|
||||
|
||||
```
|
||||
Heartbeat daemon starting on UDP :50003, HTTP :50004, WS :50005
|
||||
```
|
||||
|
||||
Open `http://your-server:50004/live` in a browser. The dashboard is empty for now.
|
||||
|
||||
### 4. Install the client on a host to monitor
|
||||
|
||||
On the machine you want to monitor (must be able to reach the server on UDP 50003):
|
||||
|
||||
```bash
|
||||
pip install hbd # or: copy scripts/hbc_mini.py if you can't install packages
|
||||
```
|
||||
|
||||
#### Quick start — no config file
|
||||
|
||||
```bash
|
||||
hbc your-server.example.com
|
||||
```
|
||||
|
||||
Within a few seconds the server log will show the host checking in, and it will appear on the dashboard.
|
||||
|
||||
#### With a config file
|
||||
|
||||
Create `~/.hbc.yaml` on the client host:
|
||||
|
||||
```yaml
|
||||
hb_port: 50003
|
||||
interval: 10 # Send a heartbeat every 10 seconds
|
||||
|
||||
plugins:
|
||||
cpu_monitor:
|
||||
interval: 60
|
||||
memory_monitor:
|
||||
interval: 60
|
||||
disk_monitor:
|
||||
interval: 60
|
||||
```
|
||||
|
||||
Then start the client:
|
||||
|
||||
```bash
|
||||
hbc -c ~/.hbc.yaml your-server.example.com
|
||||
```
|
||||
|
||||
Send a boot message at startup so the server logs when the host came up:
|
||||
|
||||
```bash
|
||||
hbc -b -c ~/.hbc.yaml your-server.example.com
|
||||
```
|
||||
|
||||
Run as a daemon (logs go to syslog):
|
||||
|
||||
```bash
|
||||
hbc -d -b -c ~/.hbc.yaml your-server.example.com
|
||||
```
|
||||
|
||||
### 5. View the dashboard
|
||||
|
||||
Open `http://your-server:50004/live`. You'll see the monitored host, its last heartbeat time, and RTT. Click the host name to see plugin metrics.
|
||||
|
||||
Navigate to `/plugins/<hostname>` for CPU, memory, and disk graphs.
|
||||
|
||||
### 6. Add a notification channel (optional)
|
||||
|
||||
Edit `~/.hb.yaml` on the server:
|
||||
|
||||
```yaml
|
||||
notification_channels:
|
||||
pushover_ops:
|
||||
type: pushover
|
||||
token: YOUR_APP_TOKEN
|
||||
user: YOUR_USER_KEY
|
||||
|
||||
users:
|
||||
alice:
|
||||
password: pbkdf2:sha256:... # generate: hbd passwd alice
|
||||
admin: true
|
||||
notification_channels: [pushover_ops]
|
||||
|
||||
default_owner: alice
|
||||
```
|
||||
|
||||
Generate the password hash:
|
||||
|
||||
```bash
|
||||
hbd passwd alice
|
||||
```
|
||||
|
||||
Paste the output into the config, then reload:
|
||||
|
||||
```bash
|
||||
hbd reload
|
||||
```
|
||||
|
||||
Test the channel:
|
||||
|
||||
```bash
|
||||
hbd notify
|
||||
```
|
||||
|
||||
### 7. Set a threshold alert (optional)
|
||||
|
||||
Add to `~/.hb.yaml`:
|
||||
|
||||
```yaml
|
||||
thresholds:
|
||||
cpu_monitor:
|
||||
cpu_percent:
|
||||
warning: 80.0
|
||||
critical: 90.0
|
||||
disk_monitor:
|
||||
partitions:
|
||||
/:
|
||||
percent:
|
||||
warning: 80.0
|
||||
critical: 90.0
|
||||
```
|
||||
|
||||
Reload: `hbd reload`. The server will now alert when a monitored host crosses these values.
|
||||
|
||||
---
|
||||
|
||||
## What's next
|
||||
|
||||
| Topic | Where to look |
|
||||
|---|---|
|
||||
| Full server config reference | [README — Server](README.md#server-hbd) |
|
||||
| Client options and all plugins | [README — Client](README.md#client-hbc) |
|
||||
| Threshold alerting details | [docs/THRESHOLD_ALERTING.md](docs/THRESHOLD_ALERTING.md) |
|
||||
| Notification channels | [docs/NOTIFICATIONS.md](docs/NOTIFICATIONS.md) |
|
||||
| User accounts and roles | [docs/USERS.md](docs/USERS.md) |
|
||||
| Writing a custom plugin | [docs/PLUGIN_DEVELOPMENT.md](docs/PLUGIN_DEVELOPMENT.md) |
|
||||
| Nagios check integration | [docs/NAGIOS_INTEGRATION.md](docs/NAGIOS_INTEGRATION.md) |
|
||||
| REST API | [docs/HTTP_API.md](docs/HTTP_API.md) |
|
||||
| Zero-dependency client | [README — hbc_mini](README.md#hbc_mini--zero-dependency-client) |
|
||||
Reference in New Issue
Block a user