Files
heartbeat/docs/MESSAGE_JOURNAL.md
T
Andreas Wrede 0543266c92 Major refactoring of the codebase, including restructuring of files and directories, renaming of modules and classes, and improvements to the overall organization and readability of the code. This refactoring aims to enhance maintainability, scalability, and clarity of the codebase while preserving existing functionality. The changes include:
- Restructuring of the project directory into client and server components
- Renaming of modules and classes to better reflect their purpose and functionality
- Moving common utilities and configurations to a shared location
- Updating import statements to reflect the new structure
- Adding new documentation files for better clarity on various aspects of the project
- Removing deprecated or unused code to streamline the codebase
- Ensuring that all existing functionality is preserved and that the codebase remains functional after the refactoring.
2026-03-29 11:13:40 -04:00

414 lines
11 KiB
Markdown

# Message Journal
The message journal provides persistent logging of all received heartbeat messages with automatic size-based log rotation.
## Overview
The journal logs every message received by the heartbeat daemon (hbd) in JSON format, making it easy to:
- Audit message history
- Debug connection issues
- Analyze traffic patterns
- Replay messages for testing
- Create historical reports
## Features
- **JSON Format**: Each message is logged as a single JSON line for easy parsing
- **Size-Based Rotation**: Automatically rotates logs when size threshold is reached
- **Automatic Cleanup**: Keeps only a configurable number of backup files
- **Thread-Safe**: Safe for concurrent access from multiple async tasks
- **Configurable**: All settings controllable via configuration file
- **Performance**: Non-blocking async operation with minimal overhead
## Configuration
Add these settings to your hbd configuration file (e.g., `.hb.yaml`):
```yaml
# Message journal configuration
journal_enabled: true # Enable/disable journaling
journal_dir: /var/log/heartbeat # Directory for journal files
journal_file: messages.journal # Base filename
journal_max_size: 104857600 # Max size in bytes (100MB default)
journal_max_backups: 10 # Number of backup files to keep
```
### Configuration Options
| Option | Default | Description |
|--------|---------|-------------|
| `journal_enabled` | `true` | Enable or disable message journaling |
| `journal_dir` | `/var/log/heartbeat` | Directory where journal files are stored |
| `journal_file` | `messages.journal` | Base filename for the journal |
| `journal_max_size` | `104857600` (100MB) | Maximum file size before rotation |
| `journal_max_backups` | `10` | Number of rotated backup files to keep |
## File Format
Messages are logged in JSONL (JSON Lines) format - one JSON object per line:
```json
{"timestamp":1711234567.123,"datetime":"2026-03-28T12:34:56","source_ip":"192.168.1.100","source_port":50003,"message":{"ID":"HTB","name":"webserver1","interval":30,"ver":1}}
{"timestamp":1711234597.456,"datetime":"2026-03-28T12:35:37","source_ip":"192.168.1.101","source_port":50003,"message":{"ID":"PLG","plugin":"cpu_monitor","cpu_percent":45.2,"load_1min":1.5}}
```
### Entry Structure
Each journal entry contains:
| Field | Type | Description |
|-------|------|-------------|
| `timestamp` | float | Unix timestamp (seconds since epoch) |
| `datetime` | string | ISO 8601 formatted datetime |
| `source_ip` | string | Source IP address |
| `source_port` | integer | Source UDP port |
| `message` | object | Complete parsed message dictionary |
## Log Rotation
### How Rotation Works
1. Journal writes messages to the current file
2. When file size exceeds `journal_max_size`, rotation is triggered
3. Current file is renamed with timestamp: `messages.journal.YYYYMMDD-HHMMSS`
4. New empty file is created as the current journal
5. Old backup files exceeding `journal_max_backups` are deleted
### Example File Structure
```
/var/log/heartbeat/
├── messages.journal # Current active journal
├── messages.journal.20260328-120000 # Rotated backup
├── messages.journal.20260328-140000 # Rotated backup
└── messages.journal.20260328-160000 # Rotated backup (oldest)
```
### Rotation Behavior
- Rotation is triggered when the next message would exceed the size limit
- Rotation is automatic and requires no manual intervention
- Old backups are deleted in FIFO order (oldest first)
- Rotation is thread-safe and won't lose messages
## Usage Examples
### Reading Journal Files
#### Using Python
```python
import json
# Read all entries from current journal
with open('/var/log/heartbeat/messages.journal', 'r') as f:
for line in f:
entry = json.loads(line)
print(f"{entry['datetime']} - {entry['source_ip']} - {entry['message']['ID']}")
```
#### Using jq (command line)
```bash
# View all messages
cat /var/log/heartbeat/messages.journal | jq .
# Filter by message type
cat /var/log/heartbeat/messages.journal | jq 'select(.message.ID == "HTB")'
# Filter by hostname
cat /var/log/heartbeat/messages.journal | jq 'select(.message.name == "webserver1")'
# Count messages by type
cat /var/log/heartbeat/messages.journal | jq -r '.message.ID' | sort | uniq -c
# Extract timestamps and source IPs
cat /var/log/heartbeat/messages.journal | jq -r '[.datetime, .source_ip, .message.ID] | @tsv'
```
#### Using shell tools
```bash
# Count total messages
wc -l /var/log/heartbeat/messages.journal
# View recent messages
tail -n 100 /var/log/heartbeat/messages.journal | jq .
# Search for specific host
grep -F '"name":"webserver1"' /var/log/heartbeat/messages.journal
# Check journal file size
du -h /var/log/heartbeat/messages.journal
```
### Analyzing Historical Data
```bash
# Combine all journal files (current + backups)
cat /var/log/heartbeat/messages.journal* | jq . > all_messages.json
# Count messages per host
cat /var/log/heartbeat/messages.journal* | jq -r '.message.name // "unknown"' | sort | uniq -c
# Find all plugin messages
cat /var/log/heartbeat/messages.journal* | jq 'select(.message.ID == "PLG")'
# Extract CPU metrics from plugin messages
cat /var/log/heartbeat/messages.journal* | \
jq 'select(.message.plugin == "cpu_monitor") | {time: .datetime, host: .message.name, cpu: .message.cpu_percent}'
```
## Integration with Log Management
### Logrotate
While the journal has built-in rotation, you can also use logrotate for additional management:
```
/var/log/heartbeat/messages.journal.* {
daily
rotate 30
compress
delaycompress
missingok
notifempty
}
```
### Elasticsearch/OpenSearch
Import journal data into Elasticsearch for advanced analysis:
```python
from elasticsearch import Elasticsearch
import json
es = Elasticsearch(['localhost:9200'])
with open('/var/log/heartbeat/messages.journal', 'r') as f:
for line in f:
entry = json.loads(line)
es.index(index='heartbeat-messages', body=entry)
```
### Splunk
Create a Splunk input for the journal:
```ini
[monitor:///var/log/heartbeat/messages.journal*]
sourcetype = heartbeat_json
index = heartbeat
```
## Performance Considerations
### Overhead
- Journal writing is async and non-blocking
- Typical overhead: < 1ms per message
- Minimal impact on heartbeat processing
### Disk Usage
Calculate expected disk usage:
```
Messages per day = (86400 seconds / interval) * number_of_hosts
Average message size ≈ 200-500 bytes
Daily disk usage = Messages per day * Average message size
Example:
- 100 hosts
- 30 second interval
- 2880 messages/day per host
- 288,000 messages/day total
- ~60-140 MB/day
```
### Recommendations
- **Small deployments** (< 50 hosts): Default settings work well
- **Medium deployments** (50-500 hosts): Increase `journal_max_size` to 500MB, `journal_max_backups` to 20
- **Large deployments** (> 500 hosts): Consider 1GB+ journal files, 30+ backups, or external log aggregation
## Monitoring
### Check Journal Status
The journal exposes statistics that can be queried:
```python
from hbd.journal import get_journal
journal = get_journal()
stats = journal.get_stats()
print(f"Current size: {stats['current_size']:,} bytes")
print(f"Rotation threshold: {stats['rotation_threshold']}")
```
### Log Messages
Journal operations are logged at appropriate levels:
- `INFO`: Initialization, rotation events, cleanup
- `DEBUG`: Individual message logging
- `WARNING`: Non-critical issues
- `ERROR`: Critical failures
Check hbd logs for journal-related messages:
```bash
grep journal /var/log/heartbeat.log
```
## Troubleshooting
### Journal Files Not Created
**Problem**: No journal files appear in the configured directory.
**Solutions**:
- Check `journal_enabled: true` in configuration
- Verify directory exists and hbd has write permissions
- Check hbd logs for initialization errors
- Verify disk space is available
### Rotation Not Working
**Problem**: Journal file grows beyond `journal_max_size`.
**Solutions**:
- Check that `journal_max_size` is properly configured
- Verify hbd has permission to rename/create files
- Check for filesystem issues
- Review hbd logs for rotation errors
### Missing Messages
**Problem**: Some messages don't appear in journal.
**Solutions**:
- Verify `journal_enabled: true`
- Check for write errors in hbd logs
- Verify sufficient disk space
- Check if filesystem is read-only
### Performance Issues
**Problem**: Journal causing slow message processing.
**Solutions**:
- Use faster storage (SSD) for journal directory
- Increase `journal_max_size` to reduce rotation frequency
- Disable journal if not needed: `journal_enabled: false`
- Consider async syslog forwarding instead
## Security Considerations
### File Permissions
Ensure proper permissions on journal files:
```bash
# Journal directory
chmod 750 /var/log/heartbeat
chown hbd:hbd /var/log/heartbeat
# Journal files
chmod 640 /var/log/heartbeat/messages.journal*
```
### Sensitive Data
Journal files may contain:
- Hostnames and IP addresses
- System metrics
- Custom message content
**Recommendations**:
- Restrict read access to authorized users only
- Consider encryption for archived journals
- Implement log retention policies
- Sanitize data if sharing for debugging
## API Reference
### MessageJournal Class
```python
class MessageJournal:
def __init__(self, config: Dict[str, Any])
async def initialize(self) -> bool
async def log_message(self, msg: Dict, addr: tuple, timestamp: float)
async def close(self)
def get_stats(self) -> Dict[str, Any]
```
### Module Functions
```python
def get_journal(config: Dict = None) -> MessageJournal
async def log_message(msg: Dict, addr: tuple, timestamp: float = None)
```
## Example: Custom Message Processing
Process journal messages in real-time:
```python
import asyncio
import json
from pathlib import Path
async def tail_journal(journal_path):
"""Follow journal file and process new messages."""
path = Path(journal_path)
with open(path, 'r') as f:
# Jump to end
f.seek(0, 2)
while True:
line = f.readline()
if line:
entry = json.loads(line)
await process_message(entry)
else:
await asyncio.sleep(0.1)
async def process_message(entry):
"""Process a journal entry."""
msg = entry['message']
# Alert on boot messages
if msg.get('boot'):
print(f"ALERT: {msg['name']} rebooted at {entry['datetime']}")
# Track CPU usage
if msg.get('ID') == 'PLG' and msg.get('plugin') == 'cpu_monitor':
cpu = msg.get('cpu_percent', 0)
if cpu > 90:
print(f"WARNING: {entry['source_ip']} CPU usage: {cpu}%")
```
## Future Enhancements
Potential improvements for future versions:
- Compression of rotated logs (gzip)
- Time-based rotation in addition to size-based
- Filtering to exclude certain message types
- Structured logging output formats (CEF, GELF)
- Remote syslog forwarding
- Message deduplication
- Journal file encryption
- Signed journal entries
## See Also
- [Configuration Guide](../hbd/config.py) - Full configuration options
- [UDP Protocol](../hbd/udp.py) - Message handling
- [Server Architecture](../hbd/server.py) - Server initialization