| .vscode | ||
| src/data_collector | ||
| tmp | ||
| .gitignore | ||
| .python-version | ||
| compose.yml | ||
| Dockerfile | ||
| example.env | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
Data Collector
A lightweight data acquisition tool with a live web dashboard and daily HDF5 logging. It samples one or more measurements at a fixed interval, provides a live plot in the browser, and periodically persists data to compressed HDF5 files. The history tab lets you aggregate and export PNG plots from previous runs.
- Live dashboard: Dash + Bootstrap UI with live multi-series graph
- HDF5 storage: Append-only, per-day files like
YYYY-MM-DD_measurements.h5 - Config via env/CLI: Sampling rate, chunk duration, live window, log level
- Docker-friendly: Minimal image, volume for logs, configurable port
What it does
- Runs a background collector that samples configured measurements every
CAPTURE_RESOLUTION_MSmilliseconds - Streams the last
LIVE_PLOT_Sseconds into a live plot - Flushes append-only chunks to HDF5 roughly every
SAVE_TIME_INTERVAL_S - Offers a History tab to render selected HDF5 files and export a combined PNG
By default, the app generates example measurements (voltage and current) using random
functions. You can extend MeasurementSpec and plug in real sensors.
Installation (uv)
Prerequisites:
- Python 3.13
uvpackage manager
# From repository root
uv sync
This creates a local virtual environment (.venv) and installs dependencies.
Running (CLI)
The package exposes an entry point named data-collector that starts the data collector
and the dashboard server.
Environment variables (defaults in parentheses):
PORT(8050): Dashboard portLOG_LEVEL(INFO): One of INFO, DEBUG, WARNING, ERROR, CRITICALCAPTURE_RESOLUTION_MS(100): Sampling interval in millisecondsSAVE_TIME_INTERVAL_S(60): Chunk interval in secondsLIVE_PLOT_S(3600): Live plot window in seconds
Examples:
# Run with defaults (port 8050)
uv run data-collector
# Override via env variables
PORT=9000 LOG_LEVEL=DEBUG LIVE_PLOT_S=600 uv run data-collector
# Override via CLI flags (same names as env vars)
uv run data-collector \
--capture-resolution-ms 50 \
--save-time-interval-s 30 \
--live-plot-s 900 \
--log-level DEBUG \
--port 9000
Logs and outputs:
- HDF5 files are written under
logs/by default, e.g.2025-09-29_measurements.h5. - History PNG exports are also saved into the same
logs/directory.
Systemd Service Setup
Create a systemd service to run Data Collector as a system service:
-
Create the service file
# Set variables for the service file THIS_DIR=$(pwd) SERVICE_USER=$(id -un) SERVICE_GROUP=$(id -gn) # Create the systemd service file sudo tee /etc/systemd/system/data-collector.service > /dev/null <<EOF [Unit] Description=Data Collector Service [Service] Type=simple User=$SERVICE_USER Group=$SERVICE_GROUP WorkingDirectory=$THIS_DIR # use underscore _ for the module name in the path ExecStart=$THIS_DIR/.venv/bin/python -m data_collector Restart=always RestartSec=10 StandardOutput=journal StandardError=journal [Install] WantedBy=multi-user.target EOF -
Enable and start the service
# Reload systemd to recognize the new service sudo systemctl daemon-reload # Enable the service to start on boot sudo systemctl enable data-collector.service # Start the service sudo systemctl start data-collector.service -
Verify the service is running
# Check service status sudo systemctl status data-collector.service # View service logs sudo journalctl -u data-collector.service -f
Running with Docker Compose
A ready-to-use Compose file is provided. It builds a slim image using uv in a builder
stage and ships the virtual environment into the final runtime image.
Key points:
- Volume mount
./logsto/app/logsto persist data on the host - Exposes
PORT(default 8050) - Environment variables mirror the CLI options
- Runs as the current user (
UID:GID) by default to avoid permission issues
Quick start:
# Optionally copy and modify example.env
cp example.env .env
# Edit .env to set PORT, LOG_LEVEL, CAPTURE_RESOLUTION_MS, etc.
# Build and run
docker compose up --build
The service is named data-collector. Once running, open the dashboard at:
http://localhost:8050(or your chosenPORT)
Customizing via .env:
# .env
PORT=8050
LOG_LEVEL=INFO
CAPTURE_RESOLUTION_MS=100
SAVE_TIME_INTERVAL_S=60
LIVE_PLOT_S=3600
# Optional: host log directory
LOG_DIR=./logs
# Optional: run container as a specific user
UID=$(id -u)
GID=$(id -g)
Then run:
docker compose up -d --build
Stopping and viewing logs:
docker compose logs -f
docker compose down
Extending measurements
Measurements are defined with MeasurementSpec and added to the collector
configuration. See data_collector/__main__.py and data_collector/core/collector.py.
To add real sensors, implement a function that returns a float and wire it into
measurements=[...] when creating the CollectorConfig.
Development notes
- Code style: Black, isort; type hints throughout
- Storage: see
data_collector/storage/hdf5_storage.py - Web app: see
data_collector/server/app.pyanddata_collector/server/callbacks.py - Logs directory:
logs/(mounted to/app/logsin Docker)