No description
Find a file
2025-10-17 11:57:11 +02:00
.vscode update 2025-09-29 15:17:27 +02:00
src/data_collector add bootstrap statically 2025-10-17 11:57:11 +02:00
tmp add esp32 voltage sensor 2025-10-13 15:17:16 +02:00
.gitignore initial commit 2025-09-29 09:25:17 +02:00
.python-version initial commit 2025-09-29 09:25:17 +02:00
compose.yml add esp32 voltage sensor 2025-10-13 15:17:16 +02:00
Dockerfile add esp32 voltage sensor 2025-10-13 15:17:16 +02:00
example.env add esp32 voltage sensor 2025-10-13 15:17:16 +02:00
pyproject.toml use numpy 2025-10-13 16:30:42 +02:00
README.md doc update 2025-09-29 15:24:05 +02:00
uv.lock use numpy 2025-10-13 16:30:42 +02:00

Data Collector

A lightweight data acquisition tool with a live web dashboard and daily HDF5 logging. It samples one or more measurements at a fixed interval, provides a live plot in the browser, and periodically persists data to compressed HDF5 files. The history tab lets you aggregate and export PNG plots from previous runs.

  • Live dashboard: Dash + Bootstrap UI with live multi-series graph
  • HDF5 storage: Append-only, per-day files like YYYY-MM-DD_measurements.h5
  • Config via env/CLI: Sampling rate, chunk duration, live window, log level
  • Docker-friendly: Minimal image, volume for logs, configurable port

What it does

  • Runs a background collector that samples configured measurements every CAPTURE_RESOLUTION_MS milliseconds
  • Streams the last LIVE_PLOT_S seconds into a live plot
  • Flushes append-only chunks to HDF5 roughly every SAVE_TIME_INTERVAL_S
  • Offers a History tab to render selected HDF5 files and export a combined PNG

By default, the app generates example measurements (voltage and current) using random functions. You can extend MeasurementSpec and plug in real sensors.

Installation (uv)

Prerequisites:

  • Python 3.13
  • uv package manager
# From repository root
uv sync

This creates a local virtual environment (.venv) and installs dependencies.

Running (CLI)

The package exposes an entry point named data-collector that starts the data collector and the dashboard server.

Environment variables (defaults in parentheses):

  • PORT (8050): Dashboard port
  • LOG_LEVEL (INFO): One of INFO, DEBUG, WARNING, ERROR, CRITICAL
  • CAPTURE_RESOLUTION_MS (100): Sampling interval in milliseconds
  • SAVE_TIME_INTERVAL_S (60): Chunk interval in seconds
  • LIVE_PLOT_S (3600): Live plot window in seconds

Examples:


# Run with defaults (port 8050)
uv run data-collector

# Override via env variables
PORT=9000 LOG_LEVEL=DEBUG LIVE_PLOT_S=600 uv run data-collector

# Override via CLI flags (same names as env vars)
uv run data-collector \
  --capture-resolution-ms 50 \
  --save-time-interval-s 30 \
  --live-plot-s 900 \
  --log-level DEBUG \
  --port 9000

Logs and outputs:

  • HDF5 files are written under logs/ by default, e.g. 2025-09-29_measurements.h5.
  • History PNG exports are also saved into the same logs/ directory.

Systemd Service Setup

Create a systemd service to run Data Collector as a system service:

  1. Create the service file

    # Set variables for the service file
    THIS_DIR=$(pwd)
    SERVICE_USER=$(id -un)
    SERVICE_GROUP=$(id -gn)
    
    # Create the systemd service file
    sudo tee /etc/systemd/system/data-collector.service > /dev/null <<EOF
    [Unit]
    Description=Data Collector Service
    
    [Service]
    Type=simple
    User=$SERVICE_USER
    Group=$SERVICE_GROUP
    WorkingDirectory=$THIS_DIR
    # use  underscore _ for the module name in the path
    ExecStart=$THIS_DIR/.venv/bin/python -m data_collector
    Restart=always
    RestartSec=10
    StandardOutput=journal
    StandardError=journal
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  2. Enable and start the service

    # Reload systemd to recognize the new service
    sudo systemctl daemon-reload
    
    # Enable the service to start on boot
    sudo systemctl enable data-collector.service
    
    # Start the service
    sudo systemctl start data-collector.service
    
  3. Verify the service is running

    # Check service status
    sudo systemctl status data-collector.service
    
    # View service logs
    sudo journalctl -u data-collector.service -f
    

Running with Docker Compose

A ready-to-use Compose file is provided. It builds a slim image using uv in a builder stage and ships the virtual environment into the final runtime image.

Key points:

  • Volume mount ./logs to /app/logs to persist data on the host
  • Exposes PORT (default 8050)
  • Environment variables mirror the CLI options
  • Runs as the current user (UID:GID) by default to avoid permission issues

Quick start:

# Optionally copy and modify example.env
cp example.env .env
# Edit .env to set PORT, LOG_LEVEL, CAPTURE_RESOLUTION_MS, etc.

# Build and run
docker compose up --build

The service is named data-collector. Once running, open the dashboard at:

  • http://localhost:8050 (or your chosen PORT)

Customizing via .env:

# .env
PORT=8050
LOG_LEVEL=INFO
CAPTURE_RESOLUTION_MS=100
SAVE_TIME_INTERVAL_S=60
LIVE_PLOT_S=3600
# Optional: host log directory
LOG_DIR=./logs
# Optional: run container as a specific user
UID=$(id -u)
GID=$(id -g)

Then run:

docker compose up -d --build

Stopping and viewing logs:

docker compose logs -f
docker compose down

Extending measurements

Measurements are defined with MeasurementSpec and added to the collector configuration. See data_collector/__main__.py and data_collector/core/collector.py. To add real sensors, implement a function that returns a float and wire it into measurements=[...] when creating the CollectorConfig.

Development notes

  • Code style: Black, isort; type hints throughout
  • Storage: see data_collector/storage/hdf5_storage.py
  • Web app: see data_collector/server/app.py and data_collector/server/callbacks.py
  • Logs directory: logs/ (mounted to /app/logs in Docker)