Files

refactor(modbus): switch to native Modbus TCP protocol

Switch from Modbus RTU over TCP to native Modbus TCP based on hardware
testing. Uses standard MBAP header (no CRC16), port 502, and TCP-only
tokio-modbus feature for simpler implementation.

Updated: Cargo.toml, plan.md, research.md, tasks.md

2026-01-22 00:57:10 +01:00

23 KiB

Raw Blame History

Research Document: Modbus Relay Control System

Created: 2025-12-28 Feature: spec.md Status: Complete

Executive Summary
Tokio-Modbus Research
WebSocket vs HTTP Polling
Existing Codebase Patterns
Integration Recommendations

Executive Summary

Key Decisions

Decision Area	Recommendation	Rationale
Modbus Library	tokio-modbus 0.17.0	Native async/await, production-ready, good testability
Communication Pattern	HTTP Polling (as in spec)	Simpler, reliable, adequate for 10 users @ 2s intervals
Connection Management	Arc<Mutex> for MVP	Single device, simple, can upgrade later if needed
Retry Strategy	Simple retry-once helper	Matches FR-007 requirement
Testing Approach	Trait-based abstraction with mockall	Enables >90% coverage without hardware

User Input Analysis

User requested: "Use tokio-modbus crate, poem-openapi for REST API, Vue.js with WebSocket for real-time updates"

Findings:

✅ tokio-modbus 0.17.0: Excellent choice, validated by research
✅ poem-openapi: Already in use, working well
⚠️ WebSocket vs HTTP Polling: Spec says HTTP polling (FR-028). WebSocket adds 43x complexity for negligible benefit at this scale.

RECOMMENDATION: Maintain HTTP polling as specified. WebSocket complexity not justified for 10 concurrent users with 2-second update intervals.

Deployment Architecture

User clarification (2025-12-29): Frontend on Cloudflare Pages, backend on Raspberry Pi behind Traefik with Authelia

Architecture:

Frontend: Cloudflare Pages (Vue 3 static build) - global CDN delivery
Backend: Raspberry Pi HTTP API (same local network as Modbus device)
Reverse Proxy: Traefik on Raspberry Pi
- HTTPS termination (TLS certificates)
- Authelia middleware for authentication
- Routes frontend requests to backend HTTP service
Communication Flow:
- Frontend (CDN) → HTTPS → Traefik (HTTPS termination + auth) → Backend (HTTP) → Modbus TCP → Device

Security:

Frontend-Backend: HTTPS via Traefik (encrypted, authenticated)
Backend-Device: Modbus TCP on local network (unencrypted, local only)

Tokio-Modbus Research

Decision: Recommended Patterns

Primary Recommendation: Use tokio-modbus 0.17.0 with a custom trait-based abstraction layer (RelayController trait) for testability. Implement connection management using Arc<Mutex> for MVP.

Technical Details

Version: tokio-modbus 0.17.0 (latest stable, released 2025-10-22)

Protocol: Modbus TCP (native TCP protocol)

Hardware configured to use native Modbus TCP protocol
Uses MBAP (Modbus Application Protocol) header
No CRC16 validation (TCP/IP handles error detection)
Standard Modbus TCP protocol on port 502

Connection Strategy:

Shared Arc<Mutex<Context>> for simplicity
Single persistent connection (only one device)
Can migrate to dedicated async task pattern if reconnection logic needed

Timeout Handling:

Wrap all operations with tokio::time::timeout(Duration::from_secs(3), ...)
CRITICAL: tokio-modbus has NO built-in timeouts

Retry Logic:

Implement simple retry-once helper per FR-007
Matches specification requirement

Testing:

Use mockall crate with async-trait for unit testing
Trait abstraction enables testing without hardware
Supports >90% test coverage target (NFR-013)

Critical Gotchas

Device Protocol Configuration: Hardware MUST be configured to use Modbus TCP protocol (not RTU over TCP) via VirCom software
- Set "Transfer Protocol" to "Modbus TCP protocol" in Advanced Settings
- Device automatically switches to port 502 when TCP protocol is selected
Device Gateway Configuration: Hardware MUST be set to "Multi-host non-storage type" - default storage type sends spurious queries causing failures
No Built-in Timeouts: tokio-modbus has NO automatic timeouts - must wrap every operation with tokio::time::timeout
Address Indexing: Relays labeled 1-8, but Modbus addresses are 0-7 (use newtype pattern with conversion methods)
Nested Result Handling: Returns Result<Result<T, Exception>, std::io::Error> - must handle both layers (use ??? triple-question-mark pattern)
Concurrent Access: Context is not thread-safe - requires Arc<Mutex> or dedicated task serialization

Code Examples

Basic Connection Setup:

use tokio_modbus::prelude::*;
use tokio::time::{timeout, Duration};

// Connect to device using Modbus TCP on standard port 502
let socket_addr = "192.168.1.200:502".parse()?;
let mut ctx = tcp::connect(socket_addr).await?;

// Set slave ID (unit identifier)
ctx.set_slave(Slave(0x01));

// Read all 8 relay states with timeout
let states = timeout(
    Duration::from_secs(3),
    ctx.read_coils(0x0000, 8)
).await???; // Triple-? handles timeout + transport + exception errors

Note: Modbus TCP uses the standard MBAP header and does not require CRC16 validation. The protocol is cleaner and more standardized than RTU over TCP.

Toggle Relay with Retry:

async fn toggle_relay(
    ctx: &mut Context,
    relay_id: u8, // 1-8
) -> Result<(), RelayError> {
    let addr = (relay_id - 1) as u16; // Convert to 0-7

    // Read current state
    let states = timeout(Duration::from_secs(3), ctx.read_coils(addr, 1))
        .await???;
    let current = states[0];

    // Write opposite state with retry
    let new_state = !current;
    let write_op = || async {
        timeout(Duration::from_secs(3), ctx.write_single_coil(addr, new_state))
            .await
    };

    // Retry once on failure (FR-007)
    match write_op().await {
        Ok(Ok(Ok(()))) => Ok(()),
        Err(_) | Ok(Err(_)) | Ok(Ok(Err(_))) => {
            tracing::warn!("Write failed, retrying");
            write_op().await???
        }
    }
}

Trait-Based Abstraction for Testing:

use async_trait::async_trait;

#[async_trait]
pub trait RelayController: Send + Sync {
    async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError>;
    async fn write_state(&mut self, relay_id: RelayId, state: RelayState) -> Result<(), RelayError>;
}

// Real implementation with tokio-modbus
pub struct ModbusRelayController {
    ctx: Arc<Mutex<Context>>,
}

#[async_trait]
impl RelayController for ModbusRelayController {
    async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError> {
        let mut ctx = self.ctx.lock().await;
        timeout(Duration::from_secs(3), ctx.read_coils(0, 8))
            .await
            .map_err(|_| RelayError::Timeout)?
            .map_err(RelayError::Transport)?
            .map_err(RelayError::Exception)
    }
    // ... other methods
}

// Mock for testing (using mockall)
mock! {
    pub RelayController {}

    #[async_trait]
    impl RelayController for RelayController {
        async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError>;
        async fn write_state(&mut self, relay_id: RelayId, state: RelayState) -> Result<(), RelayError>;
    }
}

Alternatives Considered

modbus-robust: Provides auto-reconnection but lacks retry logic and timeouts - insufficient for production
bb8 connection pool: Overkill for single-device scenario, adds unnecessary complexity
Synchronous modbus-rs: Would block Tokio threads, poor scalability for concurrent users
Custom Modbus implementation: Reinventing wheel, error-prone, significant development time

Resources

WebSocket vs HTTP Polling

Recommendation: HTTP Polling (as specified)

The specification's decision to use HTTP polling is technically sound. HTTP polling is the better choice for this specific use case.

Performance at Your Scale (10 users, 2-second intervals)

Bandwidth Comparison:

HTTP Polling: ~20 Kbps (10 users × 0.5 req/sec × 500 bytes × 8)
WebSocket: ~2.4 Kbps sustained
Difference: 17.6 Kbps - negligible on any modern network

Server Load:

HTTP Polling: 5 requests/second system-wide (trivial)
WebSocket: 10 persistent connections (~80-160 KB memory)
Verdict: Both are trivial at this scale

Implementation Complexity

HTTP Polling:

Backend: 0 lines (reuse existing GET /api/relays)
Frontend: ~10 lines (simple setInterval)
Total effort: 15 minutes

WebSocket:

Backend: ~115 lines (handler + background poller + channel setup)
Frontend: ~135 lines (WebSocket manager + reconnection logic)
Testing: ~180 lines (connection lifecycle + reconnection tests)
Total effort: 2-3 days + ongoing maintenance

Complexity ratio: 43x more code for WebSocket

Reliability & Error Handling

HTTP Polling Advantages:

Stateless (automatic recovery on next poll)
Standard HTTP error codes
Works everywhere (proxies, firewalls, old browsers)
No connection state management
Simple testing

WebSocket Challenges:

Connection lifecycle management
Exponential backoff reconnection logic
State synchronization on reconnect
Thundering herd problem (all clients reconnect after server restart)
May fail behind corporate proxies (requires fallback to HTTP polling anyway)

Decision Matrix

Criterion	HTTP Polling	WebSocket	Weight
Simplicity	5	2	3x
Reliability	5	3	3x
Testing	5	2	2x
Performance @ 10 users	4	5	1x
Scalability to 100+	3	5	1x
Architecture fit	5	3	2x

Weighted Scores:

HTTP Polling: 4.56/5
WebSocket: 3.19/5

HTTP Polling scores 43% higher when complexity, reliability, and testing are properly weighted for this project's scale.

When WebSocket Makes Sense

WebSocket advantages manifest at:

100+ concurrent users (4x throughput advantage becomes meaningful)
Sub-second update requirements (<1 second intervals)
High-frequency updates where latency matters
Bidirectional communication (chat, gaming, trading systems)

For relay control with 2-second polling:

Latency: 0-4 seconds (avg 2 sec) - acceptable for lights/pumps
Not a real-time critical system (not chat, gaming, or trading)

Migration Path (If Needed Later)

Starting with HTTP polling does NOT prevent WebSocket adoption later:

Phase 1: Add /api/ws endpoint (non-breaking change)
Phase 2: Progressive enhancement (detect WebSocket support)
Phase 3: Gradual rollout with monitoring

Key Point: HTTP polling provides a baseline. Adding WebSocket later is straightforward, but removing WebSocket complexity is harder.

Poem WebSocket Support (For Reference)

Poem has excellent WebSocket support through poem::web::websocket:

use poem::web::websocket::{WebSocket, Message};

#[handler]
async fn ws_handler(
    ws: WebSocket,
    state_tx: Data<&watch::Sender<RelayCollection>>,
) -> impl IntoResponse {
    ws.on_upgrade(move |socket| async move {
        let (mut sink, mut stream) = socket.split();
        let mut rx = state_tx.subscribe();

        // Send initial state
        let initial = rx.borrow().clone();
        sink.send(Message::text(serde_json::to_string(&initial)?)).await?;

        // Stream updates
        while rx.changed().await.is_ok() {
            let state = rx.borrow().clone();
            sink.send(Message::text(serde_json::to_string(&state)?)).await?;
        }
    })
}

Broadcasting Pattern: Use tokio::sync::watch channel:

Maintains only most recent value (perfect for relay state)
Automatic deduplication of identical states
New connections get immediate state snapshot
Memory-efficient (single state copy)

Resources

Existing Codebase Patterns

Architecture Overview

The current codebase is a well-structured Rust backend API using Poem framework with OpenAPI support, following clean architecture principles.

Current Structure:

src/
├── lib.rs          - Library entry point, orchestrates application setup
├── main.rs         - Binary entry point, calls lib::run()
├── startup.rs      - Application builder, server configuration, route setup
├── settings.rs     - Configuration from YAML files + environment variables
├── telemetry.rs    - Logging and tracing setup
├── route/          - HTTP endpoint handlers
│   ├── mod.rs      - API aggregation and OpenAPI tags
│   ├── health.rs   - Health check endpoints
│   └── meta.rs     - Application metadata endpoints
└── middleware/     - Custom middleware implementations
    ├── mod.rs
    └── rate_limit.rs - Rate limiting middleware using governor

Key Patterns Discovered

1. Route Registration Pattern

Location: src/startup.rs:95-107

fn setup_app(settings: &Settings) -> poem::Route {
    let api_service = OpenApiService::new(
        Api::from(settings).apis(),
        settings.application.clone().name,
        settings.application.clone().version,
    )
    .url_prefix("/api");
    let ui = api_service.swagger_ui();
    poem::Route::new()
        .nest("/api", api_service.clone())
        .nest("/specs", api_service.spec_endpoint_yaml())
        .nest("/", ui)
}

Key Insights:

OpenAPI service created with all API handlers via .apis() tuple
URL prefix /api applied to all API routes
Swagger UI automatically mounted at root /
OpenAPI spec YAML available at /specs

2. API Handler Organization Pattern

Location: src/route/mod.rs:14-37

#[derive(Tags)]
enum ApiCategory {
    Health,
    Meta,
}

pub(crate) struct Api {
    health: health::HealthApi,
    meta: meta::MetaApi,
}

impl From<&Settings> for Api {
    fn from(value: &Settings) -> Self {
        let health = health::HealthApi;
        let meta = meta::MetaApi::from(&value.application);
        Self { health, meta }
    }
}

impl Api {
    pub fn apis(self) -> (health::HealthApi, meta::MetaApi) {
        (self.health, self.meta)
    }
}

Key Insights:

Tags enum groups APIs into categories for OpenAPI documentation
Aggregator struct (Api) holds all API handler instances
Dependency injection via From<&Settings> trait
.apis() method returns tuple of all handlers

3. OpenAPI Handler Definition Pattern

Location: src/route/health.rs:7-29

#[derive(ApiResponse)]
enum HealthResponse {
    #[oai(status = 200)]
    Ok,
    #[oai(status = 429)]
    TooManyRequests,
}

#[derive(Default, Clone)]
pub struct HealthApi;

#[OpenApi(tag = "ApiCategory::Health")]
impl HealthApi {
    #[oai(path = "/health", method = "get")]
    async fn ping(&self) -> HealthResponse {
        tracing::event!(target: "backend::health", tracing::Level::DEBUG,
                       "Accessing health-check endpoint");
        HealthResponse::Ok
    }
}

Key Insights:

Response types are enums with #[derive(ApiResponse)]
Each variant maps to HTTP status code via #[oai(status = N)]
Handlers use #[OpenApi(tag = "...")] for categorization
Type-safe responses at compile time
Tracing at architectural boundaries

4. JSON Response Pattern with DTOs

Location: src/route/meta.rs:9-56

#[derive(Object, Debug, Clone, serde::Serialize, serde::Deserialize)]
struct Meta {
    version: String,
    name: String,
}

#[derive(ApiResponse)]
enum MetaResponse {
    #[oai(status = 200)]
    Meta(Json<Meta>),
    #[oai(status = 429)]
    TooManyRequests,
}

#[OpenApi(tag = "ApiCategory::Meta")]
impl MetaApi {
    #[oai(path = "/meta", method = "get")]
    async fn meta(&self) -> Result<MetaResponse> {
        Ok(MetaResponse::Meta(Json(self.into())))
    }
}

Key Insights:

DTOs use #[derive(Object)] for OpenAPI schema generation
Response variants can hold Json<T> payloads
Handler struct holds state/configuration
Returns Result<MetaResponse> for error handling

5. Middleware Composition Pattern

Location: src/startup.rs:59-91

let app = value
    .app
    .with(RateLimit::new(&rate_limit_config))
    .with(Cors::new())
    .data(value.settings);

Key Insights:

Middleware applied via .with() method chaining
Order matters: RateLimit → CORS → data injection
Settings injected as shared data via .data()
Configuration drives middleware behavior

6. Configuration Management Pattern

Location: src/settings.rs:40-62

let settings = config::Config::builder()
    .add_source(config::File::from(settings_directory.join("base.yaml")))
    .add_source(config::File::from(
        settings_directory.join(environment_filename),
    ))
    .add_source(
        config::Environment::with_prefix("APP")
            .prefix_separator("__")
            .separator("__"),
    )
    .build()?;

Key Insights:

Three-tier configuration: base → environment-specific → env vars
Environment detected via APP_ENVIRONMENT variable
Environment variables use APP__ prefix with double underscore separators
Type-safe deserialization

7. Testing Pattern

Location: src/route/health.rs:31-38

#[tokio::test]
async fn health_check_works() {
    let app = crate::get_test_app();
    let cli = poem::test::TestClient::new(app);
    let resp = cli.get("/api/health").send().await;
    resp.assert_status_is_ok();
}

Key Insights:

Test helper creates full application with random port
TestClient provides fluent assertion API
Tests are async with #[tokio::test]
Real application used in tests

Type System Best Practices

Current code demonstrates excellent TyDD:

Environment enum instead of strings
RateLimitConfig newtype instead of raw numbers
ApiResponse enums for type-safe HTTP responses

Architecture Compliance

Current Layers:

Presentation Layer: src/route/* - HTTP adapters
Infrastructure Layer: src/middleware/*, src/startup.rs, src/telemetry.rs

Missing Layers (to be added for Modbus): 3. Domain Layer: Pure relay logic, no Modbus knowledge 4. Application Layer: Use cases (get status, toggle)

Integration Recommendations

Recommended Architecture for Modbus Feature

Following hexagonal architecture principles from constitution:

src/
├── domain/
│   └── relay/
│       ├── mod.rs           - Domain types (RelayId, RelayState, Relay)
│       ├── relay.rs         - Relay entity
│       ├── error.rs         - Domain errors
│       └── repository.rs    - RelayRepository trait
├── application/
│   └── relay/
│       ├── mod.rs           - Use case exports
│       ├── get_status.rs    - GetRelayStatus use case
│       ├── toggle.rs        - ToggleRelay use case
│       └── bulk_control.rs  - BulkControl use case
├── infrastructure/
│   └── modbus/
│       ├── mod.rs           - Modbus exports
│       ├── client.rs        - ModbusRelayRepository implementation
│       ├── config.rs        - Modbus configuration
│       └── error.rs         - Modbus-specific errors
└── route/
    └── relay.rs             - HTTP adapter (presentation layer)

Integration Points

Component	File	Action
API Category	`src/route/mod.rs`	Add `Relay` to `ApiCategory` enum
API Aggregator	`src/route/mod.rs`	Add `relay: RelayApi` field to `Api` struct
API Tuple	`src/route/mod.rs`	Add `RelayApi` to `.apis()` return tuple
Settings	`src/settings.rs`	Add `ModbusSettings` struct and `modbus` field
Config Files	`settings/base.yaml`	Add `modbus:` section
Shared State	`src/startup.rs`	Inject `ModbusClient` via `.data()`
Dependencies	`Cargo.toml`	Add `tokio-modbus`, `async-trait`, `mockall`

Example: New Route Handler

// src/route/relay.rs
use poem::Result;
use poem_openapi::{ApiResponse, Object, OpenApi, payload::Json, param::Path};
use crate::domain::relay::{RelayId, RelayState, Relay};

#[derive(Object, Serialize, Deserialize)]
struct RelayDto {
    id: u8,
    state: String,  // "on" or "off"
    label: Option<String>,
}

#[derive(ApiResponse)]
enum RelayResponse {
    #[oai(status = 200)]
    Status(Json<RelayDto>),
    #[oai(status = 400)]
    BadRequest,
    #[oai(status = 503)]
    ServiceUnavailable,
}

#[OpenApi(tag = "ApiCategory::Relay")]
impl RelayApi {
    #[oai(path = "/relays/:id", method = "get")]
    async fn get_status(&self, id: Path<u8>) -> Result<RelayResponse> {
        let relay_id = RelayId::new(id.0)
            .map_err(|_| poem::Error::from_status(StatusCode::BAD_REQUEST))?;

        // Use application layer use case
        match self.get_status_use_case.execute(relay_id).await {
            Ok(relay) => Ok(RelayResponse::Status(Json(relay.into()))),
            Err(_) => Ok(RelayResponse::ServiceUnavailable),
        }
    }
}

Example: Settings Extension

// src/settings.rs
#[derive(Debug, serde::Deserialize, Clone)]
pub struct ModbusSettings {
    pub host: String,
    pub port: u16,
    pub slave_id: u8,
    pub timeout_seconds: u64,
}

#[derive(Debug, serde::Deserialize, Clone)]
pub struct Settings {
    pub application: ApplicationSettings,
    pub debug: bool,
    pub frontend_url: String,
    pub rate_limit: RateLimitSettings,
    pub modbus: ModbusSettings,  // New field
}

# settings/base.yaml
modbus:
  host: "192.168.1.100"
  port: 502
  slave_id: 1
  timeout_seconds: 3

Summary

Key Takeaways

tokio-modbus 0.17.0: Excellent choice, use trait abstraction for testability
HTTP Polling: Maintain spec decision, simpler and adequate for scale
Hexagonal Architecture: Add domain/application layers following existing patterns
Type-Driven Development: Apply newtype pattern (RelayId, RelayState)
Testing: Use mockall with async-trait for >90% coverage without hardware

Next Steps

Clarifying Questions: Resolve ambiguities in requirements
Architecture Design: Create multiple implementation approaches
Final Plan: Select approach and create detailed implementation plan
Implementation: Follow TDD workflow with types-first design

End of Research Document

23 KiB Raw Blame History Unescape Escape

Research Document: Modbus Relay Control System

Table of Contents

Executive Summary

Key Decisions

User Input Analysis

Deployment Architecture

Tokio-Modbus Research

Decision: Recommended Patterns

Technical Details

Critical Gotchas

Code Examples

Alternatives Considered

Resources

WebSocket vs HTTP Polling

Recommendation: HTTP Polling (as specified)

Performance at Your Scale (10 users, 2-second intervals)

Implementation Complexity

Reliability & Error Handling

Decision Matrix

When WebSocket Makes Sense

Migration Path (If Needed Later)

Poem WebSocket Support (For Reference)

Resources

Existing Codebase Patterns

Architecture Overview

Key Patterns Discovered

1. Route Registration Pattern

2. API Handler Organization Pattern

3. OpenAPI Handler Definition Pattern

4. JSON Response Pattern with DTOs

5. Middleware Composition Pattern

6. Configuration Management Pattern

7. Testing Pattern

Type System Best Practices

Architecture Compliance

Integration Recommendations

Recommended Architecture for Modbus Feature

Integration Points

Example: New Route Handler

Example: Settings Extension

Summary

Key Takeaways

Next Steps

23 KiB

Raw Blame History