Switch from Modbus RTU over TCP to native Modbus TCP based on hardware testing. Uses standard MBAP header (no CRC16), port 502, and TCP-only tokio-modbus feature for simpler implementation. Updated: Cargo.toml, plan.md, research.md, tasks.md
23 KiB
Research Document: Modbus Relay Control System
Created: 2025-12-28 Feature: spec.md Status: Complete
Table of Contents
- Executive Summary
- Tokio-Modbus Research
- WebSocket vs HTTP Polling
- Existing Codebase Patterns
- Integration Recommendations
Executive Summary
Key Decisions
| Decision Area | Recommendation | Rationale |
|---|---|---|
| Modbus Library | tokio-modbus 0.17.0 | Native async/await, production-ready, good testability |
| Communication Pattern | HTTP Polling (as in spec) | Simpler, reliable, adequate for 10 users @ 2s intervals |
| Connection Management | Arc<Mutex> for MVP | Single device, simple, can upgrade later if needed |
| Retry Strategy | Simple retry-once helper | Matches FR-007 requirement |
| Testing Approach | Trait-based abstraction with mockall | Enables >90% coverage without hardware |
User Input Analysis
User requested: "Use tokio-modbus crate, poem-openapi for REST API, Vue.js with WebSocket for real-time updates"
Findings:
- ✅ tokio-modbus 0.17.0: Excellent choice, validated by research
- ✅ poem-openapi: Already in use, working well
- ⚠️ WebSocket vs HTTP Polling: Spec says HTTP polling (FR-028). WebSocket adds 43x complexity for negligible benefit at this scale.
RECOMMENDATION: Maintain HTTP polling as specified. WebSocket complexity not justified for 10 concurrent users with 2-second update intervals.
Deployment Architecture
User clarification (2025-12-29): Frontend on Cloudflare Pages, backend on Raspberry Pi behind Traefik with Authelia
Architecture:
- Frontend: Cloudflare Pages (Vue 3 static build) - global CDN delivery
- Backend: Raspberry Pi HTTP API (same local network as Modbus device)
- Reverse Proxy: Traefik on Raspberry Pi
- HTTPS termination (TLS certificates)
- Authelia middleware for authentication
- Routes frontend requests to backend HTTP service
- Communication Flow:
- Frontend (CDN) → HTTPS → Traefik (HTTPS termination + auth) → Backend (HTTP) → Modbus TCP → Device
Security:
- Frontend-Backend: HTTPS via Traefik (encrypted, authenticated)
- Backend-Device: Modbus TCP on local network (unencrypted, local only)
Tokio-Modbus Research
Decision: Recommended Patterns
Primary Recommendation: Use tokio-modbus 0.17.0 with a custom trait-based abstraction layer (RelayController trait) for testability. Implement connection management using Arc<Mutex> for MVP.
Technical Details
Version: tokio-modbus 0.17.0 (latest stable, released 2025-10-22)
Protocol: Modbus TCP (native TCP protocol)
- Hardware configured to use native Modbus TCP protocol
- Uses MBAP (Modbus Application Protocol) header
- No CRC16 validation (TCP/IP handles error detection)
- Standard Modbus TCP protocol on port 502
Connection Strategy:
- Shared
Arc<Mutex<Context>>for simplicity - Single persistent connection (only one device)
- Can migrate to dedicated async task pattern if reconnection logic needed
Timeout Handling:
- Wrap all operations with
tokio::time::timeout(Duration::from_secs(3), ...) - CRITICAL: tokio-modbus has NO built-in timeouts
Retry Logic:
- Implement simple retry-once helper per FR-007
- Matches specification requirement
Testing:
- Use
mockallcrate withasync-traitfor unit testing - Trait abstraction enables testing without hardware
- Supports >90% test coverage target (NFR-013)
Critical Gotchas
-
Device Protocol Configuration: Hardware MUST be configured to use Modbus TCP protocol (not RTU over TCP) via VirCom software
- Set "Transfer Protocol" to "Modbus TCP protocol" in Advanced Settings
- Device automatically switches to port 502 when TCP protocol is selected
-
Device Gateway Configuration: Hardware MUST be set to "Multi-host non-storage type" - default storage type sends spurious queries causing failures
-
No Built-in Timeouts: tokio-modbus has NO automatic timeouts - must wrap every operation with
tokio::time::timeout -
Address Indexing: Relays labeled 1-8, but Modbus addresses are 0-7 (use newtype pattern with conversion methods)
-
Nested Result Handling: Returns
Result<Result<T, Exception>, std::io::Error>- must handle both layers (use???triple-question-mark pattern) -
Concurrent Access: Context is not thread-safe - requires
Arc<Mutex>or dedicated task serialization
Code Examples
Basic Connection Setup:
use tokio_modbus::prelude::*;
use tokio::time::{timeout, Duration};
// Connect to device using Modbus TCP on standard port 502
let socket_addr = "192.168.1.200:502".parse()?;
let mut ctx = tcp::connect(socket_addr).await?;
// Set slave ID (unit identifier)
ctx.set_slave(Slave(0x01));
// Read all 8 relay states with timeout
let states = timeout(
Duration::from_secs(3),
ctx.read_coils(0x0000, 8)
).await???; // Triple-? handles timeout + transport + exception errors
Note: Modbus TCP uses the standard MBAP header and does not require CRC16 validation. The protocol is cleaner and more standardized than RTU over TCP.
Toggle Relay with Retry:
async fn toggle_relay(
ctx: &mut Context,
relay_id: u8, // 1-8
) -> Result<(), RelayError> {
let addr = (relay_id - 1) as u16; // Convert to 0-7
// Read current state
let states = timeout(Duration::from_secs(3), ctx.read_coils(addr, 1))
.await???;
let current = states[0];
// Write opposite state with retry
let new_state = !current;
let write_op = || async {
timeout(Duration::from_secs(3), ctx.write_single_coil(addr, new_state))
.await
};
// Retry once on failure (FR-007)
match write_op().await {
Ok(Ok(Ok(()))) => Ok(()),
Err(_) | Ok(Err(_)) | Ok(Ok(Err(_))) => {
tracing::warn!("Write failed, retrying");
write_op().await???
}
}
}
Trait-Based Abstraction for Testing:
use async_trait::async_trait;
#[async_trait]
pub trait RelayController: Send + Sync {
async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError>;
async fn write_state(&mut self, relay_id: RelayId, state: RelayState) -> Result<(), RelayError>;
}
// Real implementation with tokio-modbus
pub struct ModbusRelayController {
ctx: Arc<Mutex<Context>>,
}
#[async_trait]
impl RelayController for ModbusRelayController {
async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError> {
let mut ctx = self.ctx.lock().await;
timeout(Duration::from_secs(3), ctx.read_coils(0, 8))
.await
.map_err(|_| RelayError::Timeout)?
.map_err(RelayError::Transport)?
.map_err(RelayError::Exception)
}
// ... other methods
}
// Mock for testing (using mockall)
mock! {
pub RelayController {}
#[async_trait]
impl RelayController for RelayController {
async fn read_all_states(&mut self) -> Result<Vec<bool>, RelayError>;
async fn write_state(&mut self, relay_id: RelayId, state: RelayState) -> Result<(), RelayError>;
}
}
Alternatives Considered
- modbus-robust: Provides auto-reconnection but lacks retry logic and timeouts - insufficient for production
- bb8 connection pool: Overkill for single-device scenario, adds unnecessary complexity
- Synchronous modbus-rs: Would block Tokio threads, poor scalability for concurrent users
- Custom Modbus implementation: Reinventing wheel, error-prone, significant development time
Resources
- GitHub - slowtec/tokio-modbus
- tokio-modbus on docs.rs
- Context7 MCP:
/slowtec/tokio-modbus - Context7 MCP:
/websites/rs_tokio-modbus_0_16_3_tokio_modbus
WebSocket vs HTTP Polling
Recommendation: HTTP Polling (as specified)
The specification's decision to use HTTP polling is technically sound. HTTP polling is the better choice for this specific use case.
Performance at Your Scale (10 users, 2-second intervals)
Bandwidth Comparison:
- HTTP Polling: ~20 Kbps (10 users × 0.5 req/sec × 500 bytes × 8)
- WebSocket: ~2.4 Kbps sustained
- Difference: 17.6 Kbps - negligible on any modern network
Server Load:
- HTTP Polling: 5 requests/second system-wide (trivial)
- WebSocket: 10 persistent connections (~80-160 KB memory)
- Verdict: Both are trivial at this scale
Implementation Complexity
HTTP Polling:
- Backend: 0 lines (reuse existing
GET /api/relays) - Frontend: ~10 lines (simple setInterval)
- Total effort: 15 minutes
WebSocket:
- Backend: ~115 lines (handler + background poller + channel setup)
- Frontend: ~135 lines (WebSocket manager + reconnection logic)
- Testing: ~180 lines (connection lifecycle + reconnection tests)
- Total effort: 2-3 days + ongoing maintenance
Complexity ratio: 43x more code for WebSocket
Reliability & Error Handling
HTTP Polling Advantages:
- Stateless (automatic recovery on next poll)
- Standard HTTP error codes
- Works everywhere (proxies, firewalls, old browsers)
- No connection state management
- Simple testing
WebSocket Challenges:
- Connection lifecycle management
- Exponential backoff reconnection logic
- State synchronization on reconnect
- Thundering herd problem (all clients reconnect after server restart)
- May fail behind corporate proxies (requires fallback to HTTP polling anyway)
Decision Matrix
| Criterion | HTTP Polling | WebSocket | Weight |
|---|---|---|---|
| Simplicity | 5 | 2 | 3x |
| Reliability | 5 | 3 | 3x |
| Testing | 5 | 2 | 2x |
| Performance @ 10 users | 4 | 5 | 1x |
| Scalability to 100+ | 3 | 5 | 1x |
| Architecture fit | 5 | 3 | 2x |
Weighted Scores:
- HTTP Polling: 4.56/5
- WebSocket: 3.19/5
HTTP Polling scores 43% higher when complexity, reliability, and testing are properly weighted for this project's scale.
When WebSocket Makes Sense
WebSocket advantages manifest at:
- 100+ concurrent users (4x throughput advantage becomes meaningful)
- Sub-second update requirements (<1 second intervals)
- High-frequency updates where latency matters
- Bidirectional communication (chat, gaming, trading systems)
For relay control with 2-second polling:
- Latency: 0-4 seconds (avg 2 sec) - acceptable for lights/pumps
- Not a real-time critical system (not chat, gaming, or trading)
Migration Path (If Needed Later)
Starting with HTTP polling does NOT prevent WebSocket adoption later:
- Phase 1: Add
/api/wsendpoint (non-breaking change) - Phase 2: Progressive enhancement (detect WebSocket support)
- Phase 3: Gradual rollout with monitoring
Key Point: HTTP polling provides a baseline. Adding WebSocket later is straightforward, but removing WebSocket complexity is harder.
Poem WebSocket Support (For Reference)
Poem has excellent WebSocket support through poem::web::websocket:
use poem::web::websocket::{WebSocket, Message};
#[handler]
async fn ws_handler(
ws: WebSocket,
state_tx: Data<&watch::Sender<RelayCollection>>,
) -> impl IntoResponse {
ws.on_upgrade(move |socket| async move {
let (mut sink, mut stream) = socket.split();
let mut rx = state_tx.subscribe();
// Send initial state
let initial = rx.borrow().clone();
sink.send(Message::text(serde_json::to_string(&initial)?)).await?;
// Stream updates
while rx.changed().await.is_ok() {
let state = rx.borrow().clone();
sink.send(Message::text(serde_json::to_string(&state)?)).await?;
}
})
}
Broadcasting Pattern: Use tokio::sync::watch channel:
- Maintains only most recent value (perfect for relay state)
- Automatic deduplication of identical states
- New connections get immediate state snapshot
- Memory-efficient (single state copy)
Resources
Existing Codebase Patterns
Architecture Overview
The current codebase is a well-structured Rust backend API using Poem framework with OpenAPI support, following clean architecture principles.
Current Structure:
src/
├── lib.rs - Library entry point, orchestrates application setup
├── main.rs - Binary entry point, calls lib::run()
├── startup.rs - Application builder, server configuration, route setup
├── settings.rs - Configuration from YAML files + environment variables
├── telemetry.rs - Logging and tracing setup
├── route/ - HTTP endpoint handlers
│ ├── mod.rs - API aggregation and OpenAPI tags
│ ├── health.rs - Health check endpoints
│ └── meta.rs - Application metadata endpoints
└── middleware/ - Custom middleware implementations
├── mod.rs
└── rate_limit.rs - Rate limiting middleware using governor
Key Patterns Discovered
1. Route Registration Pattern
Location: src/startup.rs:95-107
fn setup_app(settings: &Settings) -> poem::Route {
let api_service = OpenApiService::new(
Api::from(settings).apis(),
settings.application.clone().name,
settings.application.clone().version,
)
.url_prefix("/api");
let ui = api_service.swagger_ui();
poem::Route::new()
.nest("/api", api_service.clone())
.nest("/specs", api_service.spec_endpoint_yaml())
.nest("/", ui)
}
Key Insights:
- OpenAPI service created with all API handlers via
.apis()tuple - URL prefix
/apiapplied to all API routes - Swagger UI automatically mounted at root
/ - OpenAPI spec YAML available at
/specs
2. API Handler Organization Pattern
Location: src/route/mod.rs:14-37
#[derive(Tags)]
enum ApiCategory {
Health,
Meta,
}
pub(crate) struct Api {
health: health::HealthApi,
meta: meta::MetaApi,
}
impl From<&Settings> for Api {
fn from(value: &Settings) -> Self {
let health = health::HealthApi;
let meta = meta::MetaApi::from(&value.application);
Self { health, meta }
}
}
impl Api {
pub fn apis(self) -> (health::HealthApi, meta::MetaApi) {
(self.health, self.meta)
}
}
Key Insights:
Tagsenum groups APIs into categories for OpenAPI documentation- Aggregator struct (
Api) holds all API handler instances - Dependency injection via
From<&Settings>trait .apis()method returns tuple of all handlers
3. OpenAPI Handler Definition Pattern
Location: src/route/health.rs:7-29
#[derive(ApiResponse)]
enum HealthResponse {
#[oai(status = 200)]
Ok,
#[oai(status = 429)]
TooManyRequests,
}
#[derive(Default, Clone)]
pub struct HealthApi;
#[OpenApi(tag = "ApiCategory::Health")]
impl HealthApi {
#[oai(path = "/health", method = "get")]
async fn ping(&self) -> HealthResponse {
tracing::event!(target: "backend::health", tracing::Level::DEBUG,
"Accessing health-check endpoint");
HealthResponse::Ok
}
}
Key Insights:
- Response types are enums with
#[derive(ApiResponse)] - Each variant maps to HTTP status code via
#[oai(status = N)] - Handlers use
#[OpenApi(tag = "...")]for categorization - Type-safe responses at compile time
- Tracing at architectural boundaries
4. JSON Response Pattern with DTOs
Location: src/route/meta.rs:9-56
#[derive(Object, Debug, Clone, serde::Serialize, serde::Deserialize)]
struct Meta {
version: String,
name: String,
}
#[derive(ApiResponse)]
enum MetaResponse {
#[oai(status = 200)]
Meta(Json<Meta>),
#[oai(status = 429)]
TooManyRequests,
}
#[OpenApi(tag = "ApiCategory::Meta")]
impl MetaApi {
#[oai(path = "/meta", method = "get")]
async fn meta(&self) -> Result<MetaResponse> {
Ok(MetaResponse::Meta(Json(self.into())))
}
}
Key Insights:
- DTOs use
#[derive(Object)]for OpenAPI schema generation - Response variants can hold
Json<T>payloads - Handler struct holds state/configuration
- Returns
Result<MetaResponse>for error handling
5. Middleware Composition Pattern
Location: src/startup.rs:59-91
let app = value
.app
.with(RateLimit::new(&rate_limit_config))
.with(Cors::new())
.data(value.settings);
Key Insights:
- Middleware applied via
.with()method chaining - Order matters: RateLimit → CORS → data injection
- Settings injected as shared data via
.data() - Configuration drives middleware behavior
6. Configuration Management Pattern
Location: src/settings.rs:40-62
let settings = config::Config::builder()
.add_source(config::File::from(settings_directory.join("base.yaml")))
.add_source(config::File::from(
settings_directory.join(environment_filename),
))
.add_source(
config::Environment::with_prefix("APP")
.prefix_separator("__")
.separator("__"),
)
.build()?;
Key Insights:
- Three-tier configuration: base → environment-specific → env vars
- Environment detected via
APP_ENVIRONMENTvariable - Environment variables use
APP__prefix with double underscore separators - Type-safe deserialization
7. Testing Pattern
Location: src/route/health.rs:31-38
#[tokio::test]
async fn health_check_works() {
let app = crate::get_test_app();
let cli = poem::test::TestClient::new(app);
let resp = cli.get("/api/health").send().await;
resp.assert_status_is_ok();
}
Key Insights:
- Test helper creates full application with random port
TestClientprovides fluent assertion API- Tests are async with
#[tokio::test] - Real application used in tests
Type System Best Practices
Current code demonstrates excellent TyDD:
Environmentenum instead of stringsRateLimitConfignewtype instead of raw numbersApiResponseenums for type-safe HTTP responses
Architecture Compliance
Current Layers:
- Presentation Layer:
src/route/*- HTTP adapters - Infrastructure Layer:
src/middleware/*,src/startup.rs,src/telemetry.rs
Missing Layers (to be added for Modbus): 3. Domain Layer: Pure relay logic, no Modbus knowledge 4. Application Layer: Use cases (get status, toggle)
Integration Recommendations
Recommended Architecture for Modbus Feature
Following hexagonal architecture principles from constitution:
src/
├── domain/
│ └── relay/
│ ├── mod.rs - Domain types (RelayId, RelayState, Relay)
│ ├── relay.rs - Relay entity
│ ├── error.rs - Domain errors
│ └── repository.rs - RelayRepository trait
├── application/
│ └── relay/
│ ├── mod.rs - Use case exports
│ ├── get_status.rs - GetRelayStatus use case
│ ├── toggle.rs - ToggleRelay use case
│ └── bulk_control.rs - BulkControl use case
├── infrastructure/
│ └── modbus/
│ ├── mod.rs - Modbus exports
│ ├── client.rs - ModbusRelayRepository implementation
│ ├── config.rs - Modbus configuration
│ └── error.rs - Modbus-specific errors
└── route/
└── relay.rs - HTTP adapter (presentation layer)
Integration Points
| Component | File | Action |
|---|---|---|
| API Category | src/route/mod.rs |
Add Relay to ApiCategory enum |
| API Aggregator | src/route/mod.rs |
Add relay: RelayApi field to Api struct |
| API Tuple | src/route/mod.rs |
Add RelayApi to .apis() return tuple |
| Settings | src/settings.rs |
Add ModbusSettings struct and modbus field |
| Config Files | settings/base.yaml |
Add modbus: section |
| Shared State | src/startup.rs |
Inject ModbusClient via .data() |
| Dependencies | Cargo.toml |
Add tokio-modbus, async-trait, mockall |
Example: New Route Handler
// src/route/relay.rs
use poem::Result;
use poem_openapi::{ApiResponse, Object, OpenApi, payload::Json, param::Path};
use crate::domain::relay::{RelayId, RelayState, Relay};
#[derive(Object, Serialize, Deserialize)]
struct RelayDto {
id: u8,
state: String, // "on" or "off"
label: Option<String>,
}
#[derive(ApiResponse)]
enum RelayResponse {
#[oai(status = 200)]
Status(Json<RelayDto>),
#[oai(status = 400)]
BadRequest,
#[oai(status = 503)]
ServiceUnavailable,
}
#[OpenApi(tag = "ApiCategory::Relay")]
impl RelayApi {
#[oai(path = "/relays/:id", method = "get")]
async fn get_status(&self, id: Path<u8>) -> Result<RelayResponse> {
let relay_id = RelayId::new(id.0)
.map_err(|_| poem::Error::from_status(StatusCode::BAD_REQUEST))?;
// Use application layer use case
match self.get_status_use_case.execute(relay_id).await {
Ok(relay) => Ok(RelayResponse::Status(Json(relay.into()))),
Err(_) => Ok(RelayResponse::ServiceUnavailable),
}
}
}
Example: Settings Extension
// src/settings.rs
#[derive(Debug, serde::Deserialize, Clone)]
pub struct ModbusSettings {
pub host: String,
pub port: u16,
pub slave_id: u8,
pub timeout_seconds: u64,
}
#[derive(Debug, serde::Deserialize, Clone)]
pub struct Settings {
pub application: ApplicationSettings,
pub debug: bool,
pub frontend_url: String,
pub rate_limit: RateLimitSettings,
pub modbus: ModbusSettings, // New field
}
# settings/base.yaml
modbus:
host: "192.168.1.100"
port: 502
slave_id: 1
timeout_seconds: 3
Summary
Key Takeaways
- tokio-modbus 0.17.0: Excellent choice, use trait abstraction for testability
- HTTP Polling: Maintain spec decision, simpler and adequate for scale
- Hexagonal Architecture: Add domain/application layers following existing patterns
- Type-Driven Development: Apply newtype pattern (RelayId, RelayState)
- Testing: Use mockall with async-trait for >90% coverage without hardware
Next Steps
- Clarifying Questions: Resolve ambiguities in requirements
- Architecture Design: Create multiple implementation approaches
- Final Plan: Select approach and create detailed implementation plan
- Implementation: Follow TDD workflow with types-first design
End of Research Document