docs: add project specs and documentation for Modbus relay control

Initialize project documentation structure:
- Add CLAUDE.md with development guidelines and architecture principles
- Add project constitution (v1.1.0) with hexagonal architecture and SOLID principles
- Add MCP server configuration for Context7 integration

Feature specification (001-modbus-relay-control):
- Complete feature spec for web-based Modbus relay control system
- Implementation plan with TDD approach using SQLx for persistence
- Type-driven development design for domain types
- Technical decisions document (SQLx over rusqlite, SQLite persistence)
- Detailed task breakdown (94 tasks across 8 phases)
- Specification templates for future features

Documentation:
- Modbus POE ETH Relay hardware documentation
- Modbus Application Protocol specification (PDF)

Project uses SQLx for compile-time verified SQL queries, aligned with
type-driven development principles.
This commit is contained in:
2025-12-21 18:19:21 +01:00
parent d5a2859b64
commit a683810bdc
15 changed files with 7960 additions and 0 deletions

View File

@@ -0,0 +1,315 @@
# Feature Specification: Modbus Relay Control System
**Feature Branch**: `001-modbus-relay-control`
**Created**: 2025-12-28
**Status**: Draft
**Input**: User description: "Modbus relay control system: backend reads relay and writes states via Modbus, exposes REST API, frontend displays relay states and allows toggling."
## Executive Summary
### Problem Statement
Users currently require specialized Modbus software (Modbus Poll, SSCOM) to interact with an 8-channel relay device, creating barriers to adoption and limiting remote access capabilities. The lack of a web-based interface prevents non-technical users from controlling relays and limits integration possibilities.
### Proposed Solution
A web application consisting of:
- **Rust Backend**: Modbus RTU over TCP integration + RESTful HTTP API (deployed on Raspberry Pi)
- **Vue.js Frontend**: Real-time relay status display and control interface (deployed on Cloudflare Pages)
- **Reverse Proxy**: Traefik with Authelia middleware for authentication and HTTPS termination
- **Local Network**: Raspberry Pi on same network as Modbus relay device
### Value Proposition
- **Accessibility**: Control relays from any browser without specialized software
- **Usability**: Intuitive UI eliminates need for Modbus protocol knowledge
- **Foundation**: Enables future automation, scheduling, and integration capabilities
- **Deployment**: Self-contained system with no external dependencies
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Monitor Relay Status (Priority: P1)
As a user, I want to see the current state (on/off) of all 8 relays in real-time so I can verify the physical system state without being physically present.
**Why this priority**: Foundation capability - all other features depend on accurate state visibility. Delivers immediate value by eliminating need for physical inspection or specialized software.
**Independent Test**: Can be fully tested by loading the web interface and verifying displayed states match physical relay states (verified with multimeter or visual indicators). Delivers value even without control capabilities.
**Acceptance Scenarios**:
1. **Given** all relays are OFF, **When** I load the web interface, **Then** I see 8 relays each displaying "OFF" state
2. **Given** relay #3 is ON and others are OFF, **When** I load the interface, **Then** I see relay #3 showing "ON" and others showing "OFF"
3. **Given** the interface is loaded, **When** relay state changes externally (via Modbus Poll), **Then** the interface updates within 2 seconds to reflect the new state
4. **Given** the Modbus device is unreachable, **When** I load the interface, **Then** I see an error message indicating the device is unavailable
---
### User Story 2 - Toggle Individual Relay (Priority: P1)
As a user, I want to toggle any relay on or off with a single click so I can control connected devices remotely.
**Why this priority**: Core use case - enables remote control capability. Combined with Story 1, creates a complete minimal viable product.
**Independent Test**: Can be tested by clicking any relay toggle button and observing both UI update and physical relay click/LED change. Delivers standalone value for remote control.
**Acceptance Scenarios**:
1. **Given** relay #5 is OFF, **When** I click the toggle button for relay #5, **Then** relay #5 turns ON and the UI reflects this within 1 second
2. **Given** relay #2 is ON, **When** I click the toggle button for relay #2, **Then** relay #2 turns OFF and the UI reflects this within 1 second
3. **Given** the Modbus device is unreachable, **When** I attempt to toggle a relay, **Then** I see an error message and the UI does not change
4. **Given** I toggle relay #1, **When** the Modbus command times out, **Then** I see a timeout error and can retry
---
### User Story 3 - Bulk Relay Control (Priority: P2)
As a user, I want to turn all relays ON or OFF simultaneously so I can quickly reset the entire system or enable/disable all connected devices at once.
**Why this priority**: Efficiency improvement for common scenarios (system shutdown, initialization). Not critical for MVP but significantly improves user experience.
**Independent Test**: Can be tested by clicking "All ON" or "All OFF" buttons and verifying all 8 physical relays respond. Delivers value for batch operations without requiring individual story implementations.
**Acceptance Scenarios**:
1. **Given** relays have mixed states (some ON, some OFF), **When** I click "All ON", **Then** all 8 relays turn ON within 2 seconds
2. **Given** all relays are ON, **When** I click "All OFF", **Then** all 8 relays turn OFF within 2 seconds
3. **Given** I click "All ON" and relay #4 fails to respond, **Then** I see an error for relay #4 but other relays still turn ON
4. **Given** the Modbus device is unreachable, **When** I click "All ON", **Then** I see an error message and no state changes occur
---
### User Story 4 - System Health Monitoring (Priority: P2)
As a user, I want to see device connectivity status and firmware version so I can diagnose issues and verify device compatibility.
**Why this priority**: Operational value for troubleshooting. Not required for basic control but critical for production reliability and maintenance.
**Independent Test**: Can be tested by viewing the health status section, disconnecting the Modbus device, and observing status change. Delivers standalone diagnostic value.
**Acceptance Scenarios**:
1. **Given** the Modbus device is connected and responsive, **When** I view the health status, **Then** I see "Healthy" status with firmware version displayed
2. **Given** the Modbus device is unreachable, **When** the backend starts, **Then** the backend starts successfully and the frontend displays "Unhealthy - Device Unreachable" status
3. **Given** the Modbus device becomes unreachable during operation, **When** I view the health status, **Then** I see "Unhealthy - Connection Lost" with timestamp of last successful communication
4. **Given** the Modbus device responds but with CRC errors, **When** I view health status, **Then** I see "Degraded - Communication Errors" with error count
---
### User Story 5 - Relay Labeling (Priority: P3)
As a user, I want to assign custom labels to each relay (e.g., "Garage Light", "Water Pump") so I can identify relays by purpose instead of numbers.
**Why this priority**: Usability enhancement - makes system more intuitive for production use. Not required for MVP but improves long-term user experience.
**Independent Test**: Can be tested by assigning a label to relay #1, refreshing the page, and verifying the label persists. Delivers value for multi-relay installations without requiring other stories.
**Acceptance Scenarios**:
1. **Given** I am viewing relay #3, **When** I click "Edit Label" and enter "Office Fan", **Then** relay #3 displays "Office Fan (Relay 3)"
2. **Given** relay #7 has label "Water Pump", **When** I refresh the page, **Then** relay #7 still shows "Water Pump (Relay 7)"
3. **Given** I have labeled multiple relays, **When** I toggle a relay by label, **Then** the correct physical relay responds
4. **Given** two relays have similar labels, **When** I search for a label, **Then** both matching relays are highlighted
---
### Edge Cases
- **Network Partition**: What happens when the Raspberry Pi loses connectivity to the Modbus device mid-operation?
- Backend marks device unhealthy, frontend displays error state, pending operations fail gracefully with clear error messages
- **Concurrent Control**: How does system handle multiple users toggling the same relay simultaneously?
- Last-write-wins semantics, each client receives updated state via polling within 2 seconds
- **Modbus Timeout**: What happens when a relay command times out?
- Backend retries once automatically, if retry fails, returns error to frontend with clear timeout message
- **Partial Bulk Failure**: What happens when "All ON" command succeeds for 7 relays but relay #4 fails?
- Frontend displays partial success with list of failed relays, successful relays remain ON, user can retry failed relays individually
- **Rapid Toggle Requests**: How does system handle user clicking toggle button repeatedly in quick succession?
- Frontend debounces clicks (500ms), backend queues commands serially, prevents command flooding
- **Device Firmware Mismatch**: What happens if relay device firmware version is incompatible?
- Backend logs firmware version, health check displays warning if version is untested, system attempts normal operation with degraded status
- **State Inconsistency**: What happens if Modbus read shows relay state different from expected state after write?
- Backend logs inconsistency, frontend displays actual state (read value), user sees visual indication of unexpected state
- **Browser Compatibility**: How does frontend handle older browsers without modern JavaScript features?
- Vue.js build targets ES2015+, displays graceful error message on IE11 and older, works on all modern browsers (Chrome, Firefox, Safari, Edge)
## Requirements *(mandatory)*
### Functional Requirements
#### Backend - Modbus Integration
- **FR-001**: System MUST establish Modbus RTU over TCP connection to relay device on configurable IP and port (default: device IP, port 502)
- **FR-002**: System MUST use Modbus function code 0x01 (Read Coils) to read all 8 relay states (addresses 0-7)
- **FR-003**: System MUST use Modbus function code 0x05 (Write Single Coil) to toggle individual relays
- **FR-004**: System MUST use Modbus function code 0x0F (Write Multiple Coils) for bulk operations (All ON/All OFF)
- **FR-005**: System MUST validate Modbus CRC16 checksums on all received messages
- **FR-006**: System MUST timeout Modbus operations after 3 seconds
- **FR-007**: System MUST retry failed Modbus commands exactly once before returning error
- **FR-008**: System MUST handle Modbus exception codes (0x01-0x04) and map to user-friendly error messages
- **FR-009**: System MUST use tokio-modbus library version 0.17.0 for Modbus protocol implementation
- **FR-010**: System MUST support configurable Modbus device address (default: 0x01)
#### Backend - REST API
- **FR-011**: System MUST expose `GET /api/relays` endpoint returning array of all relay states (id, state, label)
- **FR-012**: System MUST expose `POST /api/relays/{id}/toggle` endpoint to toggle relay {id} (id: 1-8)
- **FR-013**: System MUST expose `POST /api/relays/bulk` endpoint accepting `{"operation": "all_on" | "all_off"}`
- **FR-014**: System MUST expose `GET /api/health` endpoint returning device status (healthy/unhealthy, firmware version, last_contact timestamp)
- **FR-015**: System MUST expose `PUT /api/relays/{id}/label` endpoint to update relay label (max 50 characters)
- **FR-016**: System MUST return HTTP 200 for successful operations with JSON response body
- **FR-017**: System MUST return HTTP 500 for Modbus communication failures with error details
- **FR-018**: System MUST return HTTP 400 for invalid request parameters (e.g., relay id out of range)
- **FR-019**: System MUST return HTTP 504 for Modbus timeout errors
- **FR-020**: System MUST include OpenAPI 3.0 specification accessible at `/api/specs`
- **FR-021**: System MUST apply rate limiting middleware (100 requests/minute per IP)
- **FR-022**: System MUST apply CORS middleware allowing all origins (local network deployment)
- **FR-023**: System MUST start successfully even if Modbus device is unreachable at startup, marking device as unhealthy
- **FR-024**: System MUST persist relay labels to configuration file (YAML) for persistence across restarts
#### Frontend - User Interface
- **FR-025**: UI MUST display all 8 relays in a grid layout with clear ON/OFF state indication (color-coded)
- **FR-026**: UI MUST provide toggle button for each relay that triggers `POST /api/relays/{id}/toggle`
- **FR-027**: UI MUST provide "All ON" and "All OFF" buttons that trigger `POST /api/relays/bulk`
- **FR-028**: UI MUST poll `GET /api/relays` every 2 seconds to refresh relay states
- **FR-029**: UI MUST display loading indicator while relay operations are in progress
- **FR-030**: UI MUST display error messages when API calls fail, with specific error text from backend
- **FR-031**: UI MUST display health status section showing device connectivity and firmware version
- **FR-032**: UI MUST display "Unhealthy - Device Unreachable" message when backend reports device unreachable
- **FR-033**: UI MUST provide inline label editing for each relay (click to edit, save on blur/enter)
- **FR-034**: UI MUST be responsive and functional on desktop (>1024px), tablet (768-1024px), and mobile (320-767px)
- **FR-035**: UI MUST disable toggle buttons and show error when device is unhealthy
- **FR-036**: UI MUST show timestamp of last successful state update
- **FR-037**: UI MUST debounce toggle button clicks to 500ms to prevent rapid repeated requests
### Non-Functional Requirements
#### Performance
- **NFR-001**: System MUST respond to `GET /api/relays` within 100ms (excluding Modbus communication time)
- **NFR-002**: System MUST complete relay toggle operations within 1 second (including Modbus communication)
- **NFR-003**: System MUST handle 10 concurrent users without performance degradation
- **NFR-004**: Frontend MUST render initial page load within 2 seconds on 10 Mbps connection
#### Reliability
- **NFR-005**: System MUST maintain 95% successful operation rate for Modbus commands
- **NFR-006**: System MUST recover automatically from temporary Modbus connection loss within 5 seconds
- **NFR-007**: System MUST log all Modbus errors with structured logging (timestamp, error code, relay id)
- **NFR-008**: Backend MUST continue serving health and API endpoints even when Modbus device is unreachable
#### Security
- **NFR-009**: Backend MUST run on local network with Modbus device (no direct public internet exposure)
- **NFR-010**: System MUST NOT implement application-level authentication (handled by Traefik middleware with Authelia)
- **NFR-011**: Frontend-to-backend communication MUST use HTTPS via Traefik reverse proxy (backend itself runs HTTP, Traefik handles TLS termination)
- **NFR-012**: System MUST validate all API inputs to prevent injection attacks
- **NFR-013-SEC**: Backend-to-Modbus communication uses unencrypted Modbus TCP (local network only)
#### Maintainability
- **NFR-014**: Code MUST achieve >90% test coverage for domain logic (relay control, Modbus abstraction)
- **NFR-015**: System MUST follow hexagonal architecture with trait-based Modbus abstraction for testability
- **NFR-016**: System MUST use Type-Driven Development (TyDD) with newtype pattern for RelayId, RelayState, ModbusCommand
- **NFR-017**: All public APIs MUST have OpenAPI documentation
- **NFR-018-MAINT**: Code MUST pass `cargo clippy` with zero warnings on all, pedantic, and nursery lints
#### Observability
- **NFR-019**: System MUST emit structured logs at all architectural boundaries (API, Modbus)
- **NFR-020**: System MUST log relay state changes with timestamp, relay id, old state, new state
- **NFR-021**: System MUST expose Prometheus metrics endpoint at `/metrics` (request count, error rate, Modbus latency)
- **NFR-022**: System MUST log startup configuration (Modbus host/port, relay count) at INFO level
### Key Entities
- **Relay**: Represents a single relay channel (1-8) with properties: id (1-8), state (ON/OFF), label (optional, max 50 chars)
- **RelayState**: Enum representing ON or OFF state
- **RelayId**: Newtype wrapping u8 with validation (1-8 range), implements TyDD pattern
- **ModbusCommand**: Enum representing Modbus operations (ReadCoils, WriteSingleCoil, WriteMultipleCoils)
- **DeviceHealth**: Struct representing Modbus device status (`healthy: bool`, `firmware_version: Option<String>`, `last_contact: Option<DateTime>`)
- **RelayLabel**: Newtype wrapping String with validation (max 50 chars, alphanumeric + spaces)
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Users can view all 8 relay states within 2 seconds of loading the web interface
- **SC-002**: Users can toggle any relay with physical relay response within 1 second of button click
- **SC-003**: System achieves 95% successful operation rate for relay toggle commands over 24-hour period
- **SC-004**: Web interface is accessible and functional on Chrome, Firefox, Safari, and Edge browsers
- **SC-005**: Users can successfully use the interface on mobile devices (portrait and landscape)
- **SC-006**: Backend starts successfully and serves health endpoint even when Modbus device is disconnected
- **SC-007**: Frontend displays clear error message within 2 seconds when Modbus device is unhealthy
- **SC-008**: System supports 10 concurrent users performing toggle operations without performance degradation
- **SC-009**: All 8 relays turn ON within 2 seconds when "All ON" button is clicked
- **SC-010**: Domain logic achieves >90% test coverage as measured by `cargo tarpaulin`
### User Experience Goals
- **UX-001**: Non-technical users can control relays without referring to documentation
- **UX-002**: Error messages clearly explain problem and suggest remediation (e.g., "Device unreachable - check network connection")
- **UX-003**: Relay labels make it intuitive to identify relay purpose without memorizing numbers
## Dependencies & Assumptions
### Dependencies
- **Hardware**: 8-channel Modbus POE ETH Relay device (documented in `docs/Modbus_POE_ETH_Relay.md`)
- **Network**: Local network connectivity between Raspberry Pi and relay device
- **Libraries**: tokio-modbus 0.17.0, Poem 3.1, poem-openapi 5.1, Tokio 1.48
- **Frontend**: Vue.js 3.x, TypeScript, Vite build tool
- **Backend Deployment**: Raspberry Pi (or equivalent) running Linux with Docker
- **Frontend Deployment**: Cloudflare Pages (or equivalent static hosting)
- **Reverse Proxy**: Traefik with Authelia middleware for authentication
### Assumptions
- **ASM-001**: Relay device uses Modbus RTU over TCP protocol (per hardware documentation)
- **ASM-002**: Relay device supports standard Modbus function codes 0x01, 0x05, 0x0F
- **ASM-003**: Local network provides reliable connectivity (>95% uptime)
- **ASM-004**: Traefik reverse proxy with Authelia middleware provides adequate authentication
- **ASM-005**: Single user will control relays at a time in most scenarios (concurrent control is edge case)
- **ASM-006**: Relay device exposes 8 coils at Modbus addresses 0-7
- **ASM-007**: Device firmware is compatible with tokio-modbus library
- **ASM-008**: Raspberry Pi has sufficient resources (CPU, memory) to run Rust backend
- **ASM-009**: Cloudflare Pages or equivalent CDN provides fast frontend delivery
- **ASM-010**: Frontend can reach backend via HTTPS through Traefik reverse proxy
## Out of Scope
The following capabilities are explicitly excluded from this specification:
- **Application-Level Authentication**: No user login, role-based access control, or API keys (handled by Traefik/Authelia)
- **Historical Data**: No database, state logging, or historical relay state tracking
- **Scheduling**: No timer-based relay control or automation rules
- **Multiple Devices**: No support for controlling multiple relay devices simultaneously
- **Advanced Modbus Features**: No support for flash modes, timing operations, or device reconfiguration
- **Mobile Native Apps**: Web interface only, no iOS/Android native applications
- **Cloud Backend**: Backend runs on local network (Raspberry Pi), frontend served from Cloudflare Pages
- **Real-time Updates**: HTTP polling only (no WebSocket, Server-Sent Events)
## Risks & Mitigations
| Risk | Impact | Probability | Mitigation |
|--------------------------------------------|--------|-------------|------------------------------------------------------------------------|
| Modbus device firmware incompatibility | High | Low | Test with actual hardware early, document compatible firmware versions |
| Network latency exceeds timeout thresholds | Medium | Medium | Make timeouts configurable, implement adaptive retry logic |
| Concurrent control causes state conflicts | Low | Medium | Implement last-write-wins with clear state refresh in UI |
| Frontend polling overwhelms backend | Low | Low | Rate limit API endpoints, make poll interval configurable |
| Raspberry Pi resource exhaustion | Medium | Low | Benchmark with 10 concurrent users, optimize Modbus connection pooling |
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2025-12-28 | Business Analyst Agent | Initial specification based on user input |
| 1.1 | 2025-12-28 | User Clarification | FR-023 clarified: Backend starts successfully even when device unhealthy, frontend displays error (part of Health story) |
| 1.2 | 2025-12-29 | User Clarification | Architecture updated: Frontend on Cloudflare Pages, backend on RPi behind Traefik with Authelia. Updated NFR-009 to NFR-013-SEC to reflect HTTPS via reverse proxy, authentication via Traefik middleware |