|
Bogdan Timofte
authored
2 weeks ago
|
1
|
# Home MQTT Semantic Bus
|
|
|
2
|
|
|
|
3
|
## Overview
|
|
|
4
|
|
|
|
5
|
This project defines the architecture and conventions used to build a semantic MQTT bus for a heterogeneous home infrastructure.
|
|
|
6
|
|
|
|
7
|
The environment includes multiple device ecosystems and protocols such as Zigbee, custom ESP firmware, network telemetry, energy systems, and HomeKit integrations. These systems publish data using incompatible topic structures and payload formats.
|
|
|
8
|
|
|
|
9
|
The purpose of this repository is to define a canonical internal structure that allows all telemetry, events, and states to be normalized and consumed by multiple systems.
|
|
|
10
|
|
|
|
11
|
The MQTT bus acts as the central integration layer between devices and higher-level services.
|
|
|
12
|
|
|
|
13
|
Primary documents:
|
|
|
14
|
|
|
|
15
|
- `consolidated_spec.md`: consolidated reference linking all specs, decisions, and end-to-end traces
|
|
|
16
|
- `mqtt_contract.md`: shared transport, payload, metadata, and historian rules
|
|
|
17
|
- `sys_bus.md`: operational namespace for adapters, workers, and infrastructure components
|
|
|
18
|
- `home_bus.md`: room-centric semantic bus contract
|
|
|
19
|
- `energy_bus.md`: electrical telemetry bus contract
|
|
|
20
|
- `addapters.md`: adapter responsibilities and normalization rules
|
|
|
21
|
- `adapter_implementation_examples.md`: practical Node-RED adapter patterns, flow integration guidance, and known failure modes
|
|
|
22
|
- `historian_worker.md`: historian worker responsibilities for consuming buses and writing to PostgreSQL
|
|
|
23
|
- `tdb_ingestion/mqtt_ingestion_api.md`: PostgreSQL historian ingestion API contract for numeric and boolean measurements
|
|
|
24
|
- `tdb_ingestion/counter_ingestion_api.md`: counter ingestion API contract (stabilized, not yet implemented)
|
|
|
25
|
|
|
|
26
|
---
|
|
|
27
|
|
|
|
28
|
## Architectural Model
|
|
|
29
|
|
|
|
30
|
The architecture separates five fundamental layers:
|
|
|
31
|
|
|
|
32
|
Device Layer
|
|
|
33
|
|
|
|
34
|
Devices publish telemetry using vendor-specific protocols and topic structures.
|
|
|
35
|
|
|
|
36
|
Examples:
|
|
|
37
|
|
|
|
38
|
- Zigbee2MQTT
|
|
|
39
|
- Tasmota
|
|
|
40
|
- ESP firmware
|
|
|
41
|
- SNMP
|
|
|
42
|
- MikroTik APIs
|
|
|
43
|
- Modbus energy meters
|
|
|
44
|
|
|
|
45
|
Protocol Adapter Layer
|
|
|
46
|
|
|
|
47
|
Adapters translate vendor-specific topics and payloads into canonical MQTT bus contracts.
|
|
|
48
|
|
|
|
49
|
Adapters perform only normalization and protocol translation.
|
|
|
50
|
|
|
|
51
|
They must not implement automation logic, aggregation, or persistence.
|
|
|
52
|
|
|
|
53
|
MQTT Semantic Bus
|
|
|
54
|
|
|
|
55
|
The canonical model is implemented as multiple semantic buses (for example `home`, `energy`, `network`), each with a strict domain contract.
|
|
|
56
|
|
|
|
57
|
All higher-level services consume data from this layer.
|
|
|
58
|
|
|
|
59
|
The bus is intentionally lightweight: canonical publications must remain minimal, MQTT-ready messages rather than rich adapter envelopes.
|
|
|
60
|
|
|
|
61
|
Historian Worker Layer
|
|
|
62
|
|
|
|
63
|
Historian persistence is handled by a worker that subscribes to canonical bus topics and writes them into PostgreSQL using the historian ingestion API.
|
|
|
64
|
|
|
|
65
|
Consumer Layer
|
|
|
66
|
|
|
|
67
|
Multiple systems consume the bus simultaneously:
|
|
|
68
|
|
|
|
69
|
- HomeKit integration
|
|
|
70
|
- Historian (time-series storage)
|
|
|
71
|
- Aggregators
|
|
|
72
|
- Automation logic
|
|
|
73
|
- Dashboards and monitoring
|
|
|
74
|
|
|
|
75
|
Pipeline:
|
|
|
76
|
|
|
|
77
|
Device -> Protocol Adapter -> MQTT Bus -> Historian Worker / Other Consumers
|
|
|
78
|
|
|
|
79
|
---
|
|
|
80
|
|
|
|
81
|
## The Standardization Problem
|
|
|
82
|
|
|
|
83
|
IoT ecosystems lack a common telemetry model.
|
|
|
84
|
|
|
|
85
|
Different devices publish data using incompatible conventions:
|
|
|
86
|
|
|
|
87
|
- inconsistent topic hierarchies
|
|
|
88
|
- different payload formats (numeric, text, JSON)
|
|
|
89
|
- different naming schemes
|
|
|
90
|
- missing timestamps
|
|
|
91
|
- device-specific semantics
|
|
|
92
|
|
|
|
93
|
This lack of standardization creates several problems:
|
|
|
94
|
|
|
|
95
|
- difficult automation
|
|
|
96
|
- complex integrations
|
|
|
97
|
- duplicated parsing logic
|
|
|
98
|
- unreliable historical analysis
|
|
|
99
|
|
|
|
100
|
The semantic MQTT bus solves this by enforcing strict internal addressing contracts per bus.
|
|
|
101
|
|
|
|
102
|
Adapters isolate vendor inconsistencies and expose normalized data to the rest of the system.
|
|
|
103
|
|
|
|
104
|
---
|
|
|
105
|
|
|
|
106
|
## Shared Contract Baseline (v1)
|
|
|
107
|
|
|
|
108
|
Each bus defines its own topic grammar, but all buses inherit the same shared contract from `mqtt_contract.md`.
|
|
|
109
|
|
|
|
110
|
The shared contract defines:
|
|
|
111
|
|
|
|
112
|
- the common stream taxonomy (`value`, `last`, `set`, `meta`, `availability`)
|
|
|
113
|
- payload profiles (`scalar` and `envelope`)
|
|
|
114
|
- retained metadata structure
|
|
|
115
|
- time semantics (`observed_at`, `published_at`, `ingested_at`)
|
|
|
116
|
- quality states
|
|
|
117
|
- operational topics under `<site>/sys/...` with detailed rules in `sys_bus.md`
|
|
|
118
|
- historian defaults
|
|
|
119
|
|
|
|
120
|
Semantic categories such as `sample`, `state`, and `event` are carried by `meta.historian.mode`, not by introducing separate live stream names in v1.
|
|
|
121
|
|
|
|
122
|
This keeps ingestion simple and predictable while allowing low-overhead Node-RED flows.
|
|
|
123
|
|
|
|
124
|
---
|
|
|
125
|
|
|
|
126
|
## Node-RED Translation Constraints
|
|
|
127
|
|
|
|
128
|
Protocol translation is implemented in Node-RED.
|
|
|
129
|
|
|
|
130
|
To keep flow cost low and determinism high:
|
|
|
131
|
|
|
|
132
|
- keep topic shapes stable and predictable
|
|
|
133
|
- avoid expensive JSON transforms in high-rate paths
|
|
|
134
|
- publish repeated metadata on retained `meta` topics
|
|
|
135
|
- publish canonical MQTT-ready messages as early as possible after normalization
|
|
|
136
|
- keep hot-path messages minimal at publish time: `topic`, `payload`, and stream-policy QoS/retain only
|
|
|
137
|
- do not carry adapter-internal normalization structures on forwarded `msg` objects
|
|
|
138
|
- delete temporary adapter fields before MQTT publish
|
|
|
139
|
- do not use semantic bus topics as a debugging channel
|
|
|
140
|
- use reusable normalization subflows and centralized mapping tables
|
|
|
141
|
- avoid broad `#` subscriptions on high-volume paths
|
|
|
142
|
|
|
|
143
|
These constraints are reflected in each bus specification.
|
|
|
144
|
|
|
|
145
|
---
|
|
|
146
|
|
|
|
147
|
## Operational Separation
|
|
|
148
|
|
|
|
149
|
The new broker is treated as a clean semantic boundary.
|
|
|
150
|
|
|
|
151
|
Production-facing legacy topics may continue to exist temporarily, but adapters should normalize data into the new broker namespace without leaking old topic structures into the canonical contract.
|
|
|
152
|
|
|
|
153
|
The target split is:
|
|
|
154
|
|
|
|
155
|
- legacy broker and vendor topics remain compatibility surfaces
|
|
|
156
|
- the new broker hosts the semantic buses and adapter operational topics
|
|
|
157
|
- historian and future consumers subscribe only to canonical topics
|
|
|
158
|
|
|
|
159
|
---
|
|
|
160
|
|
|
|
161
|
## Historian Integration
|
|
|
162
|
|
|
|
163
|
One of the primary consumers of the bus is the historian.
|
|
|
164
|
|
|
|
165
|
The historian records time-series measurements for long-term analysis.
|
|
|
166
|
|
|
|
167
|
Typical use cases include:
|
|
|
168
|
|
|
|
169
|
- temperature history
|
|
|
170
|
- energy production and consumption
|
|
|
171
|
- network traffic metrics
|
|
|
172
|
- device performance monitoring
|
|
|
173
|
|
|
|
174
|
The historian does not communicate directly with devices.
|
|
|
175
|
|
|
|
176
|
Instead, it subscribes to normalized bus topics.
|
|
|
177
|
|
|
|
178
|
Current ingestion modeling is split in two:
|
|
|
179
|
|
|
|
180
|
- numeric and boolean measurements or states go through `tdb_ingestion/mqtt_ingestion_api.md`
|
|
|
181
|
- cumulative counters such as `energy_total` follow the separate contract in `tdb_ingestion/counter_ingestion_api.md` (stabilized, not yet implemented)
|
|
|
182
|
|
|
|
183
|
Example subscriptions:
|
|
|
184
|
|
|
|
185
|
- `+/home/+/+/+/value`
|
|
|
186
|
- `+/energy/+/+/+/value`
|
|
|
187
|
|
|
|
188
|
This architecture ensures that historical data remains consistent even when devices or protocols change.
|
|
|
189
|
|
|
|
190
|
---
|
|
|
191
|
|
|
|
192
|
## Project Goals
|
|
|
193
|
|
|
|
194
|
The project aims to achieve the following objectives:
|
|
|
195
|
|
|
|
196
|
1. Define a stable MQTT semantic architecture for home infrastructure.
|
|
|
197
|
2. Decouple device protocols from automation and monitoring systems.
|
|
|
198
|
3. Enable multiple independent consumers of telemetry data.
|
|
|
199
|
4. Provide consistent topic contracts across heterogeneous systems.
|
|
|
200
|
5. Support scalable integration of additional device ecosystems.
|
|
|
201
|
6. Enable long-term historical analysis of telemetry.
|
|
|
202
|
7. Simplify integration with HomeKit and other user interfaces.
|
|
|
203
|
8. Make historian ingestion generic enough to reuse across buses.
|
|
|
204
|
9. Keep room for future buses without reworking existing consumers.
|
|
|
205
|
|
|
|
206
|
---
|
|
|
207
|
|
|
|
208
|
## Core Concepts
|
|
|
209
|
|
|
|
210
|
Adapters
|
|
|
211
|
|
|
|
212
|
Components that translate between systems and canonical bus contracts.
|
|
|
213
|
|
|
|
214
|
In practice there are two useful classes:
|
|
|
215
|
|
|
|
216
|
- ingress adapters: vendor/protocol topics -> canonical bus topics
|
|
|
217
|
- consumer adapters: canonical bus topics -> downstream consumer models such as HomeKit
|
|
|
218
|
|
|
|
219
|
Buses
|
|
|
220
|
|
|
|
221
|
Domain-specific normalized telemetry spaces (for example `home`, `energy`, `network`).
|
|
|
222
|
|
|
|
223
|
Streams
|
|
|
224
|
|
|
|
225
|
Named data flows associated with a capability or metric (`value`, `last`, `set`, `meta`, `availability`).
|
|
|
226
|
|
|
|
227
|
Semantic interpretation such as `sample`, `state`, or `event` is carried by retained `meta`, especially `meta.historian.mode`.
|
|
|
228
|
|
|
|
229
|
Consumers
|
|
|
230
|
|
|
|
231
|
Systems that subscribe to the bus and process the data.
|
|
|
232
|
|
|
|
233
|
---
|
|
|
234
|
|
|
|
235
|
## Design Principles
|
|
|
236
|
|
|
|
237
|
Protocol isolation
|
|
|
238
|
|
|
|
239
|
Device ecosystems must not leak their internal topic structure into the system.
|
|
|
240
|
|
|
|
241
|
Contract-driven addressing
|
|
|
242
|
|
|
|
243
|
All normalized telemetry must follow explicit per-bus topic contracts.
|
|
|
244
|
|
|
|
245
|
Loose coupling
|
|
|
246
|
|
|
|
247
|
Consumers must not depend on specific device implementations.
|
|
|
248
|
|
|
|
249
|
Extensibility
|
|
|
250
|
|
|
|
251
|
New buses, locations, devices, and metrics must be easy to integrate.
|
|
|
252
|
|
|
|
253
|
Observability
|
|
|
254
|
|
|
|
255
|
All telemetry should be recordable by a historian.
|
|
|
256
|
|
|
|
257
|
Node-RED efficiency
|
|
|
258
|
|
|
|
259
|
Topic and payload design should minimize transformation overhead in Node-RED.
|
|
|
260
|
|
|
|
261
|
The MQTT semantic bus is therefore optimized as a low-memory, low-CPU event bus for constrained accessories, SBCs, thin VMs, and high-rate Node-RED flows.
|
|
|
262
|
|
|
|
263
|
---
|
|
|
264
|
|
|
|
265
|
## Status
|
|
|
266
|
|
|
|
267
|
The system is currently being deployed with a new MQTT broker running on:
|
|
|
268
|
|
|
|
269
|
`192.168.2.101`
|
|
|
270
|
|
|
|
271
|
The legacy broker at:
|
|
|
272
|
|
|
|
273
|
`192.168.2.133`
|
|
|
274
|
|
|
|
275
|
will be progressively phased out while Node-RED adapters migrate traffic into the canonical bus.
|