mqtt_bus / mqtt_contract.md
1 contributor
420 lines | 14.744kb

MQTT Shared Contract

Purpose

This document defines the shared contract baseline for all semantic MQTT buses in this repository.

It exists to keep adapters, historian ingestion, and future buses aligned on the same transport and payload rules.

Bus-specific documents such as home_bus.md and energy_bus.md define their own topic grammar, but they inherit the rules from this shared contract.

The operational namespace under <site>/sys/... is specified in more detail by sys_bus.md.


Namespace Model

Two top-level namespaces are reserved:

  • semantic bus namespace: <site>/<bus>/...
  • operational namespace: <site>/sys/...

Examples:

  • vad/home/bedroom/temperature/bedroom-sensor/value
  • vad/energy/load/living-room-tv/active_power/value
  • vad/sys/adapter/z2m-main/error

Rules:

  • <site> MUST be stable and lowercase kebab-case
  • <bus> MUST be a reserved bus identifier such as home, energy, network, compute, vehicle
  • sys is reserved for operational topics and is not a semantic bus; see sys_bus.md
  • topic path versioning is intentionally not used in v1 to keep wildcard subscriptions simple

Versioning rule:

  • breaking contract changes MUST be expressed via document version and schema_ref
  • existing topic paths MUST remain stable for v1 consumers

Shared Streams

All semantic buses use the same stream taxonomy:

  • value: live hot-path semantic sample, whether it represents a measurement, durable state, or transition notification
  • last: retained last-known timestamped sample used for cold start and freshness evaluation
  • set: command/request topic
  • meta: retained metadata for the sibling topic family
  • availability: online/offline or degraded health signal

Rules:

  • set MUST NOT be retained
  • last SHOULD be retained
  • meta SHOULD be retained
  • availability SHOULD be retained and SHOULD use LWT when supported
  • buses MUST NOT invent ad-hoc stream names for v1
  • adapters SHOULD publish live semantic data on value, not split it across separate state and event streams
  • adapters SHOULD deduplicate hot value publications when consecutive samples are semantically identical
  • adapters SHOULD update retained last whenever the latest observed sample timestamp changes
  • legacy state and event topics SHOULD be treated as compatibility-only during migration and SHOULD NOT be introduced by new adapters

If a future need appears for diagnostics, replay, dead-letter handling, or adapter metrics, it MUST be modeled under <site>/sys/..., not by extending semantic bus streams.


Topic Naming Rules

Common naming rules:

  • identifiers representing physical or logical entities SHOULD use kebab-case
  • capability and metric names SHOULD use snake_case
  • topic segments MUST be ASCII lowercase
  • spaces MUST NOT appear in canonical topics
  • vendor-native identifiers MUST NOT leak into semantic topics unless they are the chosen canonical identifier

Identity rules:

  • canonical IDs MUST be stable across adapter rewrites
  • replacing a physical sensor SHOULD NOT force a canonical ID change if the logical endpoint remains the same
  • vendor IDs SHOULD be carried in meta.source_ref, not in the semantic topic path

Lightweight Bus Requirement

The semantic MQTT bus is a high-efficiency event bus.

It is NOT:

  • a debugging interface
  • a transport for rich adapter envelopes
  • a place to repeat internal mapping state on every sample

Normative rule:

  • semantic bus publications MUST remain minimal and MQTT-ready
  • the canonical publication boundary is the MQTT topic plus payload, with QoS and retain determined by stream policy
  • adapters MAY build richer internal normalization objects, but those objects MUST be discarded before publish
  • adapter-specific fields such as mapping tables, vendor payload snapshots, or internal processing context MUST NOT travel on semantic bus hot paths

Canonical publication form:

  • topic: <site>/<bus>/...
  • payload: scalar or small JSON envelope
  • qos: 0 or 1
  • retain: according to stream policy

Discouraged adapter-side publish shape:

{
  "topic": "vad/home/bedroom/temperature/bedroom-sensor/value",
  "payload": 23.6,
  "homeBus": {
    "location": "bedroom"
  },
  "z2mPayload": {
    "temperature": 23.6,
    "battery": 91
  },
  "mapping": {
    "source_field": "temperature"
  }
}

Those structures may exist inside normalization logic, but MUST be stripped before the MQTT publish boundary.

Reason:

  • the architecture prioritizes low CPU usage, low memory footprint, predictable Node-RED execution, and compatibility with constrained IoT accessories
  • large message objects increase memory pressure
  • large nested objects increase garbage collection cost in Node-RED
  • constrained accessories, SBCs, and thin VMs benefit from structurally simple bus traffic

Payload Profiles

Two payload profiles are supported across all buses.

Profile A: Scalar Payload

This is the default profile for hot paths.

Examples:

  • 23.6
  • 41
  • true
  • on

Profile A requirements:

  • the payload MUST be a scalar: number, boolean, or short string enum
  • units and metadata MUST be published on retained meta
  • if exact source observation time matters, Profile A MUST NOT be used unless broker receive time is acceptable
  • Profile A SHOULD be the default for high-rate telemetry and hot value streams
  • Profile A is the preferred format for live value streams consumed by lightweight clients

Profile A historian rule:

  • historian workers SHOULD use ingestion time as observed_at unless an equivalent timestamp is provided out of band

Profile B: Envelope JSON

This profile is used when timestamp, quality, or extra annotations must travel with each sample.

Canonical shape:

{
  "value": 23.6,
  "unit": "C",
  "observed_at": "2026-03-08T10:15:12Z",
  "quality": "good"
}

Optional fields:

  • published_at: adapter publish time
  • source_seq: source-side monotonic counter or sequence id
  • annotations: free-form object for low-rate streams

Profile B requirements:

  • value is REQUIRED
  • observed_at SHOULD be included when the source provides a timestamp
  • quality SHOULD be included if the adapter had to estimate or degrade data
  • unit MAY be omitted for unitless, boolean, or enum values

Use Profile B when:

  • source timestamp must be preserved
  • historian ordering must follow source time, not broker receive time
  • per-sample quality matters
  • the stream is low-rate enough that JSON overhead is acceptable
  • a retained last sample is used for startup decisions and consumers must evaluate whether it is still usable

Profile B restriction:

  • adapters SHOULD avoid envelope JSON on high-rate streams unless there is no acceptable scalar alternative
  • Profile B MUST remain a small canonical envelope and MUST NOT be extended into a general-purpose transport for adapter internals
  • repeated metadata belongs on retained meta, not inside every value sample

Meta Contract

Each retained meta topic describes the sibling value and last stream family.

Minimum recommended shape:

{
  "schema_ref": "mqbus.home.v1",
  "payload_profile": "scalar",
  "data_type": "number",
  "unit": "C",
  "adapter_id": "z2m-main",
  "source": "zigbee2mqtt",
  "source_ref": "0x00158d0008aa1111",
  "source_topic": "zigbee2mqtt/bedroom_sensor",
  "precision": 0.1,
  "historian": {
    "enabled": true,
    "mode": "sample"
  }
}

Recommended fields:

  • schema_ref: stable schema identifier such as mqbus.home.v1
  • payload_profile: scalar or envelope
  • data_type: number, boolean, string, or json
  • unit: canonical engineering unit when applicable
  • adapter_id: canonical adapter instance id
  • source: source system such as zigbee2mqtt, modbus, snmp
  • source_ref: vendor or physical device identifier
  • source_topic: original inbound topic or equivalent source path
  • precision: numeric precision hint
  • display_name: human-readable label
  • tags: optional list for analytics and discovery
  • historian: ingestion policy object

Historian metadata contract:

  • historian.enabled: boolean
  • historian.mode: one of sample, state, event, ignore
  • historian.retention_class: optional storage class such as short, default, long
  • historian.sample_period_hint_s: optional expected cadence

Rules:

  • meta SHOULD be published before live value traffic for new streams
  • meta updates MUST remain backward-compatible for existing consumers during v1
  • consumers MUST tolerate missing meta and continue with degraded defaults
  • repeated descriptive metadata MUST be published on retained meta, not repeated on each hot-path value publication

Example:

  • vad/home/bedroom/temperature/bedroom-sensor/meta -> {"unit":"C","precision":0.1,"adapter_id":"z2m-main"}
  • vad/home/bedroom/temperature/bedroom-sensor/value -> 23.6

Time Semantics

The following timestamps are distinct:

  • observed_at: when the source system observed or measured the value
  • published_at: when the adapter normalized and published the message
  • ingested_at: when the downstream worker processed the message

Rules:

  • if the source provides a trustworthy timestamp, adapters SHOULD preserve it as observed_at
  • if the source does not provide a timestamp, adapters MAY omit observed_at
  • if observed_at is omitted, historian workers SHOULD use ingested_at
  • adapters MUST NOT fabricate source time and mark it as fully trustworthy
  • if an adapter estimates time, it SHOULD use Profile B with quality=estimated

This rule is the key tradeoff between low-overhead scalar payloads and strict time fidelity.


Quality Model

The following quality values are recommended:

  • good: trusted value from source
  • estimated: value or timestamp estimated by adapter
  • degraded: source known to be unstable or partially invalid
  • stale: source not updated within expected cadence
  • invalid: malformed or failed validation

Rules:

  • omit quality only when good is implied
  • invalid payloads SHOULD NOT be emitted on semantic bus topics
  • invalid or unmappable messages SHOULD be routed to operational error topics under sys

Delivery Policy

Shared defaults:

  • value: QoS 1, retain false unless a bus-specific contract explicitly requires otherwise
  • last: QoS 1, retain true
  • set: QoS 1, retain false
  • meta: QoS 1, retain true
  • availability: QoS 1, retain true

Additional rules:

  • value is the live stream and SHOULD remain lightweight
  • last is the cold-start bootstrap mechanism for latest known measurements on the semantic bus
  • last SHOULD use Profile B and include observed_at
  • consumers MUST treat a retained last sample as the latest known observation, not as proof of freshness
  • if freshness matters, consumers SHOULD evaluate observed_at, availability, and expected cadence from meta
  • adapters MAY deduplicate value publications, but last SHOULD be updated whenever the latest observed sample timestamp changes
  • command acknowledgements SHOULD be emitted on normalized value, not by retaining set
  • late joiners MUST be able to reconstruct stream meaning and last known sample from retained meta, retained last, and retained availability
  • consumers that require deterministic retained bootstrap SHOULD use a dedicated MQTT client session rather than sharing a session with unrelated subscribers on the same broker config

Command Envelope Guidance

Simple set commands may use scalar payloads:

  • on
  • off
  • 21.5

If correlation or richer semantics are required, a JSON envelope is allowed:

{
  "value": "on",
  "request_id": "01HRN8KZQ2D7P0S4M6B4CJ3M8Y",
  "requested_at": "2026-03-08T10:20:00Z"
}

Rules:

  • command topics MUST remain bus-specific and capability-specific
  • acknowledgements SHOULD be published separately on normalized value
  • adapters SHOULD avoid command-side business logic

Operational Namespace

Operational topics are for adapter health, replay control, and malformed input handling.

Detailed operational namespace rules are defined in sys_bus.md.

Recommended topics:

  • <site>/sys/adapter/<adapter_id>/availability
  • <site>/sys/adapter/<adapter_id>/stats
  • <site>/sys/adapter/<adapter_id>/error
  • <site>/sys/adapter/<adapter_id>/dlq

Recommended uses:

  • availability: retained adapter liveness
  • stats: low-rate counters such as published points or dropped messages
  • error: structured adapter errors that deserve operator attention
  • dlq: dead-letter payloads for messages that could not be normalized

This keeps operational concerns separate from semantic buses and avoids polluting historian input.

Debugging, replay diagnostics, and adapter-internal observability MUST be published under <site>/sys/..., not embedded into semantic bus payloads.


Retained Message Lifecycle

Retained topics are part of the contract and require explicit lifecycle handling.

Rules:

  • meta, last, and availability SHOULD be retained when they represent current truth
  • when a retained topic must be deleted, publish a zero-byte retained message to the same topic
  • adapters SHOULD clear retained topics when an entity is decommissioned or renamed
  • consumers MUST tolerate retained data arriving before or after live traffic

Historian Ingestion Defaults

Historian workers SHOULD apply the following defaults:

  • ingest value streams by default
  • interpret meta.historian.mode as the semantic category of the value stream, for example sample, state, or event
  • ignore last, set, meta, and availability as time-series samples

Current PostgreSQL historian compatibility:

  • numeric and boolean samples are directly compatible with tdb_ingestion/mqtt_ingestion_api.md
  • string enum states are valid on the semantic bus, but SHOULD stay out of historian until an explicit encoding policy exists
  • counter-style cumulative metrics such as energy_total, *_bytes_total, and *_packets_total are valid bus metrics, but the current measurement API does not define their storage semantics; see tdb_ingestion/counter_ingestion_api.md
  • if enum state ingestion is needed, the worker MUST map it to an agreed numeric or boolean representation before calling PostgreSQL

Default field mapping:

  • value from payload or envelope
  • observed_at from envelope if present, otherwise ingestion time
  • unit from envelope if present, otherwise cached retained meta.unit
  • quality from envelope if present, otherwise good

This allows historian workers to stay generic while bus contracts remain strict.