mqtt_bus / sys_bus.md
1 contributor
176 lines | 4.044kb

Sys Operational Namespace

Purpose

This document defines the shared operational MQTT namespace under <site>/sys/....

It is used for adapter, worker, bridge, and infrastructure observability.

sys is not a semantic bus. It exists to expose component health, counters, errors, and rejected messages without polluting domain buses such as home or energy.


Namespace Model

Canonical shape:

<site>/sys/<producer_kind>/<instance_id>/<stream>

Examples:

  • vad/sys/adapter/z2m-zg-204zv/availability
  • vad/sys/adapter/z2m-zg-204zv/stats
  • vad/sys/historian/main/error
  • vad/sys/historian/main/dlq

Rules:

  • <site> MUST be stable lowercase kebab-case
  • <producer_kind> identifies the emitting software component, not a device capability
  • <instance_id> MUST identify a stable logical instance of that component
  • sys MUST NOT be used for room, device, or capability telemetry
  • semantic topics such as <site>/home/... and <site>/energy/... MUST remain separate from sys

Common producer_kind values in v1:

  • adapter
  • historian

Additional producer kinds MAY be introduced later, but they SHOULD follow the same operational topic model.


Streams

The sys namespace uses operational streams, not the semantic stream taxonomy from domain buses.

Supported v1 streams:

  • availability
  • stats
  • error
  • dlq

availability

Meaning:

  • liveness of the emitting component instance
  • whether the adapter or worker itself is online

Typical payloads:

  • online
  • offline
  • optionally degraded for components that expose a degraded mode

Policy:

  • QoS 1
  • retain true
  • SHOULD use LWT when supported by the MQTT client

stats

Meaning:

  • low-rate counters or snapshots describing the current operational state of the component

Typical payload shape:

{
  "processed_inputs": 1824,
  "translated_messages": 9241,
  "errors": 2,
  "dlq": 1
}

Policy:

  • QoS 1
  • retain true when the message represents the latest snapshot
  • publish at a low rate, not on every hot-path sample

error

Meaning:

  • structured operator-visible errors
  • faults that deserve attention but do not require embedding diagnostics into semantic bus payloads

Typical payload shape:

{
  "code": "invalid_topic",
  "reason": "Topic must start with zigbee2mqtt/ZG-204ZV",
  "source_topic": "zigbee2mqtt/bad/topic",
  "adapter_id": "z2m-zg-204zv"
}

Policy:

  • QoS 1
  • retain false by default
  • SHOULD remain structured and compact

dlq

Meaning:

  • dead-letter payloads for messages that could not be normalized or safely processed

Typical payload shape:

{
  "code": "payload_not_object",
  "source_topic": "zigbee2mqtt/ZG-204ZV/vad/balcon/south",
  "payload": "offline"
}

Policy:

  • QoS 1
  • retain false
  • SHOULD carry enough context to reproduce or inspect the rejected message

Availability Semantics

Operational availability and semantic availability are different signals.

Examples:

  • vad/sys/adapter/z2m-zg-204zv/availability means the adapter is running
  • vad/home/balcon/motion/south/availability means the canonical endpoint is available to consumers

These topics are complementary, not duplicates.

Rules:

  • sys/.../availability describes component health
  • semantic .../availability describes endpoint availability on a domain bus

Consumer Guidance

  • historians SHOULD NOT treat sys topics as normal semantic measurements by default
  • dashboards, alerting, and operator tooling SHOULD subscribe to sys
  • malformed input, normalization failures, and adapter faults SHOULD be routed to sys, not embedded into semantic payloads
  • consumers SHOULD expect stats to be snapshots and error/dlq to be transient diagnostics

Relationship to Other Documents

  • shared transport and payload rules: mqtt_contract.md
  • adapter operational responsibilities: addapters.md
  • historian worker operational topics: historian_worker.md
  • semantic endpoint contracts: home_bus.md, energy_bus.md