Copilot Instructions for Madagascar Thunderbolts & Backups
Big Picture Architecture
- The codebase manages high-MTU Thunderbolt networking and automated backups for a Proxmox cluster (
baobab, ebony, tapia).
- Networking: Early boot systemd/udev units create and maintain a
thunderbridge (MTU 65520), hotplug Thunderbolt NICs, and ensure persistent bridge membership.
- Backups: Autonomous agent scripts (in
backups/) discover VMs, run scheduled backups, and log lifecycle events.
- All node, network, and backup config is centralized in
cluster/madagascar.json.
Critical Developer Workflows
- Network Deploy:
- Run
deploy/attempt1/deploy_tb.sh from its directory to push configs and services to all nodes.
- Validate with
scripts/check_thunderbridge.sh (checks bridge ports, MTU, and cluster connectivity).
- Backup Deploy:
- Use
backups/scripts/deploy_to_nodes.sh to install backup agents on all nodes.
- Backup agent lifecycle is managed by systemd timers (
backup_agent.timer).
- Issue Tracking:
- All issues documented in
issues/ using TEMPLATE.md.
- Every fix/change must be referenced in
CHANGELOG.md.
Project-Specific Conventions
- Network config:
- Node-specific overlays in
deploy/attempt1/<node>/etc/network/interfaces.d/10-thunderbolt.
- Shared systemd/udev units in
deploy/attempt1/common/.
- Always use post-up hooks for bridge membership and MTU persistence.
- SSH Automation:
- Scripts use
-o LogLevel=ERROR to suppress known hosts warnings.
- Management and Thunderbolt IPs are set in deploy scripts; update helpers for new nodes.
- Versioning:
- New network designs go in new
attemptN folders for reproducibility.
- Backups:
- All backup config and manifests reference
madagascar.json for node/IP discovery.
- Backup agent logs lifecycle events and changes in
madagascar-changelog.json (if present).
Integration Points & Data Flows
- Network:
- Systemd/udev units interact via device events; enlist services attach NICs to bridge.
- Deploy script pushes all config and reloads services atomically.
- Backups:
- Agent scripts SSH into nodes, discover VMs, and run backups using Proxmox CLI.
- Results and metadata are logged for auditability.
Key Files & Directories
deploy/attempt1/deploy_tb.sh: Main network deploy script
deploy/attempt1/common/: Shared systemd/udev units
deploy/attempt1/<node>/etc/network/interfaces.d/10-thunderbolt: Node overlays
scripts/check_thunderbridge.sh: Cluster network health check
cluster/madagascar.json: Canonical node/network/backup config
backups/: Backup agent, deployment, and documentation
issues/: Issue tracker
CHANGELOG.md: Change log
Example Patterns
- To add a node: copy an existing node directory, update IPs, extend deploy script helpers.
- To troubleshoot: check systemd unit status, bridge membership, and kernel logs.
- To automate: use provided scripts, keep configs in sync with
madagascar.json, and document all changes.
For questions or unclear conventions, review README.md and issue templates, or ask for clarification in the issue tracker.