# Issue ISSUE-2026-002: Planned reboot stalls on NFS storages over thunderbridge before network shutdown

## Issue ID: ISSUE-2026-002

**Status:** investigating  
**Priority:** high  
**Created:** 2026-03-07  
**Updated:** 2026-03-07  
**Assigned to:** unassigned

---

## Summary

Planned node reboot on `baobab` spent ~106 seconds in shutdown because Proxmox NFS storages were still mounted after Thunderbolt transport had already been detached from `thunderbridge`.

---

## Description

During a controlled reboot validation on `baobab`, guest suspend worked correctly, but the host remained reachable over ICMP for almost two minutes after `systemctl reboot`. Journal analysis showed that the Thunderbolt bridge ports were detached early in shutdown, while Proxmox only attempted to unmount NFS storages later. Because `AutoNAS-1` and `AutoNAS-2` are mounted over `192.168.10.x` through `thunderbridge`, the NFS unmount path lost transport and waited for timeout.

The same investigation exposed a second maintenance risk in `pgs`: preflight cleanup could block in kernel I/O wait when it touched remote NFS-backed storages that were stale or temporarily unavailable. That does not create the slow reboot itself, but it can block the maintenance preparation step.

Follow-up validation on `ebony` showed a different but related cluster behavior: `AutoNAS-1` is currently exported by `ebony` itself. During reboot, `autonas.service` stops early, which makes the node's own Proxmox NFS client mount for `AutoNAS-1` stale and it then waits for timeout during unmount. In the same window, VM `301 is-anjohibe` (PBS `anjothibe`) is intentionally suspended by `pgs`, so PBS availability loss is expected during the maintenance window.

Validation on `tapia` showed the same class of topology problem for `AutoNAS-2`, which is locally exported there and mounted back as a Proxmox NFS storage. The AutoNAS shutdown-ordering patch remained active, but reboot timing still stayed near the pre-fix range because `mnt-pve-AutoNAS-2.mount` waited for timeout during shutdown while PBS `andrafiabe-AutoNAS` had already become unreachable.

---

## Environment

- **Affected nodes:** `baobab` confirmed, likely all nodes using Proxmox NFS storages over `thunderbridge`
- **Component:** network + storage + maintenance workflow
- **Version/software:** Proxmox VE 9.1 / kernel `6.17.13-1-pve`, `tb-enlist@.service`, `pgs`

---

## Steps to Reproduce

1. On a node with Proxmox NFS storages routed over `thunderbridge`, run `/usr/local/sbin/pgs suspend -v`.
2. Trigger `systemctl reboot`.
3. Measure ICMP availability during shutdown and boot.
4. Inspect `journalctl -b -1` around the reboot window.

---

## Expected Behavior

- NFS storages should unmount while Thunderbolt transport is still available.
- Host should stop replying to ICMP shortly after reboot is requested.
- `pgs suspend` should not hang because a remote NFS mount is stale.

---

## Actual Behavior

- First validation on `baobab`:
  - `TIME_TO_STOP_SECONDS 105.852`
  - `TIME_TO_FIRST_REPLY_SECONDS 130.230`
  - `DOWNTIME_SECONDS 24.377`
- Follow-up validation on `ebony`:
  - `TIME_TO_STOP_SECONDS 120.275`
  - `TIME_TO_FIRST_REPLY_SECONDS 145.840`
  - `DOWNTIME_SECONDS 25.565`
- Follow-up validation on `tapia` after cluster-wide AutoNAS rollout:
  - `TIME_TO_STOP_SECONDS 123.285`
  - `TIME_TO_FIRST_REPLY_SECONDS 149.420`
  - `DOWNTIME_SECONDS 26.135`
- `journalctl -b -1` showed:
  - Thunderbolt bridge ports detached at `08:48:17.989`
  - NFS unmount only started at `08:48:30.540`
  - `mnt-pve-AutoNAS-1.mount` and `mnt-pve-AutoNAS-2.mount` timed out at `08:50:00.604/0.605`
- `journalctl -b -1` on `ebony` showed:
  - `autonas.service` stopped at `11:04:22.326`
  - `mnt-pve-AutoNAS-2.mount` unmounted successfully by `11:04:38.693`
  - `mnt-pve-AutoNAS-1.mount` timed out at `11:06:08.679`
  - only after that did `network.target` stop and `tb-enlist@thunderbolt0.service` detach from `thunderbridge`
- A later maintenance attempt also showed `pgs suspend` blocked in `nfs4_proc_getattr` while scanning storage paths.

---

## Logs/Evidence

```text
Mar 07 08:48:17.989246 baobab NetworkManager[1096]: device (thunderbridge): bridge port thunderbolt0 was detached
Mar 07 08:48:17.993120 baobab NetworkManager[1096]: device (thunderbridge): bridge port thunderbolt1 was detached
Mar 07 08:48:30.540186 baobab systemd[1]: Unmounting mnt-pve-AutoNAS-1.mount - /mnt/pve/AutoNAS-1...
Mar 07 08:48:30.541335 baobab systemd[1]: Unmounting mnt-pve-AutoNAS-2.mount - /mnt/pve/AutoNAS-2...
Mar 07 08:50:00.604036 baobab systemd[1]: mnt-pve-AutoNAS-2.mount: Unmounting timed out. Terminating.
Mar 07 08:50:00.605215 baobab systemd[1]: mnt-pve-AutoNAS-1.mount: Unmounting timed out. Terminating.
```

Blocked `pgs` stack during stale-NFS preflight:

```text
[<0>] rpc_wait_bit_killable+0x11/0x80 [sunrpc]
[<0>] nfs4_do_call_sync+0x6a/0xc0 [nfsv4]
[<0>] __nfs_revalidate_inode+0xd4/0x320 [nfs]
[<0>] __do_sys_newfstatat+0x43/0x90
```

Validated timing after fixes on `baobab`:

```text
TIME_TO_STOP_SECONDS 14.599
TIME_TO_FIRST_REPLY_SECONDS 35.651
DOWNTIME_SECONDS 21.053
```

---

## Investigation Notes

- 2026-03-07: Confirmed `AutoNAS-1` and `AutoNAS-2` on `baobab` are Proxmox NFS storages mounted from `192.168.10.21` and `192.168.10.22` over `thunderbridge`.
- 2026-03-07: First reboot validation on `baobab` showed shutdown delay dominated by NFS unmount timeout, not by boot.
- 2026-03-07: `tb-enlist@.service` had no ordering against `network.target`; systemd stopped Thunderbolt bridge membership before Proxmox unmounted remote storages.
- 2026-03-07: Patched shared `tb-enlist@.service` with `Before=network.target` and deployed to `baobab`, then cluster-wide.
- 2026-03-07: Separate maintenance attempt showed `pgs suspend` can block in `nfs4_proc_getattr` while scanning storage paths on stale remote NFS mounts.
- 2026-03-07: Patched `pgs` cleanup to scan only local `dir` storages; remote storages such as NFS are skipped intentionally.
- 2026-03-07: Revalidated on `baobab` after both fixes:
  - NFS unmount started at `10:48:12.354/10:48:12.356`
  - both NFS mounts unmounted successfully by `10:48:12.460`
  - `network.target` stopped later at `10:48:16.152`
  - ICMP loss dropped from ~106s to ~15s after reboot command
- 2026-03-07: `pgs resume` completed successfully after reboot on `baobab`; state file survived boot and all 4 VMs + 1 CT were restored.
- 2026-03-07: Validated `ebony` with current `pgs` and cluster-wide `thunderbolts` rollout. `pgs suspend` / `resume` succeeded for VMs `101`, `102`, `301`; state file survived reboot and restore completed.
- 2026-03-07: `ebony` still showed long shutdown because `AutoNAS-1` is currently provided by `ebony` itself through `autonas`. Stopping `autonas.service` made the node's own NFS client mount stale and `mnt-pve-AutoNAS-1.mount` waited for timeout.
- 2026-03-07: On `ebony`, PBS `anjothibe` availability loss during maintenance is expected because VM `301 is-anjohibe` is intentionally suspended by `pgs`, and its datastore dependency is also on `AutoNAS-1`.
- 2026-03-07: Implemented AutoNAS shutdown-ordering experiment on `ebony`: `autonas.service` and `autonas-boot-scan.service` now declare `Before=remote-fs.target` and `Before=umount.target`.
- 2026-03-07: Revalidated `ebony` after AutoNAS patch:
  - previous timing: `TIME_TO_STOP_SECONDS 120.275`, `TIME_TO_FIRST_REPLY_SECONDS 145.840`
  - new timing: `TIME_TO_STOP_SECONDS 27.573`, `TIME_TO_FIRST_REPLY_SECONDS 53.288`
  - `mnt-pve-AutoNAS-2.mount` still unmounted cleanly
  - `AutoNAS-1` no longer waited for the old 90s timeout, though a brief `Stale file handle` was still observed before the provider side stopped
- 2026-03-07: Residual issue on `ebony`: even with later provider shutdown, `pvestatd` briefly logged `storage 'AutoNAS-1' is not online` / `Stale file handle` during the maintenance window, so the self-hosted NFS topology remains fragile but no longer dominates shutdown time.
- 2026-03-07: Deployed the same AutoNAS ordering patch cluster-wide and revalidated `tapia`.
- 2026-03-07: `pgs suspend` / reboot / `pgs resume` succeeded on `tapia` for VMs `104`, `107`, `113`, `302`; state file survived reboot and all four guests were restored.
- 2026-03-07: `tapia` still showed slow shutdown after the AutoNAS patch:
  - `TIME_TO_STOP_SECONDS 123.285`, `TIME_TO_FIRST_REPLY_SECONDS 149.420`
  - `mnt-pve-AutoNAS-1.mount` unmounted immediately at `11:45:01.827`
  - `autonas.service` and `nfs-server.service` stopped around `11:45:01.689/11:45:01.900`
  - `mnt-pve-AutoNAS-2.mount` then waited until timeout at `11:46:31.778`
  - `network.target` stopped only after that, at `11:46:31.781`
- 2026-03-07: On `tapia`, the remaining delay is concentrated on self-hosted `AutoNAS-2` (`server 192.168.10.22`) plus expected maintenance-window loss of PBS `andrafiabe-AutoNAS` (`192.168.10.96`).

---

## Proposed Solution

1. Keep Thunderbolt enlist units ordered before `network.target` so storage traffic over `thunderbridge` remains alive until remote filesystems are unmounted.
2. Keep `pgs` cleanup path limited to local directory-backed storages; do not let remote NFS availability gate planned maintenance.
3. Do not mount a node's own AutoNAS export back onto the same node as a Proxmox NFS storage; on `ebony`, exclude `AutoNAS-1` from local use or replace that local dependency with a direct/local storage path.
4. Review colocated service dependencies before planned reboot, especially when the node provides the storage it also consumes (for example `autonas` and PBS on `ebony`).
5. Apply the same self-hosted-storage review on `tapia`, where `AutoNAS-2` remains the dominant shutdown delay even after the AutoNAS ordering patch.
6. Validate the same shutdown path on the remaining nodes after storage-role cleanup.

---

## Related Issues

- ISSUE-2026-001

---

## Changelog References

List CHANGELOG.md entries that reference this issue:
- `projects/thunderbolts/CHANGELOG.md`: [Unreleased] - `tb-enlist@.service` now stays active until `network.target` stops... [ISSUE-2026-002]
- `projects/pve-guests-state/CHANGELOG.md`: [1.5] - Suspend-artifact cleanup now scans only local `dir` storages... [ISSUE-2026-002]
