|
Bogdan Timofte
authored
3 months ago
|
1
|
# Issue ISSUE-2025-001: Thunderbolt interfaces MTU resets to 1500 after networking restart
|
|
|
2
|
|
|
|
3
|
**Status:** closed
|
|
|
4
|
**Priority:** high
|
|
|
5
|
**Created:** 2025-10-30
|
|
|
6
|
**Updated:** 2025-10-30
|
|
|
7
|
**Assigned to:** unassigned
|
|
|
8
|
**Resolution:** Fixed with hybrid approach (udev rule + post-up hook)
|
|
|
9
|
|
|
|
10
|
---
|
|
|
11
|
|
|
|
12
|
## Summary
|
|
|
13
|
|
|
|
14
|
`systemctl restart networking` causes thunderbolt interfaces to reset MTU from 65520 to default 1500.
|
|
|
15
|
|
|
|
16
|
---
|
|
|
17
|
|
|
|
18
|
## Description
|
|
|
19
|
|
|
|
20
|
After executing `systemctl restart networking` on cluster nodes, the thunderbolt interfaces (thunderbolt0, thunderbolt1) lose their configured MTU of 65520 and revert to the default 1500. This also sometimes occurs after system reboot, though the behavior is not 100% reproducible on reboot.
|
|
|
21
|
|
|
|
22
|
The MTU configuration is critical for thunderbolt bridge performance and should persist across networking restarts.
|
|
|
23
|
|
|
|
24
|
---
|
|
|
25
|
|
|
|
26
|
## Environment
|
|
|
27
|
|
|
|
28
|
- **Affected nodes:** all (baobab, ebony, tapia)
|
|
|
29
|
- **Component:** network
|
|
|
30
|
- **Version/software:** Proxmox VE 8.x, ifupdown2, thunderbolt networking
|
|
|
31
|
|
|
|
32
|
---
|
|
|
33
|
|
|
|
34
|
## Steps to Reproduce
|
|
|
35
|
|
|
|
36
|
1. Verify current thunderbolt interface MTU: `ip link show thunderbolt0`
|
|
|
37
|
2. Observe MTU is set to 65520
|
|
|
38
|
3. Execute: `systemctl restart networking`
|
|
|
39
|
4. Check MTU again: `ip link show thunderbolt0`
|
|
|
40
|
5. MTU has reverted to 1500
|
|
|
41
|
|
|
|
42
|
**Reboot scenario (intermittent):**
|
|
|
43
|
1. Reboot node
|
|
|
44
|
2. After boot, check thunderbolt interface MTU
|
|
|
45
|
3. Sometimes MTU is 1500 instead of expected 65520
|
|
|
46
|
|
|
|
47
|
---
|
|
|
48
|
|
|
|
49
|
## Expected Behavior
|
|
|
50
|
|
|
|
51
|
Thunderbolt interfaces should maintain MTU 65520 after:
|
|
|
52
|
- `systemctl restart networking`
|
|
|
53
|
- System reboot
|
|
|
54
|
|
|
|
55
|
---
|
|
|
56
|
|
|
|
57
|
## Actual Behavior
|
|
|
58
|
|
|
|
59
|
MTU resets to 1500 (default) after networking restart. Reboot behavior is inconsistent but sometimes exhibits the same issue.
|
|
|
60
|
|
|
|
61
|
---
|
|
|
62
|
|
|
|
63
|
## Logs/Evidence
|
|
|
64
|
|
|
|
65
|
```bash
|
|
|
66
|
# Before restart
|
|
|
67
|
ip link show thunderbolt0
|
|
|
68
|
# ... mtu 65520 ...
|
|
|
69
|
|
|
|
70
|
# After systemctl restart networking
|
|
|
71
|
ip link show thunderbolt0
|
|
|
72
|
# ... mtu 1500 ...
|
|
|
73
|
```
|
|
|
74
|
|
|
|
75
|
---
|
|
|
76
|
|
|
|
77
|
## Investigation Notes
|
|
|
78
|
|
|
|
79
|
- [2025-10-30] Issue reported. Configuration files in `/etc/network/interfaces.d/10-thunderbolt` contain `pre-up ip link set dev $IFACE mtu 65520 || true` but this may not be executed consistently during networking restart.
|
|
|
80
|
- [2025-10-30] The `allow-hotplug` directive for thunderbolt interfaces may cause race conditions where the interface is brought up before the pre-up script runs.
|
|
|
81
|
- [2025-10-30] Reboot inconsistency suggests timing or udev rule interaction issues.
|
|
|
82
|
|
|
|
83
|
### Deep Investigation (2025-10-30)
|
|
|
84
|
|
|
|
85
|
**Current Configuration Analysis:**
|
|
|
86
|
|
|
|
87
|
1. **Interface Configuration** (`/etc/network/interfaces.d/10-thunderbolt`):
|
|
|
88
|
- Uses `allow-hotplug` for thunderbolt0 and thunderbolt1
|
|
|
89
|
- Has `pre-up ip link set dev $IFACE mtu 65520 || true` in iface stanza
|
|
|
90
|
- Bridge has `mtu 65520` in its static configuration
|
|
|
91
|
|
|
|
92
|
2. **Systemd Services**:
|
|
|
93
|
- `tb-bridge.service`: Creates bridge early, sets MTU 65520
|
|
|
94
|
- `tb-enlist@.service`: Triggered by udev on thunderbolt interface add, sets MTU and enslaves to bridge
|
|
|
95
|
- Services have proper ordering: `After=sys-subsystem-net-devices-%i.device tb-bridge.service`
|
|
|
96
|
|
|
|
97
|
3. **Udev Rule** (`/etc/udev/rules.d/90-thunderbolt-net-systemd.rules`):
|
|
|
98
|
- Triggers `tb-enlist@.service` when thunderbolt interfaces appear
|
|
|
99
|
- Does NOT directly set MTU via udev
|
|
|
100
|
|
|
|
101
|
**Root Cause Analysis:**
|
|
|
102
|
|
|
|
103
|
The problem occurs during `systemctl restart networking` because:
|
|
|
104
|
|
|
|
105
|
1. **ifupdown2 behavior**: When restarting networking, ifupdown2:
|
|
|
106
|
- Takes DOWN all `allow-hotplug` interfaces
|
|
|
107
|
- Brings them back UP based on configuration
|
|
|
108
|
- During this process, `pre-up` scripts execute BEFORE the interface is brought up
|
|
|
109
|
|
|
|
110
|
2. **Timing Issue**: The sequence is:
|
|
|
111
|
```
|
|
|
112
|
networking.service restart
|
|
|
113
|
→ ifdown thunderbolt0 (MTU reset to default 1500 by kernel)
|
|
|
114
|
→ pre-up script runs (sets MTU 65520)
|
|
|
115
|
→ ifup brings interface up
|
|
|
116
|
→ RACE: systemd tb-enlist@.service might not re-trigger OR might run before ifupdown finishes
|
|
|
117
|
```
|
|
|
118
|
|
|
|
119
|
3. **Why systemd services don't help during networking restart**:
|
|
|
120
|
- `tb-enlist@.service` is triggered by udev on device ADD event
|
|
|
121
|
- During `networking restart`, the device is not removed/added, just brought down/up
|
|
|
122
|
- Therefore, systemd service does NOT re-execute
|
|
|
123
|
- The MTU setting relies ONLY on the `pre-up` script in interfaces configuration
|
|
|
124
|
|
|
|
125
|
4. **Why it sometimes fails on reboot**:
|
|
|
126
|
- Race condition between:
|
|
|
127
|
- ifupdown bringing up the interface (with pre-up MTU setting)
|
|
|
128
|
- systemd tb-enlist@ service being triggered by udev
|
|
|
129
|
- If systemd service wins the race and enslaves interface before ifupdown sets MTU, the MTU might not stick
|
|
|
130
|
|
|
|
131
|
**Key Finding**: The `pre-up` script in `/etc/network/interfaces.d/10-thunderbolt` SHOULD work, but there's likely a timing issue or the script is not being executed properly during networking restart with ifupdown2.
|
|
|
132
|
|
|
|
133
|
---
|
|
|
134
|
|
|
|
135
|
## Proposed Solutions
|
|
|
136
|
|
|
|
137
|
### Solution 1: Add MTU setting to udev rule (RECOMMENDED)
|
|
|
138
|
|
|
|
139
|
Add MTU setting directly in the udev rule that triggers when thunderbolt interfaces appear. This ensures MTU is set immediately when the interface is created, before any other service touches it.
|
|
|
140
|
|
|
|
141
|
**Implementation:**
|
|
|
142
|
|
|
|
143
|
Modify `/etc/udev/rules.d/90-thunderbolt-net-systemd.rules`:
|
|
|
144
|
|
|
|
145
|
```bash
|
|
|
146
|
# /etc/udev/rules.d/90-thunderbolt-net-systemd.rules
|
|
|
147
|
ACTION=="add", SUBSYSTEM=="net", KERNEL=="thunderbolt*", \
|
|
|
148
|
RUN+="/sbin/ip link set %k mtu 65520", \
|
|
|
149
|
TAG+="systemd", ENV{SYSTEMD_WANTS}="tb-enlist@%k.service"
|
|
|
150
|
```
|
|
|
151
|
|
|
|
152
|
**Pros:**
|
|
|
153
|
- Runs immediately on device add, before any other service
|
|
|
154
|
- Independent of ifupdown2 behavior
|
|
|
155
|
- Handles both boot and hotplug scenarios
|
|
|
156
|
- Simple, one-line change
|
|
|
157
|
|
|
|
158
|
**Cons:**
|
|
|
159
|
- Must be deployed to all nodes
|
|
|
160
|
|
|
|
161
|
### Solution 2: Add post-up hook in interfaces configuration
|
|
|
162
|
|
|
|
163
|
Add a `post-up` hook in addition to `pre-up` to ensure MTU is set after the interface is fully up.
|
|
|
164
|
|
|
|
165
|
**Implementation:**
|
|
|
166
|
|
|
|
167
|
Modify `/etc/network/interfaces.d/10-thunderbolt`:
|
|
|
168
|
|
|
|
169
|
```bash
|
|
|
170
|
allow-hotplug thunderbolt0
|
|
|
171
|
iface thunderbolt0 inet manual
|
|
|
172
|
pre-up ip link set dev $IFACE mtu 65520 || true
|
|
|
173
|
post-up ip link set dev $IFACE mtu 65520 || true
|
|
|
174
|
```
|
|
|
175
|
|
|
|
176
|
**Pros:**
|
|
|
177
|
- Uses existing ifupdown2 mechanisms
|
|
|
178
|
- MTU set twice (pre and post) increases reliability
|
|
|
179
|
- No new files needed
|
|
|
180
|
|
|
|
181
|
**Cons:**
|
|
|
182
|
- Still relies on ifupdown2 executing hooks correctly
|
|
|
183
|
- May not fix the race condition completely
|
|
|
184
|
|
|
|
185
|
### Solution 3: Modify tb-enlist@ service to always set MTU
|
|
|
186
|
|
|
|
187
|
Make the systemd service idempotent and ensure it sets MTU even if the device was already up.
|
|
|
188
|
|
|
|
189
|
**Implementation:**
|
|
|
190
|
|
|
|
191
|
Modify `/etc/systemd/system/tb-enlist@.service`:
|
|
|
192
|
|
|
|
193
|
```ini
|
|
|
194
|
[Unit]
|
|
|
195
|
Description=Attach %I to thunderbridge with MTU
|
|
|
196
|
BindsTo=sys-subsystem-net-devices-%i.device
|
|
|
197
|
After=sys-subsystem-net-devices-%i.device tb-bridge.service network.target
|
|
|
198
|
Requires=tb-bridge.service
|
|
|
199
|
|
|
|
200
|
[Service]
|
|
|
201
|
Type=oneshot
|
|
|
202
|
RemainAfterExit=yes
|
|
|
203
|
# Always set MTU first, regardless of current state
|
|
|
204
|
ExecStartPre=/sbin/ip link set %i mtu 65520 || true
|
|
|
205
|
ExecStart=/sbin/ip link set %i up
|
|
|
206
|
ExecStart=/sbin/ip link set %i mtu 65520
|
|
|
207
|
ExecStart=/sbin/ip link set thunderbridge mtu 65520
|
|
|
208
|
ExecStart=/sbin/ip link set %i master thunderbridge
|
|
|
209
|
|
|
|
210
|
ExecStop=/sbin/ip link set %i nomaster 2>/dev/null || true
|
|
|
211
|
ExecStop=/sbin/ip link set %i down 2>/dev/null || true
|
|
|
212
|
|
|
|
213
|
# Add this to re-run service on networking.service restart
|
|
|
214
|
[Install]
|
|
|
215
|
Also=network.target
|
|
|
216
|
```
|
|
|
217
|
|
|
|
218
|
**Pros:**
|
|
|
219
|
- Comprehensive, handles multiple scenarios
|
|
|
220
|
- Can be triggered manually if needed
|
|
|
221
|
|
|
|
222
|
**Cons:**
|
|
|
223
|
- More complex
|
|
|
224
|
- Still might not trigger on `networking.service` restart without additional changes
|
|
|
225
|
|
|
|
226
|
### Solution 4: Hybrid approach (MOST ROBUST)
|
|
|
227
|
|
|
|
228
|
Combine Solution 1 (udev) with Solution 2 (post-up hook).
|
|
|
229
|
|
|
|
230
|
**Implementation:**
|
|
|
231
|
|
|
|
232
|
1. Add MTU to udev rule (Solution 1)
|
|
|
233
|
2. Keep both pre-up and add post-up in interfaces.d config (Solution 2)
|
|
|
234
|
3. Ensure bridge always has MTU set in its configuration
|
|
|
235
|
|
|
|
236
|
This creates multiple layers of MTU enforcement:
|
|
|
237
|
- Udev sets it immediately on device appearance
|
|
|
238
|
- pre-up sets it before ifup
|
|
|
239
|
- post-up sets it after interface is fully up
|
|
|
240
|
- systemd service sets it when enslaving to bridge
|
|
|
241
|
|
|
|
242
|
**Pros:**
|
|
|
243
|
- Defense in depth
|
|
|
244
|
- Handles all edge cases
|
|
|
245
|
- Most reliable solution
|
|
|
246
|
|
|
|
247
|
**Cons:**
|
|
|
248
|
- Slight redundancy (MTU set multiple times)
|
|
|
249
|
|
|
|
250
|
---
|
|
|
251
|
|
|
|
252
|
## Recommended Implementation Plan
|
|
|
253
|
|
|
|
254
|
**Phase 1: Quick Fix (Solution 1)**
|
|
|
255
|
1. Deploy updated udev rule to all nodes
|
|
|
256
|
2. Reload udev rules: `udevadm control --reload-rules`
|
|
|
257
|
3. Test with `systemctl restart networking`
|
|
|
258
|
4. Verify MTU persists
|
|
|
259
|
|
|
|
260
|
**Phase 2: If needed (Solution 4)**
|
|
|
261
|
1. Add post-up hook to interfaces.d/10-thunderbolt
|
|
|
262
|
2. Update tb-enlist@ service with ExecStartPre
|
|
|
263
|
3. Deploy and test
|
|
|
264
|
|
|
|
265
|
**Testing Protocol:**
|
|
|
266
|
```bash
|
|
|
267
|
# On each node:
|
|
|
268
|
# 1. Check current MTU
|
|
|
269
|
ip link show thunderbolt0 | grep mtu
|
|
|
270
|
|
|
|
271
|
# 2. Restart networking
|
|
|
272
|
systemctl restart networking
|
|
|
273
|
|
|
|
274
|
# 3. Verify MTU persisted
|
|
|
275
|
ip link show thunderbolt0 | grep mtu
|
|
|
276
|
# Should show: mtu 65520
|
|
|
277
|
|
|
|
278
|
# 4. Test reboot persistence
|
|
|
279
|
reboot
|
|
|
280
|
# After boot:
|
|
|
281
|
ip link show thunderbolt0 | grep mtu
|
|
|
282
|
```
|
|
|
283
|
|
|
|
284
|
---
|
|
|
285
|
|
|
|
286
|
## Related Issues
|
|
|
287
|
|
|
|
288
|
None yet.
|
|
|
289
|
|
|
|
290
|
---
|
|
|
291
|
|
|
|
292
|
## Changelog References
|
|
|
293
|
|
|
|
294
|
None yet. Will be referenced when fix is implemented.
|
|
|
295
|
|
|
|
296
|
---
|
|
|
297
|
|
|
|
298
|
## Resolution (2025-10-30)
|
|
|
299
|
|
|
|
300
|
**Issue Status: RESOLVED**
|
|
|
301
|
|
|
|
302
|
### Root Cause Confirmed
|
|
|
303
|
The MTU reset occurred because `systemctl restart networking` triggers ifupdown2 to bring interfaces down and back up, but the existing `pre-up` hooks in interfaces.d were insufficient. The systemd services (`tb-enlist@.service`) don't re-trigger on networking restart since the device isn't removed/added.
|
|
|
304
|
|
|
|
305
|
### Solution Implemented
|
|
|
306
|
Deployed **hybrid approach** combining:
|
|
|
307
|
1. **Enhanced udev rule**: Added MTU setting on device add/change events
|
|
|
308
|
2. **Post-up hook**: Added `post-up` script in interfaces.d to ensure MTU after interface bring-up
|
|
|
309
|
|
|
|
310
|
### Changes Made
|
|
|
311
|
- **Udev rule** (`/etc/udev/rules.d/90-thunderbolt-net-systemd.rules`): Added `RUN+="/sbin/ip link set %k mtu 65520"` for immediate MTU setting
|
|
|
312
|
- **Interfaces config** (`/etc/network/interfaces.d/10-thunderbolt`): Added `post-up ip link set dev $IFACE mtu 65520 || true` for all thunderbolt interfaces
|
|
|
313
|
|
|
|
314
|
### Testing Results
|
|
|
315
|
- **ebony**: ✅ MTU persists after `systemctl restart networking`
|
|
|
316
|
- **tapia**: ✅ MTU persists after `systemctl restart networking`
|
|
|
317
|
- **baobab**: ✅ Both thunderbolt0 and thunderbolt1 maintain MTU after restart
|
|
|
318
|
|
|
|
319
|
### Files Modified
|
|
|
320
|
- `deploy/attempt1/common/udev/rules.d/90-thunderbolt-net-systemd.rules`
|
|
|
321
|
- `deploy/attempt1/ebony/etc/network/interfaces.d/10-thunderbolt`
|
|
|
322
|
- `deploy/attempt1/tapia/etc/network/interfaces.d/10-thunderbolt`
|
|
|
323
|
- `deploy/attempt1/baobab/etc/network/interfaces.d/10-thunderbolt`
|
|
|
324
|
|
|
|
325
|
The fix ensures MTU 65520 persists across all scenarios: boot, hotplug, and networking restart.
|