1 contributor
# SSH Infrastructure - Single Source of Truth
Last updated: 2026-05-21
This is the only project documentation file. Keep architecture, key handling,
sync/deploy steps, troubleshooting, and maintenance notes here. Do not add
separate Markdown documents for the same subject unless this README is split by
explicit decision.
## Read This First
This repository manages SSH access from Bogdan's macOS workstation to Next-Gen
company hosts through:
```text
local macOS
-> is-jumper 192.168.2.100
-> J1/J2 10.253.51.50/52:25904
-> final hosts: porta, pbx, radius, voip, network gear
```
The key detail agents keep missing:
- The local machine does not hold the company hardware key.
- The physical RSA smartcard is mounted only on `is-jumper`.
- The wrapper logs into `is-jumper`, sets
`SSH_AUTH_SOCK=/run/user/0/gnupg/S.gpg-agent.ssh`, then runs SSH from there to
J1/J2.
- J1/J2 must use user `bogdan.timofte`.
- `is-jumper` itself must use local key `~/.ssh/keys/is-jumper_ed25519`.
- `ssh` on macOS must resolve to `~/.local/bin/ssh`, not `/usr/bin/ssh`, for
company aliases.
Fast health check:
```bash
which ssh
ssh -G is-jumper | grep -E '^(hostname|user|identityfile|identitiesonly) '
ssh -G j1 | grep -E '^(hostname|user|port) '
ssh is-jumper hostname
ssh porta-sip hostname
```
Expected highlights:
```text
/Users/bogdan/.local/bin/ssh
identityfile ~/.ssh/keys/is-jumper_ed25519
user bogdan.timofte
p12.voip.ro
```
## Sources of Truth
There are two separate host tables, with separate ownership:
| Table | File / Location | Owner | What Belongs There |
| --- | --- | --- | --- |
| Local table | `inventory/hosts-local.yaml` in this repo | Us / Bogdan local workstation | Local lab hosts, local defaults, local key paths, and local overrides required for this Mac |
| NextGen table | `nextgen@192.168.2.103:/home/nextgen/projects/ssh-infrastructure/inventory/hosts.yaml` | NextGen / upstream | Company-managed NextGen host list: porta, pbx, radius, voip, network gear, and upstream defaults |
Operational rule:
```text
inventory/hosts-local.yaml is our local source of truth.
inventory/hosts.yaml is a local copy of the NextGen upstream table.
```
Do not put local-only fixes into the upstream table unless they are true for
NextGen as well. Keep Mac/local requirements in `inventory/hosts-local.yaml`.
The effective local config is generated from both files:
```text
inventory/hosts.yaml <- copied/synced from nextgen upstream
inventory/hosts-local.yaml <- maintained locally by us
-> tools/generate-configs.py
-> generated/client.conf
-> ~/.ssh/config
```
Critical local overrides currently required:
```yaml
entrypoints:
is_jumper:
identity_file: ~/.ssh/keys/is-jumper_ed25519
identities_only: true
jumps:
j1:
user: bogdan.timofte
j2:
user: bogdan.timofte
```
The sync script updates only the local copy of the upstream table:
```bash
tools/sync-hosts-from-upstream.sh
```
After every sync, verify the local overlay still produces the right effective
config:
```bash
ssh -G j1 | grep -E '^(hostname|user|port) '
ssh -G is-jumper | grep -E '^(hostname|user|identityfile|identitiesonly) '
```
Expected:
```text
user bogdan.timofte
identityfile ~/.ssh/keys/is-jumper_ed25519
```
## Repository Rules
Project source:
```text
/Users/bogdan/Documents/Workspaces/Bogdan/ssh-infrastructure
```
Runtime OpenSSH state:
```text
~/.ssh/config
~/.ssh/known_hosts
~/.ssh/authorized_keys
~/.ssh/keys/
~/.local/bin/ssh
~/.local/bin/scp
~/.local/bin/sftp
```
Only edit source files in the repository. Do not edit generated runtime files by
hand.
Tracked source files:
```text
README.md this file, the only documentation
inventory/hosts.yaml upstream/company host inventory
inventory/hosts-local.yaml local overlay and local lab inventory
schema/hosts.schema.json inventory schema
scripts/ssh-wrapper.sh installed as ~/.local/bin/ssh
scripts/scp-wrapper.sh installed as ~/.local/bin/scp
scripts/sftp-wrapper.sh installed as ~/.local/bin/sftp
tools/generate-configs.py config generator
tools/deploy-local.sh local deploy
tools/sync-hosts-from-upstream.sh upstream inventory sync
tools/migrate-modern-key.sh legacy local key migration helper
.gitignore
```
Ignored or runtime-only files:
```text
generated/
SSH_SETUP_SUMMARY.md
authorized_keys
known_hosts
known_hosts.old
keys/
agent/
conf.d/
import/
*.pem *.key *.ppk *.der *.csr
```
Git basics:
```bash
git status
git add README.md inventory schema scripts tools .gitignore
git commit -m "Describe change"
```
Known remotes:
```text
nextgen ssh://git@192.168.2.103/home/git/repositories/bogdan/NextGen-Host-List.git
mazeri ssh://git@192.168.2.102/home/git/repositories/bogdan/SSH-Infrastructure.git
```
## Architecture
### Network
```text
192.168.2.0/24 - local office/lab network
is-jumper 192.168.2.100 - VPN client and hardware-key guardian
local lab hosts
10.253.51.0/24 - internal company network reached from is-jumper VPN
J1 10.253.51.50:25904
J2 10.253.51.52:25904
final hosts
```
`is-jumper` is not a VPN server. It is a local host that has VPN reachability to
the company network and has the physical smartcard mounted.
### Access Chains
Standard final-host chain:
```text
local wrapper
-> /usr/bin/ssh is-jumper
-> SSH_AUTH_SOCK=/run/user/0/gnupg/S.gpg-agent.ssh ssh -A J1
-> ssh final-host
```
Interactive J1/J2 login:
```text
local wrapper -> is-jumper -> J1/J2
```
Emergency public routes:
```text
local wrapper -> is-jumper -> j1.next-gen.ro or j2.next-gen.ro
```
The wrapper strips custom flags before calling real SSH:
```text
-J1 use J1 VPN route, default
-J2 use J2 VPN route
-j1 use public j1 route
-j2 use public j2 route
```
Do not reintroduce local port forwarding, Python relays, `IdentityAgent
/tmp/...`, or helper scripts that bridge the physical-card socket to the local
machine. Those were removed for compliance and SentinelOne noise.
## Keys
### Key Matrix
| Key | Location | Purpose |
| --- | --- | --- |
| Physical smartcard RSA 4096 | only on `is-jumper` | Auth from `is-jumper` to J1/J2/company network |
| `is-jumper_ed25519` | local `~/.ssh/keys/is-jumper_ed25519` | Auth from macOS to `is-jumper` |
| Modern ED25519 | local `~/.ssh/id_ed25519` or `~/.ssh/keys/id_ed25519` | Local lab and migrated hosts |
| Legacy RSA | local `~/.ssh/keys/id_rsa_old` | Temporary migration fallback for old local hosts |
Critical config values:
```yaml
entrypoints:
is_jumper:
hostname: 192.168.2.100
user: root
identity_file: ~/.ssh/keys/is-jumper_ed25519
identities_only: true
jumps:
j1:
hostname: 10.253.51.50
user: bogdan.timofte
port: 25904
j2:
hostname: 10.253.51.52
user: bogdan.timofte
port: 25904
```
If J1/J2 use `bogdan` instead of `bogdan.timofte`, final host SSH will fail with
an error like:
```text
bogdan@10.253.51.50: Permission denied (publickey).
Connection to 192.168.2.100 closed.
```
Fix that in `inventory/hosts-local.yaml`, deploy, then verify:
```bash
ssh -G j1 | grep -E '^(hostname|user|port) '
tools/deploy-local.sh
ssh porta-sip hostname
```
## Inventory and Generation
The generator reads:
```text
inventory/hosts.yaml
inventory/hosts-local.yaml if it exists
```
Important: the inventory merge is shallow. Later top-level maps from
`hosts-local.yaml` override upstream maps. This is useful for local lab entries
but dangerous for defaults. If `hosts-local.yaml` changes `defaults.jump.user`,
then local `jumps.j1` and `jumps.j2` must specify `user: bogdan.timofte`
explicitly.
Generated files:
```text
generated/client.conf installed as ~/.ssh/config
generated/is-jumper.conf server-side helper config
generated/j1.conf server-side final-host config
generated/j2.conf server-side final-host config
```
`generated/` is ignored by git. Recreate it any time:
```bash
python3 tools/generate-configs.py
```
Deploy local runtime:
```bash
tools/deploy-local.sh
```
Deploy does:
```text
1. run tools/generate-configs.py
2. install generated/client.conf as ~/.ssh/config
3. install scripts/ssh-wrapper.sh as ~/.local/bin/ssh
4. install scripts/scp-wrapper.sh as ~/.local/bin/scp
5. install scripts/sftp-wrapper.sh as ~/.local/bin/sftp
6. remove obsolete ~/.ssh/scripts wrapper copies
```
It does not touch private keys, `authorized_keys`, or `known_hosts`.
## Local Shell and Wrappers
For company aliases, `ssh` must be the wrapper:
```bash
which ssh
# /Users/bogdan/.local/bin/ssh
```
If it shows `/usr/bin/ssh`, fix shell PATH and reload:
```bash
source ~/.zshrc
which ssh
```
The current shell startup should keep `~/.local/bin` first in both interactive
and login shells. If editing these files, preserve this behavior:
```zsh
path=("$HOME/.local/bin" ${path:#"$HOME/.local/bin"})
export PATH
```
`ssh-wrapper.sh` uses bash 3.2 compatible array expansion under `set -u`.
Do not replace guarded forms like:
```bash
${cmd_args[@]+"${cmd_args[@]}"}
```
with plain:
```bash
"${cmd_args[@]}"
```
On macOS bash 3.2, empty arrays plus `set -u` can fail with:
```text
cmd_args[@]: unbound variable
```
## Sync from Upstream
Pull upstream `hosts.yaml`, apply the local `is-jumper` key override, validate
generation, and deploy if changed:
```bash
tools/sync-hosts-from-upstream.sh
```
Defaults:
```text
UPSTREAM_SSH_TARGET=nextgen@192.168.2.103
UPSTREAM_HOSTS_PATH=/home/nextgen/projects/ssh-infrastructure/inventory/hosts.yaml
LOCAL_IS_JUMPER_IDENTITY_FILE=~/.ssh/keys/is-jumper_ed25519
DEPLOY_AFTER_SYNC=1
FORCE_DEPLOY=0
```
Useful overrides:
```bash
UPSTREAM_HOSTS_FILE=/tmp/hosts.yaml tools/sync-hosts-from-upstream.sh
DEPLOY_AFTER_SYNC=0 tools/sync-hosts-from-upstream.sh
FORCE_DEPLOY=1 tools/sync-hosts-from-upstream.sh
UPSTREAM_SSH_TARGET=user@host tools/sync-hosts-from-upstream.sh
```
After sync, always check J1 user because the local overlay can override jump
defaults:
```bash
ssh -G j1 | grep -E '^(hostname|user|port) '
```
Expected:
```text
user bogdan.timofte
hostname 10.253.51.50
port 25904
```
## Adding or Changing Hosts
For company/Next-Gen hosts:
```text
1. Edit inventory/hosts.yaml or sync it from upstream.
2. Keep local-only corrections in inventory/hosts-local.yaml.
3. Run tools/deploy-local.sh.
4. Verify with ssh -G <alias>.
5. Verify read-only with ssh <alias> hostname.
6. Commit source changes only.
```
For local lab hosts:
```text
1. Edit inventory/hosts-local.yaml.
2. Run tools/deploy-local.sh.
3. Verify with ssh <alias> hostname.
4. Commit the local overlay change.
```
Common inventory defaults:
| Context | User | Port |
| --- | --- | --- |
| J1/J2 company jump | `bogdan.timofte` | `25904` for VPN route |
| Company final hosts | usually `bogdan` | usually `22` |
| Company inherited jump config | `bogdan.timofte` | often `24` |
| Local lab hosts | usually `bogdan` | usually `22` |
| Cisco/OLT interactive devices | inventory-specific | `22` |
For Cisco/OLT/password-interactive devices, set:
```yaml
auth: password_interactive
```
The wrapper then avoids forcing `BatchMode=yes` and disables pubkey auth for
that final hop.
## Key Migration for Local Legacy Hosts
Modern preferred key:
```text
~/.ssh/id_ed25519.pub
```
Legacy fallback key:
```text
~/.ssh/keys/id_rsa_old
```
Migrate all configured local legacy hosts:
```bash
tools/migrate-modern-key.sh
```
Migrate one host:
```bash
tools/migrate-modern-key.sh is-baobab
```
Manual fallback if password access is available:
```bash
ssh -o PubkeyAuthentication=no user@host \
"mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys" \
< ~/.ssh/id_ed25519.pub
```
Keep `id_rsa_old` until all legacy hosts are verified with the modern key.
## Verification Checklist
Run after deploy, sync, wrapper edits, or inventory changes:
```bash
which ssh
ssh -G is-jumper | grep -E '^(hostname|user|identityfile|identitiesonly) '
ssh -G j1 | grep -E '^(hostname|user|port) '
ssh is-jumper hostname
ssh is-jumper 'SSH_AUTH_SOCK=/run/user/0/gnupg/S.gpg-agent.ssh ssh-add -L | sed -n 1p'
ssh porta-sip hostname
ssh pbx-bo hostname
```
Expected signals:
```text
which ssh -> /Users/bogdan/.local/bin/ssh
is-jumper hostname -> is-vpn-gw
j1 user -> bogdan.timofte
physical card check -> ssh-rsa ... cardno:6446168
porta-sip hostname -> p12.voip.ro
pbx-bo hostname -> pbx-bo
```
Interactive smoke test:
```bash
printf "exit\n" | ssh porta-sip
printf "exit\n" | ssh pbx-bo
```
## Troubleshooting
### `bogdan@10.253.51.50: Permission denied (publickey)`
The wrapper reached `is-jumper`, but J1 was attempted with user `bogdan`.
J1/J2 need `bogdan.timofte`.
Check:
```bash
ssh -G j1 | grep -E '^(hostname|user|port) '
```
Fix:
```yaml
# inventory/hosts-local.yaml
jumps:
j1:
user: bogdan.timofte
j2:
user: bogdan.timofte
```
Deploy:
```bash
tools/deploy-local.sh
ssh porta-sip hostname
```
### `root@192.168.2.100: Permission denied`
The local connection to `is-jumper` is using the wrong key.
Check:
```bash
ssh -G is-jumper | grep -E '^(user|hostname|identityfile|identitiesonly) '
ls -l ~/.ssh/keys/is-jumper_ed25519
```
Expected:
```text
user root
hostname 192.168.2.100
identityfile ~/.ssh/keys/is-jumper_ed25519
identitiesonly yes
```
If generated config is wrong, fix `inventory/hosts-local.yaml` or
`inventory/hosts.yaml`, then deploy.
### `ssh pbx-bo` uses `/usr/bin/ssh`
The wrapper is not first in PATH.
Check:
```bash
which ssh
```
Fix current shell:
```bash
source ~/.zshrc
```
If needed, ensure `.zprofile` and `.zshrc` both move `~/.local/bin` to the front
using zsh `path`, not a guard that leaves it later in PATH.
### `cmd_args[@]: unbound variable`
This is a bash 3.2 plus `set -u` empty-array issue in `ssh-wrapper.sh`.
Use guarded array expansion:
```bash
${array[@]+"${array[@]}"}
```
Do not simplify it.
### Physical card missing on `is-jumper`
Check:
```bash
ssh is-jumper 'ls -l /run/user/0/gnupg/S.gpg-agent.ssh'
ssh is-jumper 'SSH_AUTH_SOCK=/run/user/0/gnupg/S.gpg-agent.ssh ssh-add -L | sed -n 1p'
```
Expected key output contains:
```text
cardno:6446168
```
If missing, the issue is on `is-jumper`: gpg-agent, card mount, permissions, or
hardware state.
### Direct command works but wrapper fails
Compare generated command behavior:
```bash
bash -x ~/.local/bin/ssh porta-sip hostname
```
Look for:
```text
SSH_AUTH_SOCK=/run/user/0/gnupg/S.gpg-agent.ssh
bogdan.timofte@10.253.51.50
```
If either is wrong, fix inventory/local overlay or wrapper.
### Generated config was edited manually
Discard manual runtime edits by redeploying:
```bash
tools/deploy-local.sh
```
Then verify:
```bash
ssh -G j1 | grep -E '^(hostname|user|port) '
```
## Compatibility and Compliance
Do not reintroduce these removed patterns:
```text
j1-relay.sh
ssh-proxy.sh
ensure-ssh-agent-bridge.sh
ensure-ssh-jump.sh
local socket forwarding for the hardware card
Python/base64 port-forwarding relays
per-host local ProxyCommand bridges
```
Current compliant model:
```text
local wrapper -> ssh is-jumper -> run normal ssh from is-jumper
```
Compatibility options for old final hosts belong in inventory or on jump hosts,
not in ad-hoc local forwarding scripts.
## Maintenance Notes for Agents
Before changing anything:
```bash
git status --short --branch
which ssh
ssh -G j1 | grep -E '^(hostname|user|port) '
```
When fixing auth:
```text
1. Identify which hop failed from the error user@host.
2. is-jumper failures mean local key/config.
3. J1/J2 failures mean hardware card, SSH_AUTH_SOCK, or jump user.
4. final-host failures mean final host user/auth/port.
5. Apply the fix in inventory or wrapper source, not generated config.
6. Run tools/deploy-local.sh.
7. Run read-only SSH verification.
8. Commit the source change.
```
Do not assume `hosts.yaml` alone is the effective config. Always remember
`inventory/hosts-local.yaml` is merged in by `tools/generate-configs.py`.
Do not trust stale docs, comments, or generated files over these commands:
```bash
ssh -G <alias>
tools/deploy-local.sh
ssh <alias> hostname
git diff
```