* perf: Optimize stray detection to use 1 SSH call per host
Previously, stray detection checked each stack on each host individually,
resulting in (stacks * hosts) SSH calls. For 50 stacks across 4 hosts,
this meant ~200 parallel SSH connections, causing "Connection lost" errors.
Now queries each host once for all running compose projects using:
docker ps --format '{{.Label "com.docker.compose.project"}}' | sort -u
This reduces SSH calls from ~200 to just 4 (one per host).
Changes:
- Add get_running_stacks_on_host() in executor.py
- Add discover_all_stacks_on_all_hosts() in operations.py
- Update _discover_stacks_full() to use the batch approach
* Remove unused function and add tests
- Remove discover_stack_on_all_hosts() which is no longer used
- Add tests for get_running_stacks_on_host()
- Add tests for discover_all_stacks_on_all_hosts()
- Verifies it returns correct StackDiscoveryResult
- Verifies stray detection works
- Verifies it makes only 1 call per host (not per stack)
Remove functions that were replaced by _with_progress variants in cli.py:
- discover_running_services, check_mounts_on_configured_hosts,
check_networks_on_configured_hosts, _check_resources from operations.py
- snapshot_services from logs.py
- get_service_hosts from state.py
Make previously private functions public (remove underscore prefix):
- is_local in executor.py
- isoformat, collect_service_entries, load_existing_entries,
merge_entries, write_toml in logs.py
- load_env, interpolate, parse_ports in compose.py
Update tests to use renamed public functions.
- Extract compose.py from traefik.py for generic compose parsing
(env loading, interpolation, ports, volumes, networks)
- Rename ssh.py to executor.py for clarity
- Extract operations.py from cli.py for business logic
(up_services, discover_running_services, preflight checks)
- Update CLAUDE.md with new architecture diagram
- Add docs/dev/future-improvements.md for low-priority items
CLI is now a thin layer that delegates to operations module.
All 70 tests pass.