* fix: --host filter now limits multi-host stack operations to single host
Previously, when using `-H host` with a multi-host stack like `glances: all`,
the command would find the stack (correct) but then operate on ALL hosts for
that stack (incorrect). For example, `cf down -H nas` with `glances` would
stop glances on all 5 hosts instead of just nas.
Now, when `--host` is specified:
- `cf down -H nas` only stops stacks on nas, including only the nas instance
of multi-host stacks
- `cf up -H nas` only starts stacks on nas (skips migration logic since
host is explicitly specified)
Added tests for the new filter_host behavior in both executor and CLI.
* fix: Apply filter_host to logs and ps commands as well
Same bug as up/down: when using `-H host` with multi-host stacks,
logs and ps would show results from all hosts instead of just the
filtered host.
* fix: Don't remove multi-host stacks from state when host-filtered
When using `-H host` with a multi-host stack, we only stop one instance.
The stack is still running on other hosts, so we shouldn't remove it
from state entirely.
This prevents issues where:
- `cf apply` would try to re-start the stack
- `cf ps` would show incorrect running status
- Orphan detection would be confused
Added tests to verify state is preserved for host-filtered multi-host
operations and removed for full stack operations.
* refactor: Introduce StackSelection dataclass for cleaner context passing
Instead of passing filter_host separately through multiple layers,
bundle the selection context into a StackSelection dataclass:
- stacks: list of selected stack names
- config: the loaded Config
- host_filter: optional host filter from -H flag
This provides:
1. Cleaner APIs - context travels together instead of being scattered
2. is_instance_level() method - encapsulates the check for whether
this is an instance-level operation (host-filtered multi-host stack)
3. Future extensibility - can add more context (dry_run, verbose, etc.)
Updated all callers of get_stacks() to use the new return type.
* Revert "refactor: Introduce StackSelection dataclass for cleaner context passing"
This reverts commit e6e9eed93e.
* feat: Proper per-host state tracking for multi-host stacks
- Add `remove_stack_host()` to remove a single host from a multi-host stack's state
- Add `add_stack_host()` to add a single host to a stack's state
- Update `down` command to use `remove_stack_host` for host-filtered multi-host stacks
- Update `up` command to use `add_stack_host` for host-filtered operations
This ensures the state file accurately reflects which hosts each stack is running on,
rather than just tracking if it's running at all.
* fix: Use set comparisons for host list tests
Host lists may be reordered during YAML save/load, so test for
set equality rather than list equality.
* refactor: Merge remove_stack_host into remove_stack as optional parameter
Instead of a separate function, `remove_stack` now takes an optional
`host` parameter. When specified, it removes only that host from
multi-host stacks. This reduces API surface and follows the existing
pattern.
* fix: Restore deterministic host list sorting and add filter_host test
- Restore sorting of list values in _sorted_dict for consistent YAML output
- Add test for logs --host passing filter_host to run_on_stacks
* perf: Optimize stray detection to use 1 SSH call per host
Previously, stray detection checked each stack on each host individually,
resulting in (stacks * hosts) SSH calls. For 50 stacks across 4 hosts,
this meant ~200 parallel SSH connections, causing "Connection lost" errors.
Now queries each host once for all running compose projects using:
docker ps --format '{{.Label "com.docker.compose.project"}}' | sort -u
This reduces SSH calls from ~200 to just 4 (one per host).
Changes:
- Add get_running_stacks_on_host() in executor.py
- Add discover_all_stacks_on_all_hosts() in operations.py
- Update _discover_stacks_full() to use the batch approach
* Remove unused function and add tests
- Remove discover_stack_on_all_hosts() which is no longer used
- Add tests for get_running_stacks_on_host()
- Add tests for discover_all_stacks_on_all_hosts()
- Verifies it returns correct StackDiscoveryResult
- Verifies stray detection works
- Verifies it makes only 1 call per host (not per stack)
Remove functions that were replaced by _with_progress variants in cli.py:
- discover_running_services, check_mounts_on_configured_hosts,
check_networks_on_configured_hosts, _check_resources from operations.py
- snapshot_services from logs.py
- get_service_hosts from state.py
Make previously private functions public (remove underscore prefix):
- is_local in executor.py
- isoformat, collect_service_entries, load_existing_entries,
merge_entries, write_toml in logs.py
- load_env, interpolate, parse_ports in compose.py
Update tests to use renamed public functions.
- Extract compose.py from traefik.py for generic compose parsing
(env loading, interpolation, ports, volumes, networks)
- Rename ssh.py to executor.py for clarity
- Extract operations.py from cli.py for business logic
(up_services, discover_running_services, preflight checks)
- Update CLAUDE.md with new architecture diagram
- Add docs/dev/future-improvements.md for low-priority items
CLI is now a thin layer that delegates to operations module.
All 70 tests pass.