* Add self-healing: detect and stop rogue containers
Adds the ability to detect and stop "rogue" containers - stacks running
on hosts they shouldn't be according to config.
Changes:
- `cf refresh`: Now scans ALL hosts and warns about rogues/duplicates
- `cf apply`: Stops rogue containers before migrations (new phase)
- New `--no-rogues` flag to skip rogue detection
Implementation:
- Add StackDiscoveryResult for full host scanning results
- Add discover_stack_on_all_hosts() to check all hosts in parallel
- Add stop_rogue_stacks() to stop containers on unauthorized hosts
- Update tests to include new no_rogues parameter
* Update README.md
* fix: Update refresh tests for _discover_stacks_full return type
The function now returns a tuple (discovered, rogues, duplicates)
for rogue/duplicate detection. Update test mocks accordingly.
* Rename "rogue" terminology to "stray" for consistency
Terminology update across the codebase:
- rogue_hosts -> stray_hosts
- is_rogue -> is_stray
- stop_rogue_stacks -> stop_stray_stacks
- _discover_rogues -> _discover_strays
- --no-rogues -> --no-strays
- _report_rogue_stacks -> _report_stray_stacks
"Stray" better complements "orphaned" (both evoke lost things)
while clearly indicating the stack is running somewhere it
shouldn't be.
* Update README.md
* Move asyncio import to top level
* Fix remaining rogue -> stray in docstrings and README
* Refactor: Extract shared helpers to reduce duplication
1. Extract _stop_stacks_on_hosts helper in operations.py
- Shared by stop_orphaned_stacks and stop_stray_stacks
- Reduces ~50 lines of duplicated code
2. Refactor _discover_strays to reuse _discover_stacks_full
- Removes duplicate discovery logic from lifecycle.py
- Calls management._discover_stacks_full and merges duplicates
* Add PR review prompt
* Fix typos in PR review prompt
* Move import to top level (no in-function imports)
* Update README.md
* Remove obvious comments
Add Astral's ty type checker (written in Rust, 10-100x faster than mypy)
as a second type checking layer. Both run in pre-commit and CI.
Fixed type issues caught by ty:
- config.py: explicit Host constructor to avoid dict unpacking issues
- executor.py: wrap subprocess.run in closure for asyncio.to_thread
- api.py: use getattr for Jinja TemplateModule macro access
- test files: fix playwright driver_path tuple handling, pytest rootpath typing
When --full is passed, apply also runs 'docker compose up' on all
services (not just missing/migrating ones) to pick up any config
changes (compose file, .env, etc).
- cf apply # Fast: state reconciliation only
- cf apply --full # Thorough: also refresh all running services
Previously, apply only handled:
1. Stopping orphans (in state, not in config)
2. Migrating services (in state, wrong host)
Now it also handles:
3. Starting missing services (in config, not in state)
This fixes the case where a service was stopped as an orphan, then
re-added to config - apply would say "nothing to do" instead of
starting it.
Added get_services_not_in_state() to state.py and updated tests.