* Add self-healing: detect and stop rogue containers
Adds the ability to detect and stop "rogue" containers - stacks running
on hosts they shouldn't be according to config.
Changes:
- `cf refresh`: Now scans ALL hosts and warns about rogues/duplicates
- `cf apply`: Stops rogue containers before migrations (new phase)
- New `--no-rogues` flag to skip rogue detection
Implementation:
- Add StackDiscoveryResult for full host scanning results
- Add discover_stack_on_all_hosts() to check all hosts in parallel
- Add stop_rogue_stacks() to stop containers on unauthorized hosts
- Update tests to include new no_rogues parameter
* Update README.md
* fix: Update refresh tests for _discover_stacks_full return type
The function now returns a tuple (discovered, rogues, duplicates)
for rogue/duplicate detection. Update test mocks accordingly.
* Rename "rogue" terminology to "stray" for consistency
Terminology update across the codebase:
- rogue_hosts -> stray_hosts
- is_rogue -> is_stray
- stop_rogue_stacks -> stop_stray_stacks
- _discover_rogues -> _discover_strays
- --no-rogues -> --no-strays
- _report_rogue_stacks -> _report_stray_stacks
"Stray" better complements "orphaned" (both evoke lost things)
while clearly indicating the stack is running somewhere it
shouldn't be.
* Update README.md
* Move asyncio import to top level
* Fix remaining rogue -> stray in docstrings and README
* Refactor: Extract shared helpers to reduce duplication
1. Extract _stop_stacks_on_hosts helper in operations.py
- Shared by stop_orphaned_stacks and stop_stray_stacks
- Reduces ~50 lines of duplicated code
2. Refactor _discover_strays to reuse _discover_stacks_full
- Removes duplicate discovery logic from lifecycle.py
- Calls management._discover_stacks_full and merges duplicates
* Add PR review prompt
* Fix typos in PR review prompt
* Move import to top level (no in-function imports)
* Update README.md
* Remove obvious comments
* docs(readme): position as Dockge for multi-host
- Reference Dockge (which we've used) instead of Portainer
- Move Portainer mention to "Your files" bullet as contrast
- Link to Dockge repo
* docs(readme): add agentless bullet, link Dockge
- Add "Agentless" bullet highlighting SSH-only approach
- Link to Dockge as contrast (they require agents for multi-host)
- Update NOTE to focus on agentless, CLI-first positioning
- Add bullet points highlighting key benefits after NOTE block
- Update NOTE to position as file-based Portainer alternative
- Fix hero image URL from http to https
- Add alt text to hero image for accessibility
* feat(docker): make container user configurable via CF_UID/CF_GID
Add support for running compose-farm containers as a non-root user
to preserve file ownership on mounted volumes. This prevents files
like compose-farm-state.yaml and web UI config edits from being
owned by root on NFS mounts.
Set CF_UID, CF_GID, and CF_HOME environment variables to run as
your user. Defaults to root (0:0) for backwards compatibility.
* docs: document non-root user configuration for Docker
- Add CF_UID/CF_GID/CF_HOME documentation to README and getting-started
- Add XDG config volume mount for backup/log persistence across restarts
- Update SSH volume examples to use CF_HOME variable
* fix(docker): allow non-root user access and add USER env for SSH
- Add `chmod 755 /root` to Dockerfile so non-root users can access
the installed tool at /root/.local/share/uv/tools/compose-farm
- Add USER environment variable to docker-compose.yml for SSH to work
when running as non-root (UID not in /etc/passwd)
- Update docs to include CF_USER in the setup instructions
- Support building from local source with SETUPTOOLS_SCM_PRETEND_VERSION
* fix(docker): revert local build changes, keep only chmod 755 /root
Remove the local source build logic that was added during testing.
The only required change is `chmod 755 /root` to allow non-root users
to access the installed tool.
* docs: add .envrc.example for direnv users
* docs: mention direnv option in README and getting-started
Adds `cf compose <stack> <command> [args...]` to run any docker compose
command on a stack without needing dedicated wrappers. Useful for
commands like top, images, exec, run, config, etc.
Multi-host stacks require --host to specify which host to run on.
Add support for targeting specific services within a stack:
CLI:
- New `stop` command for stopping services without removing containers
- Add `--service` / `-s` flag to: up, pull, restart, update, stop, logs, ps
- Service flag requires exactly one stack to be specified
Web API:
- Add `stop` to allowed stack commands
- New endpoint: POST /api/stack/{name}/service/{service}/{command}
- Supports: logs, pull, restart, up, stop
Web UI:
- Add action buttons to container rows: logs, restart, stop, shell
- Add rotate_ccw and scroll_text icons for new buttons
* refactor: Store SSH keys in subdirectory for cleaner volume mounting
Change SSH key location from ~/.ssh/compose-farm (file) to
~/.ssh/compose-farm/id_ed25519 (file in directory).
This allows docker-compose to mount just the compose-farm directory
to /root/.ssh without exposing all host SSH keys to the container.
Also make host path the default option in docker-compose.yml with
clearer comments about the two options.
* docs: Update README for new SSH key directory structure
* docs: Clarify cf ssh setup must run inside container
When --full is passed, apply also runs 'docker compose up' on all
services (not just missing/migrating ones) to pick up any config
changes (compose file, .env, etc).
- cf apply # Fast: state reconciliation only
- cf apply --full # Thorough: also refresh all running services
- Update intro and NOTE block to lead with cf apply
- Rewrite "How It Works" to show declarative workflow first
- Move apply to top of command table (bolded)
- Reorder examples to show apply as "the main command"
- Update automation bullet to highlight one-command reconciliation
Simplifies CLI by having one clear reconciliation command:
- cf up <service> = start specific services (auto-migrates if needed)
- cf apply = full reconcile (stop orphans + migrate + start missing)
The --migrate flag was redundant with 'apply --no-orphans'.
Separate "read state from reality" from "write config to reality":
- Rename `sync` to `refresh` (updates local state from running services)
- Add `apply` command (makes reality match config: migrate + stop orphans)
- Add `down --orphaned` flag (stops services removed from config)
- Modify `up --migrate` to only handle migrations (not orphans)
The new mental model:
- `refresh` = Reality → State (discover what's running)
- `apply` = Config → Reality (reconcile: migrate services + stop orphans)
Also extract private helper functions for reporting to match codebase style.
When services are removed from config but still tracked in state,
`cf up --migrate` now stops them automatically. This makes the
config truly declarative - comment out a service, run migrate,
and it stops.
Changes:
- Add get_orphaned_services() to state.py for detecting orphans
- Add stop_orphaned_services() to operations.py for cleanup
- Update lifecycle.py to call stop_orphaned_services on --migrate
- Refactor _report_orphaned_services to use shared function
- Rename "missing_from_config" to "unmanaged" for clarity
- Add tests for get_orphaned_services
- Only remove from state on successful down (not on failure)