* fix: --host filter now limits multi-host stack operations to single host
Previously, when using `-H host` with a multi-host stack like `glances: all`,
the command would find the stack (correct) but then operate on ALL hosts for
that stack (incorrect). For example, `cf down -H nas` with `glances` would
stop glances on all 5 hosts instead of just nas.
Now, when `--host` is specified:
- `cf down -H nas` only stops stacks on nas, including only the nas instance
of multi-host stacks
- `cf up -H nas` only starts stacks on nas (skips migration logic since
host is explicitly specified)
Added tests for the new filter_host behavior in both executor and CLI.
* fix: Apply filter_host to logs and ps commands as well
Same bug as up/down: when using `-H host` with multi-host stacks,
logs and ps would show results from all hosts instead of just the
filtered host.
* fix: Don't remove multi-host stacks from state when host-filtered
When using `-H host` with a multi-host stack, we only stop one instance.
The stack is still running on other hosts, so we shouldn't remove it
from state entirely.
This prevents issues where:
- `cf apply` would try to re-start the stack
- `cf ps` would show incorrect running status
- Orphan detection would be confused
Added tests to verify state is preserved for host-filtered multi-host
operations and removed for full stack operations.
* refactor: Introduce StackSelection dataclass for cleaner context passing
Instead of passing filter_host separately through multiple layers,
bundle the selection context into a StackSelection dataclass:
- stacks: list of selected stack names
- config: the loaded Config
- host_filter: optional host filter from -H flag
This provides:
1. Cleaner APIs - context travels together instead of being scattered
2. is_instance_level() method - encapsulates the check for whether
this is an instance-level operation (host-filtered multi-host stack)
3. Future extensibility - can add more context (dry_run, verbose, etc.)
Updated all callers of get_stacks() to use the new return type.
* Revert "refactor: Introduce StackSelection dataclass for cleaner context passing"
This reverts commit e6e9eed93e.
* feat: Proper per-host state tracking for multi-host stacks
- Add `remove_stack_host()` to remove a single host from a multi-host stack's state
- Add `add_stack_host()` to add a single host to a stack's state
- Update `down` command to use `remove_stack_host` for host-filtered multi-host stacks
- Update `up` command to use `add_stack_host` for host-filtered operations
This ensures the state file accurately reflects which hosts each stack is running on,
rather than just tracking if it's running at all.
* fix: Use set comparisons for host list tests
Host lists may be reordered during YAML save/load, so test for
set equality rather than list equality.
* refactor: Merge remove_stack_host into remove_stack as optional parameter
Instead of a separate function, `remove_stack` now takes an optional
`host` parameter. When specified, it removes only that host from
multi-host stacks. This reduces API surface and follows the existing
pattern.
* fix: Restore deterministic host list sorting and add filter_host test
- Restore sorting of list values in _sorted_dict for consistent YAML output
- Add test for logs --host passing filter_host to run_on_stacks
* Sort host lists in state file for consistent output
Multi-host stacks (like glances) now have their host lists sorted
alphabetically when saving state. This makes git diffs cleaner by
avoiding spurious reordering changes.
The get_containers_rows_by_host endpoint was missing the local_host
parameter when calling _get_glances_address, causing it to use the
external IP instead of the container name for the local host.
This caused "Connection timed out" errors on the live-stats page
when the web UI container couldn't reach its own host via the
external IP (hairpin NAT issue).
Change the filter input event handler from onkeyup to oninput.
The onkeyup event only fires when a key is physically pressed and
released, which means the clear button never appeared when:
- Pasting text
- Using browser autofill
- Setting value programmatically
The oninput event fires whenever the value changes regardless of
the input method, making the clear button work reliably.
Error rows now include the host name (e.g., "nuc: Connection refused")
and have proper id/data-host attributes so they get replaced in place
instead of accumulating on each refresh interval.
* config: Add local_host and web_stack options
Allow configuring local_host and web_stack in compose-farm.yaml instead
of requiring environment variables. This makes it easier to deploy the
web UI with just a config file mount.
- local_host: specifies which host is "local" for Glances connectivity
- web_stack: identifies the web UI stack for self-update detection
Environment variables (CF_LOCAL_HOST, CF_WEB_STACK) still work as
fallback for backwards compatibility.
Closes#152
* docs: Clarify glances_stack is used by CLI and web UI
* config: Env vars override config, add docs
- Change precedence: environment variables now override config values
(follows 12-factor app pattern)
- Document all CF_* environment variables in configuration.md
- Update example-config.yaml to mention env var overrides
* config: Consolidate env vars, prefer config options
- Update docker-compose.yml to comment out CF_WEB_STACK and CF_LOCAL_HOST
(now prefer setting in compose-farm.yaml)
- Update init-env to comment out CF_LOCAL_HOST (can be set in config)
- Update docker-deployment.md with new "Config option" column
- Simplify troubleshooting to prefer config over env vars
* config: Generate CF_LOCAL_HOST with config alternative note
Instead of commenting out CF_LOCAL_HOST, generate it normally but add
a note in the comment that it can also be set as 'local_host' in config.
* config: Extend local_host to all web UI operations
When running the web UI in a Docker container, is_local() can't detect
which host the container is on due to different network namespaces.
Previously local_host/CF_LOCAL_HOST only affected Glances connectivity.
Now it also affects:
- Container exec/shell (runs locally instead of via SSH)
- File editing (uses local filesystem instead of SSH)
Added is_local_host() helper that checks CF_LOCAL_HOST/config.local_host
first, then falls back to is_local() detection.
* refactor: DRY get_web_stack helper, add tests
- Move get_web_stack to deps.py to avoid duplication in streaming.py
and actions.py
- Add tests for config.local_host and config.web_stack parsing
- Add tests for is_local_host, get_web_stack, and get_local_host helpers
- Tests verify env var precedence over config values
* glances: rely on CF_WEB_STACK for container mode
Restore docker-compose env defaults and document local_host scope.
* web: ignore local_host outside container
Document container-only behavior and adjust tests.
* web: infer local host from web_stack
Drop local_host config option and update docs/tests.
* Remove CF_LOCAL_HOST override
* refactor: move web_stack helpers to Config class
- Add get_web_stack() and get_local_host_from_web_stack() as Config methods
- Remove duplicate _get_local_host_from_web_stack() from glances.py and deps.py
- Update deps.py get_web_stack() to delegate to Config method
- Add comprehensive tests for the new Config methods
* config: remove web_stack config option
The web_stack config option was redundant since:
- In Docker, CF_WEB_STACK env var is always set
- Outside Docker, the container-specific behavior is disabled anyway
Simplify by only using the CF_WEB_STACK environment variable.
* refactor: remove get_web_stack wrapper from deps
Callers now use config.get_web_stack() directly instead of
going through a pointless wrapper function.
* prompts: add rule to identify pointless wrapper functions
* fix: Ignore _version.py in type checkers
The _version.py file is generated at build time by hatchling,
so mypy and ty can't resolve it during development.
* Update README.md
* cli: Respect --host flag in stats summary and add tests
- Fix --host filter to work in non-containers mode (was ignored)
- Filter hosts table, pending migrations, and --live queries by host
- Add tests for stats --containers functionality
* refactor: Remove redundant _format_bytes wrappers
Use format_bytes directly from glances module instead of wrapper
functions that add no value.
* Fix stats --host filtering
* refactor: Move validate_hosts to top-level imports
Previously `cf config init-env` created the .env file next to the
compose-farm.yaml config file. This was unintuitive when working in
stack subdirectories - users expected the file in their current
directory.
Now the default is to create .env in the current working directory,
which matches typical CLI tool behavior. Use `-o /path/to/.env` to
specify a different location.
* examples: Add CoreDNS for *.local domain resolution
Adds a CoreDNS example that resolves *.local to the Traefik host,
making the .local routes in all examples work out of the box.
Also removes the redundant Multi-Container Stacks section from
README since paperless-ngx already demonstrates this pattern.
* examples: Add coredns .env file
* fix: external network name parsing
Compose network definitions may have a "name" field defining the actual network name,
which may differ from the key used in the compose file e.g. when overriding the default
compose network, or using a network name containing special characters that are not valid YAML keys.
Fix: check for "name" field on definition and use that if present, else fall back to key.
* tests: Add test for external network name field parsing
Covers the case where a network definition has a "name" field that
differs from the YAML key (e.g., default key with name: compose-net).
---------
Co-authored-by: Bas Nijholt <bas@nijho.lt>
* cli: Add short command aliases
Add single and two-letter aliases for frequently used commands:
- a → apply
- l → logs
- r → restart
- u → update
- p → pull
- s → stats
- c → compose
- rf → refresh
- ck → check
- tf → traefik-file
Aliases are hidden from --help to keep output clean.
* docs: Document command aliases in README
* docs: Clarify Docker Compose vs Compose Farm commands
Split the Usage section into two tables:
- Docker Compose Commands: wrappers with multi-host additions
- Compose Farm Commands: orchestration Docker Compose can't do
Also update the `update` command docstring to clarify it's
a shorthand for `up --pull --build`.
* chore(docs): update TOC
* docs: Add command type distinction to commands.md
Explain that commands are either Docker Compose wrappers with
multi-host superpowers, or Compose Farm originals for orchestration.
Also update `update` description to clarify it's a shorthand.
* Update README.md
* up: Add --pull and --build flags for Docker Compose parity
Add `--pull` and `--build` options to `cf up` to match Docker Compose
naming conventions. This allows users to pull images or rebuild before
starting without using the separate `update` command.
- `cf up --pull` adds `--pull always` to the compose command
- `cf up --build` adds `--build` to the compose command
- Both flags work together: `cf up --pull --build`
The `update` command remains unchanged as a convenience wrapper.
* Update README.md
* up: Run stacks in parallel when no migration needed
Refactor up_stacks to categorize stacks and run them appropriately:
- Simple stacks (no migration): run in parallel via asyncio.gather
- Multi-host stacks: run in parallel
- Migration stacks: run sequentially for clear output and rollback
This makes `cf up --all` as fast as `cf update --all` for typical use.
* refactor: DRY up command building with build_up_cmd helper
Consolidate all 'up -d' command construction into a single helper
function. Now used by up, update, and operations module.
Added tests for the helper function.
* update: Delegate to up --pull --build
Simplify update command to just call up with pull=True and build=True.
This removes duplication and ensures consistent behavior.
* restart: Match Docker Compose semantics
Change `cf restart` from doing `down + up` to using `docker compose
restart`, matching the Docker Compose command behavior.
This provides command naming parity with Docker Compose. Users who want
the old behavior can use `cf down mystack && cf up mystack`.
- Update restart implementation to use `docker compose restart`
- Remove traefik regeneration from restart (no longer recreates containers)
- Update all documentation and help text
- Remove restart from self-update SSH handling (no longer involves down)
* web: Clarify Update tooltip uses 'recreate' not 'restart'
Avoid confusion now that 'restart' means something different.
* web: Fix Update All tooltip to use 'recreates'
* update: Only restart containers when images change
Use `up -d --pull always --build` instead of separate pull/build/down/up
steps. This avoids unnecessary container restarts when images haven't
changed.
* Update README.md
* docs: Update update command description across all docs
Reflect new behavior: only recreates containers if images changed.
* Update README.md
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Previously, Pydantic validation errors like "Extra inputs are not
permitted" didn't show which field caused the error. Now the error
message includes the field location (e.g., "unknown_key: Extra inputs
are not permitted").
* perf: Optimize stray detection to use 1 SSH call per host
Previously, stray detection checked each stack on each host individually,
resulting in (stacks * hosts) SSH calls. For 50 stacks across 4 hosts,
this meant ~200 parallel SSH connections, causing "Connection lost" errors.
Now queries each host once for all running compose projects using:
docker ps --format '{{.Label "com.docker.compose.project"}}' | sort -u
This reduces SSH calls from ~200 to just 4 (one per host).
Changes:
- Add get_running_stacks_on_host() in executor.py
- Add discover_all_stacks_on_all_hosts() in operations.py
- Update _discover_stacks_full() to use the batch approach
* Remove unused function and add tests
- Remove discover_stack_on_all_hosts() which is no longer used
- Add tests for get_running_stacks_on_host()
- Add tests for discover_all_stacks_on_all_hosts()
- Verifies it returns correct StackDiscoveryResult
- Verifies stray detection works
- Verifies it makes only 1 call per host (not per stack)
* Add self-healing: detect and stop rogue containers
Adds the ability to detect and stop "rogue" containers - stacks running
on hosts they shouldn't be according to config.
Changes:
- `cf refresh`: Now scans ALL hosts and warns about rogues/duplicates
- `cf apply`: Stops rogue containers before migrations (new phase)
- New `--no-rogues` flag to skip rogue detection
Implementation:
- Add StackDiscoveryResult for full host scanning results
- Add discover_stack_on_all_hosts() to check all hosts in parallel
- Add stop_rogue_stacks() to stop containers on unauthorized hosts
- Update tests to include new no_rogues parameter
* Update README.md
* fix: Update refresh tests for _discover_stacks_full return type
The function now returns a tuple (discovered, rogues, duplicates)
for rogue/duplicate detection. Update test mocks accordingly.
* Rename "rogue" terminology to "stray" for consistency
Terminology update across the codebase:
- rogue_hosts -> stray_hosts
- is_rogue -> is_stray
- stop_rogue_stacks -> stop_stray_stacks
- _discover_rogues -> _discover_strays
- --no-rogues -> --no-strays
- _report_rogue_stacks -> _report_stray_stacks
"Stray" better complements "orphaned" (both evoke lost things)
while clearly indicating the stack is running somewhere it
shouldn't be.
* Update README.md
* Move asyncio import to top level
* Fix remaining rogue -> stray in docstrings and README
* Refactor: Extract shared helpers to reduce duplication
1. Extract _stop_stacks_on_hosts helper in operations.py
- Shared by stop_orphaned_stacks and stop_stray_stacks
- Reduces ~50 lines of duplicated code
2. Refactor _discover_strays to reuse _discover_stacks_full
- Removes duplicate discovery logic from lifecycle.py
- Calls management._discover_stacks_full and merges duplicates
* Add PR review prompt
* Fix typos in PR review prompt
* Move import to top level (no in-function imports)
* Update README.md
* Remove obvious comments
* docs(readme): position as Dockge for multi-host
- Reference Dockge (which we've used) instead of Portainer
- Move Portainer mention to "Your files" bullet as contrast
- Link to Dockge repo
* docs(readme): add agentless bullet, link Dockge
- Add "Agentless" bullet highlighting SSH-only approach
- Link to Dockge as contrast (they require agents for multi-host)
- Update NOTE to focus on agentless, CLI-first positioning
- Add bullet points highlighting key benefits after NOTE block
- Update NOTE to position as file-based Portainer alternative
- Fix hero image URL from http to https
- Add alt text to hero image for accessibility
* feat(docker): make container user configurable via CF_UID/CF_GID
Add support for running compose-farm containers as a non-root user
to preserve file ownership on mounted volumes. This prevents files
like compose-farm-state.yaml and web UI config edits from being
owned by root on NFS mounts.
Set CF_UID, CF_GID, and CF_HOME environment variables to run as
your user. Defaults to root (0:0) for backwards compatibility.
* docs: document non-root user configuration for Docker
- Add CF_UID/CF_GID/CF_HOME documentation to README and getting-started
- Add XDG config volume mount for backup/log persistence across restarts
- Update SSH volume examples to use CF_HOME variable
* fix(docker): allow non-root user access and add USER env for SSH
- Add `chmod 755 /root` to Dockerfile so non-root users can access
the installed tool at /root/.local/share/uv/tools/compose-farm
- Add USER environment variable to docker-compose.yml for SSH to work
when running as non-root (UID not in /etc/passwd)
- Update docs to include CF_USER in the setup instructions
- Support building from local source with SETUPTOOLS_SCM_PRETEND_VERSION
* fix(docker): revert local build changes, keep only chmod 755 /root
Remove the local source build logic that was added during testing.
The only required change is `chmod 755 /root` to allow non-root users
to access the installed tool.
* docs: add .envrc.example for direnv users
* docs: mention direnv option in README and getting-started