9 Commits

Author SHA1 Message Date
Bas Nijholt
fd1b04297e fix: --host filter now limits multi-host stack operations to single host (#175)
Some checks failed
Update README.md / update_readme (push) Has been cancelled
CI / test (macos-latest, 3.11) (push) Has been cancelled
CI / test (macos-latest, 3.12) (push) Has been cancelled
CI / test (macos-latest, 3.13) (push) Has been cancelled
CI / test (ubuntu-latest, 3.11) (push) Has been cancelled
CI / test (ubuntu-latest, 3.12) (push) Has been cancelled
CI / test (ubuntu-latest, 3.13) (push) Has been cancelled
CI / browser-tests (push) Has been cancelled
CI / lint (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
* fix: --host filter now limits multi-host stack operations to single host

Previously, when using `-H host` with a multi-host stack like `glances: all`,
the command would find the stack (correct) but then operate on ALL hosts for
that stack (incorrect). For example, `cf down -H nas` with `glances` would
stop glances on all 5 hosts instead of just nas.

Now, when `--host` is specified:
- `cf down -H nas` only stops stacks on nas, including only the nas instance
  of multi-host stacks
- `cf up -H nas` only starts stacks on nas (skips migration logic since
  host is explicitly specified)

Added tests for the new filter_host behavior in both executor and CLI.

* fix: Apply filter_host to logs and ps commands as well

Same bug as up/down: when using `-H host` with multi-host stacks,
logs and ps would show results from all hosts instead of just the
filtered host.

* fix: Don't remove multi-host stacks from state when host-filtered

When using `-H host` with a multi-host stack, we only stop one instance.
The stack is still running on other hosts, so we shouldn't remove it
from state entirely.

This prevents issues where:
- `cf apply` would try to re-start the stack
- `cf ps` would show incorrect running status
- Orphan detection would be confused

Added tests to verify state is preserved for host-filtered multi-host
operations and removed for full stack operations.

* refactor: Introduce StackSelection dataclass for cleaner context passing

Instead of passing filter_host separately through multiple layers,
bundle the selection context into a StackSelection dataclass:

- stacks: list of selected stack names
- config: the loaded Config
- host_filter: optional host filter from -H flag

This provides:
1. Cleaner APIs - context travels together instead of being scattered
2. is_instance_level() method - encapsulates the check for whether
   this is an instance-level operation (host-filtered multi-host stack)
3. Future extensibility - can add more context (dry_run, verbose, etc.)

Updated all callers of get_stacks() to use the new return type.

* Revert "refactor: Introduce StackSelection dataclass for cleaner context passing"

This reverts commit e6e9eed93e.

* feat: Proper per-host state tracking for multi-host stacks

- Add `remove_stack_host()` to remove a single host from a multi-host stack's state
- Add `add_stack_host()` to add a single host to a stack's state
- Update `down` command to use `remove_stack_host` for host-filtered multi-host stacks
- Update `up` command to use `add_stack_host` for host-filtered operations

This ensures the state file accurately reflects which hosts each stack is running on,
rather than just tracking if it's running at all.

* fix: Use set comparisons for host list tests

Host lists may be reordered during YAML save/load, so test for
set equality rather than list equality.

* refactor: Merge remove_stack_host into remove_stack as optional parameter

Instead of a separate function, `remove_stack` now takes an optional
`host` parameter. When specified, it removes only that host from
multi-host stacks. This reduces API surface and follows the existing
pattern.

* fix: Restore deterministic host list sorting and add filter_host test

- Restore sorting of list values in _sorted_dict for consistent YAML output
- Add test for logs --host passing filter_host to run_on_stacks
2026-02-01 13:43:17 -08:00
Bas Nijholt
5e08f1d712 web: Exclude web stack from Update All button (#142) 2026-01-04 19:56:41 +01:00
Bas Nijholt
eac9338352 Sort web stack last in bulk operations to prevent self-restart interruption (#140) 2025-12-31 10:10:42 +01:00
Bas Nijholt
6fdb43e1e9 Add self-healing: detect and stop stray containers (#128)
* Add self-healing: detect and stop rogue containers

Adds the ability to detect and stop "rogue" containers - stacks running
on hosts they shouldn't be according to config.

Changes:
- `cf refresh`: Now scans ALL hosts and warns about rogues/duplicates
- `cf apply`: Stops rogue containers before migrations (new phase)
- New `--no-rogues` flag to skip rogue detection

Implementation:
- Add StackDiscoveryResult for full host scanning results
- Add discover_stack_on_all_hosts() to check all hosts in parallel
- Add stop_rogue_stacks() to stop containers on unauthorized hosts
- Update tests to include new no_rogues parameter

* Update README.md

* fix: Update refresh tests for _discover_stacks_full return type

The function now returns a tuple (discovered, rogues, duplicates)
for rogue/duplicate detection. Update test mocks accordingly.

* Rename "rogue" terminology to "stray" for consistency

Terminology update across the codebase:
- rogue_hosts -> stray_hosts
- is_rogue -> is_stray
- stop_rogue_stacks -> stop_stray_stacks
- _discover_rogues -> _discover_strays
- --no-rogues -> --no-strays
- _report_rogue_stacks -> _report_stray_stacks

"Stray" better complements "orphaned" (both evoke lost things)
while clearly indicating the stack is running somewhere it
shouldn't be.

* Update README.md

* Move asyncio import to top level

* Fix remaining rogue -> stray in docstrings and README

* Refactor: Extract shared helpers to reduce duplication

1. Extract _stop_stacks_on_hosts helper in operations.py
   - Shared by stop_orphaned_stacks and stop_stray_stacks
   - Reduces ~50 lines of duplicated code

2. Refactor _discover_strays to reuse _discover_stacks_full
   - Removes duplicate discovery logic from lifecycle.py
   - Calls management._discover_stacks_full and merges duplicates

* Add PR review prompt

* Fix typos in PR review prompt

* Move import to top level (no in-function imports)

* Update README.md

* Remove obvious comments
2025-12-22 10:22:09 -08:00
Bas Nijholt
350947ad12 Rename services to stacks terminology (#79) 2025-12-20 16:00:41 -08:00
Bas Nijholt
bb019bcae6 feat: add ty type checker alongside mypy (#77)
Add Astral's ty type checker (written in Rust, 10-100x faster than mypy)
as a second type checking layer. Both run in pre-commit and CI.

Fixed type issues caught by ty:
- config.py: explicit Host constructor to avoid dict unpacking issues
- executor.py: wrap subprocess.run in closure for asyncio.to_thread
- api.py: use getattr for Jinja TemplateModule macro access
- test files: fix playwright driver_path tuple handling, pytest rootpath typing
2025-12-20 15:43:51 -08:00
Bas Nijholt
c720170f26 Add --full flag to apply command
When --full is passed, apply also runs 'docker compose up' on all
services (not just missing/migrating ones) to pick up any config
changes (compose file, .env, etc).

- cf apply          # Fast: state reconciliation only
- cf apply --full   # Thorough: also refresh all running services
2025-12-16 23:54:19 -08:00
Bas Nijholt
af9c760fb8 Add missing service detection to apply command
Previously, apply only handled:
1. Stopping orphans (in state, not in config)
2. Migrating services (in state, wrong host)

Now it also handles:
3. Starting missing services (in config, not in state)

This fixes the case where a service was stopped as an orphan, then
re-added to config - apply would say "nothing to do" instead of
starting it.

Added get_services_not_in_state() to state.py and updated tests.
2025-12-16 23:21:09 -08:00
Bas Nijholt
90656b05e3 Add tests for apply command and down --orphaned flag
Tests cover:
- apply: nothing to do, dry-run preview, migrations, orphan cleanup, --no-orphans
- down --orphaned: no orphans, stops services, error on invalid combinations

Lifecycle.py coverage improved from 20% to 61%.
2025-12-16 23:15:46 -08:00