Compare commits

...

15 Commits

Author SHA1 Message Date
nityanandagohain
437ce412c6 fix: use mustnewuuid 2026-04-24 00:19:02 +05:30
nityanandagohain
040872fa41 fix: address comments 2026-04-23 19:08:42 +05:30
nityanandagohain
c28f6cd1a3 fix: types 2026-04-23 16:51:30 +05:30
nityanandagohain
70e37817b9 fix: remove nullable 2026-04-23 16:41:26 +05:30
nityanandagohain
c5645c38c4 Merge remote-tracking branch 'origin/main' into issue_4360 2026-04-23 16:37:17 +05:30
nityanandagohain
5c7c262d4e fix: address comments 2026-04-23 16:34:45 +05:30
Pandey
93f5df9185 tests: unify integration + e2e under shared pytest project (#11019)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(tests): hoist pytest project to tests/ root for shared fixtures

Lift pyproject.toml, uv.lock, conftest.py, and fixtures/ up from
tests/integration/ so the pytest project becomes shared infrastructure
rather than integration's private property. A sibling tests/e2e/ can
reuse the same fixture graph (containers, auth, seeding) without
duplicating plugins.

Also:
- Merge tests/integration/src/querier/util.py into tests/fixtures/querier.py
  (response assertions and corrupt-metadata generators belong with the
  other querier helpers).
- Use --import-mode=importlib + pythonpath=["."] in pyproject so
  same-basename tests across src/*/ do not collide at the now-wider
  rootdir.
- Broaden python_files to "*/src/**/**.py" so future test trees under
  tests/e2e/src/ get discovered.
- Update Makefile py-* targets and integrationci.yaml to cd into tests/
  and reference integration/src/... paths.

* feat(tests/e2e): import Playwright suite from signoz-e2e

Relocate the standalone signoz-e2e repository into tests/e2e/ as a
sibling of tests/integration/. The suite still points at remote
staging by default; subsequent commits wire it to the shared pytest
fixture graph so the backend can be provisioned locally.

Excluded from the import: .git, .github (CI migration deferred),
.auth, node_modules, test-results, playwright-report.

* feat(tests/e2e): pytest-driven backend bring-up, seeding, and playwright runner

Wire the Playwright suite into the shared pytest fixture graph so the
backend + its seeded state are provisioned locally instead of pointing
at remote staging.

Python side (owns lifecycle):
- tests/fixtures/dashboards.py — generic create/list/upsert_dashboard
  helpers (shared infra; testdata stays per-tree).
- tests/e2e/conftest.py — e2e-scoped pytest fixtures: seed_dashboards
  (idempotent upsert from tests/e2e/testdata/dashboards/*.json),
  seed_alert_rules (from tests/e2e/testdata/alerts/*.json, via existing
  create_alert_rule), seed_e2e_telemetry (fresh traces/logs across a
  few synthetic services so /home and Services pages have data).
- tests/e2e/src/bootstrap/setup.py — test_setup depends on the fixture
  graph and persists backend coordinates to tests/e2e/.signoz-backend.json;
  test_teardown is the --teardown target.
- tests/e2e/src/bootstrap/run.py — test_e2e: one-command entrypoint that
  brings up the backend + seeds, then subprocesses yarn test and asserts
  Playwright exits 0.
- tests/conftest.py — register fixtures.dashboards plugin.

Playwright side (just reads):
- tests/e2e/global.setup.ts — loads .signoz-backend.json and injects
  SIGNOZ_E2E_BASE_URL/USERNAME/PASSWORD. No-op when env is already
  populated (staging mode, or pytest-driven runs where env is pre-set).
- playwright.config.ts registers globalSetup.
- package.json gains test:staging; existing scripts unchanged.

Testdata layout: tests/e2e/testdata/{dashboards,alerts,channels}/*.json
— per-tree (integration has its own tests/integration/testdata/).

* docs(tests): describe pytest-master workflow and shared fixture layout

- tests/README.md (new): top-level map of the shared pytest project,
  fixture-ownership rule (shared vs per-tree), and common commands.
- tests/e2e/README.md: lead with the one-command pytest run and the
  warm-backend dev loop; keep the staging fallback as option 2.
- tests/e2e/CLAUDE.md: updated commands so agent contexts reflect the
  pytest-driven lifecycle.
- tests/e2e/.env.example: drop unused SIGNOZ_E2E_ENV_TYPE; note the file
  is only needed for staging mode.

* fix(tests/fixtures/signoz.py): anchor Docker build context to repo root

Previously used path="../../" which resolved to the repo root only when
pytest's cwd was tests/integration/. After hoisting the pytest project
to tests/, that same relative path pointed one level above the repo
root and the build failed with:

  Cannot locate specified Dockerfile: cmd/enterprise/Dockerfile.with-web.integration

Anchor the build context to an absolute path computed from __file__ so
the fixture works regardless of pytest cwd.

* feat(tests/e2e): alerts-downtime regression suite (platform-pod/issues/2095)

Import the 34-step regression suite originally developed on
platform-pod/issues/2095-frontend. Targets the alerts and planned-downtime
frontend flows after their migration to generated OpenAPI clients and
generated react-query hooks.

- specs/alerts-downtime/: SUITE.md (the stable spec), README.md (scope +
  open observations from the original runs), results-schema.md (legacy
  per-run artifact shape, retained for context).
- tests/alerts-downtime/alerts-downtime.spec.ts: 881-line Playwright spec
  covering 6 flows — alert CRUD/toggle, alert detail 404, planned
  downtime CRUD, notification channel routing, anomaly alerts.

Integration with the shared suite:
- Uses baseURL + storageState from tests/e2e/playwright.config.ts (no
  separate config). page.goto calls use relative paths; SIGNOZ_E2E_*
  env vars from the pytest bootstrap drive auth.
- test.describe.configure({ mode: 'serial' }) at the top of the describe:
  the flows mutate shared tenant state, so parallel runs cause cross-
  flow interference (documented in the original 2095 config).
- Per-run artifacts (network captures + screenshots) land in
  tests/e2e/tests/alerts-downtime/run-spec-<ts>/ by default — gitignored.

Historical per-run artifacts (~7.5MB of screenshots across run-1 through
run-7) are not imported; they lived at e2e/2095/run-*/ on the original
branch and remain there if needed.

* refactor(fixtures/traces): extract insert + truncate helpers

Pull the ClickHouse insert path out of the insert_traces pytest fixture
into a plain module-level function insert_traces_to_clickhouse(conn,
traces), and move the per-table TRUNCATE loop into truncate_traces_tables
(conn, cluster). The fixture becomes a thin wrapper over both — zero
behavioural change.

Lets the HTTP seeder container (tests/fixtures/seeder/) reuse the exact
same insert + truncate code the pytest fixture uses, so the two stay in
sync as the trace schema evolves.

* feat(fixtures/seeder): HTTP seeder container for fine-grained telemetry seeding

Adds a sibling container alongside signoz/clickhouse/postgres that exposes
HTTP endpoints for direct-ClickHouse telemetry seeding, so Playwright
tests can shape per-test data without going through OTel or the SigNoz
ingestion path.

tests/fixtures/seeder/:
- Dockerfile: python:3.13-slim + the shared fixtures/ tree so the
  container can import fixtures.traces and reuse the exact insert path
  used by pytest.
- server.py: FastAPI app with GET /healthz, POST /telemetry/traces
  (accepts a JSON list matching Traces.from_dict input; auto-tags each
  inserted row with resource seeder=true), DELETE /telemetry/traces
  (truncates all traces tables).
- requirements.txt: fastapi, uvicorn, clickhouse-connect, numpy plus
  sqlalchemy/pytest/testcontainers because fixtures/{__init__,types,
  traces}.py import them at module load.

tests/fixtures/seeder/__init__.py: pytest fixture (`seeder`, package-
scoped) that builds the image via docker-py (testcontainers DockerImage
had multi-segment dockerfile issues), starts the container on the
shared network wired to ClickHouse via env vars, and waits for
/healthz. Cache key + restore follow the dev.wrap pattern other
fixtures use for --reuse.

tests/.dockerignore: exclude .venv, caches, e2e node_modules, and test
outputs so the build context is small and deterministic.

tests/conftest.py: register fixtures.seeder as a pytest plugin.

Currently traces-only — logs + metrics follow the same pattern.

* feat(tests/e2e): surface seeder_url to Playwright via globalSetup

- bootstrap/setup.py: test_setup now depends on the seeder fixture and
  writes seeder_url into .signoz-backend.json alongside base_url.
- bootstrap/run.py: test_e2e exports SIGNOZ_E2E_SEEDER_URL to the
  subprocessed yarn test so Playwright specs can reach the seeder
  directly in the one-command path.
- global.setup.ts: if .signoz-backend.json carries seeder_url, populate
  process.env.SIGNOZ_E2E_SEEDER_URL. Remains optional — staging mode
  leaves it unset.

Playwright specs that want per-test telemetry can:
  await fetch(process.env.SIGNOZ_E2E_SEEDER_URL + '/telemetry/traces', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify([...])
  });
and await a truncate via DELETE on teardown.

* fix(alerts-downtime): capture load-time GETs before navigation

Flow 1 registered cap.mark() AFTER page.goto() and then called
page.waitForResponse(/api/v2/rules) — but against a fast local backend
the GET /api/v2/rules response arrived during page.goto, before the
waiter could register, and the test timed out at 30s.

installCapture's page.on('response') listener runs from before the
navigation, so moving mark() above page.goto() and relying on
dumpSince's 500ms drain is enough. No lost precision.

One site only; the same pattern exists in later flows (via per-action
waitForResponse) and may surface similar races — those are left for a
follow-up once the backend-side 2095 migration lands on main (current
frontend still calls PATCH /api/v1/rules/:id which the spec's assertion
doesn't match anyway).

* refactor(fixtures/logs,metrics): extract insert + truncate helpers

Mirror the traces refactor: pull the ClickHouse insert path out of the
insert_logs / insert_metrics pytest fixtures into plain module-level
functions (insert_logs_to_clickhouse, insert_metrics_to_clickhouse) and
move the per-table TRUNCATE loops into truncate_logs_tables /
truncate_metrics_tables. The fixtures become thin wrappers — zero
behavioural change.

Sets up the seeder container to expose POST/DELETE endpoints for logs
and metrics using the exact same code paths as the pytest fixtures.

* feat(fixtures/seeder): add logs and metrics endpoints

Extend the seeder with POST/DELETE endpoints for logs and metrics,
following the same shape as the existing traces endpoints:

- POST /telemetry/logs accepts a JSON list matching Logs.from_dict;
  tags each row's resources with seeder=true.
- POST /telemetry/metrics accepts a JSON list matching Metrics.from_dict;
  tags resource_attrs with seeder=true (Metrics.from_dict unpacks
  resource_attrs rather than a resources dict).
- DELETE /telemetry/logs, DELETE /telemetry/metrics truncate via the
  shared truncate_*_tables helpers.

Requirements gain svix-ksuid because fixtures/logs.py imports KsuidMs
for log id generation.

Verified end-to-end against the warm backend: POST inserted=1 on each
signal, DELETE truncated=true on each.

* refactor(fixtures/seeder): align status codes with HTTP semantics

- POST /telemetry/{traces,logs,metrics}: return 201 Created (kept the
  {inserted: N} body so callers can verify the count landed).
- DELETE /telemetry/{traces,logs,metrics}: return 204 No Content with
  an empty body.

* refactor(tests/seeder): extract from fixtures/ into top-level package

Move the HTTP seeder (Dockerfile, requirements.txt, server.py) out of
tests/fixtures/seeder/ and into its own tests/seeder/ top-level package.
The pytest fixture that builds and runs the image moves to
tests/fixtures/seeder.py so it sits next to the other container fixtures.

Rationale: the seeder is a standalone containerized Python service, not a
pytest fixture. It ships a Dockerfile, its own requirements.txt, and a
server.py entrypoint — none of which belong under a package whose purpose
is shared pytest code.

Image-side changes:
- Dockerfile now copies seeder/ alongside fixtures/ and launches
  seeder.server:app instead of fixtures.seeder.server:app.
- Build context stays tests/ (unchanged), so fixtures.* imports inside
  server.py continue to resolve.

Fixture-side changes:
- _TESTS_ROOT computation drops one parent (parents[1] now that the file
  is at tests/fixtures/seeder.py, not tests/fixtures/seeder/__init__.py).
- The dockerfile= path passed to docker-py becomes seeder/Dockerfile.

No behavior change; every consumer still imports the seeder fixture as
before and gets the same container.

* refactor(fixtures/keycloak): rename from idp.py to name the concrete tech

The container provider at fixtures/idp.py brought up a Keycloak image. Name
it for what it is so we can use fixtures/idp.py later for API-side IdP
helpers (OIDC/SAML admin flows) without an idp-vs-idputils naming collision.

- fixtures/idp.py → fixtures/keycloak.py (git rename).
- fixtures.idputils updates its one internal import to fixtures.keycloak.
- conftest.py pytest_plugins entry points at the new module.

No caller outside fixtures/ imports fixtures.idp directly, so no shim is
needed. The "idp" fixture name (how tests reference it) is unchanged.

* refactor(fixtures/gateway): drop -utils suffix

The module only held helper functions (no fixtures). Rename to match the
domain and leave a shim at the old path so integration/ import sites keep
working until they are swept in a follow-up.

* fix(fixtures/gatewayutils): silence wildcard-import in deprecation shim

The shim intentionally re-exports via `from fixtures.gateway import *`;
pylint flags the wildcard and every unused-wildcard symbol. Suppress both
in the shim only — the live module has no wildcard.

* refactor(fixtures/auth): merge authutils helpers into auth

Pull the pure-helper functions from authutils.py (create_active_user,
find_user_by_email, find_user_with_roles_by_email, assert_user_has_role,
change_user_role) into auth.py next to the fixtures they complement.
Fixtures remain on top; helpers go below. Drop the module docstring.

Replace authutils.py with a deprecation shim that re-exports from
fixtures.auth so integration/ import sites (9 files) keep working until
they are swept in a follow-up. Suppress the wildcard-import warnings in
the shim only.

* refactor(fixtures/alerts): merge alertutils helpers into alerts

Pull the pure-helper functions from alertutils.py
(collect_webhook_firing_alerts, _verify_alerts_labels,
verify_webhook_alert_expectation, update_rule_channel_name) into alerts.py
next to the fixtures they complement. Fixtures stay on top; helpers go
below.

Replace alertutils.py with a deprecation shim that re-exports from
fixtures.alerts so integration/ import sites keep working until they are
swept in a follow-up.

* refactor(fixtures/cloudintegrations): merge cloudintegrationsutils helpers

Pull the pure-helper functions from cloudintegrationsutils.py
(deprecated_simulate_agent_checkin, setup_create_account_mocks,
simulate_agent_checkin) into cloudintegrations.py next to the fixtures
they complement. Fixtures stay on top; helpers go below.

Replace cloudintegrationsutils.py with a deprecation shim that re-exports
from fixtures.cloudintegrations so integration/ import sites keep working
until they are swept in a follow-up.

* refactor(fixtures/idp): rename idputils to idp now that keycloak owns the container

With the Keycloak container provider at fixtures.keycloak, the fixtures.idp
name is free for what idputils always was — API/browser helpers for OIDC
and SAML admin flows against the IdP container.

- fixtures.idputils → fixtures.idp (git rename).
- conftest.py pytest_plugins swaps fixtures.idputils for fixtures.idp so
  the create_saml_client / create_oidc_client fixtures register under the
  canonical path.

Replace fixtures.idputils with a deprecation shim re-exporting from
fixtures.idp so integration/ import sites (callbackauthn) keep working
until they are swept in a follow-up.

* refactor(fixtures/reuse): rename from dev to describe what the module is

The module wraps pytest-cache resource reuse/teardown for container
fixtures; "dev" conveyed nothing about its role. Rename to fixtures.reuse
and update the 12 internal callers that imported `from fixtures import
dev, types` to use `reuse` instead.

Replace fixtures.dev with a deprecation shim so any external caller keeps
working until the follow-up sweep.

* refactor(fixtures/time,fs): split utils by responsibility

fixtures.utils only held two time parsers (parse_timestamp, parse_duration)
and one path helper (get_testdata_file_path) — a "utils" grab bag.

- Time parsers move to fixtures.time (utils.py → time.py via git rename).
- get_testdata_file_path moves into fixtures.fs where other filesystem
  helpers live.
- Internal callers (alerts, logs, metrics, traces) update to the new paths.

Replace fixtures.utils with a deprecation shim that re-exports all three
functions so integration/ import sites keep working until the follow-up
sweep.

* refactor(fixtures/browser): rename from driver to match peer primitives

fixtures.driver was the Selenium WebDriver fixture — rename the module to
fixtures.browser so it sits next to fixtures.http as a named primitive.
The fixture name inside (driver) stays — that's the Selenium-canonical
term and tests reference it directly.

conftest.py pytest_plugins entry points at the new module. A deprecation
shim at fixtures.driver keeps any external caller working until the
follow-up sweep.

* refactor(tests/seeder): install deps via uv from pyproject, drop requirements.txt

The seeder's requirements.txt duplicated 7 of 10 deps from pyproject.toml
with overlapping version pins — a standing drift risk. The comment on top
of the file admitted the real problem: the seeder image already ships
pytest + testcontainers + sqlalchemy because importing fixtures.traces
walks fixtures/__init__.py and fixtures/types.py. "Don't ship test infra"
was already violated.

- Add fastapi, uvicorn[standard], and py to pyproject.toml dependencies
  (the three seeder-only deps that were not yet in pyproject; `py` was a
  latent gap since fixtures/types.py uses py.path.local but pytest only
  pulls it in transitively).
- Switch the Dockerfile to `uv sync --frozen --no-install-project --no-dev`
  so the container env matches local dev exactly (uv.lock is the single
  source of truth for versions).
- Move tests/seeder/Dockerfile → tests/Dockerfile.seeder so it lives
  alongside the pyproject at the root of the build context.
- Delete tests/seeder/requirements.txt.

The seeder image grows by ~40-50MB (selenium, psycopg2, wiremock now come
along from main deps); accepted as a cost of single source of truth since
the seeder is dev-only infra, not a shipped artifact.

* refactor(tests/integration): flatten src/ into bootstrap/ + tests/

Drop the redundant src/ layer in the integration tree. 'src' carries no
information — the directory IS integration test source. After flatten:

  tests/integration/
    bootstrap/setup.py        was src/bootstrap/setup.py
    tests/<suite>/*.py        was src/<suite>/*.py (16 suites)
    testdata/

Updates:
- Makefile: py-test-setup/py-test-teardown/py-test target paths.
- tests/README.md: layout diagram + command examples.
- tests/pyproject.toml: python_files glob now matches basenames
  explicitly — "[0-9][0-9]_*.py" for NN-prefixed suite files plus
  "setup.py" and "run.py" for bootstrap entrypoints. The old "*/src/.."
  glob stopped matching anything here and would have caused pytest to
  try collecting seeder/server.py as a test.

* refactor(tests/e2e): flatten src/ into bootstrap/

Drop the e2e/src/ wrapper — the only Python content under it was
bootstrap/, which is now a direct child of e2e/. Keeps integration and
e2e symmetric (both have bootstrap/, tests/, testdata/ as peers).

Also delete bootstrap/__init__.py on both integration and e2e sides.
With --import-mode=importlib, pytest walks up from each .py file to find
the highest __init__.py-containing dir and uses that as the package root.
Without integration/__init__.py or e2e/__init__.py above bootstrap/, both
setup.py files resolved to the same dotted name `bootstrap.setup`, causing
a sys.modules collision that silently dropped test_telemetry_databases_exist
from integration's bootstrap. With no __init__.py anywhere, pytest treats
each setup.py as a standalone module via spec_from_file_location and both
are collected cleanly.

Updates tests/README.md, tests/e2e/README.md, and tests/e2e/CLAUDE.md path
references from e2e/src/bootstrap/ to e2e/bootstrap/.

* refactor(tests/e2e): drop specs/ + strip // spec: back-pointers

specs/ held markdown test plans that mirrored tests/ 1:1 as pre-code
scratch. Once a test exists, the plan is stale the moment the test
diverges — they're AI-planner output, not source of truth. Keep the
workflow alive by .gitignore-ing specs/ (the planner agent can still
write locally) but stop shipping stale plans in the repo.

Strip the `// spec: specs/...` and `// seed: tests/seed.spec.ts` header
comments from 5 .spec.ts files. The spec pointer is dead; the seed
pointer was convention-only — Playwright collects regardless.

* docs(contributing/tests): move e2e/integration guides out of test dirs

Pull the e2e contributor guide out of tests/e2e/CLAUDE.md (which read
like a full agent-workflow reference doc) and into
docs/contributing/tests/e2e.md alongside the existing development / go
guides.

- Delete tests/e2e/CLAUDE.md; its content (layout, commands, role tags,
  locator priority, Playwright agent workflow) lives in the new e2e.md
  with references to the now-.gitignore'd specs/ dir removed.
- Add docs/contributing/tests/integration.md — short guide covering
  layout, runner commands, filename conventions, and the flow for
  adding a new suite (there was no contributor doc for this before).
- Trim tests/e2e/README.md to quick-start + commands; link out to the
  full guide. Readers who just want to run tests get the 5 commands
  they need; anything deeper is one hop away.

* chore(tests/e2e): drop examples/example-test-plan.md

Init-agents boilerplate. Fresh planner agents don't need a checked-in
template; they can write to the .gitignore'd specs/ scratch dir.

tests/integration/.qodo/ was also removed (untracked, empty; .qodo is
already in the root .gitignore).

* refactor(tests/seeder): use fixtures.logger.setup_logger

Drop the one-off logging.basicConfig + logging.getLogger("seeder") in
favor of the shared setup_logger helper that every fixtures/*.py already
uses. Keeps log format consistent across pytest runs and the seeder
container.

fixtures.logger ships into the image via the existing COPY fixtures step
in Dockerfile.seeder — no build change needed.

* fix(tests/e2e): correct e2e_dir path after src/ flatten

After phase 2 (flatten tests/e2e/src/ into tests/e2e/), the run.py file
sits one level closer to the e2e root. parents[2] now resolves to tests/
instead of tests/e2e/, so yarn test would subprocess from the wrong cwd.

parents[1] is the correct index now.

* fix(tests/e2e): correct endpoint-file path in setup.py after src/ flatten

Same class of stale-path bug as the run.py fix: after the e2e/src/
flatten, setup.py sits one level closer to the e2e root. parents[2] now
lands at tests/ instead of tests/e2e/, so .signoz-backend.json would be
written to tests/.signoz-backend.json and the Playwright global.setup.ts
(which expects tests/e2e/.signoz-backend.json) wouldn't find it.

parents[1] is correct.

* refactor(tests/e2e): drop pre-seed fixtures; each spec owns its data

The seeder (tests/seeder/) was built so specs can POST telemetry
per-test. Global pre-seeding via tests/e2e/conftest.py (seed_dashboards,
seed_alert_rules, seed_e2e_telemetry) is the exact anti-pattern that
setup obsoletes — shared state across specs, order-dependent runs, no
reset between tests.

- Delete tests/e2e/conftest.py (3 fixtures, all pre-seed).
- Delete tests/e2e/testdata/dashboards/apm-metrics.json — its only
  consumer was seed_dashboards. tests/e2e/testdata/ now empty and gone.
- Drop seed_dashboards, seed_alert_rules, seed_e2e_telemetry params
  from bootstrap/setup.py::test_setup and bootstrap/run.py::test_e2e.
  test_teardown never depended on them.
- Refresh the module docstrings on both bootstrap tests to reflect the
  new model (backend + seeder up; specs seed themselves).
- Update tests/README.md and docs/contributing/tests/e2e.md: remove the
  testdata/ + conftest.py references, document the per-spec seeding
  rule (telemetry via seeder endpoints, dashboards/alerts via SigNoz
  REST API from the spec).

Known breakage: tests/e2e/tests/dashboards/dashboards-list.spec.ts
expects at least one dashboard to exist. With seed_dashboards gone, it
will fail until that spec is updated to create its own dashboard via
the SigNoz API in test.beforeAll. Followup.

* refactor(tests/e2e): relocate auth helper into fixtures/; expose authedPage

Rename tests/e2e/utils/login.util.ts → tests/e2e/fixtures/auth.ts and
drop the (now-empty) utils/ dir. "Fixtures" is the unit of per-test
shared setup on both the Python and TS sides of this project — naming
them consistently across trees makes the parallel obvious.

fixtures/auth.ts now exports three things:

- `test` — Playwright test extended with an authedPage fixture. New
  specs can request `authedPage` as a param and skip the
  `beforeEach(() => ensureLoggedIn(page))` boilerplate entirely.
- `expect` — re-exported from @playwright/test so callers have one
  import.
- `ensureLoggedIn(page)` — the underlying helper, still exported for
  specs that want per-call control.

Update the 4 specs that imported from utils/login.util to point at the
new path; no behavior change in those specs (they keep calling
ensureLoggedIn in beforeEach). Refactoring them to use authedPage can
happen spec-by-spec later.

Also update the path example in .cursorrules so AI-generated snippets
reach for the new import path.

* refactor(tests/e2e): emit .env.local instead of .signoz-backend.json

The old flow (pytest writes JSON → global.setup.ts loads it → exports
env vars) was doing what dotenv already does. Collapse to the native
pattern:

- bootstrap/setup.py writes tests/e2e/.env.local with the four coords
  (BASE_URL, USERNAME, PASSWORD, SEEDER_URL). File header marks it as
  generated.
- playwright.config.ts loads .env first, then .env.local with
  override=true. User-provided defaults stay in .env; generated values
  win when present.
- Delete tests/e2e/global.setup.ts (36 lines gone) and its globalSetup
  reference in playwright.config.ts.

Subprocess-injected env (run.py shelling out to yarn test) still wins
because dotenv doesn't overwrite already-set process.env keys.

Rename the test-only override env var SIGNOZ_E2E_ENDPOINT_FILE →
SIGNOZ_E2E_ENV_FILE for accuracy. Update .env.example, .gitignore (drop
.signoz-backend.json, keep .env.local with its explanatory comment),
tests/README.md, docs/contributing/tests/e2e.md.

* refactor(tests/e2e/alerts-downtime): drop custom network + screenshot capture

The spec wrapped every /api/ response in a bespoke installCapture(), wrote
hand-named JSON files per call (01_step1.1_GET_rules.json, ...), and took
step-by-step screenshots — all going into run-spec-<ts>/ next to the spec
(gitignored).

Playwright already records equivalent data via `trace` (network bodies,
screenshots per step, DOM snapshots, console — viewable via
`playwright show-trace`). The capture infra was duplicating that for the
one-shot 2095 regression audit; no downstream consumer reads the JSON or
PNG artifacts now.

- Remove installCapture, shot, RUN_DIR/NET_DIR/SHOT_DIR, fs/path imports.
- Strip cap.mark()/cap.dumpSince()/shot() calls throughout the 7 flows.
- Collapse the block-scopes that only existed to bound mark variables.
- Drop the "Artifacts" paragraph from the file's top-of-file comment.
- Remove the `tests/alerts-downtime/run-spec-*/` entry from .gitignore.

Spec drops from 885 lines to 736 (≈17% smaller). All 7 flows + their
assertions are unchanged. For debug access, rely on
`trace: 'on-first-retry'` (already set in playwright.config.ts) + `yarn
show-trace`.

* refactor(tests/e2e): move alerts-downtime.spec.ts into alerts/

The spec lives mostly in the alerts domain (6 of 7 flows), with the
planned-downtime CRUD (Flow 4) and cascade-delete (Flow 5) as
cross-feature collateral. The standalone alerts-downtime/ dir was
compound-named, breaking the one-feature-per-dir pattern every other
dir under tests/ follows, and duplicating the spec's own filename.

Move to tests/alerts/alerts-downtime.spec.ts. Empty alerts-downtime/
dir removed.

* refactor(tests/e2e): consolidate Playwright output under artifacts/

All Playwright outputs now land under a single tests/e2e/artifacts/ dir
so CI can archive it in one command (tar / zip / upload-artifact). Each
piece was writing to its own sibling of tests/e2e/ before.

playwright.config.ts:
- outputDir: 'artifacts/test-results' — per-test traces, screenshots,
  videos (was default test-results/).
- HTML reporter → 'artifacts/html-report' (was default
  playwright-report/); open: 'never' so CI doesn't spawn a browser on
  report generation.
- JSON reporter → 'artifacts/results.json' (was
  'test-results/results.json').

package.json: `yarn report` now points playwright show-report at the new
HTML folder.

Ignore updates — replace the two old paths with /artifacts/ in
tests/e2e/.gitignore, tests/e2e/.prettierignore, and tests/.dockerignore
(seeder image build context).

.cursorrules: update the `cat test-results/results.json` example to the
new path so AI-generated snippets reach for the right file.

Delete the empty test-results/ and playwright-report/ dirs that prior
runs left behind.

* refactor(tests/e2e): one artifacts/ subdir per reporter

Within artifacts/, give each reporter its own named subdir so the layout
tells you what wrote what:

  artifacts/
    html/              # HTML reporter (was artifacts/html-report)
    json/results.json  # JSON reporter (was artifacts/results.json)
    test-results/      # outputDir — per-test traces/screenshots/videos

`yarn report` and the .cursorrules cat example point at the new paths.

* refactor(tests/e2e): drop SIGNOZ_USER_ROLE env filter and @admin/@editor/@viewer tags

The filter claimed to be role-based but only grep'd by tag — the actual
browser session is always admin (bootstrap creates one admin, auth.setup.ts
saves one storageState, every project uses it). Tagging tests `@viewer`
didn't mean they ran as a viewer; it just meant they'd be in the subset
selected when SIGNOZ_USER_ROLE=Viewer. Superset semantics (admin sees
everything) meant the filter was at best a narrower test selection and
at worst a misleading assertion of role coverage.

Gone:
- getRoleGrepPattern() + grep: line in playwright.config.ts.
- The dedicated setup project's grep override (no filter to override).
- SIGNOZ_USER_ROLE entries in .env.example, README, docs/contributing.
- The "Role-Based Testing" section + all role-tagging guidance and
  example snippets in .cursorrules.
- All `{ tag: '@viewer' | '@editor' | '@admin' }` annotations on the 90
  affected test sites across 5 spec files (single-line and multi-line
  forms). ~90 annotations gone.

For ad-hoc selection, `yarn test --grep <pattern>` still works on
Playwright's normal grep (test titles/paths).

Real role-based coverage (separate users + storageStates per role) is a
different problem — not pretending this was it.

* chore(tests/e2e): drop .cursorrules

* refactor(tests/e2e): move auth from project-level storageState to per-suite fixture

Replaced auth.setup.ts + globally-mounted storageState with a test-scoped
authedPage fixture in tests/e2e/fixtures/auth.ts. Each suite controls its
own identity via `test.use({ user: ... })`; specs that need to run
unauthenticated just request the stock `page` fixture instead.

fixtures/auth.ts:
- Declares `user` as a test option, defaulting to ADMIN (creds from
  .env.local / .env).
- authedPage resolves to a Page whose context has storageState mounted
  for that user. First request per (user, worker) triggers one login
  and writes a per-user storageState file under .auth/; subsequent
  requests reuse it via a Promise-valued cache.
- Exposes `User` type and `ADMIN` constant so future suites can declare
  additional users (EDITOR, VIEWER) as credentials become available.

playwright.config.ts:
- Drop authFile constant, `setup` project, storageState + dependencies
  on each browser project.

tests/auth.setup.ts:
- Deleted. Login logic now lives inside fixtures/auth.ts's login() helper,
  called on demand by the fixture rather than upfront for the whole run.

Spec migration (6 files):
- Import `test, expect` from ../fixtures/auth (or ../../fixtures/auth)
  instead of @playwright/test.
- Drop `ensureLoggedIn` imports and `await ensureLoggedIn(page)` calls.
- Swap `{ page }` → `{ authedPage: page }` in test and beforeEach
  destructures (local var stays `page` via aliasing so test bodies need
  no further changes).

Cost: N logins per run, where N = unique users × workers (= 1 × 2–4
today, vs the old 1 globally). Tradeoff for explicit per-suite control.

Specs that need unauth later just use `async ({ page }) => ...` — the
fixture isn't invoked, so no login fires.

291 tests still list (previously 292: the old auth.setup.ts counted as
one fake "test"; it's gone now).

* refactor(tests/e2e): cache auth storageState in memory, drop .auth/ dir

The fixture was writing each user's storageState to .auth/<user>.json and
then handing Playwright the file path. But Playwright's
browser.newContext({ storageState }) accepts the object form too —
ctx.storageState() without a path arg returns the cookies+origins
inline.

Keeping the cache in memory means no filesystem roundtrip per login, no
.auth/ dir to maintain, no stale JSON persisting across runs, and no
gitignore entry for it. Each worker's Map holds one Promise<StorageState>
per unique user, resolved on first login and reused thereafter.

Drop the .auth/ entry from tests/e2e/.gitignore; delete the (now unused)
on-disk .auth/ dir.

* chore(tests/e2e): drop seed.spec.ts

* chore(tests/e2e): drop unused README.md and .mcp.json

* refactor(tests/e2e): move existing specs to legacy/ pending fresh rewrite

Park the 5 current spec files under tests/e2e/legacy/ while fresh specs
get written in tests/e2e/tests/ against the new conventions (TC-NN
titles, authedPage fixture, minimal direct-fetch). Playwright's testDir
stays pointed at ./tests — `yarn test` now finds 0 tests until the
first fresh spec lands. legacy/ is preserved for reference but not
collected by default.

Add a .gitkeep under tests/ so the empty dir survives in git between
the move and the first new spec.

Running legacy on demand:
  npx playwright test --config tests/e2e/playwright.config.ts \
    --project chromium legacy/<spec>.ts
(or temporarily point testDir at ./legacy in the config). No yarn
script wired — legacy is expected to rot as fresh specs replace it.

* refactor(tests): drop -utils deprecation shims; import from canonical modules

The shims we introduced during the phase-3 merges (authutils, alertutils,
cloudintegrationsutils, idputils, gatewayutils) and the phase-4 primitive
renames (dev, utils, driver) have done their job — integration/ tests can
now import directly from the real modules.

Rewrite every shim-import in tests/integration/tests/:
  fixtures.authutils → fixtures.auth
  fixtures.alertutils → fixtures.alerts
  fixtures.cloudintegrationsutils → fixtures.cloudintegrations
  fixtures.idputils → fixtures.idp
  fixtures.gatewayutils → fixtures.gateway
  fixtures.utils (get_testdata_file_path) → fixtures.fs

Delete all 8 shim files:
  fixtures/{authutils,alertutils,cloudintegrationsutils,idputils,
  gatewayutils,dev,utils,driver}.py

Nothing in active code (integration tests, e2e fixtures, bootstrap, seeder)
imported fixtures.dev or fixtures.driver, so those had no callers to
sweep — just delete.

500 tests still collect.

* fix(tests/seeder): add python3-dev so psycopg2 can compile in the image

Consolidating seeder deps into pyproject.toml pulled in psycopg2, which
needs Python dev headers (Python.h) to build from source. The apt layer
had gcc + libpq-dev but was missing python3-dev, so \`uv sync --frozen
--no-install-project --no-dev\` failed with "gcc failed with exit code 1"
during the seeder image build.

Add python3-dev to the apt install line; image size bump ~50MB for dev
headers. Alternative would have been swapping psycopg2 for
psycopg2-binary in pyproject.toml, but that'd affect the whole test
project for one Dockerfile concern — wrong scope.

* feat(tests/e2e): re-author 2095 alerts + downtime regression

Three fresh specs split by resource replace the 736-line
legacy monolith at tests/e2e/legacy/alerts/alerts-downtime.spec.ts:

- alerts.spec.ts: rule list CRUD, labels round-trip, test-notification
  pre-state, details/history/AlertNotFound, anomaly (EE-gated, skip on
  community)
- downtime.spec.ts: planned-downtime CRUD round-trip
- cascade-delete.spec.ts: 409 paths on rule/downtime delete when linked

UI-first: Playwright traces capture the BE conversations, so direct
page.request calls are reserved for seeding where the query-builder
setup is incidental to the test, API-contract probes, and cleanup.

* refactor(tests/e2e): group alerts specs under tests/alerts/

* feat(tests/fixtures/auth): apply_license fixture + wire into e2e bootstrap

- Adds a package-scoped apply_license fixture that stubs the Zeus
  /v2/licenses/me mock and POSTs /api/v3/licenses so the BE flips to
  ENTERPRISE. The fixture also PUTs org_onboarding=true because the
  license enables the onboarding flag which would otherwise hijack
  every post-login navigation to a questionnaire.
- Wires apply_license into e2e/bootstrap/setup.py::test_setup and
  ::test_teardown alongside create_user_admin.
- Existing add_license helper stays as-is for integration tests.
- Login fixture now waits for the URL to leave /login instead of a
  pre-license "Hello there" welcome string (the post-login landing
  page varies with license state).
- TC-07 anomaly test no longer skips (license enables the flag) and
  drops the legacy test-notification API contract probe that needs
  seeded metric data (covered by the integration suite).

* chore: cleanup

* chore: remove claude files

* chore(tests/fixtures): drop unused dashboards.py

* chore(tests/e2e): rename playwright outputDir to artifacts/results

* chore(tests/e2e): drop legacy specs, trim alerts.spec.ts to one smoke test

Deletes tests/e2e/legacy/ (five old 2095-replay specs) and the two
sibling alerts suite files (downtime, cascade-delete). alerts.spec.ts
is reduced to a single TC-01 smoke test that loads /alerts and asserts
the tabs render — a fresh minimum to build on.

* docs(contributing): new integration.md + e2e.md at top level

Promotes the two test-contributor docs from docs/contributing/tests/
to docs/contributing/ and rewrites them in the long-form Q&A format
of docs/contributing/go/integration.md (prerequisites → setup →
framework → writing → running → configuring → remember).

Reflects the current state: shared fixtures package at tests/fixtures/,
flat integration suites under tests/integration/tests/, e2e specs
grouped by resource under tests/e2e/tests/<feature>/, apply_license
fixture in the bootstrap, authedPage Playwright fixture, and the
artifacts/{html,json,results} output layout.

* docs(contributing): relocate integration.md + e2e.md to tests/

Moves docs/contributing/go/integration.md -> docs/contributing/tests/integration.md
and docs/contributing/e2e.md -> docs/contributing/tests/e2e.md so the test-
contributor docs live under contributing/tests/. The previous top-level
promotion at docs/contributing/integration.md is removed; go/readme.md
drops the dangling integration link.

* docs(contributing/tests): update integration.md to current repo layout

* ci(tests): fix integrationci paths + add e2eci workflow

integrationci:
- Matrix path was integration/src/<suite> (old layout); current layout
  is integration/tests/<suite>. Renames the matrix key src -> suite and
  fixes the pytest path accordingly.
- Adds auditquerier and rawexportdata to the matrix (new suites).
- Drops bootstrap from the matrix — it's no longer a test suite, just
  the pytest lifecycle entry.

e2eci (new, replaces the broken frontend/-based run-e2e.yaml):
- Label-gated trigger mirroring integrationci: requires safe-to-test +
  safe-to-e2e. Runs on pull_request / pull_request_target.
- Installs Python (uv) and Node (yarn), syncs tests/ deps, installs
  Playwright browsers for the matrix project.
- Brings the stack up via e2e/bootstrap/setup.py::test_setup --with-web
  (build signoz-with-web container once), runs playwright against it,
  tears down in an always-run step.
- Uploads the HTML report + per-test traces as artifacts.
- Matrix starts with chromium only (firefox / webkit can follow).

* ci(tests/e2e): upload entire artifacts/ dir, 5-day retention

* fix(tests/fixtures): apply black formatting to truncate helpers

* fix(tests/pyproject): ignore node_modules and py module in pylint

* ci(tests): drop auditquerier from integrationci matrix for now

* refactor(tests): drop __file__.parents[N] path tricks; use pytestconfig.rootpath

The pytest rootdir is already tests/, so anywhere we were computing
_REPO_ROOT / _TESTS_ROOT / e2e-dir from Path(__file__).resolve().parents[N]
can just use pytestconfig.rootpath (or .parent for the repo root).

- fixtures/signoz.py: DockerImage path → pytestconfig.rootpath.parent
- fixtures/seeder.py: docker-py build path → pytestconfig.rootpath
- e2e/bootstrap/setup.py: .env.local path → pytestconfig.rootpath / e2e
- e2e/bootstrap/run.py: yarn-test cwd → pytestconfig.rootpath / e2e

* chore(tests/e2e): drop bootstrap/run.py

The run.py entrypoint was just setup.py + subprocess('yarn test'); CI
splits those steps anyway (separate provision / test / teardown for
clean artifact capture) and locally the two-step flow is equivalent.
Removing the duplicate entrypoint; docs updated accordingly.

* cleanup(tests): simplify review pass

- fixtures/fs.py: testdata path resolved to tests/testdata after the
  fixture move; integration tests with data-driven parametrize (e.g.
  alerts/02_basic_alert_conditions.py) were all failing with
  FileNotFoundError. Walk to tests/integration/testdata now.
- fixtures/auth.py: extract _login helper so apply_license stops
  duplicating the GET /sessions/context + POST /sessions/email_password
  pair. Add a retry loop on POST /api/v3/licenses so a BE that isn't
  quite ready at bring-up time doesn't fail the fixture.
- seeder/server.py: use FastAPI lifespan to open+close the ClickHouse
  client instead of a lazy module-level global; collapse the verbose
  module docstring.
- fixtures/seeder.py + e2e/bootstrap/setup.py: trim docstrings/comments
  that narrated WHAT the code does — per-repo convention keeps only
  non-obvious WHY.
- .github/workflows/integrationci.yaml: gate the Chrome + chromedriver
  install on matrix.suite == 'callbackauthn' (the only suite that uses
  Selenium). Saves ~30s × 50 jobs on every PR run.
2026-04-23 10:05:49 +00:00
Nikhil Mantri
89b755a6b0 feat(infra-monitoring): v2 hosts list api (#10805)
* chore: baseline setup

* chore: endpoint detail update

* chore: added logic for hosts v3 api

* fix: bug fix

* chore: disk usage

* chore: added validate function

* chore: added some unit tests

* chore: return status as a string

* chore: yarn generate api

* chore: removed isSendingK8sAgentsMetricsCode

* chore: moved funcs

* chore: added validation on order by

* chore: updated spec

* chore: nil pointer dereference fix in req.Filter

* chore: added temporalities of metrics

* chore: unified composite key function

* chore: code improvements

* chore: hostStatusNone added for clarity that this field can be left empty as well in payload

* chore: yarn generate api

* chore: return errors from getMetadata and lint fix

* chore: return errors from getMetadata and lint fix

* chore: added hostName logic

* chore: modified getMetadata query

* chore: add type for response and files rearrange

* chore: warnings added passing from queryResponse warning to host lists response struct

* chore: added better metrics existence check

* chore: added a TODO remark

* chore: added required metrics check

* chore: distributed samples table to local table change for get metadata

* chore: frontend fix

* chore: endpoint correction

* chore: endpoint modification openapi

* chore: escape backtick to prevent sql injection

* chore: rearrage

* chore: improvements

* chore: validate order by to validate function

* chore: improved description

* chore: added TODOs and made filterByStatus a part of filter struct

* chore: ignore empty string hosts in get active hosts

* feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956)

* chore: add functionality for showing active and inactive counts in custom group by

* chore: bug fix

* chore: added subquery for active and total count

* chore: ignore empty string hosts in get active hosts

* fix: sinceUnixMilli for determining active hosts compute once per request

* chore: refactor code

* chore: rename HostsList -> ListHosts

* chore: rearrangement

* chore: inframonitoring types renaming

* chore: added types package

* chore: file structure further breakdown for clarity

* chore: comments correction

* chore: removed temporalities

* chore: comments resolve

* chore: added json tag required: true

* chore: added status unauthorized

* chore: remove a defensive nil map check, the function ensure non-nil map when err nil

* chore: make sort stable in case of tiebreaker by comparing composite group by keys

* chore: regen api client for inframonitoring

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:30:33 +00:00
nityanandagohain
07cb56c548 fix: new updates 2026-04-22 22:27:24 +05:30
nityanandagohain
6e382aa363 fix: more changes 2026-04-22 14:02:44 +05:30
nityanandagohain
115ee70a9a fix: minor changes 2026-04-22 00:00:51 +05:30
Nityananda Gohain
a58a3d4a68 Merge branch 'main' into issue_4360 2026-04-20 17:54:38 +05:30
nityanandagohain
6899eb0124 fix: changes 2026-04-20 17:53:42 +05:30
nityanandagohain
de5bec0195 Merge remote-tracking branch 'origin/main' into issue_4360 2026-04-20 16:41:58 +05:30
nityanandagohain
e359b03c25 feat: 1.Types for ai-o11y ricing rules 2026-04-12 17:14:18 +05:30
164 changed files with 8820 additions and 1659 deletions

70
.github/workflows/e2eci.yaml vendored Normal file
View File

@@ -0,0 +1,70 @@
name: e2eci
on:
pull_request:
types:
- labeled
pull_request_target:
types:
- labeled
jobs:
test:
strategy:
fail-fast: false
matrix:
project:
- chromium
if: |
((github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))) && contains(github.event.pull_request.labels.*.name, 'safe-to-e2e')
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: checkout
uses: actions/checkout@v4
- name: python
uses: actions/setup-python@v5
with:
python-version: 3.13
- name: uv
uses: astral-sh/setup-uv@v4
- name: node
uses: actions/setup-node@v4
with:
node-version: lts/*
- name: python-install
run: |
cd tests && uv sync
- name: yarn-install
run: |
cd tests/e2e && yarn install --frozen-lockfile
- name: playwright-browsers
run: |
cd tests/e2e && yarn playwright install --with-deps ${{ matrix.project }}
- name: bring-up-stack
run: |
cd tests && \
uv run pytest \
--basetemp=./tmp/ \
-vv --reuse --with-web \
e2e/bootstrap/setup.py::test_setup
- name: playwright-test
run: |
cd tests/e2e && \
yarn playwright test --project=${{ matrix.project }}
- name: teardown-stack
if: always()
run: |
cd tests && \
uv run pytest \
--basetemp=./tmp/ \
-vv --teardown \
e2e/bootstrap/setup.py::test_teardown
- name: upload-artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-artifacts-${{ matrix.project }}
path: tests/e2e/artifacts/
retention-days: 5

View File

@@ -25,11 +25,11 @@ jobs:
uses: astral-sh/setup-uv@v4
- name: install
run: |
cd tests/integration && uv sync
cd tests && uv sync
- name: fmt
run: |
make py-fmt
git diff --exit-code -- tests/integration/
git diff --exit-code -- tests/
- name: lint
run: |
make py-lint
@@ -37,21 +37,21 @@ jobs:
strategy:
fail-fast: false
matrix:
src:
- bootstrap
- passwordauthn
suite:
- alerts
- callbackauthn
- cloudintegrations
- dashboard
- ingestionkeys
- logspipelines
- passwordauthn
- preference
- querier
- rawexportdata
- role
- ttl
- alerts
- ingestionkeys
- rootuser
- serviceaccount
- ttl
sqlstore-provider:
- postgres
- sqlite
@@ -79,8 +79,9 @@ jobs:
uses: astral-sh/setup-uv@v4
- name: install
run: |
cd tests/integration && uv sync
cd tests && uv sync
- name: webdriver
if: matrix.suite == 'callbackauthn'
run: |
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
echo "deb http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee -a /etc/apt/sources.list.d/google-chrome.list
@@ -99,10 +100,10 @@ jobs:
google-chrome-stable --version
- name: run
run: |
cd tests/integration && \
cd tests && \
uv run pytest \
--basetemp=./tmp/ \
src/${{matrix.src}} \
integration/tests/${{matrix.suite}} \
--sqlstore-provider ${{matrix.sqlstore-provider}} \
--sqlite-mode ${{matrix.sqlite-mode}} \
--postgres-version ${{matrix.postgres-version}} \

View File

@@ -1,62 +0,0 @@
name: e2eci
on:
workflow_dispatch:
inputs:
userRole:
description: "Role of the user (ADMIN, EDITOR, VIEWER)"
required: true
type: choice
options:
- ADMIN
- EDITOR
- VIEWER
jobs:
test:
name: Run Playwright Tests
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: lts/*
- name: Mask secrets and input
run: |
echo "::add-mask::${{ secrets.BASE_URL }}"
echo "::add-mask::${{ secrets.LOGIN_USERNAME }}"
echo "::add-mask::${{ secrets.LOGIN_PASSWORD }}"
echo "::add-mask::${{ github.event.inputs.userRole }}"
- name: Install dependencies
working-directory: frontend
run: |
npm install -g yarn
yarn
- name: Install Playwright Browsers
working-directory: frontend
run: yarn playwright install --with-deps
- name: Run Playwright Tests
working-directory: frontend
run: |
BASE_URL="${{ secrets.BASE_URL }}" \
LOGIN_USERNAME="${{ secrets.LOGIN_USERNAME }}" \
LOGIN_PASSWORD="${{ secrets.LOGIN_PASSWORD }}" \
USER_ROLE="${{ github.event.inputs.userRole }}" \
yarn playwright test
- name: Upload Playwright Report
uses: actions/upload-artifact@v4
if: always()
with:
name: playwright-report
path: frontend/playwright-report/
retention-days: 30

View File

@@ -201,26 +201,26 @@ docker-buildx-enterprise: go-build-enterprise js-build
# python commands
##############################################################
.PHONY: py-fmt
py-fmt: ## Run black for integration tests
@cd tests/integration && uv run black .
py-fmt: ## Run black across the shared tests project
@cd tests && uv run black .
.PHONY: py-lint
py-lint: ## Run lint for integration tests
@cd tests/integration && uv run isort .
@cd tests/integration && uv run autoflake .
@cd tests/integration && uv run pylint .
py-lint: ## Run lint across the shared tests project
@cd tests && uv run isort .
@cd tests && uv run autoflake .
@cd tests && uv run pylint .
.PHONY: py-test-setup
py-test-setup: ## Runs integration tests
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --reuse --capture=no src/bootstrap/setup.py::test_setup
py-test-setup: ## Bring up the shared SigNoz backend used by integration and e2e tests
@cd tests && uv run pytest --basetemp=./tmp/ -vv --reuse --capture=no integration/bootstrap/setup.py::test_setup
.PHONY: py-test-teardown
py-test-teardown: ## Runs integration tests with teardown
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --teardown --capture=no src/bootstrap/setup.py::test_teardown
py-test-teardown: ## Tear down the shared SigNoz backend
@cd tests && uv run pytest --basetemp=./tmp/ -vv --teardown --capture=no integration/bootstrap/setup.py::test_teardown
.PHONY: py-test
py-test: ## Runs integration tests
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --capture=no src/
@cd tests && uv run pytest --basetemp=./tmp/ -vv --capture=no integration/tests/
.PHONY: py-clean
py-clean: ## Clear all pycache and pytest cache from tests directory recursively

View File

@@ -2287,6 +2287,274 @@ components:
enabled:
type: boolean
type: object
InframonitoringtypesHostFilter:
properties:
expression:
type: string
filterByStatus:
$ref: '#/components/schemas/InframonitoringtypesHostStatus'
type: object
InframonitoringtypesHostRecord:
properties:
activeHostCount:
type: integer
cpu:
format: double
type: number
diskUsage:
format: double
type: number
hostName:
type: string
inactiveHostCount:
type: integer
load15:
format: double
type: number
memory:
format: double
type: number
meta:
additionalProperties: {}
nullable: true
type: object
status:
$ref: '#/components/schemas/InframonitoringtypesHostStatus'
wait:
format: double
type: number
required:
- hostName
- status
- activeHostCount
- inactiveHostCount
- cpu
- memory
- wait
- load15
- diskUsage
- meta
type: object
InframonitoringtypesHostStatus:
enum:
- active
- inactive
- ""
type: string
InframonitoringtypesHosts:
properties:
endTimeBeforeRetention:
type: boolean
records:
items:
$ref: '#/components/schemas/InframonitoringtypesHostRecord'
nullable: true
type: array
requiredMetricsCheck:
$ref: '#/components/schemas/InframonitoringtypesRequiredMetricsCheck'
total:
type: integer
type:
$ref: '#/components/schemas/InframonitoringtypesResponseType'
warning:
$ref: '#/components/schemas/Querybuildertypesv5QueryWarnData'
required:
- type
- records
- total
- requiredMetricsCheck
- endTimeBeforeRetention
type: object
InframonitoringtypesPostableHosts:
properties:
end:
format: int64
type: integer
filter:
$ref: '#/components/schemas/InframonitoringtypesHostFilter'
groupBy:
items:
$ref: '#/components/schemas/Querybuildertypesv5GroupByKey'
nullable: true
type: array
limit:
type: integer
offset:
type: integer
orderBy:
$ref: '#/components/schemas/Querybuildertypesv5OrderBy'
start:
format: int64
type: integer
required:
- start
- end
- limit
type: object
InframonitoringtypesRequiredMetricsCheck:
properties:
missingMetrics:
items:
type: string
nullable: true
type: array
required:
- missingMetrics
type: object
InframonitoringtypesResponseType:
enum:
- list
- grouped_list
type: string
LlmpricingruletypesGettablePricingRules:
properties:
items:
items:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRule'
nullable: true
type: array
limit:
type: integer
offset:
type: integer
total:
type: integer
required:
- items
- total
- offset
- limit
type: object
LlmpricingruletypesLLMPricingRule:
properties:
cacheMode:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRuleCacheMode'
costCacheRead:
format: double
type: number
costCacheWrite:
format: double
type: number
costInput:
format: double
type: number
costOutput:
format: double
type: number
createdAt:
format: date-time
type: string
createdBy:
type: string
enabled:
type: boolean
id:
type: string
isOverride:
type: boolean
modelName:
type: string
modelPattern:
items:
type: string
nullable: true
type: array
orgId:
type: string
sourceId:
type: string
syncedAt:
format: date-time
nullable: true
type: string
unit:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRuleUnit'
updatedAt:
format: date-time
type: string
updatedBy:
type: string
required:
- id
- orgId
- modelName
- modelPattern
- unit
- cacheMode
- costInput
- costOutput
- costCacheRead
- costCacheWrite
- isOverride
- enabled
type: object
LlmpricingruletypesLLMPricingRuleCacheMode:
enum:
- subtract
- additive
- unknown
type: string
LlmpricingruletypesLLMPricingRuleUnit:
enum:
- per_million_tokens
type: string
LlmpricingruletypesUpdatableLLMPricingRule:
properties:
cacheMode:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRuleCacheMode'
costCacheRead:
format: double
type: number
costCacheWrite:
format: double
type: number
costInput:
format: double
type: number
costOutput:
format: double
type: number
enabled:
type: boolean
id:
nullable: true
type: string
isOverride:
nullable: true
type: boolean
modelName:
type: string
modelPattern:
items:
type: string
nullable: true
type: array
sourceId:
nullable: true
type: string
unit:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRuleUnit'
required:
- modelName
- modelPattern
- unit
- cacheMode
- costInput
- costOutput
- costCacheRead
- costCacheWrite
- enabled
type: object
LlmpricingruletypesUpdatableLLMPricingRules:
properties:
rules:
items:
$ref: '#/components/schemas/LlmpricingruletypesUpdatableLLMPricingRule'
nullable: true
type: array
required:
- rules
type: object
MetricsexplorertypesInspectMetricsRequest:
properties:
end:
@@ -6977,6 +7245,218 @@ paths:
summary: Create bulk invite
tags:
- users
/api/v1/llm_pricing_rules:
get:
deprecated: false
description: Returns all LLM pricing rules for the authenticated org, with pagination.
operationId: ListLLMPricingRules
parameters:
- in: query
name: offset
schema:
type: integer
- in: query
name: limit
schema:
type: integer
responses:
"200":
content:
application/json:
schema:
properties:
data:
$ref: '#/components/schemas/LlmpricingruletypesGettablePricingRules'
status:
type: string
required:
- status
- data
type: object
description: OK
"400":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Bad Request
"401":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Unauthorized
"403":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Forbidden
"500":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Internal Server Error
security:
- api_key:
- VIEWER
- tokenizer:
- VIEWER
summary: List pricing rules
tags:
- llmpricingrules
put:
deprecated: false
description: Single write endpoint used by both the user and the Zeus sync job.
Per-rule match is by id, then sourceId, then insert. Override rows (is_override=true)
are fully preserved when the request does not provide isOverride; only synced_at
is stamped.
operationId: UpdateLLMPricingRules
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/LlmpricingruletypesUpdatableLLMPricingRules'
responses:
"204":
description: No Content
"400":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Bad Request
"401":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Unauthorized
"403":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Forbidden
"500":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Internal Server Error
security:
- api_key:
- ADMIN
- tokenizer:
- ADMIN
summary: Bulk update pricing rules
tags:
- llmpricingrules
/api/v1/llm_pricing_rules/{id}:
delete:
deprecated: false
description: Hard-deletes a pricing rule. If auto-synced, it will be recreated
on the next sync cycle.
operationId: DeleteLLMPricingRule
parameters:
- in: path
name: id
required: true
schema:
type: string
responses:
"204":
description: No Content
"401":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Unauthorized
"403":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Forbidden
"404":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Not Found
"500":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Internal Server Error
security:
- api_key:
- ADMIN
- tokenizer:
- ADMIN
summary: Delete a pricing rule
tags:
- llmpricingrules
get:
deprecated: false
description: Returns a single LLM pricing rule by ID.
operationId: GetLLMPricingRule
parameters:
- in: path
name: id
required: true
schema:
type: string
responses:
"200":
content:
application/json:
schema:
properties:
data:
$ref: '#/components/schemas/LlmpricingruletypesLLMPricingRule'
status:
type: string
required:
- status
- data
type: object
description: OK
"401":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Unauthorized
"403":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Forbidden
"404":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Not Found
"500":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Internal Server Error
security:
- api_key:
- VIEWER
- tokenizer:
- VIEWER
summary: Get a pricing rule
tags:
- llmpricingrules
/api/v1/logs/promote_paths:
get:
deprecated: false
@@ -9853,6 +10333,72 @@ paths:
summary: Health check
tags:
- health
/api/v2/infra_monitoring/hosts:
post:
deprecated: false
description: 'Returns a paginated list of hosts with key infrastructure metrics:
CPU usage (%), memory usage (%), I/O wait (%), disk usage (%), and 15-minute
load average. Each host includes its current status (active/inactive based
on metrics reported in the last 10 minutes) and metadata attributes (e.g.,
os.type). Supports filtering via a filter expression, filtering by host status,
custom groupBy to aggregate hosts by any attribute, ordering by any of the
five metrics, and pagination via offset/limit. The response type is ''list''
for the default host.name grouping or ''grouped_list'' for custom groupBy
keys. Also reports missing required metrics and whether the requested time
range falls before the data retention boundary.'
operationId: ListHosts
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/InframonitoringtypesPostableHosts'
responses:
"200":
content:
application/json:
schema:
properties:
data:
$ref: '#/components/schemas/InframonitoringtypesHosts'
status:
type: string
required:
- status
- data
type: object
description: OK
"400":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Bad Request
"401":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Unauthorized
"403":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Forbidden
"500":
content:
application/json:
schema:
$ref: '#/components/schemas/RenderErrorResponse'
description: Internal Server Error
security:
- api_key:
- VIEWER
- tokenizer:
- VIEWER
summary: List Hosts for Infra Monitoring
tags:
- inframonitoring
/api/v2/livez:
get:
deprecated: false

View File

@@ -1,216 +0,0 @@
# Integration Tests
SigNoz uses integration tests to verify that different components work together correctly in a real environment. These tests run against actual services (ClickHouse, PostgreSQL, etc.) to ensure end-to-end functionality.
## How to set up the integration test environment?
### Prerequisites
Before running integration tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
### Initial Setup
1. Navigate to the integration tests directory:
```bash
cd tests/integration
```
2. Install dependencies using uv:
```bash
uv sync
```
> **_NOTE:_** the build backend could throw an error while installing `psycopg2`, pleae see https://www.psycopg.org/docs/install.html#build-prerequisites
### Starting the Test Environment
To spin up all the containers necessary for writing integration tests and keep them running:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/bootstrap/setup.py::test_setup
```
This command will:
- Start all required services (ClickHouse, PostgreSQL, Zookeeper, etc.)
- Keep containers running due to the `--reuse` flag
- Verify that the setup is working correctly
### Stopping the Test Environment
When you're done writing integration tests, clean up the environment:
```bash
uv run pytest --basetemp=./tmp/ -vv --teardown -s src/bootstrap/setup.py::test_teardown
```
This will destroy the running integration test setup and clean up resources.
## Understanding the Integration Test Framework
Python and pytest form the foundation of the integration testing framework. Testcontainers are used to spin up disposable integration environments. Wiremock is used to spin up **test doubles** of other services.
- **Why Python/pytest?** It's expressive, low-boilerplate, and has powerful fixture capabilities that make integration testing straightforward. Extensive libraries for HTTP requests, JSON handling, and data analysis (numpy) make it easier to test APIs and verify data
- **Why testcontainers?** They let us spin up isolated dependencies that match our production environment without complex setup.
- **Why wiremock?** Well maintained, documented and extensible.
```
.
├── conftest.py
├── fixtures
│ ├── __init__.py
│ ├── auth.py
│ ├── clickhouse.py
│ ├── fs.py
│ ├── http.py
│ ├── migrator.py
│ ├── network.py
│ ├── postgres.py
│ ├── signoz.py
│ ├── sql.py
│ ├── sqlite.py
│ ├── types.py
│ └── zookeeper.py
├── uv.lock
├── pyproject.toml
└── src
└── bootstrap
├── __init__.py
├── 01_database.py
├── 02_register.py
└── 03_license.py
```
Each test suite follows some important principles:
1. **Organization**: Test suites live under `src/` in self-contained packages. Fixtures (a pytest concept) live inside `fixtures/`.
2. **Execution Order**: Files are prefixed with two-digit numbers (`01_`, `02_`, `03_`) to ensure sequential execution.
3. **Time Constraints**: Each suite should complete in under 10 minutes (setup takes ~4 mins).
### Test Suite Design
Test suites should target functional domains or subsystems within SigNoz. When designing a test suite, consider these principles:
- **Functional Cohesion**: Group tests around a specific capability or service boundary
- **Data Flow**: Follow the path of data through related components
- **Change Patterns**: Components frequently modified together should be tested together
The exact boundaries for modules are intentionally flexible, allowing teams to define logical groupings based on their specific context and knowledge of the system.
Eg: The **bootstrap** integration test suite validates core system functionality:
- Database initialization
- Version check
Other test suites can be **pipelines, auth, querier.**
## How to write an integration test?
Now start writing an integration test. Create a new file `src/bootstrap/05_version.py` and paste the following:
```python
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_version(signoz: types.SigNoz) -> None:
response = requests.get(signoz.self.host_config.get("/api/v1/version"), timeout=2)
logger.info(response)
```
We have written a simple test which calls the `version` endpoint of the container in step 1. In **order to just run this function, run the following command:**
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/bootstrap/05_version.py::test_version
```
> Note: The `--reuse` flag is used to reuse the environment if it is already running. Always use this flag when writing and running integration tests. If you don't use this flag, the environment will be destroyed and recreated every time you run the test.
Here's another example of how to write a more comprehensive integration test:
```python
from http import HTTPStatus
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_user_registration(signoz: types.SigNoz) -> None:
"""Test user registration functionality."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": "testuser",
"orgId": "",
"orgName": "test.org",
"email": "test@example.com",
"password": "password123Z$",
},
timeout=2,
)
assert response.status_code == HTTPStatus.OK
assert response.json()["setupCompleted"] is True
```
## How to run integration tests?
### Running All Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/
```
### Running Specific Test Categories
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>
# Run querier tests
uv run pytest --basetemp=./tmp/ -vv --reuse src/querier/
# Run auth tests
uv run pytest --basetemp=./tmp/ -vv --reuse src/auth/
```
### Running Individual Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>/<file>.py::test_name
# Run test_register in file 01_register.py in passwordauthn suite
uv run pytest --basetemp=./tmp/ -vv --reuse src/passwordauthn/01_register.py::test_register
```
## How to configure different options for integration tests?
Tests can be configured using pytest options:
- `--sqlstore-provider` - Choose database provider (default: postgres)
- `--sqlite-mode` - SQLite journal mode: `delete` or `wal` (default: delete). Only relevant when `--sqlstore-provider=sqlite`.
- `--postgres-version` - PostgreSQL version (default: 15)
- `--clickhouse-version` - ClickHouse version (default: 25.5.6)
- `--zookeeper-version` - Zookeeper version (default: 3.7.1)
Example:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse --sqlstore-provider=postgres --postgres-version=14 src/auth/
```
## What should I remember?
- **Always use the `--reuse` flag** when setting up the environment to keep containers running
- **Use the `--teardown` flag** when cleaning up to avoid resource leaks
- **Follow the naming convention** with two-digit numeric prefixes (`01_`, `02_`) for test execution order
- **Use proper timeouts** in HTTP requests to avoid hanging tests
- **Clean up test data** between tests to avoid interference
- **Use descriptive test names** that clearly indicate what is being tested
- **Leverage fixtures** for common setup and authentication
- **Test both success and failure scenarios** to ensure robust functionality
- **`--sqlite-mode=wal` does not work on macOS.** The integration test environment runs SigNoz inside a Linux container with the SQLite database file mounted from the macOS host. WAL mode requires shared memory between connections, and connections crossing the VM boundary (macOS host ↔ Linux container) cannot share the WAL index, resulting in `SQLITE_IOERR_SHORT_READ`. WAL mode is tested in CI on Linux only.

View File

@@ -15,7 +15,6 @@ We **recommend** (almost enforce) reviewing these guides before contributing to
- [Endpoint](endpoint.md) - HTTP endpoint patterns
- [Flagger](flagger.md) - Feature flag patterns
- [Handler](handler.md) - HTTP handler patterns
- [Integration](integration.md) - Integration testing
- [Provider](provider.md) - Dependency injection and provider patterns
- [Packages](packages.md) - Naming, layout, and conventions for `pkg/` packages
- [Service](service.md) - Managed service lifecycle with `factory.Service`

View File

@@ -0,0 +1,261 @@
# E2E Tests
SigNoz uses end-to-end tests to verify the frontend works correctly against a real backend. These tests use Playwright to drive a real browser against a containerized SigNoz stack that pytest brings up — the same fixture graph integration tests use, with an extra HTTP seeder container for per-spec telemetry seeding.
## How to set up the E2E test environment?
### Prerequisites
Before running E2E tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
- Node 18+ and Yarn
### Initial Setup
1. Install Python deps for the shared tests project:
```bash
cd tests
uv sync
```
2. Install Node deps and Playwright browsers:
```bash
cd e2e
yarn install
yarn install:browsers # one-time Playwright browser install
```
### Starting the Test Environment
To spin up the backend stack (SigNoz, ClickHouse, Postgres, Zookeeper, Zeus mock, gateway mock, seeder, migrator-with-web) and keep it running:
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse --with-web \
e2e/bootstrap/setup.py::test_setup
```
This command will:
- Bring up all containers via pytest fixtures
- Register the admin user (`admin@integration.test` / `password123Z$`)
- Apply the enterprise license (via a WireMock stub of Zeus) and dismiss the org-onboarding prompt so specs can navigate directly to feature pages
- Start the HTTP seeder container (`tests/seeder/` — exposing `/telemetry/{traces,logs,metrics}` POST + DELETE)
- Write backend coordinates to `tests/e2e/.env.local` (loaded by `playwright.config.ts` via dotenv)
- Keep containers running via the `--reuse` flag
The `--with-web` flag builds the frontend into the SigNoz container — required for E2E. The build takes ~4 mins on a cold start.
### Stopping the Test Environment
When you're done writing E2E tests, clean up the environment:
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --teardown \
e2e/bootstrap/setup.py::test_teardown
```
## Understanding the E2E Test Framework
Playwright drives a real browser (Chromium / Firefox / WebKit) against the running SigNoz frontend. The backend is brought up by the same pytest fixture graph integration tests use, so both suites share one source of truth for container lifecycle, license seeding, and test-user accounts.
- **Why Playwright?** First-class TypeScript support, network interception, automatic wait-for-visibility, built-in trace viewer that captures every request/response the UI triggers — so specs rarely need separate API probes alongside UI clicks.
- **Why pytest for lifecycle?** The integration suite already owns container bring-up. Reusing it keeps the E2E stack exactly in sync with the integration stack and avoids a parallel lifecycle framework.
- **Why a separate seeder container?** Per-spec telemetry seeding (traces / logs / metrics) needs a thin HTTP wrapper around the ClickHouse insert helpers so a browser spec can POST from inside the test. The seeder lives at `tests/seeder/`, is built from `tests/Dockerfile.seeder`, and reuses the same `fixtures/{traces,logs,metrics}.py` as integration tests.
```
tests/
├── fixtures/ # shared with integration (see integration.md)
├── integration/ # pytest integration suite
├── seeder/ # standalone HTTP seeder container
│ ├── __init__.py
│ ├── Dockerfile
│ └── server.py # FastAPI app wrapping fixtures.{traces,logs,metrics}
└── e2e/
├── package.json
├── playwright.config.ts # loads .env + .env.local via dotenv
├── .env.example # staging-mode template
├── .env.local # generated by bootstrap/setup.py (gitignored)
├── bootstrap/
│ └── setup.py # test_setup / test_teardown — pytest lifecycle
├── fixtures/
│ └── auth.ts # authedPage Playwright fixture + per-worker storageState cache
├── tests/ # Playwright .spec.ts files, one dir per feature area
│ └── alerts/
│ └── alerts.spec.ts
└── artifacts/ # per-run output (gitignored)
├── html/ # HTML reporter output
├── json/ # JSON reporter output
└── results/ # per-test traces / screenshots / videos on failure
```
Each spec follows these principles:
1. **Directory per feature**: `tests/e2e/tests/<feature>/*.spec.ts`. Cross-resource junction concerns (e.g. cascade-delete) go in their own file, not packed into one giant spec.
2. **Test titles use `TC-NN`**: `test('TC-01 alerts page — tabs render', ...)`. Preserves ordering at a glance and maps to external coverage tracking.
3. **UI-first**: drive flows through the UI. Playwright traces capture every BE request/response the UI triggers, so asserting on UI outcomes implicitly validates BE contracts. Reach for direct `page.request.*` only when the test's *purpose* is asserting a response contract (use `page.waitForResponse` on a UI click) or when a specific UI step is structurally flaky (e.g. Ant DatePicker calendar-cell indices) — and even then try UI first.
4. **Self-contained state**: each spec creates what it needs and cleans up in `try/finally`. No global pre-seeding fixtures.
## How to write an E2E test?
Create a new file `tests/e2e/tests/alerts/smoke.spec.ts`:
```typescript
import { test, expect } from '../../fixtures/auth';
test('TC-01 alerts page — tabs render', async ({ authedPage: page }) => {
await page.goto('/alerts');
await expect(page.getByRole('tab', { name: /alert rules/i })).toBeVisible();
await expect(page.getByRole('tab', { name: /configuration/i })).toBeVisible();
});
```
The `authedPage` fixture (from `tests/e2e/fixtures/auth.ts`) gives you a `Page` whose browser context is already authenticated as the admin user. First use per worker triggers one login; the resulting `storageState` is held in memory and reused for later requests.
To run just this test (assuming the stack is up via `test_setup`):
```bash
cd tests/e2e
npx playwright test tests/alerts/smoke.spec.ts --project=chromium
```
Here's a more comprehensive example that exercises a CRUD flow via the UI:
```typescript
import { test, expect } from '../../fixtures/auth';
test.describe.configure({ mode: 'serial' });
test('TC-02 alerts list — create, toggle, delete', async ({ authedPage: page }) => {
await page.goto('/alerts?tab=AlertRules');
const name = 'smoke-rule';
// Seed via UI — click "New Alert", fill form, save.
await page.getByRole('button', { name: /new alert/i }).click();
await page.getByTestId('alert-name-input').fill(name);
// ... fill metric / threshold / save ...
// Find the row and exercise the action menu.
const row = page.locator('tr', { hasText: name });
await expect(row).toBeVisible();
await row.locator('[data-testid="alert-actions"] button').first().click();
// waitForResponse captures the network call the UI triggers — no parallel fetch needed.
const patchWait = page.waitForResponse(
(r) => r.url().includes('/rules/') && r.request().method() === 'PATCH',
);
await page.getByRole('menuitem').filter({ hasText: /^disable$/i }).click();
await patchWait;
await expect(row).toContainText(/disabled/i);
});
```
### Locator priority
1. `getByRole('button', { name: 'Submit' })`
2. `getByLabel('Email')`
3. `getByPlaceholder('...')`
4. `getByText('...')`
5. `getByTestId('...')`
6. `locator('.ant-select')` — last resort (Ant Design dropdowns often have no semantic alternative)
## How to run E2E tests?
### Running All Tests
With the stack already up, from `tests/e2e/`:
```bash
yarn test # headless, all projects
```
### Running Specific Projects
```bash
yarn test:chromium # chromium only
yarn test:firefox
yarn test:webkit
```
### Running Specific Tests
```bash
cd tests/e2e
# Single feature dir
npx playwright test tests/alerts/ --project=chromium
# Single file
npx playwright test tests/alerts/alerts.spec.ts --project=chromium
# Single test by title grep
npx playwright test --project=chromium -g "TC-01"
```
### Iterative modes
```bash
yarn test:ui # Playwright UI mode — watch + step through
yarn test:headed # headed browser
yarn test:debug # Playwright inspector, pause-on-breakpoint
yarn codegen # record-and-replay locator generation
yarn report # open the last HTML report (artifacts/html)
```
### Staging fallback
Point `SIGNOZ_E2E_BASE_URL` at a remote env via `.env` — no local backend bring-up, no `.env.local` generated, Playwright hits the URL directly:
```bash
cd tests/e2e
cp .env.example .env # fill SIGNOZ_E2E_USERNAME / PASSWORD
yarn test:staging
```
## How to configure different options for E2E tests?
### Environment variables
| Variable | Description |
|---|---|
| `SIGNOZ_E2E_BASE_URL` | Base URL the browser targets. Written by `bootstrap/setup.py` for local mode; set manually for staging. |
| `SIGNOZ_E2E_USERNAME` | Admin email. Bootstrap writes `admin@integration.test`. |
| `SIGNOZ_E2E_PASSWORD` | Admin password. Bootstrap writes the integration-test default. |
| `SIGNOZ_E2E_SEEDER_URL` | Seeder HTTP base URL — hit by specs that need per-test telemetry. |
Loading order in `playwright.config.ts`: `.env` first (user-provided, staging), then `.env.local` with `override: true` (bootstrap-generated, local mode). Anything already set in `process.env` at yarn-test time wins because dotenv doesn't touch vars that are already present.
### Playwright options
The full `playwright.config.ts` is the source of truth. Common things to tweak:
- `projects` — Chromium / Firefox / WebKit are enabled by default. Disable to speed up iteration.
- `retries``2` on CI (`process.env.CI`), `0` locally.
- `fullyParallel: true` — files run in parallel by worker; within a file, use `test.describe.configure({ mode: 'serial' })` if tests share list pages / mutate shared state.
- `trace: 'on-first-retry'`, `screenshot: 'only-on-failure'`, `video: 'retain-on-failure'` — default diagnostic artifacts land in `artifacts/results/<test>/`.
### Pytest options (bootstrap side)
The same pytest flags integration tests expose work here, since E2E reuses the shared fixture graph:
- `--reuse` — keep containers warm between runs (required for all iteration).
- `--teardown` — tear everything down.
- `--with-web` — build the frontend into the SigNoz container. **Required for E2E**; integration tests don't need it.
- `--sqlstore-provider`, `--postgres-version`, `--clickhouse-version`, etc. — see `docs/contributing/integration.md`.
## What should I remember?
- **Always use the `--reuse` flag** when setting up the E2E stack. `--with-web` adds a ~4 min frontend build; you only want to pay that once.
- **Don't teardown before setup.** `--reuse` correctly handles partially-set-up state, so chaining teardown → setup wastes time.
- **Prefer UI-driven flows.** Playwright captures BE requests in the trace; a parallel `fetch` probe is almost always redundant. Drop to `page.request.*` only when the UI can't reach what you need.
- **Use `page.waitForResponse` on UI clicks** to assert BE contracts — it still exercises the UI trigger path.
- **Title every test `TC-NN <short description>`** — keeps the suite navigable and reportable.
- **Split by resource, not by regression suite.** One spec per feature resource; cross-resource junction concerns (cascade-delete, linked-edit) get their own file.
- **Use short descriptive resource names** (`alerts-list-rule`, `labels-rule`, `downtime-once`) — no timestamp disambiguation. Each test owns its resources and cleans up in `try/finally`.
- **Never commit `test.only`** — a pre-commit check or CI runs with `forbidOnly: true`.
- **Prefer explicit waits over `page.waitForTimeout(ms)`.** `await expect(locator).toBeVisible()` is always better than `waitForTimeout(5000)`.
- **Unique test names won't save you from shared-tenant state.** When two tests hit the same list page, either serialize (`describe.configure({ mode: 'serial' })`) or isolate cleanup religiously.
- **Artifacts go to `tests/e2e/artifacts/`** — HTML report at `artifacts/html`, traces at `artifacts/results/<test>/`. All gitignored; archive the dir in CI.

View File

@@ -0,0 +1,251 @@
# Integration Tests
SigNoz uses integration tests to verify that different components work together correctly in a real environment. These tests run against actual services (ClickHouse, PostgreSQL, SigNoz, Zeus mock, Keycloak, etc.) spun up as containers, so suites exercise the same code paths production does.
## How to set up the integration test environment?
### Prerequisites
Before running integration tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
### Initial Setup
1. Navigate to the shared tests project:
```bash
cd tests
```
2. Install dependencies using uv:
```bash
uv sync
```
> **_NOTE:_** the build backend could throw an error while installing `psycopg2`, please see https://www.psycopg.org/docs/install.html#build-prerequisites
### Starting the Test Environment
To spin up all the containers necessary for writing integration tests and keep them running:
```bash
make py-test-setup
```
Under the hood this runs, from `tests/`:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse integration/bootstrap/setup.py::test_setup
```
This command will:
- Start all required services (ClickHouse, PostgreSQL, Zookeeper, SigNoz, Zeus mock, gateway mock)
- Register an admin user
- Keep containers running via the `--reuse` flag
### Stopping the Test Environment
When you're done writing integration tests, clean up the environment:
```bash
make py-test-teardown
```
Which runs:
```bash
uv run pytest --basetemp=./tmp/ -vv --teardown integration/bootstrap/setup.py::test_teardown
```
This destroys the running integration test setup and cleans up resources.
## Understanding the Integration Test Framework
Python and pytest form the foundation of the integration testing framework. Testcontainers are used to spin up disposable integration environments. WireMock is used to spin up **test doubles** of external services (Zeus cloud API, gateway, etc.).
- **Why Python/pytest?** It's expressive, low-boilerplate, and has powerful fixture capabilities that make integration testing straightforward. Extensive libraries for HTTP requests, JSON handling, and data analysis (numpy) make it easier to test APIs and verify data.
- **Why testcontainers?** They let us spin up isolated dependencies that match our production environment without complex setup.
- **Why WireMock?** Well maintained, documented, and extensible.
```
tests/
├── conftest.py # pytest_plugins registration
├── pyproject.toml
├── uv.lock
├── fixtures/ # shared fixture library (flat package)
│ ├── __init__.py
│ ├── auth.py # admin/editor/viewer users, tokens, license
│ ├── clickhouse.py
│ ├── http.py # WireMock helpers
│ ├── keycloak.py # IdP container
│ ├── postgres.py
│ ├── signoz.py # SigNoz-backend container
│ ├── sql.py
│ ├── types.py
│ └── ... # logs, metrics, traces, alerts, dashboards, ...
├── integration/
│ ├── bootstrap/
│ │ └── setup.py # test_setup / test_teardown
│ ├── testdata/ # JSON / JSONL / YAML inputs per suite
│ └── tests/ # one directory per feature area
│ ├── alerts/
│ │ ├── 01_*.py # numbered suite files
│ │ └── conftest.py # optional suite-local fixtures
│ ├── auditquerier/
│ ├── cloudintegrations/
│ ├── dashboard/
│ ├── passwordauthn/
│ ├── querier/
│ └── ...
└── e2e/ # Playwright suite (see docs/contributing/e2e.md)
```
Each test suite follows these principles:
1. **Organization**: Suites live under `tests/integration/tests/` in self-contained packages. Shared fixtures live in the top-level `tests/fixtures/` package so the e2e tree can reuse them.
2. **Execution Order**: Files are prefixed with two-digit numbers (`01_`, `02_`, `03_`) to ensure sequential execution when tests depend on ordering.
3. **Time Constraints**: Each suite should complete in under 10 minutes (setup takes ~4 mins).
### Test Suite Design
Test suites should target functional domains or subsystems within SigNoz. When designing a test suite, consider these principles:
- **Functional Cohesion**: Group tests around a specific capability or service boundary
- **Data Flow**: Follow the path of data through related components
- **Change Patterns**: Components frequently modified together should be tested together
The exact boundaries for suites are intentionally flexible, allowing contributors to define logical groupings based on their domain knowledge. Current suites cover alerts, audit querier, callback authn, cloud integrations, dashboards, ingestion keys, logs pipelines, password authn, preferences, querier, raw export data, roles, root user, service accounts, and TTL.
## How to write an integration test?
Now start writing an integration test. Create a new file `tests/integration/tests/bootstrap/01_version.py` and paste the following:
```python
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_version(signoz: types.SigNoz) -> None:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v1/version"),
timeout=2,
)
logger.info(response)
```
We have written a simple test which calls the `version` endpoint of the SigNoz backend. **To run just this function, run the following command:**
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/bootstrap/01_version.py::test_version
```
> **Note:** The `--reuse` flag is used to reuse the environment if it is already running. Always use this flag when writing and running integration tests. Without it the environment is destroyed and recreated every run.
Here's another example of how to write a more comprehensive integration test:
```python
from http import HTTPStatus
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_user_registration(signoz: types.SigNoz) -> None:
"""Test user registration functionality."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": "testuser",
"orgId": "",
"orgName": "test.org",
"email": "test@example.com",
"password": "password123Z$",
},
timeout=2,
)
assert response.status_code == HTTPStatus.OK
assert response.json()["setupCompleted"] is True
```
Test inputs (JSON fixtures, expected payloads) go under `tests/integration/testdata/<suite>/` and are loaded via `fixtures.fs.get_testdata_file_path`.
## How to run integration tests?
### Running All Tests
```bash
make py-test
```
Which runs:
```bash
uv run pytest --basetemp=./tmp/ -vv integration/tests/
```
### Running Specific Test Categories
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/<suite>/
# Run querier tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/querier/
# Run passwordauthn tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/passwordauthn/
```
### Running Individual Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/<suite>/<file>.py::test_name
# Run test_register in 01_register.py in the passwordauthn suite
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/passwordauthn/01_register.py::test_register
```
## How to configure different options for integration tests?
Tests can be configured using pytest options:
- `--sqlstore-provider` — Choose the SQL store provider (default: `postgres`)
- `--sqlite-mode` — SQLite journal mode: `delete` or `wal` (default: `delete`). Only relevant when `--sqlstore-provider=sqlite`.
- `--postgres-version` — PostgreSQL version (default: `15`)
- `--clickhouse-version` — ClickHouse version (default: `25.5.6`)
- `--zookeeper-version` — Zookeeper version (default: `3.7.1`)
- `--schema-migrator-version` — SigNoz schema migrator version (default: `v0.144.2`)
Example:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse \
--sqlstore-provider=postgres --postgres-version=14 \
integration/tests/passwordauthn/
```
## What should I remember?
- **Always use the `--reuse` flag** when setting up the environment or running tests to keep containers warm. Without it every run rebuilds the stack (~4 mins).
- **Use the `--teardown` flag** only when cleaning up — mixing `--teardown` with `--reuse` is a contradiction.
- **Do not pre-emptively teardown before setup.** If the stack is partially up, `--reuse` picks up from wherever it is. `make py-test-teardown` then `make py-test-setup` wastes minutes.
- **Follow the naming convention** with two-digit numeric prefixes (`01_`, `02_`) for ordered test execution within a suite.
- **Use proper timeouts** in HTTP requests to avoid hanging tests (`timeout=5` is typical).
- **Clean up test data** between tests in the same suite to avoid interference — or rely on a fresh SigNoz container if you need full isolation.
- **Use descriptive test names** that clearly indicate what is being tested.
- **Leverage fixtures** for common setup. The shared fixture package is at `tests/fixtures/` — reuse before adding new ones.
- **Test both success and failure scenarios** (4xx / 5xx paths) to ensure robust functionality.
- **Run `make py-fmt` and `make py-lint` before committing** Python changes — black + isort + autoflake + pylint.
- **`--sqlite-mode=wal` does not work on macOS.** The integration test environment runs SigNoz inside a Linux container with the SQLite database file mounted from the macOS host. WAL mode requires shared memory between connections, and connections crossing the VM boundary (macOS host ↔ Linux container) cannot share the WAL index, resulting in `SQLITE_IOERR_SHORT_READ`. WAL mode is tested in CI on Linux only.

View File

@@ -0,0 +1,106 @@
/**
* ! Do not edit manually
* * The file has been auto-generated using Orval for SigNoz
* * regenerate with 'yarn generate:api'
* SigNoz
*/
import { useMutation } from 'react-query';
import type {
MutationFunction,
UseMutationOptions,
UseMutationResult,
} from 'react-query';
import type {
InframonitoringtypesPostableHostsDTO,
ListHosts200,
RenderErrorResponseDTO,
} from '../sigNoz.schemas';
import { GeneratedAPIInstance } from '../../../generatedAPIInstance';
import type { ErrorType, BodyType } from '../../../generatedAPIInstance';
/**
* Returns a paginated list of hosts with key infrastructure metrics: CPU usage (%), memory usage (%), I/O wait (%), disk usage (%), and 15-minute load average. Each host includes its current status (active/inactive based on metrics reported in the last 10 minutes) and metadata attributes (e.g., os.type). Supports filtering via a filter expression, filtering by host status, custom groupBy to aggregate hosts by any attribute, ordering by any of the five metrics, and pagination via offset/limit. The response type is 'list' for the default host.name grouping or 'grouped_list' for custom groupBy keys. Also reports missing required metrics and whether the requested time range falls before the data retention boundary.
* @summary List Hosts for Infra Monitoring
*/
export const listHosts = (
inframonitoringtypesPostableHostsDTO: BodyType<InframonitoringtypesPostableHostsDTO>,
signal?: AbortSignal,
) => {
return GeneratedAPIInstance<ListHosts200>({
url: `/api/v2/infra_monitoring/hosts`,
method: 'POST',
headers: { 'Content-Type': 'application/json' },
data: inframonitoringtypesPostableHostsDTO,
signal,
});
};
export const getListHostsMutationOptions = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof listHosts>>,
TError,
{ data: BodyType<InframonitoringtypesPostableHostsDTO> },
TContext
>;
}): UseMutationOptions<
Awaited<ReturnType<typeof listHosts>>,
TError,
{ data: BodyType<InframonitoringtypesPostableHostsDTO> },
TContext
> => {
const mutationKey = ['listHosts'];
const { mutation: mutationOptions } = options
? options.mutation &&
'mutationKey' in options.mutation &&
options.mutation.mutationKey
? options
: { ...options, mutation: { ...options.mutation, mutationKey } }
: { mutation: { mutationKey } };
const mutationFn: MutationFunction<
Awaited<ReturnType<typeof listHosts>>,
{ data: BodyType<InframonitoringtypesPostableHostsDTO> }
> = (props) => {
const { data } = props ?? {};
return listHosts(data);
};
return { mutationFn, ...mutationOptions };
};
export type ListHostsMutationResult = NonNullable<
Awaited<ReturnType<typeof listHosts>>
>;
export type ListHostsMutationBody =
BodyType<InframonitoringtypesPostableHostsDTO>;
export type ListHostsMutationError = ErrorType<RenderErrorResponseDTO>;
/**
* @summary List Hosts for Infra Monitoring
*/
export const useListHosts = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof listHosts>>,
TError,
{ data: BodyType<InframonitoringtypesPostableHostsDTO> },
TContext
>;
}): UseMutationResult<
Awaited<ReturnType<typeof listHosts>>,
TError,
{ data: BodyType<InframonitoringtypesPostableHostsDTO> },
TContext
> => {
const mutationOptions = getListHostsMutationOptions(options);
return useMutation(mutationOptions);
};

View File

@@ -0,0 +1,398 @@
/**
* ! Do not edit manually
* * The file has been auto-generated using Orval for SigNoz
* * regenerate with 'yarn generate:api'
* SigNoz
*/
import { useMutation, useQuery } from 'react-query';
import type {
InvalidateOptions,
MutationFunction,
QueryClient,
QueryFunction,
QueryKey,
UseMutationOptions,
UseMutationResult,
UseQueryOptions,
UseQueryResult,
} from 'react-query';
import type {
DeleteLLMPricingRulePathParameters,
GetLLMPricingRule200,
GetLLMPricingRulePathParameters,
ListLLMPricingRules200,
ListLLMPricingRulesParams,
LlmpricingruletypesUpdatableLLMPricingRulesDTO,
RenderErrorResponseDTO,
} from '../sigNoz.schemas';
import { GeneratedAPIInstance } from '../../../generatedAPIInstance';
import type { ErrorType, BodyType } from '../../../generatedAPIInstance';
/**
* Returns all LLM pricing rules for the authenticated org, with pagination.
* @summary List pricing rules
*/
export const listLLMPricingRules = (
params?: ListLLMPricingRulesParams,
signal?: AbortSignal,
) => {
return GeneratedAPIInstance<ListLLMPricingRules200>({
url: `/api/v1/llm_pricing_rules`,
method: 'GET',
params,
signal,
});
};
export const getListLLMPricingRulesQueryKey = (
params?: ListLLMPricingRulesParams,
) => {
return [`/api/v1/llm_pricing_rules`, ...(params ? [params] : [])] as const;
};
export const getListLLMPricingRulesQueryOptions = <
TData = Awaited<ReturnType<typeof listLLMPricingRules>>,
TError = ErrorType<RenderErrorResponseDTO>,
>(
params?: ListLLMPricingRulesParams,
options?: {
query?: UseQueryOptions<
Awaited<ReturnType<typeof listLLMPricingRules>>,
TError,
TData
>;
},
) => {
const { query: queryOptions } = options ?? {};
const queryKey =
queryOptions?.queryKey ?? getListLLMPricingRulesQueryKey(params);
const queryFn: QueryFunction<
Awaited<ReturnType<typeof listLLMPricingRules>>
> = ({ signal }) => listLLMPricingRules(params, signal);
return { queryKey, queryFn, ...queryOptions } as UseQueryOptions<
Awaited<ReturnType<typeof listLLMPricingRules>>,
TError,
TData
> & { queryKey: QueryKey };
};
export type ListLLMPricingRulesQueryResult = NonNullable<
Awaited<ReturnType<typeof listLLMPricingRules>>
>;
export type ListLLMPricingRulesQueryError = ErrorType<RenderErrorResponseDTO>;
/**
* @summary List pricing rules
*/
export function useListLLMPricingRules<
TData = Awaited<ReturnType<typeof listLLMPricingRules>>,
TError = ErrorType<RenderErrorResponseDTO>,
>(
params?: ListLLMPricingRulesParams,
options?: {
query?: UseQueryOptions<
Awaited<ReturnType<typeof listLLMPricingRules>>,
TError,
TData
>;
},
): UseQueryResult<TData, TError> & { queryKey: QueryKey } {
const queryOptions = getListLLMPricingRulesQueryOptions(params, options);
const query = useQuery(queryOptions) as UseQueryResult<TData, TError> & {
queryKey: QueryKey;
};
query.queryKey = queryOptions.queryKey;
return query;
}
/**
* @summary List pricing rules
*/
export const invalidateListLLMPricingRules = async (
queryClient: QueryClient,
params?: ListLLMPricingRulesParams,
options?: InvalidateOptions,
): Promise<QueryClient> => {
await queryClient.invalidateQueries(
{ queryKey: getListLLMPricingRulesQueryKey(params) },
options,
);
return queryClient;
};
/**
* Single write endpoint used by both the user and the Zeus sync job. Per-rule match is by id, then sourceId, then insert. Override rows (is_override=true) are fully preserved when the request does not provide isOverride; only synced_at is stamped.
* @summary Bulk update pricing rules
*/
export const updateLLMPricingRules = (
llmpricingruletypesUpdatableLLMPricingRulesDTO: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO>,
) => {
return GeneratedAPIInstance<void>({
url: `/api/v1/llm_pricing_rules`,
method: 'PUT',
headers: { 'Content-Type': 'application/json' },
data: llmpricingruletypesUpdatableLLMPricingRulesDTO,
});
};
export const getUpdateLLMPricingRulesMutationOptions = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof updateLLMPricingRules>>,
TError,
{ data: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO> },
TContext
>;
}): UseMutationOptions<
Awaited<ReturnType<typeof updateLLMPricingRules>>,
TError,
{ data: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO> },
TContext
> => {
const mutationKey = ['updateLLMPricingRules'];
const { mutation: mutationOptions } = options
? options.mutation &&
'mutationKey' in options.mutation &&
options.mutation.mutationKey
? options
: { ...options, mutation: { ...options.mutation, mutationKey } }
: { mutation: { mutationKey } };
const mutationFn: MutationFunction<
Awaited<ReturnType<typeof updateLLMPricingRules>>,
{ data: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO> }
> = (props) => {
const { data } = props ?? {};
return updateLLMPricingRules(data);
};
return { mutationFn, ...mutationOptions };
};
export type UpdateLLMPricingRulesMutationResult = NonNullable<
Awaited<ReturnType<typeof updateLLMPricingRules>>
>;
export type UpdateLLMPricingRulesMutationBody =
BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO>;
export type UpdateLLMPricingRulesMutationError =
ErrorType<RenderErrorResponseDTO>;
/**
* @summary Bulk update pricing rules
*/
export const useUpdateLLMPricingRules = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof updateLLMPricingRules>>,
TError,
{ data: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO> },
TContext
>;
}): UseMutationResult<
Awaited<ReturnType<typeof updateLLMPricingRules>>,
TError,
{ data: BodyType<LlmpricingruletypesUpdatableLLMPricingRulesDTO> },
TContext
> => {
const mutationOptions = getUpdateLLMPricingRulesMutationOptions(options);
return useMutation(mutationOptions);
};
/**
* Hard-deletes a pricing rule. If auto-synced, it will be recreated on the next sync cycle.
* @summary Delete a pricing rule
*/
export const deleteLLMPricingRule = ({
id,
}: DeleteLLMPricingRulePathParameters) => {
return GeneratedAPIInstance<void>({
url: `/api/v1/llm_pricing_rules/${id}`,
method: 'DELETE',
});
};
export const getDeleteLLMPricingRuleMutationOptions = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof deleteLLMPricingRule>>,
TError,
{ pathParams: DeleteLLMPricingRulePathParameters },
TContext
>;
}): UseMutationOptions<
Awaited<ReturnType<typeof deleteLLMPricingRule>>,
TError,
{ pathParams: DeleteLLMPricingRulePathParameters },
TContext
> => {
const mutationKey = ['deleteLLMPricingRule'];
const { mutation: mutationOptions } = options
? options.mutation &&
'mutationKey' in options.mutation &&
options.mutation.mutationKey
? options
: { ...options, mutation: { ...options.mutation, mutationKey } }
: { mutation: { mutationKey } };
const mutationFn: MutationFunction<
Awaited<ReturnType<typeof deleteLLMPricingRule>>,
{ pathParams: DeleteLLMPricingRulePathParameters }
> = (props) => {
const { pathParams } = props ?? {};
return deleteLLMPricingRule(pathParams);
};
return { mutationFn, ...mutationOptions };
};
export type DeleteLLMPricingRuleMutationResult = NonNullable<
Awaited<ReturnType<typeof deleteLLMPricingRule>>
>;
export type DeleteLLMPricingRuleMutationError =
ErrorType<RenderErrorResponseDTO>;
/**
* @summary Delete a pricing rule
*/
export const useDeleteLLMPricingRule = <
TError = ErrorType<RenderErrorResponseDTO>,
TContext = unknown,
>(options?: {
mutation?: UseMutationOptions<
Awaited<ReturnType<typeof deleteLLMPricingRule>>,
TError,
{ pathParams: DeleteLLMPricingRulePathParameters },
TContext
>;
}): UseMutationResult<
Awaited<ReturnType<typeof deleteLLMPricingRule>>,
TError,
{ pathParams: DeleteLLMPricingRulePathParameters },
TContext
> => {
const mutationOptions = getDeleteLLMPricingRuleMutationOptions(options);
return useMutation(mutationOptions);
};
/**
* Returns a single LLM pricing rule by ID.
* @summary Get a pricing rule
*/
export const getLLMPricingRule = (
{ id }: GetLLMPricingRulePathParameters,
signal?: AbortSignal,
) => {
return GeneratedAPIInstance<GetLLMPricingRule200>({
url: `/api/v1/llm_pricing_rules/${id}`,
method: 'GET',
signal,
});
};
export const getGetLLMPricingRuleQueryKey = ({
id,
}: GetLLMPricingRulePathParameters) => {
return [`/api/v1/llm_pricing_rules/${id}`] as const;
};
export const getGetLLMPricingRuleQueryOptions = <
TData = Awaited<ReturnType<typeof getLLMPricingRule>>,
TError = ErrorType<RenderErrorResponseDTO>,
>(
{ id }: GetLLMPricingRulePathParameters,
options?: {
query?: UseQueryOptions<
Awaited<ReturnType<typeof getLLMPricingRule>>,
TError,
TData
>;
},
) => {
const { query: queryOptions } = options ?? {};
const queryKey =
queryOptions?.queryKey ?? getGetLLMPricingRuleQueryKey({ id });
const queryFn: QueryFunction<
Awaited<ReturnType<typeof getLLMPricingRule>>
> = ({ signal }) => getLLMPricingRule({ id }, signal);
return {
queryKey,
queryFn,
enabled: !!id,
...queryOptions,
} as UseQueryOptions<
Awaited<ReturnType<typeof getLLMPricingRule>>,
TError,
TData
> & { queryKey: QueryKey };
};
export type GetLLMPricingRuleQueryResult = NonNullable<
Awaited<ReturnType<typeof getLLMPricingRule>>
>;
export type GetLLMPricingRuleQueryError = ErrorType<RenderErrorResponseDTO>;
/**
* @summary Get a pricing rule
*/
export function useGetLLMPricingRule<
TData = Awaited<ReturnType<typeof getLLMPricingRule>>,
TError = ErrorType<RenderErrorResponseDTO>,
>(
{ id }: GetLLMPricingRulePathParameters,
options?: {
query?: UseQueryOptions<
Awaited<ReturnType<typeof getLLMPricingRule>>,
TError,
TData
>;
},
): UseQueryResult<TData, TError> & { queryKey: QueryKey } {
const queryOptions = getGetLLMPricingRuleQueryOptions({ id }, options);
const query = useQuery(queryOptions) as UseQueryResult<TData, TError> & {
queryKey: QueryKey;
};
query.queryKey = queryOptions.queryKey;
return query;
}
/**
* @summary Get a pricing rule
*/
export const invalidateGetLLMPricingRule = async (
queryClient: QueryClient,
{ id }: GetLLMPricingRulePathParameters,
options?: InvalidateOptions,
): Promise<QueryClient> => {
await queryClient.invalidateQueries(
{ queryKey: getGetLLMPricingRuleQueryKey({ id }) },
options,
);
return queryClient;
};

View File

@@ -3053,6 +3053,298 @@ export interface GlobaltypesTokenizerConfigDTO {
enabled?: boolean;
}
export interface InframonitoringtypesHostFilterDTO {
/**
* @type string
*/
expression?: string;
filterByStatus?: InframonitoringtypesHostStatusDTO;
}
/**
* @nullable
*/
export type InframonitoringtypesHostRecordDTOMeta = {
[key: string]: unknown;
} | null;
export interface InframonitoringtypesHostRecordDTO {
/**
* @type integer
*/
activeHostCount: number;
/**
* @type number
* @format double
*/
cpu: number;
/**
* @type number
* @format double
*/
diskUsage: number;
/**
* @type string
*/
hostName: string;
/**
* @type integer
*/
inactiveHostCount: number;
/**
* @type number
* @format double
*/
load15: number;
/**
* @type number
* @format double
*/
memory: number;
/**
* @type object
* @nullable true
*/
meta: InframonitoringtypesHostRecordDTOMeta;
status: InframonitoringtypesHostStatusDTO;
/**
* @type number
* @format double
*/
wait: number;
}
export enum InframonitoringtypesHostStatusDTO {
active = 'active',
inactive = 'inactive',
'' = '',
}
export interface InframonitoringtypesHostsDTO {
/**
* @type boolean
*/
endTimeBeforeRetention: boolean;
/**
* @type array
* @nullable true
*/
records: InframonitoringtypesHostRecordDTO[] | null;
requiredMetricsCheck: InframonitoringtypesRequiredMetricsCheckDTO;
/**
* @type integer
*/
total: number;
type: InframonitoringtypesResponseTypeDTO;
warning?: Querybuildertypesv5QueryWarnDataDTO;
}
export interface InframonitoringtypesPostableHostsDTO {
/**
* @type integer
* @format int64
*/
end: number;
filter?: InframonitoringtypesHostFilterDTO;
/**
* @type array
* @nullable true
*/
groupBy?: Querybuildertypesv5GroupByKeyDTO[] | null;
/**
* @type integer
*/
limit: number;
/**
* @type integer
*/
offset?: number;
orderBy?: Querybuildertypesv5OrderByDTO;
/**
* @type integer
* @format int64
*/
start: number;
}
export interface InframonitoringtypesRequiredMetricsCheckDTO {
/**
* @type array
* @nullable true
*/
missingMetrics: string[] | null;
}
export enum InframonitoringtypesResponseTypeDTO {
list = 'list',
grouped_list = 'grouped_list',
}
export interface LlmpricingruletypesGettablePricingRulesDTO {
/**
* @type array
* @nullable true
*/
items: LlmpricingruletypesLLMPricingRuleDTO[] | null;
/**
* @type integer
*/
limit: number;
/**
* @type integer
*/
offset: number;
/**
* @type integer
*/
total: number;
}
export interface LlmpricingruletypesLLMPricingRuleDTO {
cacheMode: LlmpricingruletypesLLMPricingRuleCacheModeDTO;
/**
* @type number
* @format double
*/
costCacheRead: number;
/**
* @type number
* @format double
*/
costCacheWrite: number;
/**
* @type number
* @format double
*/
costInput: number;
/**
* @type number
* @format double
*/
costOutput: number;
/**
* @type string
* @format date-time
*/
createdAt?: Date;
/**
* @type string
*/
createdBy?: string;
/**
* @type boolean
*/
enabled: boolean;
/**
* @type string
*/
id: string;
/**
* @type boolean
*/
isOverride: boolean;
/**
* @type string
*/
modelName: string;
/**
* @type array
* @nullable true
*/
modelPattern: string[] | null;
/**
* @type string
*/
orgId: string;
/**
* @type string
*/
sourceId?: string;
/**
* @type string
* @format date-time
* @nullable true
*/
syncedAt?: Date | null;
unit: LlmpricingruletypesLLMPricingRuleUnitDTO;
/**
* @type string
* @format date-time
*/
updatedAt?: Date;
/**
* @type string
*/
updatedBy?: string;
}
export enum LlmpricingruletypesLLMPricingRuleCacheModeDTO {
subtract = 'subtract',
additive = 'additive',
unknown = 'unknown',
}
export enum LlmpricingruletypesLLMPricingRuleUnitDTO {
per_million_tokens = 'per_million_tokens',
}
export interface LlmpricingruletypesUpdatableLLMPricingRuleDTO {
cacheMode: LlmpricingruletypesLLMPricingRuleCacheModeDTO;
/**
* @type number
* @format double
*/
costCacheRead: number;
/**
* @type number
* @format double
*/
costCacheWrite: number;
/**
* @type number
* @format double
*/
costInput: number;
/**
* @type number
* @format double
*/
costOutput: number;
/**
* @type boolean
*/
enabled: boolean;
/**
* @type string
* @nullable true
*/
id?: string | null;
/**
* @type boolean
* @nullable true
*/
isOverride?: boolean | null;
/**
* @type string
*/
modelName: string;
/**
* @type array
* @nullable true
*/
modelPattern: string[] | null;
/**
* @type string
* @nullable true
*/
sourceId?: string | null;
unit: LlmpricingruletypesLLMPricingRuleUnitDTO;
}
export interface LlmpricingruletypesUpdatableLLMPricingRulesDTO {
/**
* @type array
* @nullable true
*/
rules: LlmpricingruletypesUpdatableLLMPricingRuleDTO[] | null;
}
export interface MetricsexplorertypesInspectMetricsRequestDTO {
/**
* @type integer
@@ -6198,6 +6490,41 @@ export type CreateInvite201 = {
status: string;
};
export type ListLLMPricingRulesParams = {
/**
* @type integer
* @description undefined
*/
offset?: number;
/**
* @type integer
* @description undefined
*/
limit?: number;
};
export type ListLLMPricingRules200 = {
data: LlmpricingruletypesGettablePricingRulesDTO;
/**
* @type string
*/
status: string;
};
export type DeleteLLMPricingRulePathParameters = {
id: string;
};
export type GetLLMPricingRulePathParameters = {
id: string;
};
export type GetLLMPricingRule200 = {
data: LlmpricingruletypesLLMPricingRuleDTO;
/**
* @type string
*/
status: string;
};
export type ListPromotedAndIndexedPaths200 = {
/**
* @type array
@@ -6638,6 +6965,14 @@ export type Healthz503 = {
status: string;
};
export type ListHosts200 = {
data: InframonitoringtypesHostsDTO;
/**
* @type string
*/
status: string;
};
export type Livez200 = {
data: FactoryResponseDTO;
/**

View File

@@ -0,0 +1,33 @@
package signozapiserver
import (
"net/http"
"github.com/SigNoz/signoz/pkg/http/handler"
"github.com/SigNoz/signoz/pkg/types"
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
"github.com/gorilla/mux"
)
func (provider *provider) addInfraMonitoringRoutes(router *mux.Router) error {
if err := router.Handle("/api/v2/infra_monitoring/hosts", handler.New(
provider.authZ.ViewAccess(provider.infraMonitoringHandler.ListHosts),
handler.OpenAPIDef{
ID: "ListHosts",
Tags: []string{"inframonitoring"},
Summary: "List Hosts for Infra Monitoring",
Description: "Returns a paginated list of hosts with key infrastructure metrics: CPU usage (%), memory usage (%), I/O wait (%), disk usage (%), and 15-minute load average. Each host includes its current status (active/inactive based on metrics reported in the last 10 minutes) and metadata attributes (e.g., os.type). Supports filtering via a filter expression, filtering by host status, custom groupBy to aggregate hosts by any attribute, ordering by any of the five metrics, and pagination via offset/limit. The response type is 'list' for the default host.name grouping or 'grouped_list' for custom groupBy keys. Also reports missing required metrics and whether the requested time range falls before the data retention boundary.",
Request: new(inframonitoringtypes.PostableHosts),
RequestContentType: "application/json",
Response: new(inframonitoringtypes.Hosts),
ResponseContentType: "application/json",
SuccessStatusCode: http.StatusOK,
ErrorStatusCodes: []int{http.StatusBadRequest, http.StatusUnauthorized},
Deprecated: false,
SecuritySchemes: newSecuritySchemes(types.RoleViewer),
})).Methods(http.MethodPost).GetError(); err != nil {
return err
}
return nil
}

View File

@@ -0,0 +1,93 @@
package signozapiserver
import (
"net/http"
"github.com/SigNoz/signoz/pkg/http/handler"
"github.com/SigNoz/signoz/pkg/types"
"github.com/SigNoz/signoz/pkg/types/llmpricingruletypes"
"github.com/gorilla/mux"
)
func (provider *provider) addLLMPricingRuleRoutes(router *mux.Router) error {
if err := router.Handle("/api/v1/llm_pricing_rules", handler.New(
provider.authZ.ViewAccess(provider.llmPricingRuleHandler.List),
handler.OpenAPIDef{
ID: "ListLLMPricingRules",
Tags: []string{"llmpricingrules"},
Summary: "List pricing rules",
Description: "Returns all LLM pricing rules for the authenticated org, with pagination.",
Request: nil,
RequestContentType: "",
RequestQuery: new(llmpricingruletypes.ListPricingRulesQuery),
Response: new(llmpricingruletypes.GettablePricingRules),
ResponseContentType: "application/json",
SuccessStatusCode: http.StatusOK,
ErrorStatusCodes: []int{http.StatusBadRequest},
Deprecated: false,
SecuritySchemes: newSecuritySchemes(types.RoleViewer),
},
)).Methods(http.MethodGet).GetError(); err != nil {
return err
}
if err := router.Handle("/api/v1/llm_pricing_rules", handler.New(
provider.authZ.AdminAccess(provider.llmPricingRuleHandler.Update),
handler.OpenAPIDef{
ID: "UpdateLLMPricingRules",
Tags: []string{"llmpricingrules"},
Summary: "Bulk update pricing rules",
Description: "Single write endpoint used by both the user and the Zeus sync job. Per-rule match is by id, then sourceId, then insert. Override rows (is_override=true) are fully preserved when the request does not provide isOverride; only synced_at is stamped.",
Request: new(llmpricingruletypes.UpdatableLLMPricingRules),
RequestContentType: "application/json",
SuccessStatusCode: http.StatusNoContent,
ErrorStatusCodes: []int{http.StatusBadRequest},
Deprecated: false,
SecuritySchemes: newSecuritySchemes(types.RoleAdmin),
},
)).Methods(http.MethodPut).GetError(); err != nil {
return err
}
if err := router.Handle("/api/v1/llm_pricing_rules/{id}", handler.New(
provider.authZ.ViewAccess(provider.llmPricingRuleHandler.Get),
handler.OpenAPIDef{
ID: "GetLLMPricingRule",
Tags: []string{"llmpricingrules"},
Summary: "Get a pricing rule",
Description: "Returns a single LLM pricing rule by ID.",
Request: nil,
RequestContentType: "",
Response: new(llmpricingruletypes.GettableLLMPricingRule),
ResponseContentType: "application/json",
SuccessStatusCode: http.StatusOK,
ErrorStatusCodes: []int{http.StatusNotFound},
Deprecated: false,
SecuritySchemes: newSecuritySchemes(types.RoleViewer),
},
)).Methods(http.MethodGet).GetError(); err != nil {
return err
}
if err := router.Handle("/api/v1/llm_pricing_rules/{id}", handler.New(
provider.authZ.AdminAccess(provider.llmPricingRuleHandler.Delete),
handler.OpenAPIDef{
ID: "DeleteLLMPricingRule",
Tags: []string{"llmpricingrules"},
Summary: "Delete a pricing rule",
Description: "Hard-deletes a pricing rule. If auto-synced, it will be recreated on the next sync cycle.",
Request: nil,
RequestContentType: "",
Response: nil,
ResponseContentType: "",
SuccessStatusCode: http.StatusNoContent,
ErrorStatusCodes: []int{http.StatusNotFound},
Deprecated: false,
SecuritySchemes: newSecuritySchemes(types.RoleAdmin),
},
)).Methods(http.MethodDelete).GetError(); err != nil {
return err
}
return nil
}

View File

@@ -16,6 +16,8 @@ import (
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
"github.com/SigNoz/signoz/pkg/modules/dashboard"
"github.com/SigNoz/signoz/pkg/modules/fields"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/modules/llmpricingrule"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/organization"
"github.com/SigNoz/signoz/pkg/modules/preference"
@@ -49,6 +51,7 @@ type provider struct {
dashboardModule dashboard.Module
dashboardHandler dashboard.Handler
metricsExplorerHandler metricsexplorer.Handler
infraMonitoringHandler inframonitoring.Handler
gatewayHandler gateway.Handler
fieldsHandler fields.Handler
authzHandler authz.Handler
@@ -61,6 +64,7 @@ type provider struct {
ruleStateHistoryHandler rulestatehistory.Handler
alertmanagerHandler alertmanager.Handler
rulerHandler ruler.Handler
llmPricingRuleHandler llmpricingrule.Handler
}
func NewFactory(
@@ -77,6 +81,7 @@ func NewFactory(
dashboardModule dashboard.Module,
dashboardHandler dashboard.Handler,
metricsExplorerHandler metricsexplorer.Handler,
infraMonitoringHandler inframonitoring.Handler,
gatewayHandler gateway.Handler,
fieldsHandler fields.Handler,
authzHandler authz.Handler,
@@ -88,6 +93,7 @@ func NewFactory(
cloudIntegrationHandler cloudintegration.Handler,
ruleStateHistoryHandler rulestatehistory.Handler,
alertmanagerHandler alertmanager.Handler,
llmPricingRuleHandler llmpricingrule.Handler,
rulerHandler ruler.Handler,
) factory.ProviderFactory[apiserver.APIServer, apiserver.Config] {
return factory.NewProviderFactory(factory.MustNewName("signoz"), func(ctx context.Context, providerSettings factory.ProviderSettings, config apiserver.Config) (apiserver.APIServer, error) {
@@ -108,6 +114,7 @@ func NewFactory(
dashboardModule,
dashboardHandler,
metricsExplorerHandler,
infraMonitoringHandler,
gatewayHandler,
fieldsHandler,
authzHandler,
@@ -119,6 +126,7 @@ func NewFactory(
cloudIntegrationHandler,
ruleStateHistoryHandler,
alertmanagerHandler,
llmPricingRuleHandler,
rulerHandler,
)
})
@@ -141,6 +149,7 @@ func newProvider(
dashboardModule dashboard.Module,
dashboardHandler dashboard.Handler,
metricsExplorerHandler metricsexplorer.Handler,
infraMonitoringHandler inframonitoring.Handler,
gatewayHandler gateway.Handler,
fieldsHandler fields.Handler,
authzHandler authz.Handler,
@@ -152,6 +161,8 @@ func newProvider(
cloudIntegrationHandler cloudintegration.Handler,
ruleStateHistoryHandler rulestatehistory.Handler,
alertmanagerHandler alertmanager.Handler,
llmPricingRuleHandler llmpricingrule.Handler,
rulerHandler ruler.Handler,
) (apiserver.APIServer, error) {
settings := factory.NewScopedProviderSettings(providerSettings, "github.com/SigNoz/signoz/pkg/apiserver/signozapiserver")
@@ -172,6 +183,7 @@ func newProvider(
dashboardModule: dashboardModule,
dashboardHandler: dashboardHandler,
metricsExplorerHandler: metricsExplorerHandler,
infraMonitoringHandler: infraMonitoringHandler,
gatewayHandler: gatewayHandler,
fieldsHandler: fieldsHandler,
authzHandler: authzHandler,
@@ -184,6 +196,7 @@ func newProvider(
ruleStateHistoryHandler: ruleStateHistoryHandler,
alertmanagerHandler: alertmanagerHandler,
rulerHandler: rulerHandler,
llmPricingRuleHandler: llmPricingRuleHandler,
}
provider.authZ = middleware.NewAuthZ(settings.Logger(), orgGetter, authz)
@@ -240,6 +253,10 @@ func (provider *provider) AddToRouter(router *mux.Router) error {
return err
}
if err := provider.addInfraMonitoringRoutes(router); err != nil {
return err
}
if err := provider.addGatewayRoutes(router); err != nil {
return err
}
@@ -288,6 +305,10 @@ func (provider *provider) AddToRouter(router *mux.Router) error {
return err
}
if err := provider.addLLMPricingRuleRoutes(router); err != nil {
return err
}
if err := provider.addRulerRoutes(router); err != nil {
return err
}

View File

@@ -0,0 +1,33 @@
package inframonitoring
import (
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/factory"
)
type Config struct {
TelemetryStore TelemetryStoreConfig `mapstructure:"telemetrystore"`
}
type TelemetryStoreConfig struct {
Threads int `mapstructure:"threads"`
}
func NewConfigFactory() factory.ConfigFactory {
return factory.NewConfigFactory(factory.MustNewName("inframonitoring"), newConfig)
}
func newConfig() factory.Config {
return Config{
TelemetryStore: TelemetryStoreConfig{
Threads: 8,
},
}
}
func (c Config) Validate() error {
if c.TelemetryStore.Threads <= 0 {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "inframonitoring.telemetrystore.threads must be positive, got %d", c.TelemetryStore.Threads)
}
return nil
}

View File

@@ -0,0 +1,47 @@
package implinframonitoring
import (
"net/http"
"github.com/SigNoz/signoz/pkg/http/binding"
"github.com/SigNoz/signoz/pkg/http/render"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/types/authtypes"
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
type handler struct {
module inframonitoring.Module
}
// NewHandler returns an inframonitoring.Handler implementation.
func NewHandler(m inframonitoring.Module) inframonitoring.Handler {
return &handler{
module: m,
}
}
func (h *handler) ListHosts(rw http.ResponseWriter, req *http.Request) {
claims, err := authtypes.ClaimsFromContext(req.Context())
if err != nil {
render.Error(rw, err)
return
}
orgID := valuer.MustNewUUID(claims.OrgID)
var parsedReq inframonitoringtypes.PostableHosts
if err := binding.JSON.BindBody(req.Body, &parsedReq); err != nil {
render.Error(rw, err)
return
}
result, err := h.module.ListHosts(req.Context(), orgID, &parsedReq)
if err != nil {
render.Error(rw, err)
return
}
render.Success(rw, http.StatusOK, result)
}

View File

@@ -0,0 +1,573 @@
package implinframonitoring
import (
"context"
"fmt"
"sort"
"strings"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/querybuilder"
"github.com/SigNoz/signoz/pkg/telemetrymetrics"
"github.com/SigNoz/signoz/pkg/types/metrictypes"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/huandu/go-sqlbuilder"
)
// quoteIdentifier wraps s in backticks for use as a ClickHouse identifier,
// escaping any embedded backticks by doubling them.
func quoteIdentifier(s string) string {
return fmt.Sprintf("`%s`", strings.ReplaceAll(s, "`", "``"))
}
func isKeyInGroupByAttrs(groupByAttrs []qbtypes.GroupByKey, key string) bool {
for _, groupBy := range groupByAttrs {
if groupBy.Name == key {
return true
}
}
return false
}
func mergeFilterExpressions(queryFilterExpr, reqFilterExpr string) string {
queryFilterExpr = strings.TrimSpace(queryFilterExpr)
reqFilterExpr = strings.TrimSpace(reqFilterExpr)
if queryFilterExpr == "" {
return reqFilterExpr
}
if reqFilterExpr == "" {
return queryFilterExpr
}
return fmt.Sprintf("(%s) AND (%s)", queryFilterExpr, reqFilterExpr)
}
// compositeKeyFromList builds a composite key by joining the given parts
// with a null byte separator. This is the canonical way to construct
// composite keys for group identification across the infra monitoring module.
func compositeKeyFromList(parts []string) string {
return strings.Join(parts, "\x00")
}
// compositeKeyFromLabels builds a composite key from a label map by extracting
// the value for each groupBy key in order and joining them via compositeKeyFromList.
func compositeKeyFromLabels(labels map[string]string, groupBy []qbtypes.GroupByKey) string {
parts := make([]string, len(groupBy))
for i, key := range groupBy {
parts[i] = labels[key.Name]
}
return compositeKeyFromList(parts)
}
// parseAndSortGroups extracts group label maps from a ScalarData response and
// sorts them by the ranking query's aggregation value.
func parseAndSortGroups(
resp *qbtypes.QueryRangeResponse,
rankingQueryName string,
groupBy []qbtypes.GroupByKey,
direction qbtypes.OrderDirection,
) []rankedGroup {
if resp == nil || len(resp.Data.Results) == 0 {
return nil
}
// Find the ScalarData that contains the ranking column.
var sd *qbtypes.ScalarData
for _, r := range resp.Data.Results {
candidate, ok := r.(*qbtypes.ScalarData)
if !ok || candidate == nil {
continue
}
for _, col := range candidate.Columns {
if col.Type == qbtypes.ColumnTypeAggregation && col.QueryName == rankingQueryName {
sd = candidate
break
}
}
if sd != nil {
break
}
}
if sd == nil || len(sd.Data) == 0 {
return nil
}
groupColIndices := make(map[string]int)
rankingColIdx := -1
for i, col := range sd.Columns {
if col.Type == qbtypes.ColumnTypeGroup {
groupColIndices[col.Name] = i
}
if col.Type == qbtypes.ColumnTypeAggregation && col.QueryName == rankingQueryName {
rankingColIdx = i
}
}
if rankingColIdx == -1 {
return nil
}
groups := make([]rankedGroup, 0, len(sd.Data))
for _, row := range sd.Data {
labels := make(map[string]string, len(groupBy))
for _, key := range groupBy {
if idx, ok := groupColIndices[key.Name]; ok && idx < len(row) {
labels[key.Name] = fmt.Sprintf("%v", row[idx])
}
}
var value float64
if rankingColIdx < len(row) {
if v, ok := row[rankingColIdx].(float64); ok {
value = v
}
}
groups = append(groups, rankedGroup{
labels: labels,
value: value,
compositeKey: compositeKeyFromLabels(labels, groupBy),
})
}
sort.Slice(groups, func(i, j int) bool {
if groups[i].value != groups[j].value {
if direction == qbtypes.OrderDirectionAsc {
return groups[i].value < groups[j].value
}
return groups[i].value > groups[j].value
}
return groups[i].compositeKey < groups[j].compositeKey
})
return groups
}
// paginateWithBackfill returns the page of groups for [offset, offset+limit).
// The virtual sorted list is: metric-ranked groups first, then metadata-only
// groups (those in metadataMap but not in metric results) sorted alphabetically.
func paginateWithBackfill(
metricGroups []rankedGroup,
metadataMap map[string]map[string]string,
groupBy []qbtypes.GroupByKey,
offset, limit int,
) []map[string]string {
metricKeySet := make(map[string]bool, len(metricGroups))
for _, g := range metricGroups {
metricKeySet[g.compositeKey] = true
}
metadataOnlyKeys := make([]string, 0)
for compositeKey := range metadataMap {
if !metricKeySet[compositeKey] {
metadataOnlyKeys = append(metadataOnlyKeys, compositeKey)
}
}
sort.Strings(metadataOnlyKeys)
totalMetric := len(metricGroups)
totalAll := totalMetric + len(metadataOnlyKeys)
end := offset + limit
if end > totalAll {
end = totalAll
}
if offset >= totalAll {
return nil
}
pageGroups := make([]map[string]string, 0, end-offset)
for i := offset; i < end; i++ {
if i < totalMetric {
pageGroups = append(pageGroups, metricGroups[i].labels)
} else {
compositeKey := metadataOnlyKeys[i-totalMetric]
attrs := metadataMap[compositeKey]
labels := make(map[string]string, len(groupBy))
for _, key := range groupBy {
labels[key.Name] = attrs[key.Name]
}
pageGroups = append(pageGroups, labels)
}
}
return pageGroups
}
// buildPageGroupsFilterExpr builds a filter expression that restricts results
// to the given page of groups via IN clauses.
// Returns e.g. "host.name IN ('h1','h2') AND os.type IN ('linux','windows')".
func buildPageGroupsFilterExpr(pageGroups []map[string]string) string {
groupValues := make(map[string][]string)
for _, labels := range pageGroups {
for k, v := range labels {
groupValues[k] = append(groupValues[k], v)
}
}
inClauses := make([]string, 0, len(groupValues))
for key, values := range groupValues {
quoted := make([]string, len(values))
for i, v := range values {
quoted[i] = fmt.Sprintf("'%s'", v)
}
inClauses = append(inClauses, fmt.Sprintf("%s IN (%s)", key, strings.Join(quoted, ", ")))
}
return strings.Join(inClauses, " AND ")
}
// buildFullQueryRequest creates a QueryRangeRequest for all metrics,
// restricted to the given page of groups via an IN filter.
// Accepts primitive fields so it can be reused across different v2 APIs
// (hosts, pods, etc.).
func buildFullQueryRequest(
start int64,
end int64,
filterExpr string,
groupBy []qbtypes.GroupByKey,
pageGroups []map[string]string,
tableListQuery *qbtypes.QueryRangeRequest,
) *qbtypes.QueryRangeRequest {
inFilterExpr := buildPageGroupsFilterExpr(pageGroups)
fullReq := &qbtypes.QueryRangeRequest{
Start: uint64(start),
End: uint64(end),
RequestType: qbtypes.RequestTypeScalar,
CompositeQuery: qbtypes.CompositeQuery{
Queries: make([]qbtypes.QueryEnvelope, 0, len(tableListQuery.CompositeQuery.Queries)),
},
}
for _, envelope := range tableListQuery.CompositeQuery.Queries {
copied := envelope
if copied.Type == qbtypes.QueryTypeBuilder {
existingExpr := ""
if f := copied.GetFilter(); f != nil {
existingExpr = f.Expression
}
merged := mergeFilterExpressions(existingExpr, filterExpr)
merged = mergeFilterExpressions(merged, inFilterExpr)
copied.SetFilter(&qbtypes.Filter{Expression: merged})
copied.SetGroupBy(groupBy)
}
fullReq.CompositeQuery.Queries = append(fullReq.CompositeQuery.Queries, copied)
}
return fullReq
}
// parseFullQueryResponse extracts per-group metric values from the full
// composite query response. Returns compositeKey -> (queryName -> value).
// Each enabled query/formula produces its own ScalarData entry in Results,
// so we iterate over all of them and merge metrics per composite key.
func parseFullQueryResponse(
resp *qbtypes.QueryRangeResponse,
groupBy []qbtypes.GroupByKey,
) map[string]map[string]float64 {
result := make(map[string]map[string]float64)
if resp == nil || len(resp.Data.Results) == 0 {
return result
}
for _, r := range resp.Data.Results {
sd, ok := r.(*qbtypes.ScalarData)
if !ok || sd == nil {
continue
}
groupColIndices := make(map[string]int)
aggCols := make(map[int]string) // col index -> query name
for i, col := range sd.Columns {
if col.Type == qbtypes.ColumnTypeGroup {
groupColIndices[col.Name] = i
}
if col.Type == qbtypes.ColumnTypeAggregation {
aggCols[i] = col.QueryName
}
}
for _, row := range sd.Data {
labels := make(map[string]string, len(groupBy))
for _, key := range groupBy {
if idx, ok := groupColIndices[key.Name]; ok && idx < len(row) {
labels[key.Name] = fmt.Sprintf("%v", row[idx])
}
}
compositeKey := compositeKeyFromLabels(labels, groupBy)
if result[compositeKey] == nil {
result[compositeKey] = make(map[string]float64)
}
for idx, queryName := range aggCols {
if idx < len(row) {
if v, ok := row[idx].(float64); ok {
result[compositeKey][queryName] = v
}
}
}
}
}
return result
}
// buildSamplesTblFingerprintSubQuery returns a SelectBuilder that selects distinct fingerprints
// from the samples table for the given metric names andtime range.
func (m *module) buildSamplesTblFingerprintSubQuery(metricNames []string, startMs, endMs int64) *sqlbuilder.SelectBuilder {
samplesTableName := telemetrymetrics.WhichSamplesTableToUse(
uint64(startMs), uint64(endMs),
metrictypes.UnspecifiedType,
metrictypes.TimeAggregationUnspecified,
nil,
)
localSamplesTable := strings.TrimPrefix(samplesTableName, "distributed_")
fpSB := sqlbuilder.NewSelectBuilder()
fpSB.Select("DISTINCT fingerprint")
fpSB.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, localSamplesTable))
fpSB.Where(
fpSB.In("metric_name", sqlbuilder.List(metricNames)),
fpSB.GE("unix_milli", startMs),
fpSB.L("unix_milli", endMs),
)
return fpSB
}
func (m *module) buildFilterClause(ctx context.Context, filter *qbtypes.Filter, startMillis, endMillis int64) (*sqlbuilder.WhereClause, error) {
expression := ""
if filter != nil {
expression = strings.TrimSpace(filter.Expression)
}
if expression == "" {
return sqlbuilder.NewWhereClause(), nil
}
whereClauseSelectors := querybuilder.QueryStringToKeysSelectors(expression)
for idx := range whereClauseSelectors {
whereClauseSelectors[idx].Signal = telemetrytypes.SignalMetrics
whereClauseSelectors[idx].SelectorMatchType = telemetrytypes.FieldSelectorMatchTypeExact
}
keys, _, err := m.telemetryMetadataStore.GetKeysMulti(ctx, whereClauseSelectors)
if err != nil {
return nil, err
}
opts := querybuilder.FilterExprVisitorOpts{
Context: ctx,
Logger: m.logger,
FieldMapper: m.fieldMapper,
ConditionBuilder: m.condBuilder,
FullTextColumn: &telemetrytypes.TelemetryFieldKey{Name: "metric_name", FieldContext: telemetrytypes.FieldContextMetric},
FieldKeys: keys,
StartNs: querybuilder.ToNanoSecs(uint64(startMillis)),
EndNs: querybuilder.ToNanoSecs(uint64(endMillis)),
}
whereClause, err := querybuilder.PrepareWhereClause(expression, opts)
if err != nil {
return nil, err
}
if whereClause == nil || whereClause.WhereClause == nil {
return sqlbuilder.NewWhereClause(), nil
}
return whereClause.WhereClause, nil
}
// NOTE: this method is not specific to infra monitoring — it queries attributes_metadata generically.
// Consider moving to telemetryMetaStore when a second use case emerges.
//
// getMetricsExistenceAndEarliestTime checks which of the given metric names have been
// reported. It returns a list of missing metrics (those not found or with zero count)
// and the earliest first-reported timestamp across all present metrics.
// When all metrics are missing, minFirstReportedUnixMilli is 0.
// TODO(nikhilmantri0902, srikanthccv): This method was designed this way because querier errors if any of the metrics
// in the querier list was never sent, the QueryRange call throws not found error. Modify this method, if QueryRange
// behaviour changes towards this.
func (m *module) getMetricsExistenceAndEarliestTime(ctx context.Context, metricNames []string) ([]string, uint64, error) {
if len(metricNames) == 0 {
return nil, 0, nil
}
sb := sqlbuilder.NewSelectBuilder()
sb.Select("metric_name", "count(*) AS cnt", "min(first_reported_unix_milli) AS min_first_reported")
sb.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, telemetrymetrics.AttributesMetadataTableName))
sb.Where(sb.In("metric_name", sqlbuilder.List(metricNames)))
sb.GroupBy("metric_name")
query, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
rows, err := m.telemetryStore.ClickhouseDB().Query(ctx, query, args...)
if err != nil {
return nil, 0, err
}
defer rows.Close()
type metricInfo struct {
count uint64
minFirstReported uint64
}
found := make(map[string]metricInfo, len(metricNames))
for rows.Next() {
var name string
var cnt, minFR uint64
if err := rows.Scan(&name, &cnt, &minFR); err != nil {
return nil, 0, err
}
found[name] = metricInfo{count: cnt, minFirstReported: minFR}
}
if err := rows.Err(); err != nil {
return nil, 0, err
}
var missingMetrics []string
var globalMinFirstReported uint64
for _, name := range metricNames {
info, ok := found[name]
if !ok || info.count == 0 {
missingMetrics = append(missingMetrics, name)
continue
}
if globalMinFirstReported == 0 || info.minFirstReported < globalMinFirstReported {
globalMinFirstReported = info.minFirstReported
}
}
return missingMetrics, globalMinFirstReported, nil
}
// getMetadata fetches the latest values of additionalCols for each unique combination of groupBy keys,
// within the given time range and metric names. It uses argMax(tuple(...), unix_milli) to ensure
// we always pick attribute values from the latest timestamp for each group.
// The returned map has a composite key of groupBy column values joined by "\x00" (null byte),
// mapping to a flat map of attr_name -> attr_value (includes both groupBy and additional cols).
func (m *module) getMetadata(
ctx context.Context,
metricNames []string,
groupBy []qbtypes.GroupByKey,
additionalCols []string,
filter *qbtypes.Filter,
startMs, endMs int64,
) (map[string]map[string]string, error) {
if len(metricNames) == 0 {
return nil, errors.NewInvalidInputf(errors.CodeInvalidInput, "metricNames must not be empty")
}
if len(groupBy) == 0 {
return nil, errors.NewInvalidInputf(errors.CodeInvalidInput, "groupBy must not be empty")
}
// Pick the optimal timeseries table based on time range; also get adjusted start.
adjustedStart, adjustedEnd, distributedTableName, _ := telemetrymetrics.WhichTSTableToUse(
uint64(startMs), uint64(endMs), nil,
)
fpSB := m.buildSamplesTblFingerprintSubQuery(metricNames, startMs, endMs)
// Flatten groupBy keys to string names for SQL expressions and result scanning.
groupByCols := make([]string, len(groupBy))
for i, key := range groupBy {
groupByCols[i] = key.Name
}
allCols := append(groupByCols, additionalCols...)
// --- Build inner query ---
innerSB := sqlbuilder.NewSelectBuilder()
// Inner SELECT columns: JSONExtractString for each groupBy col + argMax(tuple(...)) for additional cols
innerSelectCols := make([]string, 0, len(groupByCols)+1)
for _, col := range groupByCols {
innerSelectCols = append(innerSelectCols,
fmt.Sprintf("JSONExtractString(labels, %s) AS %s", innerSB.Var(col), quoteIdentifier(col)),
)
}
// Build the argMax(tuple(...), unix_milli) expression for all additional cols
if len(additionalCols) > 0 {
tupleArgs := make([]string, 0, len(additionalCols))
for _, col := range additionalCols {
tupleArgs = append(tupleArgs, fmt.Sprintf("JSONExtractString(labels, %s)", innerSB.Var(col)))
}
innerSelectCols = append(innerSelectCols,
fmt.Sprintf("argMax(tuple(%s), unix_milli) AS latest_attrs", strings.Join(tupleArgs, ", ")),
)
}
innerSB.Select(innerSelectCols...)
innerSB.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, distributedTableName))
innerSB.Where(
innerSB.In("metric_name", sqlbuilder.List(metricNames)),
innerSB.GE("unix_milli", adjustedStart),
innerSB.L("unix_milli", adjustedEnd),
fmt.Sprintf("fingerprint IN (%s)", innerSB.Var(fpSB)),
)
// Apply optional filter expression
if filter != nil && strings.TrimSpace(filter.Expression) != "" {
filterClause, err := m.buildFilterClause(ctx, filter, startMs, endMs)
if err != nil {
return nil, err
}
if filterClause != nil {
innerSB.AddWhereClause(filterClause)
}
}
groupByAliases := make([]string, 0, len(groupByCols))
for _, col := range groupByCols {
groupByAliases = append(groupByAliases, quoteIdentifier(col))
}
innerSB.GroupBy(groupByAliases...)
innerQuery, innerArgs := innerSB.BuildWithFlavor(sqlbuilder.ClickHouse)
// --- Build outer query ---
// Outer SELECT columns: groupBy cols directly + tupleElement(latest_attrs, N) for each additionalCol
outerSelectCols := make([]string, 0, len(allCols))
for _, col := range groupByCols {
outerSelectCols = append(outerSelectCols, quoteIdentifier(col))
}
for i, col := range additionalCols {
outerSelectCols = append(outerSelectCols,
fmt.Sprintf("tupleElement(latest_attrs, %d) AS %s", i+1, quoteIdentifier(col)),
)
}
outerSB := sqlbuilder.NewSelectBuilder()
outerSB.Select(outerSelectCols...)
outerSB.From(fmt.Sprintf("(%s)", innerQuery))
outerQuery, _ := outerSB.BuildWithFlavor(sqlbuilder.ClickHouse)
// All ? params are in innerArgs; outer query introduces none of its own.
rows, err := m.telemetryStore.ClickhouseDB().Query(ctx, outerQuery, innerArgs...)
if err != nil {
return nil, err
}
defer rows.Close()
result := make(map[string]map[string]string)
for rows.Next() {
row := make([]string, len(allCols))
scanPtrs := make([]any, len(row))
for i := range row {
scanPtrs[i] = &row[i]
}
if err := rows.Scan(scanPtrs...); err != nil {
return nil, err
}
compositeKey := compositeKeyFromList(row[:len(groupByCols)])
attrMap := make(map[string]string, len(allCols))
for i, col := range allCols {
attrMap[col] = row[i]
}
result[compositeKey] = attrMap
}
if err := rows.Err(); err != nil {
return nil, err
}
return result, nil
}

View File

@@ -0,0 +1,283 @@
package implinframonitoring
import (
"testing"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
)
func groupByKey(name string) qbtypes.GroupByKey {
return qbtypes.GroupByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{Name: name},
}
}
func TestIsKeyInGroupByAttrs(t *testing.T) {
tests := []struct {
name string
groupByAttrs []qbtypes.GroupByKey
key string
expectedFound bool
}{
{
name: "key present in single-element list",
groupByAttrs: []qbtypes.GroupByKey{groupByKey("host.name")},
key: "host.name",
expectedFound: true,
},
{
name: "key present in multi-element list",
groupByAttrs: []qbtypes.GroupByKey{
groupByKey("host.name"),
groupByKey("os.type"),
groupByKey("k8s.cluster.name"),
},
key: "os.type",
expectedFound: true,
},
{
name: "key at last position",
groupByAttrs: []qbtypes.GroupByKey{
groupByKey("host.name"),
groupByKey("os.type"),
},
key: "os.type",
expectedFound: true,
},
{
name: "key not in list",
groupByAttrs: []qbtypes.GroupByKey{groupByKey("host.name")},
key: "os.type",
expectedFound: false,
},
{
name: "empty group by list",
groupByAttrs: []qbtypes.GroupByKey{},
key: "host.name",
expectedFound: false,
},
{
name: "nil group by list",
groupByAttrs: nil,
key: "host.name",
expectedFound: false,
},
{
name: "empty key string",
groupByAttrs: []qbtypes.GroupByKey{groupByKey("host.name")},
key: "",
expectedFound: false,
},
{
name: "empty key matches empty-named group by key",
groupByAttrs: []qbtypes.GroupByKey{groupByKey("")},
key: "",
expectedFound: true,
},
{
name: "partial match does not count",
groupByAttrs: []qbtypes.GroupByKey{
groupByKey("host"),
},
key: "host.name",
expectedFound: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := isKeyInGroupByAttrs(tt.groupByAttrs, tt.key)
if got != tt.expectedFound {
t.Errorf("isKeyInGroupByAttrs(%v, %q) = %v, want %v",
tt.groupByAttrs, tt.key, got, tt.expectedFound)
}
})
}
}
func TestMergeFilterExpressions(t *testing.T) {
tests := []struct {
name string
queryFilterExpr string
reqFilterExpr string
expected string
}{
{
name: "both non-empty",
queryFilterExpr: "cpu > 50",
reqFilterExpr: "host.name = 'web-1'",
expected: "(cpu > 50) AND (host.name = 'web-1')",
},
{
name: "query empty, req non-empty",
queryFilterExpr: "",
reqFilterExpr: "host.name = 'web-1'",
expected: "host.name = 'web-1'",
},
{
name: "query non-empty, req empty",
queryFilterExpr: "cpu > 50",
reqFilterExpr: "",
expected: "cpu > 50",
},
{
name: "both empty",
queryFilterExpr: "",
reqFilterExpr: "",
expected: "",
},
{
name: "whitespace-only query treated as empty",
queryFilterExpr: " ",
reqFilterExpr: "host.name = 'web-1'",
expected: "host.name = 'web-1'",
},
{
name: "whitespace-only req treated as empty",
queryFilterExpr: "cpu > 50",
reqFilterExpr: " ",
expected: "cpu > 50",
},
{
name: "both whitespace-only",
queryFilterExpr: " ",
reqFilterExpr: " ",
expected: "",
},
{
name: "leading/trailing whitespace trimmed before merge",
queryFilterExpr: " cpu > 50 ",
reqFilterExpr: " mem < 80 ",
expected: "(cpu > 50) AND (mem < 80)",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := mergeFilterExpressions(tt.queryFilterExpr, tt.reqFilterExpr)
if got != tt.expected {
t.Errorf("mergeFilterExpressions(%q, %q) = %q, want %q",
tt.queryFilterExpr, tt.reqFilterExpr, got, tt.expected)
}
})
}
}
func TestCompositeKeyFromList(t *testing.T) {
tests := []struct {
name string
parts []string
expected string
}{
{
name: "single part",
parts: []string{"web-1"},
expected: "web-1",
},
{
name: "multiple parts joined with null separator",
parts: []string{"web-1", "linux", "us-east"},
expected: "web-1\x00linux\x00us-east",
},
{
name: "empty slice returns empty string",
parts: []string{},
expected: "",
},
{
name: "nil slice returns empty string",
parts: nil,
expected: "",
},
{
name: "parts with empty strings",
parts: []string{"web-1", "", "us-east"},
expected: "web-1\x00\x00us-east",
},
{
name: "all empty strings",
parts: []string{"", ""},
expected: "\x00",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := compositeKeyFromList(tt.parts)
if got != tt.expected {
t.Errorf("compositeKeyFromList(%v) = %q, want %q",
tt.parts, got, tt.expected)
}
})
}
}
func TestCompositeKeyFromLabels(t *testing.T) {
tests := []struct {
name string
labels map[string]string
groupBy []qbtypes.GroupByKey
expected string
}{
{
name: "single group-by key",
labels: map[string]string{"host.name": "web-1"},
groupBy: []qbtypes.GroupByKey{groupByKey("host.name")},
expected: "web-1",
},
{
name: "multiple group-by keys joined with null separator",
labels: map[string]string{
"host.name": "web-1",
"os.type": "linux",
},
groupBy: []qbtypes.GroupByKey{groupByKey("host.name"), groupByKey("os.type")},
expected: "web-1\x00linux",
},
{
name: "missing label yields empty segment",
labels: map[string]string{"host.name": "web-1"},
groupBy: []qbtypes.GroupByKey{groupByKey("host.name"), groupByKey("os.type")},
expected: "web-1\x00",
},
{
name: "empty labels map",
labels: map[string]string{},
groupBy: []qbtypes.GroupByKey{groupByKey("host.name")},
expected: "",
},
{
name: "empty group-by slice",
labels: map[string]string{"host.name": "web-1"},
groupBy: []qbtypes.GroupByKey{},
expected: "",
},
{
name: "nil labels map",
labels: nil,
groupBy: []qbtypes.GroupByKey{groupByKey("host.name")},
expected: "",
},
{
name: "order matches group-by order, not map iteration order",
labels: map[string]string{
"z": "last",
"a": "first",
"m": "middle",
},
groupBy: []qbtypes.GroupByKey{groupByKey("a"), groupByKey("m"), groupByKey("z")},
expected: "first\x00middle\x00last",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := compositeKeyFromLabels(tt.labels, tt.groupBy)
if got != tt.expected {
t.Errorf("compositeKeyFromLabels(%v, %v) = %q, want %q",
tt.labels, tt.groupBy, got, tt.expected)
}
})
}
}

View File

@@ -0,0 +1,354 @@
package implinframonitoring
import (
"context"
"fmt"
"slices"
"strings"
"github.com/SigNoz/signoz/pkg/telemetrymetrics"
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/huandu/go-sqlbuilder"
)
// getPerGroupHostStatusCounts computes the number of active and inactive hosts per group
// for the current page. It queries the timeseries table using the provided filter
// expression (which includes user filter + status filter + page groups IN clause).
// Uses GLOBAL IN with the active-hosts subquery inside uniqExactIf for active count,
// and a simple uniqExactIf for total count. Inactive = total - active (computed in Go).
func (m *module) getPerGroupHostStatusCounts(
ctx context.Context,
req *inframonitoringtypes.PostableHosts,
metricNames []string,
pageGroups []map[string]string,
sinceUnixMilli int64,
) (map[string]groupHostStatusCounts, error) {
// Build the full filter expression from req (user filter + status filter) and page groups.
reqFilterExpr := ""
if req.Filter != nil {
reqFilterExpr = req.Filter.Expression
}
pageGroupsFilterExpr := buildPageGroupsFilterExpr(pageGroups)
filterExpr := mergeFilterExpressions(reqFilterExpr, pageGroupsFilterExpr)
adjustedStart, adjustedEnd, distributedTimeSeriesTableName, _ := telemetrymetrics.WhichTSTableToUse(
uint64(req.Start), uint64(req.End), nil,
)
hostNameExpr := fmt.Sprintf("JSONExtractString(labels, '%s')", hostNameAttrKey)
sb := sqlbuilder.NewSelectBuilder()
selectCols := make([]string, 0, len(req.GroupBy)+2)
for _, key := range req.GroupBy {
selectCols = append(selectCols,
fmt.Sprintf("JSONExtractString(labels, %s) AS %s", sb.Var(key.Name), quoteIdentifier(key.Name)),
)
}
activeHostsSQ := m.getActiveHostsQuery(metricNames, hostNameAttrKey, sinceUnixMilli)
selectCols = append(selectCols,
fmt.Sprintf("uniqExactIf(%s, %s GLOBAL IN (%s)) AS active_host_count", hostNameExpr, hostNameExpr, sb.Var(activeHostsSQ)),
fmt.Sprintf("uniqExactIf(%s, %s != '') AS total_host_count", hostNameExpr, hostNameExpr),
)
// Build a fingerprint subquery to restrict to fingerprints with actual sample
// data in the original time range (not the wider timeseries table window).
fpSB := m.buildSamplesTblFingerprintSubQuery(metricNames, req.Start, req.End)
sb.Select(selectCols...)
sb.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, distributedTimeSeriesTableName))
sb.Where(
sb.In("metric_name", sqlbuilder.List(metricNames)),
sb.GE("unix_milli", adjustedStart),
sb.L("unix_milli", adjustedEnd),
fmt.Sprintf("fingerprint IN (%s)", sb.Var(fpSB)),
)
// Apply the combined filter expression (user filter + status filter + page groups IN).
if filterExpr != "" {
filterClause, err := m.buildFilterClause(ctx, &qbtypes.Filter{Expression: filterExpr}, req.Start, req.End)
if err != nil {
return nil, err
}
if filterClause != nil {
sb.AddWhereClause(filterClause)
}
}
// GROUP BY
groupByAliases := make([]string, 0, len(req.GroupBy))
for _, key := range req.GroupBy {
groupByAliases = append(groupByAliases, quoteIdentifier(key.Name))
}
sb.GroupBy(groupByAliases...)
query, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
rows, err := m.telemetryStore.ClickhouseDB().Query(ctx, query, args...)
if err != nil {
return nil, err
}
defer rows.Close()
result := make(map[string]groupHostStatusCounts)
for rows.Next() {
groupVals := make([]string, len(req.GroupBy))
scanPtrs := make([]any, 0, len(req.GroupBy)+2)
for i := range groupVals {
scanPtrs = append(scanPtrs, &groupVals[i])
}
var activeCount, totalCount uint64
scanPtrs = append(scanPtrs, &activeCount, &totalCount)
if err := rows.Scan(scanPtrs...); err != nil {
return nil, err
}
compositeKey := compositeKeyFromList(groupVals)
result[compositeKey] = groupHostStatusCounts{
Active: int(activeCount),
Inactive: int(totalCount - activeCount),
}
}
if err := rows.Err(); err != nil {
return nil, err
}
return result, nil
}
// buildHostRecords constructs the final list of HostRecords for a page.
// Groups that had no metric data get default values of -1.
//
// hostCounts is nil when host.name is in the groupBy — in that case, counts are
// derived directly from activeHostsMap (1/0 per host). When non-nil (custom groupBy
// without host.name), counts are looked up from the map.
func buildHostRecords(
isHostNameInGroupBy bool,
resp *qbtypes.QueryRangeResponse,
pageGroups []map[string]string,
groupBy []qbtypes.GroupByKey,
metadataMap map[string]map[string]string,
activeHostsMap map[string]bool,
hostCounts map[string]groupHostStatusCounts,
) []inframonitoringtypes.HostRecord {
metricsMap := parseFullQueryResponse(resp, groupBy)
records := make([]inframonitoringtypes.HostRecord, 0, len(pageGroups))
for _, labels := range pageGroups {
compositeKey := compositeKeyFromLabels(labels, groupBy)
hostName := labels[hostNameAttrKey]
activeStatus := inframonitoringtypes.HostStatusNone
activeHostCount := 0
inactiveHostCount := 0
if isHostNameInGroupBy { // derive from activeHostsMap since each row = one host
if hostName != "" {
if activeHostsMap[hostName] {
activeStatus = inframonitoringtypes.HostStatusActive
activeHostCount = 1
} else {
activeStatus = inframonitoringtypes.HostStatusInactive
inactiveHostCount = 1
}
}
} else { // derive from hostCounts since custom groupBy without host.name
if counts, ok := hostCounts[compositeKey]; ok {
activeHostCount = counts.Active
inactiveHostCount = counts.Inactive
}
}
record := inframonitoringtypes.HostRecord{
HostName: hostName,
Status: activeStatus,
ActiveHostCount: activeHostCount,
InactiveHostCount: inactiveHostCount,
CPU: -1,
Memory: -1,
Wait: -1,
Load15: -1,
DiskUsage: -1,
Meta: map[string]any{},
}
if metrics, ok := metricsMap[compositeKey]; ok {
if v, exists := metrics["F1"]; exists {
record.CPU = v
}
if v, exists := metrics["F2"]; exists {
record.Memory = v
}
if v, exists := metrics["F3"]; exists {
record.Wait = v
}
if v, exists := metrics["F4"]; exists {
record.DiskUsage = v
}
if v, exists := metrics["G"]; exists {
record.Load15 = v
}
}
if attrs, ok := metadataMap[compositeKey]; ok {
for k, v := range attrs {
record.Meta[k] = v
}
}
records = append(records, record)
}
return records
}
// getTopHostGroups runs a ranking query for the ordering metric, sorts the
// results, paginates, and backfills from metadataMap when the page extends
// past the metric-ranked groups.
func (m *module) getTopHostGroups(
ctx context.Context,
orgID valuer.UUID,
req *inframonitoringtypes.PostableHosts,
metadataMap map[string]map[string]string,
) ([]map[string]string, error) {
orderByKey := req.OrderBy.Key.Name
queryNamesForOrderBy := orderByToHostsQueryNames[orderByKey]
// The last entry is the formula/query whose value we sort by.
rankingQueryName := queryNamesForOrderBy[len(queryNamesForOrderBy)-1]
topReq := &qbtypes.QueryRangeRequest{
Start: uint64(req.Start),
End: uint64(req.End),
RequestType: qbtypes.RequestTypeScalar,
CompositeQuery: qbtypes.CompositeQuery{
Queries: make([]qbtypes.QueryEnvelope, 0, len(queryNamesForOrderBy)),
},
}
for _, envelope := range m.newListHostsQuery().CompositeQuery.Queries {
if !slices.Contains(queryNamesForOrderBy, envelope.GetQueryName()) {
continue
}
copied := envelope
if copied.Type == qbtypes.QueryTypeBuilder {
existingExpr := ""
if f := copied.GetFilter(); f != nil {
existingExpr = f.Expression
}
reqFilterExpr := ""
if req.Filter != nil {
reqFilterExpr = req.Filter.Expression
}
merged := mergeFilterExpressions(existingExpr, reqFilterExpr)
copied.SetFilter(&qbtypes.Filter{Expression: merged})
copied.SetGroupBy(req.GroupBy)
}
topReq.CompositeQuery.Queries = append(topReq.CompositeQuery.Queries, copied)
}
resp, err := m.querier.QueryRange(ctx, orgID, topReq)
if err != nil {
return nil, err
}
allMetricGroups := parseAndSortGroups(resp, rankingQueryName, req.GroupBy, req.OrderBy.Direction)
return paginateWithBackfill(allMetricGroups, metadataMap, req.GroupBy, req.Offset, req.Limit), nil
}
// applyHostsActiveStatusFilter MODIFIES req.Filter.Expression to include an IN/NOT IN
// clause based on FilterByStatus and the set of active hosts.
// Returns true if the caller should short-circuit with an empty result (eg. ACTIVE
// requested but no hosts are active).
func (m *module) applyHostsActiveStatusFilter(req *inframonitoringtypes.PostableHosts, activeHostsMap map[string]bool) (shouldShortCircuit bool) {
if req.Filter == nil || (req.Filter.FilterByStatus != inframonitoringtypes.HostStatusActive && req.Filter.FilterByStatus != inframonitoringtypes.HostStatusInactive) {
return false
}
activeHosts := make([]string, 0, len(activeHostsMap))
for host := range activeHostsMap {
activeHosts = append(activeHosts, fmt.Sprintf("'%s'", host))
}
if len(activeHosts) == 0 {
return req.Filter.FilterByStatus == inframonitoringtypes.HostStatusActive
}
op := "IN"
if req.Filter.FilterByStatus == inframonitoringtypes.HostStatusInactive {
op = "NOT IN"
}
statusClause := fmt.Sprintf("%s %s (%s)", hostNameAttrKey, op, strings.Join(activeHosts, ", "))
req.Filter.Expression = mergeFilterExpressions(req.Filter.Expression, statusClause)
return false
}
func (m *module) getHostsTableMetadata(ctx context.Context, req *inframonitoringtypes.PostableHosts) (map[string]map[string]string, error) {
var nonGroupByAttrs []string
for _, key := range hostAttrKeysForMetadata {
if !isKeyInGroupByAttrs(req.GroupBy, key) {
nonGroupByAttrs = append(nonGroupByAttrs, key)
}
}
var filter *qbtypes.Filter
if req.Filter != nil {
filter = &req.Filter.Filter
}
metadataMap, err := m.getMetadata(ctx, hostsTableMetricNamesList, req.GroupBy, nonGroupByAttrs, filter, req.Start, req.End)
if err != nil {
return nil, err
}
return metadataMap, nil
}
// getActiveHostsQuery builds a SelectBuilder that returns distinct host names
// with metrics reported in the last 10 minutes. The builder is not executed —
// callers can either execute it (getActiveHosts) or embed it as a subquery
// (getPerGroupActiveInactiveHostCounts).
func (m *module) getActiveHostsQuery(metricNames []string, hostNameAttr string, sinceUnixMilli int64) *sqlbuilder.SelectBuilder {
sb := sqlbuilder.NewSelectBuilder()
sb.Distinct()
sb.Select("attr_string_value")
sb.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, telemetrymetrics.AttributesMetadataTableName))
sb.Where(
sb.In("metric_name", sqlbuilder.List(metricNames)),
sb.E("attr_name", hostNameAttr),
sb.NE("attr_string_value", ""),
sb.GE("last_reported_unix_milli", sinceUnixMilli),
)
return sb
}
// getActiveHosts returns a set of host names that have reported metrics recently.
// It queries distributed_metadata for hosts where last_reported_unix_milli >= sinceUnixMilli.
func (m *module) getActiveHosts(ctx context.Context, metricNames []string, hostNameAttr string, sinceUnixMilli int64) (map[string]bool, error) {
sb := m.getActiveHostsQuery(metricNames, hostNameAttr, sinceUnixMilli)
query, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
rows, err := m.telemetryStore.ClickhouseDB().Query(ctx, query, args...)
if err != nil {
return nil, err
}
defer rows.Close()
activeHosts := make(map[string]bool)
for rows.Next() {
var hostName string
if err := rows.Scan(&hostName); err != nil {
return nil, err
}
if hostName != "" {
activeHosts[hostName] = true
}
}
if err := rows.Err(); err != nil {
return nil, err
}
return activeHosts, nil
}

View File

@@ -0,0 +1,270 @@
package implinframonitoring
import (
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
"github.com/SigNoz/signoz/pkg/types/metrictypes"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
)
const (
hostNameAttrKey = "host.name"
)
// Helper group-by key used across all queries.
var hostNameGroupByKey = qbtypes.GroupByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{
Name: hostNameAttrKey,
FieldContext: telemetrytypes.FieldContextResource,
FieldDataType: telemetrytypes.FieldDataTypeString,
},
}
var hostsTableMetricNamesList = []string{
"system.cpu.time",
"system.memory.usage",
"system.cpu.load_average.15m",
"system.filesystem.usage",
}
var hostAttrKeysForMetadata = []string{
"os.type",
}
// orderByToHostsQueryNames maps the orderBy column to the query/formula names
// from HostsTableListQuery used for ranking host groups.
var orderByToHostsQueryNames = map[string][]string{
inframonitoringtypes.HostsOrderByCPU: {"A", "B", "F1"},
inframonitoringtypes.HostsOrderByMemory: {"C", "D", "F2"},
inframonitoringtypes.HostsOrderByWait: {"E", "F", "F3"},
inframonitoringtypes.HostsOrderByDiskUsage: {"H", "I", "F4"},
inframonitoringtypes.HostsOrderByLoad15: {"G"},
}
// newListHostsQuery constructs the base QueryRangeRequest with all the queries for the hosts table.
// This is kept in this file because the queries themselves do not change based on the request parameters
// only the filters, group bys, and order bys change, which are applied in buildFullQueryRequest.
func (m *module) newListHostsQuery() *qbtypes.QueryRangeRequest {
return &qbtypes.QueryRangeRequest{
RequestType: qbtypes.RequestTypeScalar,
CompositeQuery: qbtypes.CompositeQuery{
Queries: []qbtypes.QueryEnvelope{
// Query A: CPU usage logic (non-idle)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "A",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.cpu.time",
TimeAggregation: metrictypes.TimeAggregationRate,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
Filter: &qbtypes.Filter{
Expression: "state != 'idle'",
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Query B: CPU usage (all states)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "B",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.cpu.time",
TimeAggregation: metrictypes.TimeAggregationRate,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Formula F1: CPU Usage (%)
{
Type: qbtypes.QueryTypeFormula,
Spec: qbtypes.QueryBuilderFormula{
Name: "F1",
Expression: "A/B",
Legend: "CPU Usage (%)",
Disabled: false,
},
},
// Query C: Memory usage (state = used)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "C",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.memory.usage",
TimeAggregation: metrictypes.TimeAggregationAvg,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
Filter: &qbtypes.Filter{
Expression: "state = 'used'",
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Query D: Memory usage (all states)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "D",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.memory.usage",
TimeAggregation: metrictypes.TimeAggregationAvg,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Formula F2: Memory Usage (%)
{
Type: qbtypes.QueryTypeFormula,
Spec: qbtypes.QueryBuilderFormula{
Name: "F2",
Expression: "C/D",
Legend: "Memory Usage (%)",
Disabled: false,
},
},
// Query E: CPU Wait time (state = wait)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "E",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.cpu.time",
TimeAggregation: metrictypes.TimeAggregationRate,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
Filter: &qbtypes.Filter{
Expression: "state = 'wait'",
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Query F: CPU time (all states)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "F",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.cpu.time",
TimeAggregation: metrictypes.TimeAggregationRate,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Formula F3: CPU Wait Time (%)
{
Type: qbtypes.QueryTypeFormula,
Spec: qbtypes.QueryBuilderFormula{
Name: "F3",
Expression: "E/F",
Legend: "CPU Wait Time (%)",
Disabled: false,
},
},
// Query G: Load15
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "G",
Signal: telemetrytypes.SignalMetrics,
Legend: "CPU Load Average (15m)",
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.cpu.load_average.15m",
TimeAggregation: metrictypes.TimeAggregationAvg,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: false,
},
},
// Query H: Filesystem Usage (state = used)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "H",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.filesystem.usage",
TimeAggregation: metrictypes.TimeAggregationAvg,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
Filter: &qbtypes.Filter{
Expression: "state = 'used'",
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Query I: Filesystem Usage (all states)
{
Type: qbtypes.QueryTypeBuilder,
Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Name: "I",
Signal: telemetrytypes.SignalMetrics,
Aggregations: []qbtypes.MetricAggregation{
{
MetricName: "system.filesystem.usage",
TimeAggregation: metrictypes.TimeAggregationAvg,
SpaceAggregation: metrictypes.SpaceAggregationSum,
ReduceTo: qbtypes.ReduceToAvg,
},
},
GroupBy: []qbtypes.GroupByKey{hostNameGroupByKey},
Disabled: true,
},
},
// Formula F4: Disk Usage (%)
{
Type: qbtypes.QueryTypeFormula,
Spec: qbtypes.QueryBuilderFormula{
Name: "F4",
Expression: "H/I",
Legend: "Disk Usage (%)",
Disabled: false,
},
},
},
},
}
}

View File

@@ -0,0 +1,16 @@
package implinframonitoring
// The types in this file are only used within the implinframonitoring package, and are not exposed outside.
// They are primarily used for internal processing and structuring of data within the module's implementation.
type rankedGroup struct {
labels map[string]string
value float64
compositeKey string
}
// groupHostStatusCounts holds per-group active and inactive host counts.
type groupHostStatusCounts struct {
Active int
Inactive int
}

View File

@@ -0,0 +1,161 @@
package implinframonitoring
import (
"context"
"log/slog"
"time"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/querier"
"github.com/SigNoz/signoz/pkg/telemetrymetrics"
"github.com/SigNoz/signoz/pkg/telemetrystore"
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
type module struct {
telemetryStore telemetrystore.TelemetryStore
telemetryMetadataStore telemetrytypes.MetadataStore
querier querier.Querier
fieldMapper qbtypes.FieldMapper
condBuilder qbtypes.ConditionBuilder
logger *slog.Logger
config inframonitoring.Config
}
// NewModule constructs the inframonitoring module with the provided dependencies.
func NewModule(
telemetryStore telemetrystore.TelemetryStore,
telemetryMetadataStore telemetrytypes.MetadataStore,
querier querier.Querier,
providerSettings factory.ProviderSettings,
cfg inframonitoring.Config,
) inframonitoring.Module {
fieldMapper := telemetrymetrics.NewFieldMapper()
condBuilder := telemetrymetrics.NewConditionBuilder(fieldMapper)
return &module{
telemetryStore: telemetryStore,
telemetryMetadataStore: telemetryMetadataStore,
querier: querier,
fieldMapper: fieldMapper,
condBuilder: condBuilder,
logger: providerSettings.Logger,
config: cfg,
}
}
func (m *module) ListHosts(ctx context.Context, orgID valuer.UUID, req *inframonitoringtypes.PostableHosts) (*inframonitoringtypes.Hosts, error) {
if err := req.Validate(); err != nil {
return nil, err
}
resp := &inframonitoringtypes.Hosts{}
// default to cpu order by
if req.OrderBy == nil {
req.OrderBy = &qbtypes.OrderBy{
Key: qbtypes.OrderByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{
Name: "cpu",
},
},
Direction: qbtypes.OrderDirectionDesc,
}
}
// default to host name group by
if len(req.GroupBy) == 0 {
req.GroupBy = []qbtypes.GroupByKey{hostNameGroupByKey}
resp.Type = inframonitoringtypes.ResponseTypeList
} else {
resp.Type = inframonitoringtypes.ResponseTypeGroupedList
}
// 1. Check which required metrics exist and get earliest retention time.
// If any required metric is missing, return early with the list of missing metrics.
// 2. If metrics exist but req.End is before the earliest reported time, return early with endTimeBeforeRetention=true.
missingMetrics, minFirstReportedUnixMilli, err := m.getMetricsExistenceAndEarliestTime(ctx, hostsTableMetricNamesList)
if err != nil {
return nil, err
}
if len(missingMetrics) > 0 {
resp.RequiredMetricsCheck = inframonitoringtypes.RequiredMetricsCheck{MissingMetrics: missingMetrics}
resp.Records = []inframonitoringtypes.HostRecord{}
resp.Total = 0
return resp, nil
}
if req.End < int64(minFirstReportedUnixMilli) {
resp.EndTimeBeforeRetention = true
resp.Records = []inframonitoringtypes.HostRecord{}
resp.Total = 0
return resp, nil
}
resp.RequiredMetricsCheck = inframonitoringtypes.RequiredMetricsCheck{MissingMetrics: []string{}}
// TOD(nikhilmantri0902): replace this separate ClickHouse query with a sub-query inside the main query builder query
// once QB supports sub-queries.
// Determine active hosts: those with metrics reported in the last 10 minutes.
// Compute the cutoff once so every downstream query/subquery agrees on what "active" means.
sinceUnixMilli := time.Now().Add(-10 * time.Minute).UTC().UnixMilli()
activeHostsMap, err := m.getActiveHosts(ctx, hostsTableMetricNamesList, hostNameAttrKey, sinceUnixMilli)
if err != nil {
return nil, err
}
// this check below modifies req.Filter by adding `AND active hosts filter` if req.FilterByStatus is set.
if m.applyHostsActiveStatusFilter(req, activeHostsMap) {
resp.Records = []inframonitoringtypes.HostRecord{}
resp.Total = 0
return resp, nil
}
metadataMap, err := m.getHostsTableMetadata(ctx, req)
if err != nil {
return nil, err
}
resp.Total = len(metadataMap)
pageGroups, err := m.getTopHostGroups(ctx, orgID, req, metadataMap)
if err != nil {
return nil, err
}
if len(pageGroups) == 0 {
resp.Records = []inframonitoringtypes.HostRecord{}
return resp, nil
}
hostsFilterExpr := ""
if req.Filter != nil {
hostsFilterExpr = req.Filter.Expression
}
fullQueryReq := buildFullQueryRequest(req.Start, req.End, hostsFilterExpr, req.GroupBy, pageGroups, m.newListHostsQuery())
queryResp, err := m.querier.QueryRange(ctx, orgID, fullQueryReq)
if err != nil {
return nil, err
}
// Compute per-group active/inactive host counts.
// When host.name is in groupBy, each row = one host, so counts are derived
// directly from activeHostsMap in buildHostRecords (no extra query needed).
// When host.name is not in groupBy, we need to run an additional query to get the counts per group for the current page,
// using the same filter expression as the main query (including user filters + page groups IN clause).
hostCounts := make(map[string]groupHostStatusCounts)
isHostNameInGroupBy := isKeyInGroupByAttrs(req.GroupBy, hostNameAttrKey)
if !isHostNameInGroupBy {
hostCounts, err = m.getPerGroupHostStatusCounts(ctx, req, hostsTableMetricNamesList, pageGroups, sinceUnixMilli)
if err != nil {
return nil, err
}
}
resp.Records = buildHostRecords(isHostNameInGroupBy, queryResp, pageGroups, req.GroupBy, metadataMap, activeHostsMap, hostCounts)
resp.Warning = queryResp.Warning
return resp, nil
}

View File

@@ -0,0 +1,17 @@
package inframonitoring
import (
"context"
"net/http"
"github.com/SigNoz/signoz/pkg/types/inframonitoringtypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
type Handler interface {
ListHosts(http.ResponseWriter, *http.Request)
}
type Module interface {
ListHosts(ctx context.Context, orgID valuer.UUID, req *inframonitoringtypes.PostableHosts) (*inframonitoringtypes.Hosts, error)
}

View File

@@ -0,0 +1,158 @@
package impllmpricingrule
import (
"context"
"net/http"
"time"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/http/binding"
"github.com/SigNoz/signoz/pkg/http/render"
"github.com/SigNoz/signoz/pkg/modules/llmpricingrule"
"github.com/SigNoz/signoz/pkg/types/authtypes"
"github.com/SigNoz/signoz/pkg/types/llmpricingruletypes"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/gorilla/mux"
)
const maxLimit = 100
type handler struct {
module llmpricingrule.Module
providerSettings factory.ProviderSettings
}
func NewHandler(module llmpricingrule.Module, providerSettings factory.ProviderSettings) llmpricingrule.Handler {
return &handler{module: module, providerSettings: providerSettings}
}
// List handles GET /api/v1/llm_pricing_rules.
func (h *handler) List(rw http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 10*time.Second)
defer cancel()
claims, err := authtypes.ClaimsFromContext(ctx)
if err != nil {
render.Error(rw, err)
return
}
orgID := valuer.MustNewUUID(claims.OrgID)
var q llmpricingruletypes.ListPricingRulesQuery
if err := binding.Query.BindQuery(r.URL.Query(), &q); err != nil {
render.Error(rw, err)
return
}
if q.Limit <= 0 {
q.Limit = 20
} else if q.Limit > maxLimit {
q.Limit = maxLimit
}
if q.Offset < 0 {
render.Error(rw, errors.Newf(errors.TypeInvalidInput, llmpricingruletypes.ErrCodePricingRuleInvalidInput, "offset must be a non-negative integer"))
return
}
rules, total, err := h.module.List(ctx, orgID, q.Offset, q.Limit)
if err != nil {
render.Error(rw, err)
return
}
render.Success(rw, http.StatusOK, llmpricingruletypes.NewGettableLLMPricingRulesFromLLMPricingRules(rules, total, q.Offset, q.Limit))
}
// Get handles GET /api/v1/llm_pricing_rules/{id}.
func (h *handler) Get(rw http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 10*time.Second)
defer cancel()
claims, err := authtypes.ClaimsFromContext(ctx)
if err != nil {
render.Error(rw, err)
return
}
orgID := valuer.MustNewUUID(claims.OrgID)
id, err := ruleIDFromPath(r)
if err != nil {
render.Error(rw, err)
return
}
rule, err := h.module.Get(ctx, orgID, id)
if err != nil {
render.Error(rw, err)
return
}
render.Success(rw, http.StatusOK, rule)
}
func (h *handler) Update(rw http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
claims, err := authtypes.ClaimsFromContext(ctx)
if err != nil {
render.Error(rw, err)
return
}
orgID := valuer.MustNewUUID(claims.OrgID)
req := new(llmpricingruletypes.UpdatableLLMPricingRules)
if err := binding.JSON.BindBody(r.Body, req); err != nil {
render.Error(rw, err)
return
}
err = h.module.Update(ctx, orgID, claims.Email, req.Rules)
if err != nil {
render.Error(rw, err)
return
}
render.Success(rw, http.StatusNoContent, nil)
}
// Delete handles DELETE /api/v1/llm_pricing_rules/{id}.
func (h *handler) Delete(rw http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithTimeout(r.Context(), 10*time.Second)
defer cancel()
claims, err := authtypes.ClaimsFromContext(ctx)
if err != nil {
render.Error(rw, err)
return
}
orgID := valuer.MustNewUUID(claims.OrgID)
id, err := ruleIDFromPath(r)
if err != nil {
render.Error(rw, err)
return
}
if err := h.module.Delete(ctx, orgID, id); err != nil {
render.Error(rw, err)
return
}
render.Success(rw, http.StatusNoContent, nil)
}
// ruleIDFromPath extracts and validates the {id} path variable.
func ruleIDFromPath(r *http.Request) (valuer.UUID, error) {
raw := mux.Vars(r)["id"]
id, err := valuer.NewUUID(raw)
if err != nil {
return valuer.UUID{}, errors.Wrapf(err, errors.TypeInvalidInput, llmpricingruletypes.ErrCodePricingRuleInvalidInput, "id is not a valid uuid")
}
return id, nil
}

View File

@@ -0,0 +1,24 @@
package llmpricingrule
import (
"context"
"net/http"
"github.com/SigNoz/signoz/pkg/types/llmpricingruletypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
type Module interface {
List(ctx context.Context, orgID valuer.UUID, offset, limit int) ([]*llmpricingruletypes.LLMPricingRule, int, error)
Get(ctx context.Context, orgID valuer.UUID, id valuer.UUID) (*llmpricingruletypes.LLMPricingRule, error)
Update(ctx context.Context, orgID valuer.UUID, userEmail string, rules []llmpricingruletypes.UpdatableLLMPricingRule) (err error)
Delete(ctx context.Context, orgID, id valuer.UUID) error
}
// Handler defines the HTTP handler interface for pricing rule endpoints.
type Handler interface {
List(rw http.ResponseWriter, r *http.Request)
Get(rw http.ResponseWriter, r *http.Request)
Update(rw http.ResponseWriter, r *http.Request)
Delete(rw http.ResponseWriter, r *http.Request)
}

View File

@@ -23,6 +23,7 @@ import (
"github.com/SigNoz/signoz/pkg/identn"
"github.com/SigNoz/signoz/pkg/instrumentation"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/serviceaccount"
"github.com/SigNoz/signoz/pkg/modules/user"
@@ -114,6 +115,9 @@ type Config struct {
// MetricsExplorer config
MetricsExplorer metricsexplorer.Config `mapstructure:"metricsexplorer"`
// InfraMonitoring config
InfraMonitoring inframonitoring.Config `mapstructure:"inframonitoring"`
// Flagger config
Flagger flagger.Config `mapstructure:"flagger"`
@@ -157,6 +161,7 @@ func NewConfig(ctx context.Context, logger *slog.Logger, resolverConfig config.R
gateway.NewConfigFactory(),
tokenizer.NewConfigFactory(),
metricsexplorer.NewConfigFactory(),
inframonitoring.NewConfigFactory(),
flagger.NewConfigFactory(),
user.NewConfigFactory(),
identn.NewConfigFactory(),

View File

@@ -3,8 +3,6 @@ package signoz
import (
"github.com/SigNoz/signoz/pkg/alertmanager"
"github.com/SigNoz/signoz/pkg/alertmanager/signozalertmanager"
"github.com/SigNoz/signoz/pkg/ruler"
"github.com/SigNoz/signoz/pkg/ruler/signozruler"
"github.com/SigNoz/signoz/pkg/analytics"
"github.com/SigNoz/signoz/pkg/authz"
"github.com/SigNoz/signoz/pkg/authz/signozauthzapi"
@@ -22,6 +20,10 @@ import (
"github.com/SigNoz/signoz/pkg/modules/dashboard/impldashboard"
"github.com/SigNoz/signoz/pkg/modules/fields"
"github.com/SigNoz/signoz/pkg/modules/fields/implfields"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring/implinframonitoring"
"github.com/SigNoz/signoz/pkg/modules/llmpricingrule"
"github.com/SigNoz/signoz/pkg/modules/llmpricingrule/impllmpricingrule"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer/implmetricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/quickfilter"
@@ -41,6 +43,8 @@ import (
"github.com/SigNoz/signoz/pkg/modules/tracefunnel"
"github.com/SigNoz/signoz/pkg/modules/tracefunnel/impltracefunnel"
"github.com/SigNoz/signoz/pkg/querier"
"github.com/SigNoz/signoz/pkg/ruler"
"github.com/SigNoz/signoz/pkg/ruler/signozruler"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/zeus"
)
@@ -55,6 +59,7 @@ type Handlers struct {
SpanPercentile spanpercentile.Handler
Services services.Handler
MetricsExplorer metricsexplorer.Handler
InfraMonitoring inframonitoring.Handler
Global global.Handler
FlaggerHandler flagger.Handler
GatewayHandler gateway.Handler
@@ -68,6 +73,7 @@ type Handlers struct {
RuleStateHistory rulestatehistory.Handler
AlertmanagerHandler alertmanager.Handler
RulerHandler ruler.Handler
LLMPricingRuleHandler llmpricingrule.Handler
}
func NewHandlers(
@@ -95,6 +101,7 @@ func NewHandlers(
RawDataExport: implrawdataexport.NewHandler(modules.RawDataExport),
Services: implservices.NewHandler(modules.Services),
MetricsExplorer: implmetricsexplorer.NewHandler(modules.MetricsExplorer),
InfraMonitoring: implinframonitoring.NewHandler(modules.InfraMonitoring),
SpanPercentile: implspanpercentile.NewHandler(modules.SpanPercentile),
Global: signozglobal.NewHandler(global),
FlaggerHandler: flagger.NewHandler(flaggerService),
@@ -109,5 +116,6 @@ func NewHandlers(
CloudIntegrationHandler: implcloudintegration.NewHandler(modules.CloudIntegration),
AlertmanagerHandler: signozalertmanager.NewHandler(alertmanagerService),
RulerHandler: signozruler.NewHandler(rulerService),
LLMPricingRuleHandler: impllmpricingrule.NewHandler(nil, providerSettings),
}
}

View File

@@ -14,6 +14,8 @@ import (
"github.com/SigNoz/signoz/pkg/modules/authdomain/implauthdomain"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
"github.com/SigNoz/signoz/pkg/modules/dashboard"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring/implinframonitoring"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer/implmetricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/organization"
@@ -69,6 +71,7 @@ type Modules struct {
Services services.Module
SpanPercentile spanpercentile.Module
MetricsExplorer metricsexplorer.Module
InfraMonitoring inframonitoring.Module
Promote promote.Module
ServiceAccount serviceaccount.Module
CloudIntegration cloudintegration.Module
@@ -119,6 +122,7 @@ func NewModules(
SpanPercentile: implspanpercentile.NewModule(querier, providerSettings),
Services: implservices.NewModule(querier, telemetryStore),
MetricsExplorer: implmetricsexplorer.NewModule(telemetryStore, telemetryMetadataStore, cache, ruleStore, dashboard, providerSettings, config.MetricsExplorer),
InfraMonitoring: implinframonitoring.NewModule(telemetryStore, telemetryMetadataStore, querier, providerSettings, config.InfraMonitoring),
Promote: implpromote.NewModule(telemetryMetadataStore, telemetryStore),
ServiceAccount: serviceAccount,
RuleStateHistory: implrulestatehistory.NewModule(implrulestatehistory.NewStore(telemetryStore, telemetryMetadataStore, providerSettings.Logger)),

View File

@@ -17,11 +17,13 @@ import (
"github.com/SigNoz/signoz/pkg/global"
"github.com/SigNoz/signoz/pkg/http/handler"
"github.com/SigNoz/signoz/pkg/instrumentation"
"github.com/SigNoz/signoz/pkg/modules/llmpricingrule"
"github.com/SigNoz/signoz/pkg/modules/authdomain"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
"github.com/SigNoz/signoz/pkg/modules/dashboard"
"github.com/SigNoz/signoz/pkg/modules/fields"
"github.com/SigNoz/signoz/pkg/modules/metricsexplorer"
"github.com/SigNoz/signoz/pkg/modules/inframonitoring"
"github.com/SigNoz/signoz/pkg/modules/organization"
"github.com/SigNoz/signoz/pkg/modules/preference"
"github.com/SigNoz/signoz/pkg/modules/promote"
@@ -61,6 +63,7 @@ func NewOpenAPI(ctx context.Context, instrumentation instrumentation.Instrumenta
struct{ dashboard.Module }{},
struct{ dashboard.Handler }{},
struct{ metricsexplorer.Handler }{},
struct{ inframonitoring.Handler }{},
struct{ gateway.Handler }{},
struct{ fields.Handler }{},
struct{ authz.Handler }{},
@@ -72,6 +75,7 @@ func NewOpenAPI(ctx context.Context, instrumentation instrumentation.Instrumenta
struct{ cloudintegration.Handler }{},
struct{ rulestatehistory.Handler }{},
struct{ alertmanager.Handler }{},
struct{ llmpricingrule.Handler }{},
struct{ ruler.Handler }{},
).New(ctx, instrumentation.ToProviderSettings(), apiserver.Config{})
if err != nil {

View File

@@ -3,8 +3,6 @@ package signoz
import (
"github.com/SigNoz/signoz/pkg/alertmanager"
"github.com/SigNoz/signoz/pkg/alertmanager/nfmanager"
"github.com/SigNoz/signoz/pkg/auditor"
"github.com/SigNoz/signoz/pkg/auditor/noopauditor"
"github.com/SigNoz/signoz/pkg/alertmanager/nfmanager/rulebasednotification"
"github.com/SigNoz/signoz/pkg/alertmanager/signozalertmanager"
"github.com/SigNoz/signoz/pkg/analytics"
@@ -12,6 +10,8 @@ import (
"github.com/SigNoz/signoz/pkg/analytics/segmentanalytics"
"github.com/SigNoz/signoz/pkg/apiserver"
"github.com/SigNoz/signoz/pkg/apiserver/signozapiserver"
"github.com/SigNoz/signoz/pkg/auditor"
"github.com/SigNoz/signoz/pkg/auditor/noopauditor"
"github.com/SigNoz/signoz/pkg/authz"
"github.com/SigNoz/signoz/pkg/cache"
"github.com/SigNoz/signoz/pkg/cache/memorycache"
@@ -227,8 +227,6 @@ func NewAlertmanagerProviderFactories(sqlstore sqlstore.SQLStore, orgGetter orga
)
}
func NewEmailingProviderFactories() factory.NamedMap[factory.ProviderFactory[emailing.Emailing, emailing.Config]] {
return factory.MustNewNamedMap(
noopemailing.NewFactory(),
@@ -272,6 +270,7 @@ func NewAPIServerProviderFactories(orgGetter organization.Getter, authz authz.Au
modules.Dashboard,
handlers.Dashboard,
handlers.MetricsExplorer,
handlers.InfraMonitoring,
handlers.GatewayHandler,
handlers.Fields,
handlers.AuthzHandler,
@@ -283,6 +282,7 @@ func NewAPIServerProviderFactories(orgGetter organization.Getter, authz authz.Au
handlers.CloudIntegrationHandler,
handlers.RuleStateHistory,
handlers.AlertmanagerHandler,
handlers.LLMPricingRuleHandler,
handlers.RulerHandler,
),
)

View File

@@ -0,0 +1,21 @@
package inframonitoringtypes
import (
"github.com/SigNoz/signoz/pkg/valuer"
)
type ResponseType struct {
valuer.String
}
var (
ResponseTypeList = ResponseType{valuer.NewString("list")}
ResponseTypeGroupedList = ResponseType{valuer.NewString("grouped_list")}
)
func (ResponseType) Enum() []any {
return []any{
ResponseTypeList,
ResponseTypeGroupedList,
}
}

View File

@@ -0,0 +1,117 @@
package inframonitoringtypes
import (
"encoding/json"
"slices"
"github.com/SigNoz/signoz/pkg/errors"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
)
type Hosts struct {
Type ResponseType `json:"type" required:"true"`
Records []HostRecord `json:"records" required:"true"`
Total int `json:"total" required:"true"`
RequiredMetricsCheck RequiredMetricsCheck `json:"requiredMetricsCheck" required:"true"`
EndTimeBeforeRetention bool `json:"endTimeBeforeRetention" required:"true"`
Warning *qbtypes.QueryWarnData `json:"warning,omitempty"`
}
type HostRecord struct {
HostName string `json:"hostName" required:"true"`
Status HostStatus `json:"status" required:"true"`
ActiveHostCount int `json:"activeHostCount" required:"true"`
InactiveHostCount int `json:"inactiveHostCount" required:"true"`
CPU float64 `json:"cpu" required:"true"`
Memory float64 `json:"memory" required:"true"`
Wait float64 `json:"wait" required:"true"`
Load15 float64 `json:"load15" required:"true"`
DiskUsage float64 `json:"diskUsage" required:"true"`
Meta map[string]interface{} `json:"meta" required:"true"`
}
type RequiredMetricsCheck struct {
MissingMetrics []string `json:"missingMetrics" required:"true"`
}
type PostableHosts struct {
Start int64 `json:"start" required:"true"`
End int64 `json:"end" required:"true"`
Filter *HostFilter `json:"filter"`
GroupBy []qbtypes.GroupByKey `json:"groupBy"`
OrderBy *qbtypes.OrderBy `json:"orderBy"`
Offset int `json:"offset"`
Limit int `json:"limit" required:"true"`
}
type HostFilter struct {
qbtypes.Filter `json:",inline"`
FilterByStatus HostStatus `json:"filterByStatus"`
}
// Validate ensures HostsListRequest contains acceptable values.
func (req *PostableHosts) Validate() error {
if req == nil {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "request is nil")
}
if req.Start <= 0 {
return errors.NewInvalidInputf(
errors.CodeInvalidInput,
"invalid start time %d: start must be greater than 0",
req.Start,
)
}
if req.End <= 0 {
return errors.NewInvalidInputf(
errors.CodeInvalidInput,
"invalid end time %d: end must be greater than 0",
req.End,
)
}
if req.Start >= req.End {
return errors.NewInvalidInputf(
errors.CodeInvalidInput,
"invalid time range: start (%d) must be less than end (%d)",
req.Start,
req.End,
)
}
if req.Limit < 1 || req.Limit > 5000 {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "limit must be between 1 and 5000")
}
if req.Offset < 0 {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "offset cannot be negative")
}
if req.Filter != nil && !req.Filter.FilterByStatus.IsZero() &&
req.Filter.FilterByStatus != HostStatusActive && req.Filter.FilterByStatus != HostStatusInactive {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "invalid filter by status: %s", req.Filter.FilterByStatus)
}
if req.OrderBy != nil {
if !slices.Contains(HostsValidOrderByKeys, req.OrderBy.Key.Name) {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "invalid order by key: %s", req.OrderBy.Key.Name)
}
if req.OrderBy.Direction != qbtypes.OrderDirectionAsc && req.OrderBy.Direction != qbtypes.OrderDirectionDesc {
return errors.NewInvalidInputf(errors.CodeInvalidInput, "invalid order by direction: %s", req.OrderBy.Direction)
}
}
return nil
}
// UnmarshalJSON validates input immediately after decoding.
func (req *PostableHosts) UnmarshalJSON(data []byte) error {
type raw PostableHosts
var decoded raw
if err := json.Unmarshal(data, &decoded); err != nil {
return err
}
*req = PostableHosts(decoded)
return req.Validate()
}

View File

@@ -0,0 +1,37 @@
package inframonitoringtypes
import "github.com/SigNoz/signoz/pkg/valuer"
type HostStatus struct {
valuer.String
}
var (
HostStatusActive = HostStatus{valuer.NewString("active")}
HostStatusInactive = HostStatus{valuer.NewString("inactive")}
HostStatusNone = HostStatus{valuer.NewString("")}
)
func (HostStatus) Enum() []any {
return []any{
HostStatusActive,
HostStatusInactive,
HostStatusNone,
}
}
const (
HostsOrderByCPU = "cpu"
HostsOrderByMemory = "memory"
HostsOrderByWait = "wait"
HostsOrderByDiskUsage = "disk_usage"
HostsOrderByLoad15 = "load15"
)
var HostsValidOrderByKeys = []string{
HostsOrderByCPU,
HostsOrderByMemory,
HostsOrderByWait,
HostsOrderByDiskUsage,
HostsOrderByLoad15,
}

View File

@@ -0,0 +1,244 @@
package inframonitoringtypes
import (
"testing"
"github.com/SigNoz/signoz/pkg/errors"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/stretchr/testify/require"
)
func TestHostsListRequest_Validate(t *testing.T) {
tests := []struct {
name string
req *PostableHosts
wantErr bool
}{
{
name: "valid request",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
},
wantErr: false,
},
{
name: "nil request",
req: nil,
wantErr: true,
},
{
name: "start time zero",
req: &PostableHosts{
Start: 0,
End: 2000,
Limit: 100,
Offset: 0,
},
wantErr: true,
},
{
name: "start time negative",
req: &PostableHosts{
Start: -1000,
End: 2000,
Limit: 100,
Offset: 0,
},
wantErr: true,
},
{
name: "end time zero",
req: &PostableHosts{
Start: 1000,
End: 0,
Limit: 100,
Offset: 0,
},
wantErr: true,
},
{
name: "start time greater than end time",
req: &PostableHosts{
Start: 2000,
End: 1000,
Limit: 100,
Offset: 0,
},
wantErr: true,
},
{
name: "start time equal to end time",
req: &PostableHosts{
Start: 1000,
End: 1000,
Limit: 100,
Offset: 0,
},
wantErr: true,
},
{
name: "limit zero",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 0,
Offset: 0,
},
wantErr: true,
},
{
name: "limit negative",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: -10,
Offset: 0,
},
wantErr: true,
},
{
name: "limit exceeds max",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 5001,
Offset: 0,
},
wantErr: true,
},
{
name: "offset negative",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: -5,
},
wantErr: true,
},
{
name: "filter by status ACTIVE",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
Filter: &HostFilter{FilterByStatus: HostStatusActive},
},
wantErr: false,
},
{
name: "filter by status INACTIVE",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
Filter: &HostFilter{FilterByStatus: HostStatusInactive},
},
wantErr: false,
},
{
name: "filter by status empty (zero value)",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
},
wantErr: false,
},
{
name: "filter by status invalid value",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
Filter: &HostFilter{FilterByStatus: HostStatus{valuer.NewString("UNKNOWN")}},
},
wantErr: true,
},
{
name: "orderBy nil is valid",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
},
wantErr: false,
},
{
name: "orderBy with valid key cpu and direction asc",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
OrderBy: &qbtypes.OrderBy{
Key: qbtypes.OrderByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{
Name: HostsOrderByCPU,
},
},
Direction: qbtypes.OrderDirectionAsc,
},
},
wantErr: false,
},
{
name: "orderBy with invalid key",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
OrderBy: &qbtypes.OrderBy{
Key: qbtypes.OrderByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{
Name: "unknown",
},
},
Direction: qbtypes.OrderDirectionDesc,
},
},
wantErr: true,
},
{
name: "orderBy with valid key but invalid direction",
req: &PostableHosts{
Start: 1000,
End: 2000,
Limit: 100,
Offset: 0,
OrderBy: &qbtypes.OrderBy{
Key: qbtypes.OrderByKey{
TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{
Name: HostsOrderByMemory,
},
},
Direction: qbtypes.OrderDirection{String: valuer.NewString("invalid")},
},
},
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := tt.req.Validate()
if tt.wantErr {
require.Error(t, err)
require.True(t, errors.Ast(err, errors.TypeInvalidInput), "expected error to be of type InvalidInput")
} else {
require.NoError(t, err)
}
})
}
}

View File

@@ -0,0 +1,144 @@
package llmpricingruletypes
import (
"time"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/types"
"github.com/SigNoz/signoz/pkg/valuer"
)
var (
ErrCodePricingRuleNotFound = errors.MustNewCode("pricing_rule_not_found")
ErrCodePricingRuleInvalidInput = errors.MustNewCode("pricing_rule_invalid_input")
)
type LLMPricingRuleUnit struct {
valuer.String
}
var (
UnitPerMillionTokens = LLMPricingRuleUnit{valuer.NewString("per_million_tokens")}
)
type LLMPricingRuleCacheMode struct {
valuer.String
}
var (
// LLMPricingRuleCacheModeSubtract: cached tokens are inside input_tokens (OpenAI-style).
LLMPricingRuleCacheModeSubtract = LLMPricingRuleCacheMode{valuer.NewString("subtract")}
// LLMPricingRuleCacheModeAdditive: cached tokens are reported separately (Anthropic-style).
LLMPricingRuleCacheModeAdditive = LLMPricingRuleCacheMode{valuer.NewString("additive")}
// LLMPricingRuleCacheModeUnknown: provider behaviour is unknown; falls back to subtract.
LLMPricingRuleCacheModeUnknown = LLMPricingRuleCacheMode{valuer.NewString("unknown")}
)
// LLMPricingRule is the domain model for an LLM pricing rule.
// It also doubles as the HTTP response shape; see GettablePricingRule.
type LLMPricingRule struct {
types.TimeAuditable
types.UserAuditable
ID valuer.UUID `json:"id" required:"true"`
OrgID valuer.UUID `json:"orgId" required:"true"`
SourceID *valuer.UUID `json:"sourceId,omitempty"`
Model string `json:"modelName" required:"true"`
ModelPattern []string `json:"modelPattern" required:"true"`
Unit LLMPricingRuleUnit `json:"unit" required:"true"`
CacheMode LLMPricingRuleCacheMode `json:"cacheMode" required:"true"`
CostInput float64 `json:"costInput" required:"true"`
CostOutput float64 `json:"costOutput" required:"true"`
CostCacheRead float64 `json:"costCacheRead" required:"true"`
CostCacheWrite float64 `json:"costCacheWrite" required:"true"`
IsOverride bool `json:"isOverride" required:"true"`
SyncedAt *time.Time `json:"syncedAt,omitempty"`
Enabled bool `json:"enabled" required:"true"`
}
// GettablePricingRule is a type alias for PricingRule — the response shape is
// identical to the core type, so per pkg/types conventions we do not mint a
// separate flavor.
type GettableLLMPricingRule = LLMPricingRule
// UpdatablePricingRule is one entry in the bulk upsert batch.
//
// Identification:
// - ID set → match by id (user editing a known row).
// - SourceID set → match by source_id (Zeus sync, or user editing a Zeus-synced row).
// - neither set → insert a new row with source_id = NULL (user-created custom rule).
//
// IsOverride is a pointer so the caller can distinguish "not sent" from "set to false".
// When IsOverride is nil AND the matched row has is_override = true, the row is fully
// preserved — only synced_at is stamped.
type UpdatableLLMPricingRule struct {
ID *valuer.UUID `json:"id,omitempty"`
SourceID *valuer.UUID `json:"sourceId,omitempty"`
Model string `json:"modelName" required:"true"`
ModelPattern []string `json:"modelPattern" required:"true"`
Unit LLMPricingRuleUnit `json:"unit" required:"true"`
CacheMode LLMPricingRuleCacheMode `json:"cacheMode" required:"true"`
CostInput float64 `json:"costInput" required:"true"`
CostOutput float64 `json:"costOutput" required:"true"`
CostCacheRead float64 `json:"costCacheRead" required:"true"`
CostCacheWrite float64 `json:"costCacheWrite" required:"true"`
IsOverride *bool `json:"isOverride,omitempty"`
Enabled bool `json:"enabled" required:"true"`
}
type UpdatableLLMPricingRules struct {
Rules []UpdatableLLMPricingRule `json:"rules" required:"true"`
}
type ListPricingRulesQuery struct {
Offset int `query:"offset" json:"offset"`
Limit int `query:"limit" json:"limit"`
}
type GettablePricingRules struct {
Items []*GettableLLMPricingRule `json:"items" required:"true"`
Total int `json:"total" required:"true"`
Offset int `json:"offset" required:"true"`
Limit int `json:"limit" required:"true"`
}
func (LLMPricingRuleUnit) Enum() []any {
return []any{UnitPerMillionTokens}
}
func (LLMPricingRuleCacheMode) Enum() []any {
return []any{LLMPricingRuleCacheModeSubtract, LLMPricingRuleCacheModeAdditive, LLMPricingRuleCacheModeUnknown}
}
func NewLLMPricingRuleFromStorable(s *StorableLLMPricingRule) *LLMPricingRule {
pattern := make([]string, len(s.ModelPattern))
copy(pattern, s.ModelPattern)
return &LLMPricingRule{
TimeAuditable: s.TimeAuditable,
UserAuditable: s.UserAuditable,
ID: s.ID,
OrgID: s.OrgID,
SourceID: s.SourceID,
Model: s.Model,
ModelPattern: pattern,
Unit: s.Unit,
CacheMode: s.CacheMode,
CostInput: s.CostInput,
CostOutput: s.CostOutput,
CostCacheRead: s.CostCacheRead,
CostCacheWrite: s.CostCacheWrite,
IsOverride: s.IsOverride,
SyncedAt: s.SyncedAt,
Enabled: s.Enabled,
}
}
func NewGettableLLMPricingRulesFromLLMPricingRules(items []*LLMPricingRule, total, offset, limit int) *GettablePricingRules {
return &GettablePricingRules{
Items: items,
Total: total,
Offset: offset,
Limit: limit,
}
}

View File

@@ -0,0 +1,67 @@
package llmpricingruletypes
import (
"database/sql/driver"
"encoding/json"
"time"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/types"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/uptrace/bun"
)
// StringSlice is a []string that is stored as a JSON text column.
// It is compatible with both SQLite and PostgreSQL.
type StringSlice []string
// StorableLLMPricingRule is the bun/DB representation of an LLM pricing rule.
type StorableLLMPricingRule struct {
bun.BaseModel `bun:"table:llm_pricing_rules,alias:llm_pricing_rules"`
types.Identifiable
types.TimeAuditable
types.UserAuditable
OrgID valuer.UUID `bun:"org_id,type:text,notnull"`
SourceID *valuer.UUID `bun:"source_id,type:text"`
Model string `bun:"model,type:text,notnull"`
ModelPattern StringSlice `bun:"model_pattern,type:text,notnull"`
Unit LLMPricingRuleUnit `bun:"unit,type:text,notnull"`
CacheMode LLMPricingRuleCacheMode `bun:"cache_mode,type:text,notnull"`
CostInput float64 `bun:"cost_input,notnull"`
CostOutput float64 `bun:"cost_output,notnull"`
CostCacheRead float64 `bun:"cost_cache_read,notnull"`
CostCacheWrite float64 `bun:"cost_cache_write,notnull"`
// IsOverride marks the row as user-pinned. When true, Zeus skips it entirely.
IsOverride bool `bun:"is_override,notnull,default:false"`
SyncedAt *time.Time `bun:"synced_at"`
Enabled bool `bun:"enabled,notnull,default:true"`
}
func (s StringSlice) Value() (driver.Value, error) {
if s == nil {
return "[]", nil
}
b, err := json.Marshal(s)
if err != nil {
return nil, err
}
return string(b), nil
}
func (s *StringSlice) Scan(src any) error {
var raw []byte
switch v := src.(type) {
case string:
raw = []byte(v)
case []byte:
raw = v
case nil:
*s = nil
return nil
default:
return errors.NewInternalf(errors.CodeInternal, "llmpricingruletypes: cannot scan %T into StringSlice", src)
}
return json.Unmarshal(raw, s)
}

View File

@@ -0,0 +1,16 @@
package llmpricingruletypes
import (
"context"
"github.com/SigNoz/signoz/pkg/valuer"
)
type Store interface {
List(ctx context.Context, orgID valuer.UUID, offset, limit int) ([]*StorableLLMPricingRule, int, error)
Get(ctx context.Context, orgID, id valuer.UUID) (*StorableLLMPricingRule, error)
GetBySourceID(ctx context.Context, orgID, sourceID valuer.UUID) (*StorableLLMPricingRule, error)
Create(ctx context.Context, rule *StorableLLMPricingRule) error
Update(ctx context.Context, rule *StorableLLMPricingRule) error
Delete(ctx context.Context, orgID, id valuer.UUID) error
}

19
tests/.dockerignore Normal file
View File

@@ -0,0 +1,19 @@
# Build context for tests/Dockerfile.seeder. Keep the context lean — the
# seeder image only needs fixtures/ to be importable alongside seeder/,
# plus pyproject.toml + uv.lock for dep install.
.venv
.pytest_cache
tmp
**/__pycache__
**/*.pyc
# e2e Playwright outputs and deps
e2e/node_modules
e2e/artifacts
e2e/.auth
e2e/.playwright-cli
# Integration-side outputs (if any stale dirs remain)
integration/tmp
integration/testdata

35
tests/Dockerfile.seeder Normal file
View File

@@ -0,0 +1,35 @@
# HTTP seeder for Playwright e2e tests. Wraps the direct-ClickHouse-insert
# helpers in tests/fixtures/{traces,logs,metrics}.py so a browser test can
# seed telemetry with fine-grained control.
#
# Build context is tests/ (this file sits at its root) so `fixtures/` is
# importable inside the image alongside `seeder/`.
FROM python:3.13-slim
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
WORKDIR /app
RUN apt-get update \
&& apt-get install -y --no-install-recommends gcc libpq-dev python3-dev \
&& rm -rf /var/lib/apt/lists/*
# Install project dependencies from the pytest project's pyproject.toml +
# uv.lock so the seeder container's Python env matches local dev exactly
# (single source of truth for versions; no parallel requirements.txt).
# --no-install-project skips building the signoz-tests project itself
# (there is no buildable package here — pyproject is used purely for dep
# management alongside pythonpath = ["."]).
COPY pyproject.toml uv.lock /app/
RUN uv sync --frozen --no-install-project --no-dev
ENV PATH="/app/.venv/bin:$PATH"
# Ship the whole fixtures/ package so server.py can `from fixtures.traces
# import ...` with the same module path the pytest side uses.
COPY fixtures /app/fixtures
COPY seeder /app/seeder
EXPOSE 8080
CMD ["uvicorn", "seeder.server:app", "--host", "0.0.0.0", "--port", "8080"]

View File

@@ -17,12 +17,13 @@ pytest_plugins = [
"fixtures.traces",
"fixtures.metrics",
"fixtures.meter",
"fixtures.driver",
"fixtures.browser",
"fixtures.keycloak",
"fixtures.idp",
"fixtures.idputils",
"fixtures.notification_channel",
"fixtures.alerts",
"fixtures.cloudintegrations",
"fixtures.seeder",
]

15
tests/e2e/.env.example Normal file
View File

@@ -0,0 +1,15 @@
# Copy this to .env and fill in values for staging-mode runs.
#
# This file (.env) holds user-provided defaults — staging credentials, role
# override. It is loaded by playwright.config.ts via dotenv.
#
# Local-mode runs (`cd tests && uv run pytest ... e2e/bootstrap/setup.py::test_setup`)
# bring up a containerized backend and write .env.local, which overrides .env.
# You do NOT need to touch this file for local mode.
# Staging base URL (set to opt out of local backend bring-up)
SIGNOZ_E2E_BASE_URL=https://app.us.staging.signoz.cloud
# Test credentials (required only when SIGNOZ_E2E_BASE_URL is set — i.e. staging mode)
SIGNOZ_E2E_USERNAME=
SIGNOZ_E2E_PASSWORD=

38
tests/e2e/.eslintignore Normal file
View File

@@ -0,0 +1,38 @@
# Dependencies
node_modules/
# Build outputs
dist/
build/
# Test results
test-results/
playwright-report/
coverage/
# Environment files
.env
.env.local
.env.production
# Editor files
.vscode/
.idea/
*.swp
*.swo
# OS files
.DS_Store
Thumbs.db
# Logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# Runtime data
pids
*.pid
*.seed
*.pid.lock

68
tests/e2e/.eslintrc.js Normal file
View File

@@ -0,0 +1,68 @@
module.exports = {
parser: '@typescript-eslint/parser',
parserOptions: {
ecmaVersion: 2022,
sourceType: 'module',
},
extends: [
'eslint:recommended',
'plugin:@typescript-eslint/recommended',
'plugin:playwright/recommended',
],
env: {
node: true,
es2022: true,
},
rules: {
// Code Quality
'@typescript-eslint/no-unused-vars': 'error',
'@typescript-eslint/no-explicit-any': 'warn',
'prefer-const': 'error',
'no-var': 'error',
// Formatting Rules (ESLint handles formatting)
'semi': ['error', 'always'],
'quotes': ['error', 'single', { avoidEscape: true }],
'comma-dangle': ['error', 'always-multiline'],
'indent': ['error', 2, { SwitchCase: 1 }],
'object-curly-spacing': ['error', 'always'],
'array-bracket-spacing': ['error', 'never'],
'space-before-function-paren': ['error', {
anonymous: 'always',
named: 'never',
asyncArrow: 'always',
}],
'keyword-spacing': 'error',
'space-infix-ops': 'error',
'eol-last': 'error',
'no-trailing-spaces': 'error',
'no-multiple-empty-lines': ['error', { max: 2, maxEOF: 1 }],
// Playwright-specific (enhanced)
'playwright/expect-expect': 'error',
'playwright/no-conditional-in-test': 'error',
'playwright/no-page-pause': 'error',
'playwright/no-wait-for-timeout': 'warn',
'playwright/prefer-web-first-assertions': 'error',
// Console usage
'no-console': ['warn', { allow: ['warn', 'error'] }],
},
overrides: [
{
// Config files can use console and have relaxed formatting
files: ['*.config.{js,ts}', 'playwright.config.ts'],
rules: {
'no-console': 'off',
'@typescript-eslint/no-explicit-any': 'off',
},
},
{
// Test files specific rules
files: ['**/*.spec.ts', '**/*.test.ts'],
rules: {
'@typescript-eslint/no-explicit-any': 'off', // Page objects often need any
},
},
],
};

24
tests/e2e/.gitignore vendored Normal file
View File

@@ -0,0 +1,24 @@
node_modules/
# All Playwright output — HTML report, JSON summary, per-test traces /
# screenshots / videos. Set via outputDir + reporter paths in playwright.config.ts.
/artifacts/
/playwright/.cache/
.env
dist/
*.log
yarn-error.log
.yarn/cache
.yarn/install-state.gz
.vscode/
# playwright-cli artifacts (snapshots, screenshots, videos, traces)
.playwright-cli/
# backend coordinates written by the pytest bootstrap (bootstrap/setup.py);
# loaded by playwright.config.ts via dotenv override.
.env.local
# AI test-planner scratch (playwright-test-planner writes markdown plans
# here before the generator turns them into .spec.ts files; the tests are
# the source of truth, plans are regenerable).
specs/

30
tests/e2e/.prettierignore Normal file
View File

@@ -0,0 +1,30 @@
# Dependencies
node_modules/
# Generated test outputs
artifacts/
playwright/.cache/
# Build outputs
dist/
# Environment files
.env
.env.local
.env*.local
# Lock files
yarn.lock
package-lock.json
pnpm-lock.yaml
# Logs
*.log
yarn-error.log
# IDE
.vscode/
.idea/
# Other
.DS_Store

View File

@@ -0,0 +1,6 @@
{
"useTabs": false,
"tabWidth": 2,
"singleQuote": true,
"trailingComma": "all"
}

View File

@@ -0,0 +1,44 @@
import os
from pathlib import Path
import pytest
from fixtures import types
from fixtures.auth import USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD
def _env_file(pytestconfig: pytest.Config) -> Path:
override = os.environ.get("SIGNOZ_E2E_ENV_FILE")
if override:
return Path(override)
return pytestconfig.rootpath / "e2e" / ".env.local"
def test_setup(
signoz: types.SigNoz,
create_user_admin: types.Operation, # pylint: disable=unused-argument
apply_license: types.Operation, # pylint: disable=unused-argument
seeder: types.TestContainerDocker,
pytestconfig: pytest.Config,
) -> None:
"""Bring the backend up and write e2e coordinates to .env.local."""
host_cfg = signoz.self.host_configs["8080"]
seeder_cfg = seeder.host_configs["8080"]
out = _env_file(pytestconfig)
out.parent.mkdir(parents=True, exist_ok=True)
out.write_text(
"# Generated by tests/e2e/bootstrap/setup.py — do not edit.\n"
f"SIGNOZ_E2E_BASE_URL={host_cfg.base()}\n"
f"SIGNOZ_E2E_USERNAME={USER_ADMIN_EMAIL}\n"
f"SIGNOZ_E2E_PASSWORD={USER_ADMIN_PASSWORD}\n"
f"SIGNOZ_E2E_SEEDER_URL={seeder_cfg.base()}\n"
)
def test_teardown(
signoz: types.SigNoz, # pylint: disable=unused-argument
create_user_admin: types.Operation, # pylint: disable=unused-argument
apply_license: types.Operation, # pylint: disable=unused-argument
seeder: types.TestContainerDocker, # pylint: disable=unused-argument
) -> None:
"""Fixture dependencies trigger container teardown via --teardown."""

View File

@@ -0,0 +1,85 @@
import {
test as base,
expect,
type Browser,
type BrowserContext,
type Page,
} from '@playwright/test';
export type User = { email: string; password: string };
// Default user — admin from the pytest bootstrap (.env.local) or staging .env.
export const ADMIN: User = {
email: process.env.SIGNOZ_E2E_USERNAME!,
password: process.env.SIGNOZ_E2E_PASSWORD!,
};
// Per-worker storageState cache. One login per unique user per worker.
// Promise-valued so concurrent requests share the same in-flight work.
// Held in memory only — no .auth/ dir, no JSON on disk.
type StorageState = Awaited<ReturnType<BrowserContext['storageState']>>;
const storageByUser = new Map<string, Promise<StorageState>>();
async function storageFor(browser: Browser, user: User): Promise<StorageState> {
const cached = storageByUser.get(user.email);
if (cached) return cached;
const task = (async () => {
const ctx = await browser.newContext();
const page = await ctx.newPage();
await login(page, user);
const state = await ctx.storageState();
await ctx.close();
return state;
})();
storageByUser.set(user.email, task);
return task;
}
async function login(page: Page, user: User): Promise<void> {
if (!user.email || !user.password) {
throw new Error(
'User credentials missing. Set SIGNOZ_E2E_USERNAME / SIGNOZ_E2E_PASSWORD ' +
'(pytest bootstrap writes them to .env.local), or pass a User via test.use({ user: ... }).',
);
}
await page.goto('/login?password=Y');
await page.getByTestId('email').fill(user.email);
await page.getByTestId('initiate_login').click();
await page.getByTestId('password').fill(user.password);
await page.getByRole('button', { name: 'Sign in with Password' }).click();
// Post-login lands somewhere different depending on whether the org is
// licensed (onboarding flow on ENTERPRISE) or not (legacy "Hello there"
// welcome). Wait for URL to move off /login — whichever page follows
// is fine, each spec navigates to the feature under test anyway.
await page.waitForURL((url) => !url.pathname.startsWith('/login'));
}
export const test = base.extend<{
/**
* User identity for this test. Override with `test.use({ user: ... })` at
* the describe or test level to run the suite as a different user.
* Defaults to ADMIN (the pytest-bootstrap-seeded admin).
*/
user: User;
/**
* A Page whose context is already authenticated as `user`. First request
* for a given user triggers one login per worker; the resulting
* storageState is held in memory and reused for all later requests.
*/
authedPage: Page;
}>({
user: [ADMIN, { option: true }],
authedPage: async ({ browser, user }, use) => {
const storageState = await storageFor(browser, user);
const ctx = await browser.newContext({ storageState });
const page = await ctx.newPage();
await use(page);
await ctx.close();
},
});
export { expect };

45
tests/e2e/package.json Normal file
View File

@@ -0,0 +1,45 @@
{
"name": "signoz-frontend-automation",
"version": "1.0.0",
"description": "E2E tests for SigNoz frontend with Playwright",
"main": "index.js",
"scripts": {
"test": "playwright test",
"test:staging": "SIGNOZ_E2E_BASE_URL=https://app.us.staging.signoz.cloud playwright test",
"test:ui": "playwright test --ui",
"test:headed": "playwright test --headed",
"test:debug": "playwright test --debug",
"test:chromium": "playwright test --project=chromium",
"test:firefox": "playwright test --project=firefox",
"test:webkit": "playwright test --project=webkit",
"report": "playwright show-report artifacts/html",
"codegen": "playwright codegen",
"install:browsers": "playwright install",
"install:cli": "npm install -g @playwright/cli@latest && playwright-cli install --skills",
"lint": "eslint . --ext .ts,.js",
"lint:fix": "eslint . --ext .ts,.js --fix",
"typecheck": "tsc --noEmit"
},
"keywords": [
"playwright",
"e2e",
"testing",
"signoz"
],
"author": "",
"license": "MIT",
"devDependencies": {
"@playwright/test": "^1.57.0-alpha-2025-10-09",
"@types/node": "^20.0.0",
"@typescript-eslint/eslint-plugin": "^6.0.0",
"@typescript-eslint/parser": "^6.0.0",
"dotenv": "^16.0.0",
"eslint": "^9.26.0",
"eslint-plugin-playwright": "^0.16.0",
"typescript": "^5.0.0"
},
"engines": {
"node": ">=18.0.0",
"yarn": ">=1.22.0"
}
}

View File

@@ -0,0 +1,11 @@
{
"browser": {
"browserName": "chromium",
"launchOptions": { "headless": true }
},
"timeouts": {
"action": 5000,
"navigation": 30000
},
"outputDir": ".playwright-cli"
}

View File

@@ -0,0 +1,61 @@
import { defineConfig, devices } from '@playwright/test';
import dotenv from 'dotenv';
import path from 'path';
// .env holds user-provided defaults (staging creds).
// .env.local is written by tests/e2e/bootstrap/setup.py when the pytest
// lifecycle brings the backend up locally; override=true so local-backend
// coordinates win over any stale .env values. Subprocess-injected env
// (e.g. when pytest shells out to `yarn test`) still takes priority —
// dotenv doesn't touch vars that are already set in process.env.
dotenv.config({ path: path.resolve(__dirname, '.env') });
dotenv.config({ path: path.resolve(__dirname, '.env.local'), override: true });
export default defineConfig({
testDir: './tests',
// All Playwright output lands under artifacts/. One subdir per reporter
// plus results/ for per-test artifacts (traces/screenshots/videos).
// CI can archive the whole dir with `tar czf artifacts.tgz tests/e2e/artifacts`.
outputDir: 'artifacts/results',
// Run tests in parallel
fullyParallel: true,
// Fail the build on CI if you accidentally left test.only
forbidOnly: !!process.env.CI,
// Retry on CI only
retries: process.env.CI ? 2 : 0,
// Workers
workers: process.env.CI ? 2 : undefined,
// Reporter
reporter: [
['html', { outputFolder: 'artifacts/html', open: 'never' }],
['json', { outputFile: 'artifacts/json/results.json' }],
['list'],
],
// Shared settings
use: {
baseURL:
process.env.SIGNOZ_E2E_BASE_URL || 'https://app.us.staging.signoz.cloud',
trace: 'on-first-retry',
screenshot: 'only-on-failure',
video: 'retain-on-failure',
colorScheme: 'dark',
locale: 'en-US',
viewport: { width: 1280, height: 720 },
},
// Browser projects. No project-level auth — specs opt in via the
// authedPage fixture in tests/e2e/fixtures/auth.ts, which logs a user
// in on first use and caches the resulting storageState per worker.
projects: [
{ name: 'chromium', use: devices['Desktop Chrome'] },
{ name: 'firefox', use: devices['Desktop Firefox'] },
{ name: 'webkit', use: devices['Desktop Safari'] },
],
});

View File

@@ -0,0 +1,7 @@
import { test, expect } from '../../fixtures/auth';
test('TC-01 alerts page — tabs render', async ({ authedPage: page }) => {
await page.goto('/alerts');
await expect(page.getByRole('tab', { name: /alert rules/i })).toBeVisible();
await expect(page.getByRole('tab', { name: /configuration/i })).toBeVisible();
});

23
tests/e2e/tsconfig.json Normal file
View File

@@ -0,0 +1,23 @@
{
"compilerOptions": {
"target": "ES2020",
"module": "commonjs",
"moduleResolution": "bundler",
"lib": ["ES2020"],
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"types": ["node", "@playwright/test"],
"paths": {
"@tests/*": ["./tests/*"],
"@utils/*": ["./utils/*"],
"@specs/*": ["./specs/*"]
},
"outDir": "./dist",
"rootDir": "."
},
"include": ["tests/**/*.ts", "utils/**/*.ts", "playwright.config.ts"],
"exclude": ["node_modules", "dist"]
}

1480
tests/e2e/yarn.lock Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,18 +1,118 @@
import base64
import json
import time
from datetime import datetime, timedelta
from datetime import datetime, timedelta, timezone
from http import HTTPStatus
from typing import List
from typing import Callable, List
import pytest
import requests
from fixtures import types
from fixtures.auth import USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD
from fixtures.fs import get_testdata_file_path
from fixtures.logger import setup_logger
from fixtures.logs import Logs
from fixtures.metrics import Metrics
from fixtures.traces import Traces
logger = setup_logger(__name__)
@pytest.fixture(name="create_alert_rule", scope="function")
def create_alert_rule(
signoz: types.SigNoz, get_token: Callable[[str, str], str]
) -> Callable[[dict], str]:
admin_token = get_token(USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD)
rule_ids = []
def _create_alert_rule(rule_data: dict) -> str:
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/rules"),
json=rule_data,
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert (
response.status_code == HTTPStatus.OK
), f"Failed to create rule, api returned {response.status_code} with response: {response.text}"
rule_id = response.json()["data"]["id"]
rule_ids.append(rule_id)
return rule_id
def _delete_alert_rule(rule_id: str):
logger.info("Deleting rule: %s", {"rule_id": rule_id})
response = requests.delete(
signoz.self.host_configs["8080"].get(f"/api/v1/rules/{rule_id}"),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
if response.status_code != HTTPStatus.OK:
raise Exception( # pylint: disable=broad-exception-raised
f"Failed to delete rule, api returned {response.status_code} with response: {response.text}"
)
yield _create_alert_rule
# delete the rule on cleanup
for rule_id in rule_ids:
try:
_delete_alert_rule(rule_id)
except Exception as e: # pylint: disable=broad-exception-caught
logger.error("Error deleting rule: %s", {"rule_id": rule_id, "error": e})
@pytest.fixture(name="insert_alert_data", scope="function")
def insert_alert_data(
insert_metrics: Callable[[List[Metrics]], None],
insert_traces: Callable[[List[Traces]], None],
insert_logs: Callable[[List[Logs]], None],
) -> Callable[[List[types.AlertData]], None]:
def _insert_alert_data(
alert_data_items: List[types.AlertData],
base_time: datetime = None,
) -> None:
metrics: List[Metrics] = []
traces: List[Traces] = []
logs: List[Logs] = []
now = base_time or datetime.now(tz=timezone.utc).replace(
second=0, microsecond=0
)
for data_item in alert_data_items:
if data_item.type == "metrics":
_metrics = Metrics.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
metrics.extend(_metrics)
elif data_item.type == "traces":
_traces = Traces.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
traces.extend(_traces)
elif data_item.type == "logs":
_logs = Logs.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
logs.extend(_logs)
# Add data to ClickHouse if any data is present
if len(metrics) > 0:
insert_metrics(metrics)
if len(traces) > 0:
insert_traces(traces)
if len(logs) > 0:
insert_logs(logs)
yield _insert_alert_data
def collect_webhook_firing_alerts(
webhook_test_container: types.TestContainerDocker, notification_channel_name: str
) -> List[types.FiringAlert]:

445
tests/fixtures/auth.py vendored Normal file
View File

@@ -0,0 +1,445 @@
import time
from http import HTTPStatus
from typing import Callable, Dict, List, Tuple
import pytest
import requests
from wiremock.client import Mappings
from wiremock.constants import Config
from wiremock.resources.mappings import (
HttpMethods,
Mapping,
MappingRequest,
MappingResponse,
WireMockMatchers,
)
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
USER_ADMIN_NAME = "admin"
USER_ADMIN_EMAIL = "admin@integration.test"
USER_ADMIN_PASSWORD = "password123Z$"
USER_EDITOR_NAME = "editor"
USER_EDITOR_EMAIL = "editor@integration.test"
USER_EDITOR_PASSWORD = "password123Z$"
USER_VIEWER_NAME = "viewer"
USER_VIEWER_EMAIL = "viewer@integration.test"
USER_VIEWER_PASSWORD = "password123Z$"
USERS_BASE = "/api/v2/users"
def _login(signoz: types.SigNoz, email: str, password: str) -> str:
"""Complete GET /sessions/context + POST /sessions/email_password; return accessToken."""
ctx = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert ctx.status_code == HTTPStatus.OK
org_id = ctx.json()["data"]["orgs"][0]["id"]
login = requests.post(
signoz.self.host_configs["8080"].get("/api/v2/sessions/email_password"),
json={"email": email, "password": password, "orgId": org_id},
timeout=5,
)
assert login.status_code == HTTPStatus.OK
return login.json()["data"]["accessToken"]
@pytest.fixture(name="create_user_admin", scope="package")
def create_user_admin(
signoz: types.SigNoz, request: pytest.FixtureRequest, pytestconfig: pytest.Config
) -> types.Operation:
def create() -> None:
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": USER_ADMIN_NAME,
"orgName": "",
"email": USER_ADMIN_EMAIL,
"password": USER_ADMIN_PASSWORD,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return types.Operation(name="create_user_admin")
def delete(_: types.Operation) -> None:
pass
def restore(cache: dict) -> types.Operation:
return types.Operation(name=cache["name"])
return reuse.wrap(
request,
pytestconfig,
"create_user_admin",
lambda: types.Operation(name=""),
create,
delete,
restore,
)
@pytest.fixture(name="get_session_context", scope="function")
def get_session_context(signoz: types.SigNoz) -> Callable[[str, str], str]:
def _get_session_context(email: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return response.json()["data"]
return _get_session_context
@pytest.fixture(name="get_token", scope="function")
def get_token(signoz: types.SigNoz) -> Callable[[str, str], str]:
def _get_token(email: str, password: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
org_id = response.json()["data"]["orgs"][0]["id"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v2/sessions/email_password"),
json={
"email": email,
"password": password,
"orgId": org_id,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return response.json()["data"]["accessToken"]
return _get_token
@pytest.fixture(name="get_tokens", scope="function")
def get_tokens(signoz: types.SigNoz) -> Callable[[str, str], Tuple[str, str]]:
def _get_tokens(email: str, password: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
org_id = response.json()["data"]["orgs"][0]["id"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v2/sessions/email_password"),
json={
"email": email,
"password": password,
"orgId": org_id,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
access_token = response.json()["data"]["accessToken"]
refresh_token = response.json()["data"]["refreshToken"]
return access_token, refresh_token
return _get_tokens
@pytest.fixture(name="apply_license", scope="package")
def apply_license(
signoz: types.SigNoz,
create_user_admin: types.Operation, # pylint: disable=unused-argument,redefined-outer-name
request: pytest.FixtureRequest,
pytestconfig: pytest.Config,
) -> types.Operation:
"""Stub Zeus license-lookup, then POST /api/v3/licenses so the BE flips
to ENTERPRISE. Package-scoped so an e2e bootstrap can pull it in and
every spec inherits the licensed state."""
def create() -> types.Operation:
Config.base_url = signoz.zeus.host_configs["8080"].get("/__admin")
Mappings.create_mapping(
mapping=Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v2/licenses/me",
headers={
"X-Signoz-Cloud-Api-Key": {
WireMockMatchers.EQUAL_TO: "secret-key"
}
},
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"id": "0196360e-90cd-7a74-8313-1aa815ce2a67",
"key": "secret-key",
"valid_from": 1732146923,
"valid_until": -1,
"status": "VALID",
"state": "EVALUATING",
"plan": {"name": "ENTERPRISE"},
"platform": "CLOUD",
"features": [],
"event_queue": {},
},
},
),
persistent=False,
)
)
access_token = _login(signoz, USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD)
# 202 = applied, 409 = already applied. Retry transient failures —
# the BE occasionally 5xxs right after startup before the license
# sync goroutine is ready.
license_url = signoz.self.host_configs["8080"].get("/api/v3/licenses")
auth_header = {"Authorization": f"Bearer {access_token}"}
for attempt in range(10):
resp = requests.post(
license_url,
json={"key": "secret-key"},
headers=auth_header,
timeout=5,
)
if resp.status_code in (HTTPStatus.ACCEPTED, HTTPStatus.CONFLICT):
break
if attempt == 9:
resp.raise_for_status()
time.sleep(1)
# The ENTERPRISE license flips on the `onboarding` feature which
# redirects first-time admins to a questionnaire. Mark the preference
# complete so specs can navigate directly to the feature under test.
pref_resp = requests.put(
signoz.self.host_configs["8080"].get(
"/api/v1/org/preferences/org_onboarding"
),
json={"value": True},
headers=auth_header,
timeout=5,
)
assert pref_resp.status_code in (HTTPStatus.OK, HTTPStatus.NO_CONTENT)
return types.Operation(name="apply_license")
def delete(_: types.Operation) -> None:
pass
def restore(cache: dict) -> types.Operation:
return types.Operation(name=cache["name"])
return reuse.wrap(
request,
pytestconfig,
"apply_license",
lambda: types.Operation(name=""),
create,
delete,
restore,
)
# This is not a fixture purposefully, we just want to add a license to the signoz instance.
# This is also idempotent in nature.
def add_license(
signoz: types.SigNoz,
make_http_mocks: Callable[[types.TestContainerDocker, List[Mapping]], None],
get_token: Callable[[str, str], str], # pylint: disable=redefined-outer-name
) -> None:
make_http_mocks(
signoz.zeus,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v2/licenses/me",
headers={
"X-Signoz-Cloud-Api-Key": {
WireMockMatchers.EQUAL_TO: "secret-key"
}
},
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"id": "0196360e-90cd-7a74-8313-1aa815ce2a67",
"key": "secret-key",
"valid_from": 1732146923,
"valid_until": -1,
"status": "VALID",
"state": "EVALUATING",
"plan": {
"name": "ENTERPRISE",
},
"platform": "CLOUD",
"features": [],
"event_queue": {},
},
},
),
persistent=False,
)
],
)
access_token = get_token(USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD)
response = requests.post(
url=signoz.self.host_configs["8080"].get("/api/v3/licenses"),
json={"key": "secret-key"},
headers={"Authorization": "Bearer " + access_token},
timeout=5,
)
if response.status_code == HTTPStatus.CONFLICT:
return
assert response.status_code == HTTPStatus.ACCEPTED
response = requests.post(
url=signoz.zeus.host_configs["8080"].get("/__admin/requests/count"),
json={"method": "GET", "url": "/v2/licenses/me"},
timeout=5,
)
assert response.json()["count"] == 1
def create_active_user(
signoz: types.SigNoz,
admin_token: str,
email: str,
role: str,
password: str,
name: str = "",
) -> str:
"""Invite a user and activate via resetPassword. Returns user ID."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/invite"),
json={"email": email, "role": role, "name": name},
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.CREATED, response.text
invited_user = response.json()["data"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/resetPassword"),
json={"password": password, "token": invited_user["token"]},
timeout=5,
)
assert response.status_code == HTTPStatus.NO_CONTENT, response.text
return invited_user["id"]
def find_user_by_email(signoz: types.SigNoz, token: str, email: str) -> Dict:
"""Find a user by email from the user list. Raises AssertionError if not found."""
response = requests.get(
signoz.self.host_configs["8080"].get(USERS_BASE),
headers={"Authorization": f"Bearer {token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
user = next((u for u in response.json()["data"] if u["email"] == email), None)
assert user is not None, f"User with email '{email}' not found"
return user
def find_user_with_roles_by_email(signoz: types.SigNoz, token: str, email: str) -> Dict:
"""Find a user by email and return UserWithRoles (user fields + userRoles).
Raises AssertionError if the user is not found.
"""
user = find_user_by_email(signoz, token, email)
response = requests.get(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user['id']}"),
headers={"Authorization": f"Bearer {token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
return response.json()["data"]
def assert_user_has_role(data: Dict, role_name: str) -> None:
"""Assert that a UserWithRoles response contains the expected managed role."""
role_names = {ur["role"]["name"] for ur in data.get("userRoles", [])}
assert role_name in role_names, f"Expected role '{role_name}' in {role_names}"
def change_user_role(
signoz: types.SigNoz,
admin_token: str,
user_id: str,
old_role: str,
new_role: str,
) -> None:
"""Change a user's role (remove old, assign new).
Role names should be managed role names (e.g. signoz-editor).
"""
# Get current roles to find the old role's ID
response = requests.get(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user_id}/roles"),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
roles = response.json()["data"]
old_role_entry = next((r for r in roles if r["name"] == old_role), None)
assert old_role_entry is not None, f"User does not have role '{old_role}'"
# Remove old role
response = requests.delete(
signoz.self.host_configs["8080"].get(
f"{USERS_BASE}/{user_id}/roles/{old_role_entry['id']}"
),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.NO_CONTENT, response.text
# Assign new role
response = requests.post(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user_id}/roles"),
json={"name": new_role},
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text

View File

@@ -10,7 +10,7 @@ import pytest
from testcontainers.clickhouse import ClickHouseContainer
from testcontainers.core.container import Network
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -256,7 +256,7 @@ def clickhouse(
env=env,
)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"clickhouse",

View File

@@ -5,6 +5,13 @@ from typing import Callable
import pytest
import requests
from wiremock.client import (
HttpMethods,
Mapping,
MappingRequest,
MappingResponse,
WireMockMatchers,
)
from fixtures import types
from fixtures.auth import USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD
@@ -153,3 +160,140 @@ def create_cloud_integration_account(
logger.info("Cleaned up test account: %s", account_id)
except Exception as exc: # pylint: disable=broad-except
logger.info("Post-test delete cleanup failed: %s", exc)
def deprecated_simulate_agent_checkin(
signoz: types.SigNoz,
admin_token: str,
cloud_provider: str,
account_id: str,
cloud_account_id: str,
) -> requests.Response:
endpoint = f"/api/v1/cloud-integrations/{cloud_provider}/agent-check-in"
checkin_payload = {
"account_id": account_id,
"cloud_account_id": cloud_account_id,
"data": {},
}
response = requests.post(
signoz.self.host_configs["8080"].get(endpoint),
headers={"Authorization": f"Bearer {admin_token}"},
json=checkin_payload,
timeout=10,
)
if not response.ok:
logger.error(
"Agent check-in failed: %s, response: %s",
response.status_code,
response.text,
)
return response
def setup_create_account_mocks(
signoz: types.SigNoz,
make_http_mocks: Callable,
) -> None:
"""Set up Zeus and Gateway mocks required by the CreateAccount endpoint."""
make_http_mocks(
signoz.zeus,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v2/deployments/me",
headers={
"X-Signoz-Cloud-Api-Key": {
WireMockMatchers.EQUAL_TO: "secret-key"
}
},
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"name": "test-deployment",
"cluster": {"region": {"dns": "test.signoz.cloud"}},
},
},
),
persistent=False,
)
],
)
make_http_mocks(
signoz.gateway,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v1/workspaces/me/keys/search?name=aws-integration&page=1&per_page=10",
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": [],
"_pagination": {"page": 1, "per_page": 10, "total": 0},
},
),
persistent=False,
),
Mapping(
request=MappingRequest(
method=HttpMethods.POST,
url="/v1/workspaces/me/keys",
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"name": "aws-integration",
"value": "test-ingestion-key-123456",
},
"error": "",
},
),
persistent=False,
),
],
)
def simulate_agent_checkin(
signoz: types.SigNoz,
admin_token: str,
cloud_provider: str,
account_id: str,
cloud_account_id: str,
data: dict | None = None,
) -> requests.Response:
endpoint = f"/api/v1/cloud_integrations/{cloud_provider}/accounts/check_in"
checkin_payload = {
"cloudIntegrationId": account_id,
"providerAccountId": cloud_account_id,
"data": data or {},
}
response = requests.post(
signoz.self.host_configs["8080"].get(endpoint),
headers={"Authorization": f"Bearer {admin_token}"},
json=checkin_payload,
timeout=10,
)
if not response.ok:
logger.error(
"Agent check-in failed: %s, response: %s",
response.status_code,
response.text,
)
return response

25
tests/fixtures/fs.py vendored Normal file
View File

@@ -0,0 +1,25 @@
import os
from typing import Any, Generator
import pytest
from fixtures import types
@pytest.fixture(scope="package")
def tmpfs(
tmp_path_factory: pytest.TempPathFactory,
) -> Generator[types.LegacyPath, Any, None]:
def _tmp(basename: str):
return tmp_path_factory.mktemp(basename)
yield _tmp
def get_testdata_file_path(file: str) -> str:
# Integration testdata lives at tests/integration/testdata/. This helper
# resolves from tests/fixtures/fs.py, so walk up to tests/ and across.
testdata_dir = os.path.join(
os.path.dirname(__file__), "..", "integration", "testdata"
)
return os.path.join(testdata_dir, file)

View File

@@ -12,7 +12,7 @@ from wiremock.client import (
from wiremock.constants import Config
from wiremock.testing.testcontainer import WireMockContainer
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -63,7 +63,7 @@ def zeus(
def restore(cache: dict) -> types.TestContainerDocker:
return types.TestContainerDocker.from_cache(cache)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"zeus",
@@ -120,7 +120,7 @@ def gateway(
def restore(cache: dict) -> types.TestContainerDocker:
return types.TestContainerDocker.from_cache(cache)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"gateway",

View File

@@ -11,7 +11,7 @@ from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
from fixtures import types
from fixtures.idp import IDP_ROOT_PASSWORD, IDP_ROOT_USERNAME
from fixtures.keycloak import IDP_ROOT_PASSWORD, IDP_ROOT_USERNAME
@pytest.fixture(name="create_saml_client", scope="function")

View File

@@ -4,7 +4,7 @@ import pytest
from testcontainers.core.container import Network
from testcontainers.keycloak import KeycloakContainer
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -80,7 +80,7 @@ def idp(
container=container,
)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"idp",

View File

@@ -11,7 +11,7 @@ from ksuid import KsuidMs
from fixtures import types
from fixtures.fingerprint import LogsOrTracesFingerprint
from fixtures.utils import parse_timestamp
from fixtures.time import parse_timestamp
class LogsResource(ABC):
@@ -391,112 +391,124 @@ class Logs(ABC):
return logs
def insert_logs_to_clickhouse(conn, logs: List[Logs]) -> None:
"""
Insert logs into ClickHouse tables following the same logic as the Go exporter.
Handles insertion into:
- distributed_logs_v2 (main logs table)
- distributed_logs_v2_resource (resource fingerprints)
- distributed_tag_attributes_v2 (tag attributes)
- distributed_logs_attribute_keys (attribute keys)
- distributed_logs_resource_keys (resource keys)
Pure function so the seeder container can reuse the exact insert path
used by the pytest fixture. `conn` is a clickhouse-connect Client.
"""
resources: List[LogsResource] = []
for log in logs:
resources.extend(log.resource)
if len(resources) > 0:
conn.insert(
database="signoz_logs",
table="distributed_logs_v2_resource",
data=[resource.np_arr() for resource in resources],
column_names=[
"labels",
"fingerprint",
"seen_at_ts_bucket_start",
],
)
tag_attributes: List[LogsTagAttributes] = []
for log in logs:
tag_attributes.extend(log.tag_attributes)
if len(tag_attributes) > 0:
conn.insert(
database="signoz_logs",
table="distributed_tag_attributes_v2",
data=[tag_attribute.np_arr() for tag_attribute in tag_attributes],
)
attribute_keys: List[LogsResourceOrAttributeKeys] = []
for log in logs:
attribute_keys.extend(log.attribute_keys)
if len(attribute_keys) > 0:
conn.insert(
database="signoz_logs",
table="distributed_logs_attribute_keys",
data=[attribute_key.np_arr() for attribute_key in attribute_keys],
)
resource_keys: List[LogsResourceOrAttributeKeys] = []
for log in logs:
resource_keys.extend(log.resource_keys)
if len(resource_keys) > 0:
conn.insert(
database="signoz_logs",
table="distributed_logs_resource_keys",
data=[resource_key.np_arr() for resource_key in resource_keys],
)
conn.insert(
database="signoz_logs",
table="distributed_logs_v2",
data=[log.np_arr() for log in logs],
column_names=[
"ts_bucket_start",
"resource_fingerprint",
"timestamp",
"observed_timestamp",
"id",
"trace_id",
"span_id",
"trace_flags",
"severity_text",
"severity_number",
"body",
"attributes_string",
"attributes_number",
"attributes_bool",
"resources_string",
"scope_name",
"scope_version",
"scope_string",
"resource",
],
)
_LOGS_TABLES_TO_TRUNCATE = [
"logs_v2",
"logs_v2_resource",
"tag_attributes_v2",
"logs_attribute_keys",
"logs_resource_keys",
]
def truncate_logs_tables(conn, cluster: str) -> None:
"""Truncate all logs tables. Used by the pytest fixture teardown and by
the seeder's DELETE /telemetry/logs endpoint."""
for table in _LOGS_TABLES_TO_TRUNCATE:
conn.query(f"TRUNCATE TABLE signoz_logs.{table} ON CLUSTER '{cluster}' SYNC")
@pytest.fixture(name="insert_logs", scope="function")
def insert_logs(
clickhouse: types.TestContainerClickhouse,
) -> Generator[Callable[[List[Logs]], None], Any, None]:
def _insert_logs(logs: List[Logs]) -> None:
"""
Insert logs into ClickHouse tables following the same logic as the Go exporter.
This function handles insertion into multiple tables:
- distributed_logs_v2 (main logs table)
- distributed_logs_v2_resource (resource fingerprints)
- distributed_tag_attributes_v2 (tag attributes)
- distributed_logs_attribute_keys (attribute keys)
- distributed_logs_resource_keys (resource keys)
"""
resources: List[LogsResource] = []
for log in logs:
resources.extend(log.resource)
if len(resources) > 0:
clickhouse.conn.insert(
database="signoz_logs",
table="distributed_logs_v2_resource",
data=[resource.np_arr() for resource in resources],
column_names=[
"labels",
"fingerprint",
"seen_at_ts_bucket_start",
],
)
tag_attributes: List[LogsTagAttributes] = []
for log in logs:
tag_attributes.extend(log.tag_attributes)
if len(tag_attributes) > 0:
clickhouse.conn.insert(
database="signoz_logs",
table="distributed_tag_attributes_v2",
data=[tag_attribute.np_arr() for tag_attribute in tag_attributes],
)
attribute_keys: List[LogsResourceOrAttributeKeys] = []
for log in logs:
attribute_keys.extend(log.attribute_keys)
if len(attribute_keys) > 0:
clickhouse.conn.insert(
database="signoz_logs",
table="distributed_logs_attribute_keys",
data=[attribute_key.np_arr() for attribute_key in attribute_keys],
)
resource_keys: List[LogsResourceOrAttributeKeys] = []
for log in logs:
resource_keys.extend(log.resource_keys)
if len(resource_keys) > 0:
clickhouse.conn.insert(
database="signoz_logs",
table="distributed_logs_resource_keys",
data=[resource_key.np_arr() for resource_key in resource_keys],
)
clickhouse.conn.insert(
database="signoz_logs",
table="distributed_logs_v2",
data=[log.np_arr() for log in logs],
column_names=[
"ts_bucket_start",
"resource_fingerprint",
"timestamp",
"observed_timestamp",
"id",
"trace_id",
"span_id",
"trace_flags",
"severity_text",
"severity_number",
"body",
"attributes_string",
"attributes_number",
"attributes_bool",
"resources_string",
"scope_name",
"scope_version",
"scope_string",
"resource",
],
)
insert_logs_to_clickhouse(clickhouse.conn, logs)
yield _insert_logs
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_logs.logs_v2 ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_logs.logs_v2_resource ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_logs.tag_attributes_v2 ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_logs.logs_attribute_keys ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_logs.logs_resource_keys ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
truncate_logs_tables(
clickhouse.conn,
clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER"],
)

View File

@@ -8,7 +8,7 @@ import numpy as np
import pytest
from fixtures import types
from fixtures.utils import parse_timestamp
from fixtures.time import parse_timestamp
class MetricsTimeSeries(ABC):
@@ -417,151 +417,166 @@ class Metrics(ABC):
return metrics
def insert_metrics_to_clickhouse(conn, metrics: List[Metrics]) -> None:
"""
Insert metrics into ClickHouse tables.
Handles insertion into:
- distributed_time_series_v4 (time series metadata)
- distributed_samples_v4 (actual sample values)
- distributed_metadata (metric attribute metadata)
Pure function so the seeder container can reuse the exact insert path
used by the pytest fixture. `conn` is a clickhouse-connect Client.
"""
time_series_map: dict[int, MetricsTimeSeries] = {}
for metric in metrics:
fp = int(metric.time_series.fingerprint)
if fp not in time_series_map:
time_series_map[fp] = metric.time_series
if len(time_series_map) > 0:
conn.insert(
database="signoz_metrics",
table="distributed_time_series_v4",
column_names=[
"env",
"temporality",
"metric_name",
"description",
"unit",
"type",
"is_monotonic",
"fingerprint",
"unix_milli",
"labels",
"attrs",
"scope_attrs",
"resource_attrs",
"__normalized",
],
data=[ts.to_row() for ts in time_series_map.values()],
)
samples = [metric.sample for metric in metrics]
if len(samples) > 0:
conn.insert(
database="signoz_metrics",
table="distributed_samples_v4",
column_names=[
"env",
"temporality",
"metric_name",
"fingerprint",
"unix_milli",
"value",
"flags",
],
data=[sample.to_row() for sample in samples],
)
# (metric_name, attr_type, attr_name, attr_value) -> MetricsMetadata
metadata_map: dict[tuple, MetricsMetadata] = {}
for metric in metrics:
ts = metric.time_series
for attr_name, attr_value in metric.labels.items():
key = (ts.metric_name, "point", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="point",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
for attr_name, attr_value in ts.resource_attrs.items():
key = (ts.metric_name, "resource", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="resource",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
for attr_name, attr_value in ts.scope_attrs.items():
key = (ts.metric_name, "scope", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="scope",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
if len(metadata_map) > 0:
conn.insert(
database="signoz_metrics",
table="distributed_metadata",
column_names=[
"temporality",
"metric_name",
"description",
"unit",
"type",
"is_monotonic",
"attr_name",
"attr_type",
"attr_datatype",
"attr_string_value",
"first_reported_unix_milli",
"last_reported_unix_milli",
],
data=[m.to_row() for m in metadata_map.values()],
)
_METRICS_TABLES_TO_TRUNCATE = [
"time_series_v4",
"samples_v4",
"exp_hist",
"metadata",
]
def truncate_metrics_tables(conn, cluster: str) -> None:
"""Truncate all metrics tables. Used by the pytest fixture teardown and by
the seeder's DELETE /telemetry/metrics endpoint."""
for table in _METRICS_TABLES_TO_TRUNCATE:
conn.query(f"TRUNCATE TABLE signoz_metrics.{table} ON CLUSTER '{cluster}' SYNC")
@pytest.fixture(name="insert_metrics", scope="function")
def insert_metrics(
clickhouse: types.TestContainerClickhouse,
) -> Generator[Callable[[List[Metrics]], None], Any, None]:
def _insert_metrics(metrics: List[Metrics]) -> None:
"""
Insert metrics into ClickHouse tables.
This function handles insertion into:
- distributed_time_series_v4 (time series metadata)
- distributed_samples_v4 (actual sample values)
- distributed_metadata (metric attribute metadata)
"""
time_series_map: dict[int, MetricsTimeSeries] = {}
for metric in metrics:
fp = int(metric.time_series.fingerprint)
if fp not in time_series_map:
time_series_map[fp] = metric.time_series
if len(time_series_map) > 0:
clickhouse.conn.insert(
database="signoz_metrics",
table="distributed_time_series_v4",
column_names=[
"env",
"temporality",
"metric_name",
"description",
"unit",
"type",
"is_monotonic",
"fingerprint",
"unix_milli",
"labels",
"attrs",
"scope_attrs",
"resource_attrs",
"__normalized",
],
data=[ts.to_row() for ts in time_series_map.values()],
)
samples = [metric.sample for metric in metrics]
if len(samples) > 0:
clickhouse.conn.insert(
database="signoz_metrics",
table="distributed_samples_v4",
column_names=[
"env",
"temporality",
"metric_name",
"fingerprint",
"unix_milli",
"value",
"flags",
],
data=[sample.to_row() for sample in samples],
)
# (metric_name, attr_type, attr_name, attr_value) -> MetricsMetadata
metadata_map: dict[tuple, MetricsMetadata] = {}
for metric in metrics:
ts = metric.time_series
for attr_name, attr_value in metric.labels.items():
key = (ts.metric_name, "point", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="point",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
for attr_name, attr_value in ts.resource_attrs.items():
key = (ts.metric_name, "resource", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="resource",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
for attr_name, attr_value in ts.scope_attrs.items():
key = (ts.metric_name, "scope", attr_name, str(attr_value))
if key not in metadata_map:
metadata_map[key] = MetricsMetadata(
metric_name=ts.metric_name,
attr_name=attr_name,
attr_type="scope",
attr_datatype="String",
attr_string_value=str(attr_value),
timestamp=metric.timestamp,
temporality=ts.temporality,
description=ts.description,
unit=ts.unit,
type_=ts.type,
is_monotonic=ts.is_monotonic,
)
if len(metadata_map) > 0:
clickhouse.conn.insert(
database="signoz_metrics",
table="distributed_metadata",
column_names=[
"temporality",
"metric_name",
"description",
"unit",
"type",
"is_monotonic",
"attr_name",
"attr_type",
"attr_datatype",
"attr_string_value",
"first_reported_unix_milli",
"last_reported_unix_milli",
],
data=[m.to_row() for m in metadata_map.values()],
)
insert_metrics_to_clickhouse(clickhouse.conn, metrics)
yield _insert_metrics
cluster = clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER"]
tables_to_truncate = [
"time_series_v4",
"samples_v4",
"exp_hist",
"metadata",
]
for table in tables_to_truncate:
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_metrics.{table} ON CLUSTER '{cluster}' SYNC"
)
truncate_metrics_tables(
clickhouse.conn,
clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER"],
)
@pytest.fixture(name="remove_metrics_ttl_and_storage_settings", scope="function")

View File

@@ -2,7 +2,7 @@ import docker
import pytest
from testcontainers.core.container import Network
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -67,7 +67,7 @@ def migrator(
def restore(cache: dict) -> types.Operation:
return types.Operation(name=cache["name"])
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"migrator",

View File

@@ -3,7 +3,7 @@ import docker.errors
import pytest
from testcontainers.core.network import Network
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -37,7 +37,7 @@ def network(
nw = client.networks.get(network_id=existing.get("id"))
return types.Network(id=nw.id, name=nw.name)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"network",

View File

@@ -8,7 +8,7 @@ import requests
from testcontainers.core.container import Network
from wiremock.testing.testcontainer import WireMockContainer
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.auth import USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD
from fixtures.logger import setup_logger
@@ -60,7 +60,7 @@ def notification_channel(
def restore(cache: dict) -> types.TestContainerDocker:
return types.TestContainerDocker.from_cache(cache)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"notification_channel",

View File

@@ -5,7 +5,7 @@ from sqlalchemy import create_engine, sql
from testcontainers.core.container import Network
from testcontainers.postgres import PostgresContainer
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -97,7 +97,7 @@ def postgres(
env=env,
)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"postgres",

View File

@@ -1,10 +1,13 @@
from dataclasses import dataclass
from datetime import datetime, timedelta
from datetime import datetime, timedelta, timezone
from http import HTTPStatus
from typing import Any, Dict, List, Optional, Union
import requests
from fixtures import types
from fixtures.logs import Logs
from fixtures.traces import TraceIdGenerator, Traces, TracesKind, TracesStatusCode
DEFAULT_STEP_INTERVAL = 60 # seconds
DEFAULT_TOLERANCE = 1e-9
@@ -583,3 +586,251 @@ def assert_scalar_column_order(
f"{context}: Column {column_index} order mismatch. "
f"Expected {expected_values}, got {actual_values}"
)
def format_timestamp(dt: datetime) -> str:
"""
Format a datetime object to match the API's timestamp format.
The API returns timestamps with minimal fractional seconds precision.
Example: 2026-02-03T20:54:56.5Z for 500000 microseconds
"""
base_str = dt.strftime("%Y-%m-%dT%H:%M:%S")
if dt.microsecond:
# Convert microseconds to fractional seconds and strip trailing zeros
fractional = f"{dt.microsecond / 1000000:.6f}"[2:].rstrip("0")
return f"{base_str}.{fractional}Z"
return f"{base_str}Z"
def assert_identical_query_response(
response1: requests.Response, response2: requests.Response
) -> None:
"""
Assert that two query responses are identical in status and data.
"""
assert response1.status_code == response2.status_code, "Status codes do not match"
if response1.status_code == HTTPStatus.OK:
assert (
response1.json()["status"] == response2.json()["status"]
), "Response statuses do not match"
assert (
response1.json()["data"]["data"]["results"]
== response2.json()["data"]["data"]["results"]
), "Response data do not match"
def generate_logs_with_corrupt_metadata() -> List[Logs]:
"""
Specifically, entries with 'id', 'timestamp', 'severity_text', 'severity_number' and 'body' fields in metadata
"""
now = datetime.now(tz=timezone.utc).replace(second=0, microsecond=0)
return [
Logs(
timestamp=now - timedelta(seconds=4),
body="POST /integration request received",
severity_text="INFO",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"timestamp": "corrupt_data",
},
attributes={
"net.transport": "IP.TCP",
"http.scheme": "http",
"http.user_agent": "Integration Test",
"http.request.method": "POST",
"http.response.status_code": "200",
"severity_text": "corrupt_data",
"timestamp": "corrupt_data",
},
trace_id="1",
),
Logs(
timestamp=now - timedelta(seconds=3),
body="SELECT query executed",
severity_text="DEBUG",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"severity_number": "corrupt_data",
"id": "corrupt_data",
},
attributes={
"db.name": "integration",
"db.operation": "SELECT",
"db.statement": "SELECT * FROM integration",
"trace_id": "2",
},
),
Logs(
timestamp=now - timedelta(seconds=2),
body="HTTP PATCH failed with 404",
severity_text="WARN",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"body": "corrupt_data",
"trace_id": "3",
},
attributes={
"http.request.method": "PATCH",
"http.status_code": "404",
"id": "1",
},
),
Logs(
timestamp=now - timedelta(seconds=1),
body="{'trace_id': '4'}",
severity_text="ERROR",
resources={
"deployment.environment": "production",
"service.name": "topic-service",
"os.type": "linux",
"host.name": "linux-001",
"cloud.provider": "integration",
"cloud.account.id": "001",
},
attributes={
"message.type": "SENT",
"messaging.operation": "publish",
"messaging.message.id": "001",
"body": "corrupt_data",
"timestamp": "corrupt_data",
},
),
]
def generate_traces_with_corrupt_metadata() -> List[Traces]:
"""
Specifically, entries with 'id', 'timestamp', 'trace_id' and 'duration_nano' fields in metadata
"""
http_service_trace_id = TraceIdGenerator.trace_id()
http_service_span_id = TraceIdGenerator.span_id()
http_service_db_span_id = TraceIdGenerator.span_id()
http_service_patch_span_id = TraceIdGenerator.span_id()
topic_service_trace_id = TraceIdGenerator.trace_id()
topic_service_span_id = TraceIdGenerator.span_id()
now = datetime.now(tz=timezone.utc).replace(second=0, microsecond=0)
return [
Traces(
timestamp=now - timedelta(seconds=4),
duration=timedelta(seconds=3),
trace_id=http_service_trace_id,
span_id=http_service_span_id,
parent_span_id="",
name="POST /integration",
kind=TracesKind.SPAN_KIND_SERVER,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"trace_id": "corrupt_data",
},
attributes={
"net.transport": "IP.TCP",
"http.scheme": "http",
"http.user_agent": "Integration Test",
"http.request.method": "POST",
"http.response.status_code": "200",
"timestamp": "corrupt_data",
},
),
Traces(
timestamp=now - timedelta(seconds=3.5),
duration=timedelta(seconds=5),
trace_id=http_service_trace_id,
span_id=http_service_db_span_id,
parent_span_id=http_service_span_id,
name="SELECT",
kind=TracesKind.SPAN_KIND_CLIENT,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"timestamp": "corrupt_data",
},
attributes={
"db.name": "integration",
"db.operation": "SELECT",
"db.statement": "SELECT * FROM integration",
"trace_d": "corrupt_data",
},
),
Traces(
timestamp=now - timedelta(seconds=3),
duration=timedelta(seconds=1),
trace_id=http_service_trace_id,
span_id=http_service_patch_span_id,
parent_span_id=http_service_span_id,
name="HTTP PATCH",
kind=TracesKind.SPAN_KIND_CLIENT,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"duration_nano": "corrupt_data",
},
attributes={
"http.request.method": "PATCH",
"http.status_code": "404",
"id": "1",
},
),
Traces(
timestamp=now - timedelta(seconds=1),
duration=timedelta(seconds=4),
trace_id=topic_service_trace_id,
span_id=topic_service_span_id,
parent_span_id="",
name="topic publish",
kind=TracesKind.SPAN_KIND_PRODUCER,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "topic-service",
"os.type": "linux",
"host.name": "linux-001",
"cloud.provider": "integration",
"cloud.account.id": "001",
},
attributes={
"message.type": "SENT",
"messaging.operation": "publish",
"messaging.message.id": "001",
"duration_nano": "corrupt_data",
"id": 1,
},
),
]

101
tests/fixtures/seeder.py vendored Normal file
View File

@@ -0,0 +1,101 @@
import time
from http import HTTPStatus
import docker
import docker.errors
import pytest
import requests
from testcontainers.core.container import DockerContainer, Network
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@pytest.fixture(name="seeder", scope="package")
def seeder(
network: Network,
clickhouse: types.TestContainerClickhouse,
request: pytest.FixtureRequest,
pytestconfig: pytest.Config,
) -> types.TestContainerDocker:
"""HTTP seeder container exposing POST/DELETE endpoints for per-test telemetry."""
def create() -> types.TestContainerDocker:
docker_client = docker.from_env()
docker_client.images.build(
path=str(pytestconfig.rootpath),
dockerfile="Dockerfile.seeder",
tag="signoz-tests-seeder:latest",
rm=True,
)
container = DockerContainer("signoz-tests-seeder:latest")
container.with_env(
"CH_HOST", clickhouse.container.container_configs["8123"].address
)
container.with_env(
"CH_PORT", str(clickhouse.container.container_configs["8123"].port)
)
container.with_env(
"CH_USER", clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_USERNAME"]
)
container.with_env(
"CH_PASSWORD", clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_PASSWORD"]
)
container.with_env(
"CH_CLUSTER", clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER"]
)
container.with_exposed_ports(8080)
container.with_network(network=network)
container.start()
host = container.get_container_host_ip()
host_port = container.get_exposed_port(8080)
for attempt in range(20):
try:
response = requests.get(f"http://{host}:{host_port}/healthz", timeout=2)
if response.status_code == HTTPStatus.OK:
break
except Exception as e: # pylint: disable=broad-exception-caught
logger.info("seeder attempt %d: %s", attempt + 1, e)
time.sleep(1)
else:
raise TimeoutError("seeder container did not become ready")
return types.TestContainerDocker(
id=container.get_wrapped_container().id,
host_configs={
"8080": types.TestContainerUrlConfig("http", host, host_port),
},
container_configs={
"8080": types.TestContainerUrlConfig(
"http", container.get_wrapped_container().name, 8080
),
},
)
def delete(container: types.TestContainerDocker) -> None:
client = docker.from_env()
try:
client.containers.get(container_id=container.id).stop()
client.containers.get(container_id=container.id).remove(v=True)
except docker.errors.NotFound:
logger.info("Seeder container %s already gone", container.id)
def restore(cache: dict) -> types.TestContainerDocker:
return types.TestContainerDocker.from_cache(cache)
return reuse.wrap(
request,
pytestconfig,
"seeder",
empty=lambda: types.TestContainerDocker(
id="", host_configs={}, container_configs={}
),
create=create,
delete=delete,
restore=restore,
)

View File

@@ -11,7 +11,7 @@ import requests
from testcontainers.core.container import DockerContainer, Network
from testcontainers.core.image import DockerImage
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -49,8 +49,10 @@ def create_signoz(
if with_web:
dockerfile_path = "cmd/enterprise/Dockerfile.with-web.integration"
# Docker build context is the repo root — one up from pytest's
# rootdir (tests/).
self = DockerImage(
path="../../",
path=str(pytestconfig.rootpath.parent),
dockerfile_path=dockerfile_path,
tag="signoz:integration",
buildargs={
@@ -181,7 +183,7 @@ def create_signoz(
gateway=gateway,
)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
cache_key,

View File

@@ -4,7 +4,7 @@ from typing import Any, Generator
import pytest
from sqlalchemy import create_engine, sql
from fixtures import dev, types
from fixtures import reuse, types
ConnectionTuple = namedtuple("ConnectionTuple", "connection config")
@@ -64,7 +64,7 @@ def sqlite(
env=cache["env"],
)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"sqlite",

View File

@@ -1,5 +1,4 @@
import datetime
import os
from typing import Any
import isodate
@@ -26,8 +25,3 @@ def parse_duration(duration: Any) -> datetime.timedelta:
if isinstance(duration, datetime.timedelta):
return duration
return datetime.timedelta(seconds=duration)
def get_testdata_file_path(file: str) -> str:
testdata_dir = os.path.join(os.path.dirname(__file__), "..", "testdata")
return os.path.join(testdata_dir, file)

View File

@@ -13,7 +13,7 @@ import pytest
from fixtures import types
from fixtures.fingerprint import LogsOrTracesFingerprint
from fixtures.utils import parse_duration, parse_timestamp
from fixtures.time import parse_duration, parse_timestamp
class TracesKind(Enum):
@@ -689,131 +689,142 @@ class Traces(ABC):
return traces
def insert_traces_to_clickhouse(conn, traces: List[Traces]) -> None:
"""
Insert traces into ClickHouse tables following the same logic as the Go exporter.
Handles insertion into:
- distributed_signoz_index_v3 (main traces table)
- distributed_traces_v3_resource (resource fingerprints)
- distributed_tag_attributes_v2 (tag attributes)
- distributed_span_attributes_keys (attribute keys)
- distributed_signoz_error_index_v2 (error events)
Pure function so the seeder container (tests/seeder/) can reuse the
exact insert path used by the pytest fixtures. `conn` is a
clickhouse-connect Client.
"""
resources: List[TracesResource] = []
for trace in traces:
resources.extend(trace.resource)
if len(resources) > 0:
conn.insert(
database="signoz_traces",
table="distributed_traces_v3_resource",
data=[resource.np_arr() for resource in resources],
)
tag_attributes: List[TracesTagAttributes] = []
for trace in traces:
tag_attributes.extend(trace.tag_attributes)
if len(tag_attributes) > 0:
conn.insert(
database="signoz_traces",
table="distributed_tag_attributes_v2",
data=[tag_attribute.np_arr() for tag_attribute in tag_attributes],
)
attribute_keys: List[TracesResourceOrAttributeKeys] = []
resource_keys: List[TracesResourceOrAttributeKeys] = []
for trace in traces:
attribute_keys.extend(trace.attribute_keys)
resource_keys.extend(trace.resource_keys)
if len(attribute_keys) > 0:
conn.insert(
database="signoz_traces",
table="distributed_span_attributes_keys",
data=[attribute_key.np_arr() for attribute_key in attribute_keys],
)
if len(resource_keys) > 0:
conn.insert(
database="signoz_traces",
table="distributed_span_attributes_keys",
data=[resource_key.np_arr() for resource_key in resource_keys],
)
conn.insert(
database="signoz_traces",
table="distributed_signoz_index_v3",
column_names=[
"ts_bucket_start",
"resource_fingerprint",
"timestamp",
"trace_id",
"span_id",
"trace_state",
"parent_span_id",
"flags",
"name",
"kind",
"kind_string",
"duration_nano",
"status_code",
"status_message",
"status_code_string",
"attributes_string",
"attributes_number",
"attributes_bool",
"resources_string",
"events",
"links",
"response_status_code",
"external_http_url",
"http_url",
"external_http_method",
"http_method",
"http_host",
"db_name",
"db_operation",
"has_error",
"is_remote",
"resource",
],
data=[trace.np_arr() for trace in traces],
)
error_events: List[TracesErrorEvent] = []
for trace in traces:
error_events.extend(trace.error_events)
if len(error_events) > 0:
conn.insert(
database="signoz_traces",
table="distributed_signoz_error_index_v2",
data=[error_event.np_arr() for error_event in error_events],
)
_TRACES_TABLES_TO_TRUNCATE = [
"signoz_index_v3",
"traces_v3_resource",
"tag_attributes_v2",
"span_attributes_keys",
"signoz_error_index_v2",
]
def truncate_traces_tables(conn, cluster: str) -> None:
"""Truncate all traces tables. Used by the pytest fixture teardown and by
the seeder's DELETE /telemetry/traces endpoint."""
for table in _TRACES_TABLES_TO_TRUNCATE:
conn.query(f"TRUNCATE TABLE signoz_traces.{table} ON CLUSTER '{cluster}' SYNC")
@pytest.fixture(name="insert_traces", scope="function")
def insert_traces(
clickhouse: types.TestContainerClickhouse,
) -> Generator[Callable[[List[Traces]], None], Any, None]:
def _insert_traces(traces: List[Traces]) -> None:
"""
Insert traces into ClickHouse tables following the same logic as the Go exporter.
This function handles insertion into multiple tables:
- distributed_signoz_index_v3 (main traces table)
- distributed_traces_v3_resource (resource fingerprints)
- distributed_tag_attributes_v2 (tag attributes)
- distributed_span_attributes_keys (attribute keys)
- distributed_signoz_error_index_v2 (error events)
"""
resources: List[TracesResource] = []
for trace in traces:
resources.extend(trace.resource)
if len(resources) > 0:
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_traces_v3_resource",
data=[resource.np_arr() for resource in resources],
)
tag_attributes: List[TracesTagAttributes] = []
for trace in traces:
tag_attributes.extend(trace.tag_attributes)
if len(tag_attributes) > 0:
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_tag_attributes_v2",
data=[tag_attribute.np_arr() for tag_attribute in tag_attributes],
)
attribute_keys: List[TracesResourceOrAttributeKeys] = []
resource_keys: List[TracesResourceOrAttributeKeys] = []
for trace in traces:
attribute_keys.extend(trace.attribute_keys)
resource_keys.extend(trace.resource_keys)
if len(attribute_keys) > 0:
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_span_attributes_keys",
data=[attribute_key.np_arr() for attribute_key in attribute_keys],
)
if len(resource_keys) > 0:
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_span_attributes_keys",
data=[resource_key.np_arr() for resource_key in resource_keys],
)
# Insert main traces
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_signoz_index_v3",
column_names=[
"ts_bucket_start",
"resource_fingerprint",
"timestamp",
"trace_id",
"span_id",
"trace_state",
"parent_span_id",
"flags",
"name",
"kind",
"kind_string",
"duration_nano",
"status_code",
"status_message",
"status_code_string",
"attributes_string",
"attributes_number",
"attributes_bool",
"resources_string",
"events",
"links",
"response_status_code",
"external_http_url",
"http_url",
"external_http_method",
"http_method",
"http_host",
"db_name",
"db_operation",
"has_error",
"is_remote",
"resource",
],
data=[trace.np_arr() for trace in traces],
)
# Insert error events
error_events: List[TracesErrorEvent] = []
for trace in traces:
error_events.extend(trace.error_events)
if len(error_events) > 0:
clickhouse.conn.insert(
database="signoz_traces",
table="distributed_signoz_error_index_v2",
data=[error_event.np_arr() for error_event in error_events],
)
insert_traces_to_clickhouse(clickhouse.conn, traces)
yield _insert_traces
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_traces.signoz_index_v3 ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_traces.traces_v3_resource ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_traces.tag_attributes_v2 ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_traces.span_attributes_keys ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
)
clickhouse.conn.query(
f"TRUNCATE TABLE signoz_traces.signoz_error_index_v2 ON CLUSTER '{clickhouse.env['SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER']}' SYNC"
truncate_traces_tables(
clickhouse.conn,
clickhouse.env["SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_CLUSTER"],
)

View File

@@ -3,7 +3,7 @@ import docker.errors
import pytest
from testcontainers.core.container import DockerContainer, Network
from fixtures import dev, types
from fixtures import reuse, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
@@ -58,7 +58,7 @@ def zookeeper(
def restore(cache: dict) -> types.TestContainerDocker:
return types.TestContainerDocker.from_cache(cache)
return dev.wrap(
return reuse.wrap(
request,
pytestconfig,
"zookeeper",

View File

@@ -1,110 +0,0 @@
from datetime import datetime, timezone
from http import HTTPStatus
from typing import Callable, List
import pytest
import requests
from fixtures import types
from fixtures.auth import USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD
from fixtures.logger import setup_logger
from fixtures.logs import Logs
from fixtures.metrics import Metrics
from fixtures.traces import Traces
from fixtures.utils import get_testdata_file_path
logger = setup_logger(__name__)
@pytest.fixture(name="create_alert_rule", scope="function")
def create_alert_rule(
signoz: types.SigNoz, get_token: Callable[[str, str], str]
) -> Callable[[dict], str]:
admin_token = get_token(USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD)
rule_ids = []
def _create_alert_rule(rule_data: dict) -> str:
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/rules"),
json=rule_data,
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert (
response.status_code == HTTPStatus.OK
), f"Failed to create rule, api returned {response.status_code} with response: {response.text}"
rule_id = response.json()["data"]["id"]
rule_ids.append(rule_id)
return rule_id
def _delete_alert_rule(rule_id: str):
logger.info("Deleting rule: %s", {"rule_id": rule_id})
response = requests.delete(
signoz.self.host_configs["8080"].get(f"/api/v1/rules/{rule_id}"),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
if response.status_code != HTTPStatus.OK:
raise Exception( # pylint: disable=broad-exception-raised
f"Failed to delete rule, api returned {response.status_code} with response: {response.text}"
)
yield _create_alert_rule
# delete the rule on cleanup
for rule_id in rule_ids:
try:
_delete_alert_rule(rule_id)
except Exception as e: # pylint: disable=broad-exception-caught
logger.error("Error deleting rule: %s", {"rule_id": rule_id, "error": e})
@pytest.fixture(name="insert_alert_data", scope="function")
def insert_alert_data(
insert_metrics: Callable[[List[Metrics]], None],
insert_traces: Callable[[List[Traces]], None],
insert_logs: Callable[[List[Logs]], None],
) -> Callable[[List[types.AlertData]], None]:
def _insert_alert_data(
alert_data_items: List[types.AlertData],
base_time: datetime = None,
) -> None:
metrics: List[Metrics] = []
traces: List[Traces] = []
logs: List[Logs] = []
now = base_time or datetime.now(tz=timezone.utc).replace(
second=0, microsecond=0
)
for data_item in alert_data_items:
if data_item.type == "metrics":
_metrics = Metrics.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
metrics.extend(_metrics)
elif data_item.type == "traces":
_traces = Traces.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
traces.extend(_traces)
elif data_item.type == "logs":
_logs = Logs.load_from_file(
get_testdata_file_path(data_item.data_path),
base_time=now,
)
logs.extend(_logs)
# Add data to ClickHouse if any data is present
if len(metrics) > 0:
insert_metrics(metrics)
if len(traces) > 0:
insert_traces(traces)
if len(logs) > 0:
insert_logs(logs)
yield _insert_alert_data

View File

@@ -1,216 +0,0 @@
from http import HTTPStatus
from typing import Callable, List, Tuple
import pytest
import requests
from wiremock.resources.mappings import (
HttpMethods,
Mapping,
MappingRequest,
MappingResponse,
WireMockMatchers,
)
from fixtures import dev, types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
USER_ADMIN_NAME = "admin"
USER_ADMIN_EMAIL = "admin@integration.test"
USER_ADMIN_PASSWORD = "password123Z$"
USER_EDITOR_NAME = "editor"
USER_EDITOR_EMAIL = "editor@integration.test"
USER_EDITOR_PASSWORD = "password123Z$"
USER_VIEWER_NAME = "viewer"
USER_VIEWER_EMAIL = "viewer@integration.test"
USER_VIEWER_PASSWORD = "password123Z$"
@pytest.fixture(name="create_user_admin", scope="package")
def create_user_admin(
signoz: types.SigNoz, request: pytest.FixtureRequest, pytestconfig: pytest.Config
) -> types.Operation:
def create() -> None:
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": USER_ADMIN_NAME,
"orgName": "",
"email": USER_ADMIN_EMAIL,
"password": USER_ADMIN_PASSWORD,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return types.Operation(name="create_user_admin")
def delete(_: types.Operation) -> None:
pass
def restore(cache: dict) -> types.Operation:
return types.Operation(name=cache["name"])
return dev.wrap(
request,
pytestconfig,
"create_user_admin",
lambda: types.Operation(name=""),
create,
delete,
restore,
)
@pytest.fixture(name="get_session_context", scope="function")
def get_session_context(signoz: types.SigNoz) -> Callable[[str, str], str]:
def _get_session_context(email: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return response.json()["data"]
return _get_session_context
@pytest.fixture(name="get_token", scope="function")
def get_token(signoz: types.SigNoz) -> Callable[[str, str], str]:
def _get_token(email: str, password: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
org_id = response.json()["data"]["orgs"][0]["id"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v2/sessions/email_password"),
json={
"email": email,
"password": password,
"orgId": org_id,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
return response.json()["data"]["accessToken"]
return _get_token
@pytest.fixture(name="get_tokens", scope="function")
def get_tokens(signoz: types.SigNoz) -> Callable[[str, str], Tuple[str, str]]:
def _get_tokens(email: str, password: str) -> str:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v2/sessions/context"),
params={
"email": email,
"ref": f"{signoz.self.host_configs['8080'].base()}",
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
org_id = response.json()["data"]["orgs"][0]["id"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v2/sessions/email_password"),
json={
"email": email,
"password": password,
"orgId": org_id,
},
timeout=5,
)
assert response.status_code == HTTPStatus.OK
access_token = response.json()["data"]["accessToken"]
refresh_token = response.json()["data"]["refreshToken"]
return access_token, refresh_token
return _get_tokens
# This is not a fixture purposefully, we just want to add a license to the signoz instance.
# This is also idempotent in nature.
def add_license(
signoz: types.SigNoz,
make_http_mocks: Callable[[types.TestContainerDocker, List[Mapping]], None],
get_token: Callable[[str, str], str], # pylint: disable=redefined-outer-name
) -> None:
make_http_mocks(
signoz.zeus,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v2/licenses/me",
headers={
"X-Signoz-Cloud-Api-Key": {
WireMockMatchers.EQUAL_TO: "secret-key"
}
},
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"id": "0196360e-90cd-7a74-8313-1aa815ce2a67",
"key": "secret-key",
"valid_from": 1732146923,
"valid_until": -1,
"status": "VALID",
"state": "EVALUATING",
"plan": {
"name": "ENTERPRISE",
},
"platform": "CLOUD",
"features": [],
"event_queue": {},
},
},
),
persistent=False,
)
],
)
access_token = get_token(USER_ADMIN_EMAIL, USER_ADMIN_PASSWORD)
response = requests.post(
url=signoz.self.host_configs["8080"].get("/api/v3/licenses"),
json={"key": "secret-key"},
headers={"Authorization": "Bearer " + access_token},
timeout=5,
)
if response.status_code == HTTPStatus.CONFLICT:
return
assert response.status_code == HTTPStatus.ACCEPTED
response = requests.post(
url=signoz.zeus.host_configs["8080"].get("/__admin/requests/count"),
json={"method": "GET", "url": "/v2/licenses/me"},
timeout=5,
)
assert response.json()["count"] == 1

View File

@@ -1,115 +0,0 @@
"""Reusable helpers for user API tests."""
from http import HTTPStatus
from typing import Dict
import requests
from fixtures import types
USERS_BASE = "/api/v2/users"
def create_active_user(
signoz: types.SigNoz,
admin_token: str,
email: str,
role: str,
password: str,
name: str = "",
) -> str:
"""Invite a user and activate via resetPassword. Returns user ID."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/invite"),
json={"email": email, "role": role, "name": name},
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.CREATED, response.text
invited_user = response.json()["data"]
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/resetPassword"),
json={"password": password, "token": invited_user["token"]},
timeout=5,
)
assert response.status_code == HTTPStatus.NO_CONTENT, response.text
return invited_user["id"]
def find_user_by_email(signoz: types.SigNoz, token: str, email: str) -> Dict:
"""Find a user by email from the user list. Raises AssertionError if not found."""
response = requests.get(
signoz.self.host_configs["8080"].get(USERS_BASE),
headers={"Authorization": f"Bearer {token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
user = next((u for u in response.json()["data"] if u["email"] == email), None)
assert user is not None, f"User with email '{email}' not found"
return user
def find_user_with_roles_by_email(signoz: types.SigNoz, token: str, email: str) -> Dict:
"""Find a user by email and return UserWithRoles (user fields + userRoles).
Raises AssertionError if the user is not found.
"""
user = find_user_by_email(signoz, token, email)
response = requests.get(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user['id']}"),
headers={"Authorization": f"Bearer {token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
return response.json()["data"]
def assert_user_has_role(data: Dict, role_name: str) -> None:
"""Assert that a UserWithRoles response contains the expected managed role."""
role_names = {ur["role"]["name"] for ur in data.get("userRoles", [])}
assert role_name in role_names, f"Expected role '{role_name}' in {role_names}"
def change_user_role(
signoz: types.SigNoz,
admin_token: str,
user_id: str,
old_role: str,
new_role: str,
) -> None:
"""Change a user's role (remove old, assign new).
Role names should be managed role names (e.g. signoz-editor).
"""
# Get current roles to find the old role's ID
response = requests.get(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user_id}/roles"),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text
roles = response.json()["data"]
old_role_entry = next((r for r in roles if r["name"] == old_role), None)
assert old_role_entry is not None, f"User does not have role '{old_role}'"
# Remove old role
response = requests.delete(
signoz.self.host_configs["8080"].get(
f"{USERS_BASE}/{user_id}/roles/{old_role_entry['id']}"
),
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.NO_CONTENT, response.text
# Assign new role
response = requests.post(
signoz.self.host_configs["8080"].get(f"{USERS_BASE}/{user_id}/roles"),
json={"name": new_role},
headers={"Authorization": f"Bearer {admin_token}"},
timeout=5,
)
assert response.status_code == HTTPStatus.OK, response.text

View File

@@ -1,154 +0,0 @@
"""Fixtures for cloud integration tests."""
from typing import Callable
import requests
from wiremock.client import (
HttpMethods,
Mapping,
MappingRequest,
MappingResponse,
WireMockMatchers,
)
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def deprecated_simulate_agent_checkin(
signoz: types.SigNoz,
admin_token: str,
cloud_provider: str,
account_id: str,
cloud_account_id: str,
) -> requests.Response:
endpoint = f"/api/v1/cloud-integrations/{cloud_provider}/agent-check-in"
checkin_payload = {
"account_id": account_id,
"cloud_account_id": cloud_account_id,
"data": {},
}
response = requests.post(
signoz.self.host_configs["8080"].get(endpoint),
headers={"Authorization": f"Bearer {admin_token}"},
json=checkin_payload,
timeout=10,
)
if not response.ok:
logger.error(
"Agent check-in failed: %s, response: %s",
response.status_code,
response.text,
)
return response
def setup_create_account_mocks(
signoz: types.SigNoz,
make_http_mocks: Callable,
) -> None:
"""Set up Zeus and Gateway mocks required by the CreateAccount endpoint."""
make_http_mocks(
signoz.zeus,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v2/deployments/me",
headers={
"X-Signoz-Cloud-Api-Key": {
WireMockMatchers.EQUAL_TO: "secret-key"
}
},
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"name": "test-deployment",
"cluster": {"region": {"dns": "test.signoz.cloud"}},
},
},
),
persistent=False,
)
],
)
make_http_mocks(
signoz.gateway,
[
Mapping(
request=MappingRequest(
method=HttpMethods.GET,
url="/v1/workspaces/me/keys/search?name=aws-integration&page=1&per_page=10",
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": [],
"_pagination": {"page": 1, "per_page": 10, "total": 0},
},
),
persistent=False,
),
Mapping(
request=MappingRequest(
method=HttpMethods.POST,
url="/v1/workspaces/me/keys",
),
response=MappingResponse(
status=200,
json_body={
"status": "success",
"data": {
"name": "aws-integration",
"value": "test-ingestion-key-123456",
},
"error": "",
},
),
persistent=False,
),
],
)
def simulate_agent_checkin(
signoz: types.SigNoz,
admin_token: str,
cloud_provider: str,
account_id: str,
cloud_account_id: str,
data: dict | None = None,
) -> requests.Response:
endpoint = f"/api/v1/cloud_integrations/{cloud_provider}/accounts/check_in"
checkin_payload = {
"cloudIntegrationId": account_id,
"providerAccountId": cloud_account_id,
"data": data or {},
}
response = requests.post(
signoz.self.host_configs["8080"].get(endpoint),
headers={"Authorization": f"Bearer {admin_token}"},
json=checkin_payload,
timeout=10,
)
if not response.ok:
logger.error(
"Agent check-in failed: %s, response: %s",
response.status_code,
response.text,
)
return response

View File

@@ -1,15 +0,0 @@
from typing import Any, Generator
import pytest
from fixtures import types
@pytest.fixture(scope="package")
def tmpfs(
tmp_path_factory: pytest.TempPathFactory,
) -> Generator[types.LegacyPath, Any, None]:
def _tmp(basename: str):
return tmp_path_factory.mktemp(basename)
yield _tmp

View File

@@ -1,256 +0,0 @@
from datetime import datetime, timedelta, timezone
from http import HTTPStatus
from typing import List
import requests
from fixtures.logs import Logs
from fixtures.traces import TraceIdGenerator, Traces, TracesKind, TracesStatusCode
def format_timestamp(dt: datetime) -> str:
"""
Format a datetime object to match the API's timestamp format.
The API returns timestamps with minimal fractional seconds precision.
Example: 2026-02-03T20:54:56.5Z for 500000 microseconds
"""
base_str = dt.strftime("%Y-%m-%dT%H:%M:%S")
if dt.microsecond:
# Convert microseconds to fractional seconds and strip trailing zeros
fractional = f"{dt.microsecond / 1000000:.6f}"[2:].rstrip("0")
return f"{base_str}.{fractional}Z"
return f"{base_str}Z"
def assert_identical_query_response(
response1: requests.Response, response2: requests.Response
) -> None:
"""
Assert that two query responses are identical in status and data.
"""
assert response1.status_code == response2.status_code, "Status codes do not match"
if response1.status_code == HTTPStatus.OK:
assert (
response1.json()["status"] == response2.json()["status"]
), "Response statuses do not match"
assert (
response1.json()["data"]["data"]["results"]
== response2.json()["data"]["data"]["results"]
), "Response data do not match"
def generate_logs_with_corrupt_metadata() -> List[Logs]:
"""
Specifically, entries with 'id', 'timestamp', 'severity_text', 'severity_number' and 'body' fields in metadata
"""
now = datetime.now(tz=timezone.utc).replace(second=0, microsecond=0)
return [
Logs(
timestamp=now - timedelta(seconds=4),
body="POST /integration request received",
severity_text="INFO",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"timestamp": "corrupt_data",
},
attributes={
"net.transport": "IP.TCP",
"http.scheme": "http",
"http.user_agent": "Integration Test",
"http.request.method": "POST",
"http.response.status_code": "200",
"severity_text": "corrupt_data",
"timestamp": "corrupt_data",
},
trace_id="1",
),
Logs(
timestamp=now - timedelta(seconds=3),
body="SELECT query executed",
severity_text="DEBUG",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"severity_number": "corrupt_data",
"id": "corrupt_data",
},
attributes={
"db.name": "integration",
"db.operation": "SELECT",
"db.statement": "SELECT * FROM integration",
"trace_id": "2",
},
),
Logs(
timestamp=now - timedelta(seconds=2),
body="HTTP PATCH failed with 404",
severity_text="WARN",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"body": "corrupt_data",
"trace_id": "3",
},
attributes={
"http.request.method": "PATCH",
"http.status_code": "404",
"id": "1",
},
),
Logs(
timestamp=now - timedelta(seconds=1),
body="{'trace_id': '4'}",
severity_text="ERROR",
resources={
"deployment.environment": "production",
"service.name": "topic-service",
"os.type": "linux",
"host.name": "linux-001",
"cloud.provider": "integration",
"cloud.account.id": "001",
},
attributes={
"message.type": "SENT",
"messaging.operation": "publish",
"messaging.message.id": "001",
"body": "corrupt_data",
"timestamp": "corrupt_data",
},
),
]
def generate_traces_with_corrupt_metadata() -> List[Traces]:
"""
Specifically, entries with 'id', 'timestamp', 'trace_id' and 'duration_nano' fields in metadata
"""
http_service_trace_id = TraceIdGenerator.trace_id()
http_service_span_id = TraceIdGenerator.span_id()
http_service_db_span_id = TraceIdGenerator.span_id()
http_service_patch_span_id = TraceIdGenerator.span_id()
topic_service_trace_id = TraceIdGenerator.trace_id()
topic_service_span_id = TraceIdGenerator.span_id()
now = datetime.now(tz=timezone.utc).replace(second=0, microsecond=0)
return [
Traces(
timestamp=now - timedelta(seconds=4),
duration=timedelta(seconds=3),
trace_id=http_service_trace_id,
span_id=http_service_span_id,
parent_span_id="",
name="POST /integration",
kind=TracesKind.SPAN_KIND_SERVER,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"trace_id": "corrupt_data",
},
attributes={
"net.transport": "IP.TCP",
"http.scheme": "http",
"http.user_agent": "Integration Test",
"http.request.method": "POST",
"http.response.status_code": "200",
"timestamp": "corrupt_data",
},
),
Traces(
timestamp=now - timedelta(seconds=3.5),
duration=timedelta(seconds=5),
trace_id=http_service_trace_id,
span_id=http_service_db_span_id,
parent_span_id=http_service_span_id,
name="SELECT",
kind=TracesKind.SPAN_KIND_CLIENT,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"timestamp": "corrupt_data",
},
attributes={
"db.name": "integration",
"db.operation": "SELECT",
"db.statement": "SELECT * FROM integration",
"trace_d": "corrupt_data",
},
),
Traces(
timestamp=now - timedelta(seconds=3),
duration=timedelta(seconds=1),
trace_id=http_service_trace_id,
span_id=http_service_patch_span_id,
parent_span_id=http_service_span_id,
name="HTTP PATCH",
kind=TracesKind.SPAN_KIND_CLIENT,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "http-service",
"os.type": "linux",
"host.name": "linux-000",
"cloud.provider": "integration",
"cloud.account.id": "000",
"duration_nano": "corrupt_data",
},
attributes={
"http.request.method": "PATCH",
"http.status_code": "404",
"id": "1",
},
),
Traces(
timestamp=now - timedelta(seconds=1),
duration=timedelta(seconds=4),
trace_id=topic_service_trace_id,
span_id=topic_service_span_id,
parent_span_id="",
name="topic publish",
kind=TracesKind.SPAN_KIND_PRODUCER,
status_code=TracesStatusCode.STATUS_CODE_OK,
status_message="",
resources={
"deployment.environment": "production",
"service.name": "topic-service",
"os.type": "linux",
"host.name": "linux-001",
"cloud.provider": "integration",
"cloud.account.id": "001",
},
attributes={
"message.type": "SENT",
"messaging.operation": "publish",
"messaging.message.id": "001",
"duration_nano": "corrupt_data",
"id": 1,
},
),
]

View File

@@ -7,12 +7,12 @@ import pytest
from wiremock.client import HttpMethods, Mapping, MappingRequest, MappingResponse
from fixtures import types
from fixtures.alertutils import (
from fixtures.alerts import (
update_rule_channel_name,
verify_webhook_alert_expectation,
)
from fixtures.fs import get_testdata_file_path
from fixtures.logger import setup_logger
from fixtures.utils import get_testdata_file_path
# Alert test cases use a 30-second wait time to verify expected alert firing.
# Alert data is set up to trigger on the first rule manager evaluation.

Some files were not shown because too many files have changed in this diff Show More