Files
signoz/docs/contributing/tests/e2e.md
Ashwin Bhatkal ae2127afe8
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
test: dashboards list spec with new e2e framework (#11190)
* test: dashboards list spec with new e2e framework

* chore: update docker ignore

* test: dashboards list spec with new e2e framework

* test: fix skipped ones

* test: fix scroll

* test: fix flaky clicks

* test: fix formatting

* chore: doc update + ignore file changes

* chore: update fixtures vs helpers

* chore: resolve comments

* chore: resolve comments
2026-05-07 14:55:38 +00:00

17 KiB

E2E Tests

SigNoz uses end-to-end tests to verify the frontend works correctly against a real backend. These tests use Playwright to drive a real browser against a containerized SigNoz stack that pytest brings up — the same fixture graph integration tests use, with an extra HTTP seeder container for per-spec telemetry seeding.

How to set up the E2E test environment?

Prerequisites

Before running E2E tests, ensure you have the following installed:

  • Python 3.13+
  • uv
  • Docker (for containerized services)
  • Node 18+ and Yarn

Initial Setup

  1. Install Python deps for the shared tests project:
cd tests
uv sync
  1. Install Node deps and Playwright browsers:
cd e2e
yarn install
yarn install:browsers   # one-time Playwright browser install

Starting the Test Environment

To spin up the backend stack (SigNoz, ClickHouse, Postgres, Zookeeper, Zeus mock, gateway mock, seeder, migrator-with-web) and keep it running:

cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse --with-web \
  e2e/bootstrap/setup.py::test_setup

This command will:

  • Bring up all containers via pytest fixtures
  • Register the admin user (admin@integration.test / password123Z$)
  • Apply the enterprise license (via a WireMock stub of Zeus) and dismiss the org-onboarding prompt so specs can navigate directly to feature pages
  • Start the HTTP seeder container (tests/seeder/ — exposing /telemetry/{traces,logs,metrics} POST + DELETE)
  • Write backend coordinates to tests/e2e/.env.local (loaded by playwright.config.ts via dotenv)
  • Keep containers running via the --reuse flag

The --with-web flag builds the frontend into the SigNoz container — required for E2E. The build takes ~4 mins on a cold start.

Stopping the Test Environment

When you're done writing E2E tests, clean up the environment:

cd tests
uv run pytest --basetemp=./tmp/ -vv --teardown \
  e2e/bootstrap/setup.py::test_teardown

Understanding the E2E Test Framework

Playwright drives a real browser (Chromium / Firefox / WebKit) against the running SigNoz frontend. The backend is brought up by the same pytest fixture graph integration tests use, so both suites share one source of truth for container lifecycle, license seeding, and test-user accounts.

  • Why Playwright? First-class TypeScript support, network interception, automatic wait-for-visibility, built-in trace viewer that captures every request/response the UI triggers — so specs rarely need separate API probes alongside UI clicks.
  • Why pytest for lifecycle? The integration suite already owns container bring-up. Reusing it keeps the E2E stack exactly in sync with the integration stack and avoids a parallel lifecycle framework.
  • Why a separate seeder container? Per-spec telemetry seeding (traces / logs / metrics) needs a thin HTTP wrapper around the ClickHouse insert helpers so a browser spec can POST from inside the test. The seeder lives at tests/seeder/, is built from tests/Dockerfile.seeder, and reuses the same fixtures/{traces,logs,metrics}.py as integration tests.
tests/
├── fixtures/                  # shared with integration (see integration.md)
├── integration/               # pytest integration suite
├── seeder/                    # standalone HTTP seeder container
│   ├── __init__.py
│   ├── Dockerfile
│   └── server.py              # FastAPI app wrapping fixtures.{traces,logs,metrics}
└── e2e/
    ├── package.json
    ├── playwright.config.ts   # loads .env + .env.local via dotenv
    ├── .env.example           # staging-mode template
    ├── .env.local             # generated by bootstrap/setup.py (gitignored)
    ├── bootstrap/
    │   └── setup.py           # test_setup / test_teardown — pytest lifecycle
    ├── fixtures/              # Playwright test fixtures (test.extend) only
    │   └── auth.ts
    ├── helpers/               # function helpers + the constants they share with tests
    │   ├── auth.ts
    │   └── dashboards.ts
    ├── testdata/              # static data files (JSON) used by helpers and tests
    │   └── apm-metrics.json   # (example)
    ├── tests/                 # Playwright .spec.ts files, one dir per feature area
    │   └── alerts/
    │       └── alerts.spec.ts # (example)
    └── artifacts/             # per-run output (gitignored)
        ├── html/              # HTML reporter output
        ├── json/              # JSON reporter output
        └── results/           # per-test traces / screenshots / videos on failure

fixtures/ vs helpers/ — what goes where

These two folders look similar but mean different things:

  • fixtures/ holds Playwright test fixtures (created via test.extend({...})). By the canonical definition, a fixture is "a consistent, predefined set of data, objects, or environmental conditions used to ensure tests run in a stable state" — i.e. setup/teardown that runs automatically around each test or worker. auth.ts matches: it extends Playwright's test with an authedPage that's logged-in before every test runs and torn down after. If the only thing in this folder ever is auth.ts, that's fine — fixtures are a deliberately small surface.
  • helpers/ holds plain function helpers that you call explicitly from a test or hook — they don't extend Playwright's test. This covers both behaviour helpers (e.g. gotoDashboardsList(page)) and the constants those helpers and the tests both refer to (e.g. SEARCH_PLACEHOLDER). Constants live next to the helpers that use them so a single import line in a test covers both.
  • testdata/ holds static data files (typically JSON / YAML) consumed by the helpers — for example, apm-metrics.json, a real dashboard payload uploaded through the UI by an importer helper.

Rule of thumb: if it's a test.extend fixture, put it in fixtures/. If it's a function you call explicitly (or a constant the function uses), put it in helpers/. If it's a static file the helpers read, put it in testdata/.

Each spec follows these principles:

  1. Directory per feature: tests/e2e/tests/<feature>/*.spec.ts. Cross-resource junction concerns (e.g. cascade-delete) go in their own file, not packed into one giant spec.
  2. Test titles use TC-NN: test('TC-01 alerts page — tabs render', ...). Preserves ordering at a glance and maps to external coverage tracking.
  3. UI-first: drive flows through the UI. Playwright traces capture every BE request/response the UI triggers, so asserting on UI outcomes implicitly validates BE contracts. Reach for direct page.request.* only when the test's purpose is asserting a response contract (use page.waitForResponse on a UI click) or when a specific UI step is structurally flaky (e.g. Ant DatePicker calendar-cell indices) — and even then try UI first.
  4. Self-contained state: each spec seeds its own data and cleans up at suite teardown. The pytest harness creates a fresh stack with zero dashboards / alerts / etc. — never assume pre-existing data. Two patterns work:
    • Per-test seed + cleanup in try / finally — small specs where each test owns its data.
    • Suite-level seed + afterAll teardown — preferred for larger specs. Each createDashboard(...) call adds the resulting ID to a module-level Set<string>, and one test.afterAll(...) deletes everything in the set. See tests/e2e/tests/dashboards/list.spec.ts for the full pattern. test.beforeAll / test.afterAll cannot use authedPage directly (it's test-scoped); use newAdminContext(browser) from helpers/auth.ts instead — it performs one fresh login per suite hook.
  5. Seed via API when the UI flow is multi-step or brittle. The frontend stores its JWT in localStorage under AUTH_TOKEN; page.request.* inherits the auth fixture's storage state. A typical pattern:
    const token = await page.evaluate(
      () => (globalThis as any).localStorage.getItem('AUTH_TOKEN') || '',
    );
    await page.request.post('/api/v1/dashboards', {
      data: { title: 'my-name', uploadedGrafana: false },
      headers: { Authorization: `Bearer ${token}` },
    });
    
    This is faster and more reliable than a multi-step UI seed. Reach for the UI flow only when the test's purpose is asserting that flow.
  6. Reusable static data lives in tests/e2e/testdata/. For example, apm-metrics.json is a real dashboard payload that importApmMetricsDashboardViaUI (in helpers/dashboards.ts) uploads through the actual Import JSON UI flow to seed a richly-tagged dashboard for search/list tests.

How to write an E2E test?

Create a new file tests/e2e/tests/alerts/smoke.spec.ts:

import { test, expect } from '../../fixtures/auth';

test('TC-01 alerts page — tabs render', async ({ authedPage: page }) => {
  await page.goto('/alerts');
  await expect(page.getByRole('tab', { name: /alert rules/i })).toBeVisible();
  await expect(page.getByRole('tab', { name: /configuration/i })).toBeVisible();
});

The authedPage fixture (from tests/e2e/fixtures/auth.ts) gives you a Page whose browser context is already authenticated as the admin user. First use per worker triggers one login; the resulting storageState is held in memory and reused for later requests.

To run just this test (assuming the stack is up via test_setup):

cd tests/e2e
npx playwright test tests/alerts/smoke.spec.ts --project=chromium

Here's a more comprehensive example that exercises a CRUD flow via the UI:

import { test, expect } from '../../fixtures/auth';

test.describe.configure({ mode: 'serial' });

test('TC-02 alerts list — create, toggle, delete', async ({ authedPage: page }) => {
  await page.goto('/alerts?tab=AlertRules');
  const name = 'smoke-rule';

  // Seed via UI — click "New Alert", fill form, save.
  await page.getByRole('button', { name: /new alert/i }).click();
  await page.getByTestId('alert-name-input').fill(name);
  // ... fill metric / threshold / save ...

  // Find the row and exercise the action menu.
  const row = page.locator('tr', { hasText: name });
  await expect(row).toBeVisible();
  await row.locator('[data-testid="alert-actions"] button').first().click();

  // waitForResponse captures the network call the UI triggers — no parallel fetch needed.
  const patchWait = page.waitForResponse(
    (r) => r.url().includes('/rules/') && r.request().method() === 'PATCH',
  );
  await page.getByRole('menuitem').filter({ hasText: /^disable$/i }).click();
  await patchWait;
  await expect(row).toContainText(/disabled/i);
});

Locator priority

  1. getByTestId('...') — preferred when the source exposes one. Stable, app-author-provided handle that survives copy-edits.
  2. getByRole('button', { name: 'Submit' })
  3. getByLabel('Email')
  4. getByPlaceholder('...')
  5. getByText('...')
  6. locator('.ant-select') — last resort (Ant Design dropdowns often have no semantic alternative)

Agents

Three Claude agents in .claude/agents/ accelerate writing and maintaining E2E specs:

  • playwright-test-planner — explores a feature in a real browser plus the local frontend source and writes a test plan as a scratch markdown file (under tests/e2e/specs/, which is gitignored — plans are working artifacts for the generator, not committed docs).
  • playwright-test-generator — converts a test plan into Playwright spec files under tests/e2e/tests/<feature>/. Drives each scenario through MCP browser tools and emits TC-NN-titled tests using the authedPage fixture and the API-seed pattern.
  • playwright-test-healer — runs failing specs, debugs them with snapshots / console / network introspection, and edits the spec to fix selector drift, timing, or state-leak issues.

The agents rely on the Playwright-test MCP server (mcp__playwright-test__* tools). Configure it in your Claude MCP settings; the permission allowlist lives in .claude/settings.local.json.

How to run E2E tests?

Running All Tests

With the stack already up, from tests/e2e/:

yarn test                 # headless, all projects

Running Specific Projects

yarn test:chromium        # chromium only
yarn test:firefox
yarn test:webkit

Running Specific Tests

cd tests/e2e

# Single feature dir
npx playwright test tests/alerts/ --project=chromium

# Single file
npx playwright test tests/alerts/alerts.spec.ts --project=chromium

# Single test by title grep
npx playwright test --project=chromium -g "TC-01"

Iterative modes

yarn test:ui              # Playwright UI mode — watch + step through
yarn test:headed          # headed browser
yarn test:debug           # Playwright inspector, pause-on-breakpoint
yarn codegen              # record-and-replay locator generation
yarn report               # open the last HTML report (artifacts/html)

Staging fallback

Point SIGNOZ_E2E_BASE_URL at a remote env via .env — no local backend bring-up, no .env.local generated, Playwright hits the URL directly:

cd tests/e2e
cp .env.example .env      # fill SIGNOZ_E2E_USERNAME / PASSWORD
yarn test:staging

How to configure different options for E2E tests?

Environment variables

Variable Description
SIGNOZ_E2E_BASE_URL Base URL the browser targets. Written by bootstrap/setup.py for local mode; set manually for staging.
SIGNOZ_E2E_USERNAME Admin email. Bootstrap writes admin@integration.test.
SIGNOZ_E2E_PASSWORD Admin password. Bootstrap writes the integration-test default.
SIGNOZ_E2E_SEEDER_URL Seeder HTTP base URL — hit by specs that need per-test telemetry.

Loading order in playwright.config.ts: .env first (user-provided, staging), then .env.local with override: true (bootstrap-generated, local mode). Anything already set in process.env at yarn-test time wins because dotenv doesn't touch vars that are already present.

Playwright options

The full playwright.config.ts is the source of truth. Common things to tweak:

  • projects — Chromium / Firefox / WebKit are enabled by default. Disable to speed up iteration.
  • retries2 on CI (process.env.CI), 0 locally.
  • fullyParallel: true — files run in parallel by worker; within a file, use test.describe.configure({ mode: 'serial' }) if tests share list pages / mutate shared state.
  • trace: 'on-first-retry', screenshot: 'only-on-failure', video: 'retain-on-failure' — default diagnostic artifacts land in artifacts/results/<test>/.

Pytest options (bootstrap side)

The same pytest flags integration tests expose work here, since E2E reuses the shared fixture graph:

  • --reuse — keep containers warm between runs (required for all iteration).
  • --teardown — tear everything down.
  • --with-web — build the frontend into the SigNoz container. Required for E2E; integration tests don't need it.
  • --sqlstore-provider, --postgres-version, --clickhouse-version, etc. — see docs/contributing/integration.md.

What should I remember?

  • Always use the --reuse flag when setting up the E2E stack. --with-web adds a ~4 min frontend build; you only want to pay that once.
  • Don't teardown before setup. --reuse correctly handles partially-set-up state, so chaining teardown → setup wastes time.
  • Prefer UI-driven flows. Playwright captures BE requests in the trace; a parallel fetch probe is almost always redundant. Drop to page.request.* only when the UI can't reach what you need.
  • Use page.waitForResponse on UI clicks to assert BE contracts — it still exercises the UI trigger path.
  • Title every test TC-NN <short description> — keeps the suite navigable and reportable.
  • Split by resource, not by regression suite. One spec per feature resource; cross-resource junction concerns (cascade-delete, linked-edit) get their own file.
  • Use short descriptive resource names (alerts-list-rule, labels-rule, downtime-once) — no timestamp disambiguation. Each test owns its resources and cleans up in try/finally.
  • Never commit test.only — a pre-commit check or CI runs with forbidOnly: true.
  • Prefer explicit waits over page.waitForTimeout(ms). await expect(locator).toBeVisible() is always better than waitForTimeout(5000).
  • Unique test names won't save you from shared-tenant state. When two tests hit the same list page, either serialize (describe.configure({ mode: 'serial' })) or isolate cleanup religiously.
  • Artifacts go to tests/e2e/artifacts/ — HTML report at artifacts/html, traces at artifacts/results/<test>/. All gitignored; archive the dir in CI.