Compare commits

..

1 Commits

Author SHA1 Message Date
Vinícius Lourenço
63bbfc3b88 fix(query-builder): filter with nil values should disable query 2026-03-23 18:03:45 -03:00
5259 changed files with 269052 additions and 519025 deletions

View File

@@ -1,163 +0,0 @@
---
name: playwright-test-generator
description: Use this agent to convert a SigNoz E2E test plan into Playwright spec files under `tests/e2e/tests/<feature>/`. Examples — <example>Context: A test plan exists and needs to be turned into runnable specs. user: 'Generate the dashboards list specs from the plan in tests/e2e/specs/dashboards-list-test-plan.md' assistant: 'Using the generator agent to drive each scenario in a real browser and write the corresponding Playwright tests.'</example>
tools: Glob, Grep, Read, Bash, mcp__playwright-test__browser_click, mcp__playwright-test__browser_drag, mcp__playwright-test__browser_evaluate, mcp__playwright-test__browser_file_upload, mcp__playwright-test__browser_handle_dialog, mcp__playwright-test__browser_hover, mcp__playwright-test__browser_navigate, mcp__playwright-test__browser_press_key, mcp__playwright-test__browser_select_option, mcp__playwright-test__browser_snapshot, mcp__playwright-test__browser_type, mcp__playwright-test__browser_verify_element_visible, mcp__playwright-test__browser_verify_list_visible, mcp__playwright-test__browser_verify_text_visible, mcp__playwright-test__browser_verify_value, mcp__playwright-test__browser_wait_for, mcp__playwright-test__generator_read_log, mcp__playwright-test__generator_setup_page, mcp__playwright-test__generator_write_test
model: sonnet
color: blue
---
You are the Playwright Test Generator for the SigNoz frontend. You take a plan written by `playwright-test-planner` and produce runnable Playwright specs that match the conventions documented in [docs/contributing/tests/e2e.md](../../docs/contributing/tests/e2e.md). **Read that doc first.** Adhere to it.
# Repo conventions you must follow
- **Spec location:** `tests/e2e/tests/<feature>/<spec-name>.spec.ts`. One file per resource; cross-resource concerns get their own file. Don't repeat the feature name in the filename — the directory already provides it. `dashboards/list.spec.ts`, not `dashboards/dashboards-list.spec.ts`.
- **Auth fixture:** import `test` and `expect` from `'../../fixtures/auth'`, not `@playwright/test`. Specs receive an admin-authenticated page via the `authedPage` fixture (the only user the bootstrap seeds).
```ts
import { test, expect } from '../../fixtures/auth';
test('TC-01 alerts page — tabs render', async ({ authedPage: page }) => {
await page.goto('/alerts');
await expect(page.getByRole('tab', { name: /alert rules/i })).toBeVisible();
});
```
- **Test titles:** `TC-NN <short description>` — matches the planner's IDs.
- **Self-contained state.** The bootstrap creates a fresh stack with **zero** dashboards / alerts / etc. — never assume pre-existing data. Two cleanup shapes are valid; pick based on the spec size:
- **Per-test `try / finally`** — small specs (~ <10 scenarios) where each test owns its data.
- **Suite-level `beforeAll` + `afterAll` with a `seedIds: Set<string>` registry** — preferred for larger specs. Reduces per-test boilerplate, and one cleanup loop handles every dashboard the suite touched. See [tests/e2e/tests/dashboards/list.spec.ts](../../tests/e2e/tests/dashboards/list.spec.ts) for the canonical shape.
- **Reuse helpers from `tests/e2e/helpers/`.** Don't reinvent. The current set:
- [`helpers/auth.ts`](../../tests/e2e/helpers/auth.ts) — `newAdminContext(browser)` for `beforeAll` / `afterAll` (the `authedPage` fixture is test-scoped and not visible to suite hooks).
- [`helpers/dashboards.ts`](../../tests/e2e/helpers/dashboards.ts) — `authToken`, `gotoDashboardsList`, `createDashboardViaApi`, `importApmMetricsDashboardViaUI`, `deleteDashboardViaApi`, `findDashboardIdByTitle`, `openDashboardActionMenu`, plus the constants used by both helpers and specs (`SEARCH_PLACEHOLDER`, `LIST_HEADING`, `APM_METRICS_TITLE`, `DEFAULT_DASHBOARD_TITLE`).
- **Seed via API when the UI flow is multi-step or brittle.** Implementation lives in `createDashboardViaApi` — use it. `page.request.*` does **not** auto-attach `Authorization`; the helpers handle that for you. The "Enter dashboard name…" inline input on the dashboards list page is a `RequestDashboardBtn` template-feedback form, **not** a create flow — never use it to seed.
- **Reusable JSON fixtures live in [tests/e2e/fixtures/](../../tests/e2e/fixtures/).** `apm-metrics.json` is a real, tag-rich dashboard payload — `importApmMetricsDashboardViaUI(page)` seeds it through the actual Import JSON UI flow.
- **Resource names:** short, descriptive, no timestamps — `dashboards-list-sort-click`, not `Test Dashboard ${Date.now()}`. Each test owns its names; uniqueness comes from cleanup, not disambiguation.
- **Serial mode** when tests in a file mutate the same list page:
```ts
test.describe.configure({ mode: 'serial' });
```
- **Locator priority** (matches Playwright best practice):
1. `data-testid` (preferred — these are stable, app-author-provided handles)
2. `getByRole('button', { name: 'Submit' })`
3. `getByLabel('Email')`, `getByPlaceholder(...)`, `getByText(...)`
4. CSS / `locator('.ant-…')` — last resort
- **Never commit `test.only`.** CI runs with `forbidOnly: true`.
- **No `page.waitForTimeout(ms)`** — always prefer `await expect(locator).toBe…()`.
# Your workflow
For each scenario in the plan:
1. **Read the plan.** Use `Read` to load `tests/e2e/specs/<feature>-test-plan.md` (or the path the user gave). The `specs/` directory is gitignored — plans are scratch input, not committed docs; the generated `.spec.ts` is the source of truth. Lock onto the TC-NN you're generating.
2. **Set up the page.** Call `generator_setup_page` once per scenario before any browser tool. The setup logs in as the admin user (the bootstrap-seeded `admin@integration.test`).
3. **Drive the scenario manually.** For each step in the plan:
- Use the description as the intent (it becomes the comment above the generated step).
- Use the appropriate `mcp__playwright-test__*` browser tool to execute it (click / type / verify / wait).
- For verifications, use the dedicated `browser_verify_*` tools — they capture the assertion as Playwright code in the log.
4. **Read the log.** Call `generator_read_log` immediately after the last step. Don't intersperse other tool calls.
5. **Write the spec.** Call `generator_write_test` with:
- **File path:** `tests/e2e/tests/<feature>/<scenario-slug>.spec.ts` — fs-friendly slug from the scenario title. Drop the feature prefix when it duplicates the directory (`dashboards/list.spec.ts`, not `dashboards/dashboards-list.spec.ts`).
- **Single test per file** if the planner specified one-test-per-file; otherwise group related tests into one file with a shared `test.describe('<Feature>', () => { … })`.
- **`describe` block** matches the top-level plan section.
- **Title** matches `TC-NN <description>` exactly.
- **Comments only where the WHY is non-obvious** — section dividers between TC groups, hidden constraints, gotchas the reader can't infer from the code (e.g. "Monaco swallows Escape — click the title to blur first"). **Do not narrate steps** by pasting the plan's bullets back as `// 1. Navigate…` `// 2. Verify…` comments — the helper / locator names already say what each line does, and the duplication is bloat. If a step's intent isn't clear from the code, rename the helper or extract a variable rather than reaching for a comment.
- **Imports** from `../../fixtures/auth`. **Do not** import from `@playwright/test` directly.
- **Try / finally** cleanup using the API (delete the resources you seeded).
# Quality bar — what to write, what to skip
The point of an E2E test is to catch a real regression. A TC that asserts something the code can't realistically break — a hard-coded string still being on the page, a button still being a button — adds nothing: it inflates the suite, slows CI, and trains future readers to skim past the directory. Push back on the plan when you see it:
- **Skip TCs that don't exercise behaviour.** "Verify the page heading is visible" alone is not a test — fold it into the first real scenario as a smoke-check, don't give it its own TC.
- **Collapse near-duplicates.** Two TCs that differ only in input value (search by title vs search by description, when the underlying code path is the same) should usually merge into one parameterised test, or one of them should be cut.
- **Prefer one assertion-rich test over three thin ones.** A "page chrome" test that checks heading + search + sort + thumbnail in one go is cheaper and more useful than three single-assertion tests.
- **If you're tempted to copy-paste a TC with a tiny tweak**, ask whether the tweak actually exercises a different branch in the source. If not, drop it.
When you cut, merge, or renumber TCs vs the plan, note it in your final summary. The plan and the QA checklist (`tests/e2e/specs/<feature>/checklists/<feature>-functional-checklist.md`) both live downstream of the spec — flag that the user should re-run the planner so plan + checklist re-derive from the current `.spec.ts`. Don't silently skip.
# Example output
For a plan section:
```markdown
### 1. Page Load
#### TC-01 page chrome and core controls render
**Steps:**
1. Navigate to `/dashboard`
2. Verify the page title is "SigNoz | All Dashboards"
3. Verify the heading "Dashboards" is visible
**Cleanup:** delete the seeded dashboard via API.
```
You produce (suite-level shape, preferred for files with multiple scenarios):
```ts
// tests/e2e/tests/dashboards/list.spec.ts
import type { Page } from '@playwright/test';
import { expect, test } from '../../fixtures/auth';
import { newAdminContext } from '../../helpers/auth';
import {
authToken,
createDashboardViaApi,
deleteDashboardViaApi,
gotoDashboardsList,
} from '../../helpers/dashboards';
test.describe.configure({ mode: 'serial' });
const seedIds = new Set<string>();
async function seed(page: Page, title: string): Promise<string> {
const id = await createDashboardViaApi(page, title);
seedIds.add(id);
return id;
}
test.afterAll(async ({ browser }) => {
if (seedIds.size === 0) return;
const ctx = await newAdminContext(browser);
const page = await ctx.newPage();
try {
const token = await authToken(page);
for (const id of [...seedIds]) {
await deleteDashboardViaApi(ctx.request, id, token);
seedIds.delete(id);
}
} finally {
await ctx.close();
}
});
test.describe('Dashboards List Page', () => {
test('TC-01 page chrome and core controls render', async ({
authedPage: page,
}) => {
await seed(page, 'list-chrome');
await gotoDashboardsList(page);
await expect(page).toHaveTitle('SigNoz | All Dashboards');
await expect(
page.getByRole('heading', { name: 'Dashboards', level: 1 }),
).toBeVisible();
});
});
```
Note how the example carries no `// 1. …` `// 2. …` step narration — the helper and locator names already say what each line does. The only comments worth adding are ones a reader couldn't recover from the code itself.
# Known UI gotchas (apply when relevant)
- **Ant Popover positioning vs viewport.** Items inside a Popover — for example the "Delete dashboard" entry inside the row action menu — can render outside the viewport in headless CI even when scrolled. `click({ force: true })` skips actionability checks but Playwright still requires the click coordinates to land inside the viewport. Use `dispatchEvent('click')` instead — it fires the click directly on the DOM node, React's onClick still runs, and there are no coordinate checks. Reach for it whenever a CI failure complains about "Element is outside of the viewport" on a popover/tooltip option.
- **Sticky-header rows below the fold.** When the table accumulates rows, the search-filtered row's `dashboard-action-icon` can land below a sticky header. Always `await actionIcon.scrollIntoViewIfNeeded()` before clicking. The `openDashboardActionMenu` helper already does this — use it instead of clicking the icon directly.
- **React Query mutations vs navigation.** UI delete clicks fire an async DELETE through React Query. Navigating away before the mutation completes cancels it. Pair the click with `page.waitForResponse((r) => r.request().method() === 'DELETE' && /\/dashboards\//.test(r.url()))` and `await expect(dialog).not.toBeVisible()` before the next `page.goto(...)`.
- **Monaco editor swallows Escape.** Inside the Import JSON dialog the Monaco editor grabs focus and intercepts the Escape keystroke. Click the modal title (or any non-editor element inside the dialog) first to blur Monaco; Ant's `keyboard` handler then sees the Escape and dismisses.
- **Empty zero-state hides controls.** With no dashboards in the workspace, the search input, sort button, "All Dashboards" header, and `new-dashboard-cta` testid are absent — only the page heading and the inline "request a template" form render. Always seed at least one dashboard before driving any test that touches list-page controls.
# Quality bar
- Every test runs end-to-end against a fresh stack. If you can't run it green from a fresh `test_setup`, it's not done.
- Use `data-testid` whenever the source exposes one; grep `frontend/src/<feature-dir>/` for `data-testid=` to find them.
- If a step depends on UI behaviour you can't verify (e.g. clipboard, downloads), use the matching Playwright primitive (`page.waitForEvent('download')`, `page.context().grantPermissions(...)` — note `page.context()`, not the `context` fixture, since the auth fixture creates its own context).
- If the page renders differently when the workspace is empty vs non-empty, **always** seed before driving the test.
- Iterate on a single failing TC with `npx playwright test -g "TC-NN" --project=chromium`. Use `--last-failed` after a multi-failure run to replay only what failed.

View File

@@ -1,63 +0,0 @@
---
name: playwright-test-healer
description: Use this agent to debug and fix failing SigNoz E2E Playwright tests. Examples — <example>Context: A spec is red. user: 'tests/e2e/tests/dashboards/list.spec.ts is failing, fix it' assistant: 'Using the healer agent to debug each failing scenario and adjust the spec.'</example> <example>Context: After a frontend change a previously-green spec broke. user: 'TC-09 in alerts started failing' assistant: 'Launching the healer to investigate.'</example>
tools: Glob, Grep, Read, Write, Edit, Bash, mcp__playwright-test__browser_console_messages, mcp__playwright-test__browser_evaluate, mcp__playwright-test__browser_generate_locator, mcp__playwright-test__browser_network_requests, mcp__playwright-test__browser_snapshot, mcp__playwright-test__test_debug, mcp__playwright-test__test_list, mcp__playwright-test__test_run
model: sonnet
color: red
---
You are the Playwright Test Healer for the SigNoz E2E suite. You debug and fix red specs with a methodical approach. Read [docs/contributing/tests/e2e.md](../../docs/contributing/tests/e2e.md) before you start — it documents the harness and the conventions you must preserve.
# Preconditions
The E2E backend stack must be up. If `tests/e2e/.env.local` does not exist, ask the user to bring up the stack via:
```
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse --with-web e2e/bootstrap/setup.py::test_setup
```
Don't try to start the stack yourself — it can take ~4 minutes on a cold build and the user controls when to pay that cost.
# Workflow
1. **Inventory.** `mcp__playwright-test__test_list` (or `npx playwright test <file> --list` from `tests/e2e/`) to see all tests in the spec.
2. **Initial run.** `mcp__playwright-test__test_run` (or `npx playwright test <file> --project=chromium`) to identify failing tests. Don't run all browsers — chromium first.
3. **Per failing test, debug.** Use `mcp__playwright-test__test_debug` to attach. When the test pauses on the error:
- `browser_snapshot` to read the current accessibility tree.
- `browser_console_messages` for client-side errors.
- `browser_network_requests` for API failures (the SigNoz API requires `Authorization: Bearer <localStorage.AUTH_TOKEN>`; 401s usually mean the test bypassed the fixture).
- `browser_generate_locator` to suggest a stable locator if the failing one drifted.
4. **Root-cause.** Distinguish between:
- **Selector drift** — the app changed `data-testid` or text. Fix the locator. Prefer `data-testid` (grep `frontend/src/<feature-dir>/` for the new one).
- **Timing** — the test races a load. Replace `waitForTimeout` with `await expect(locator).toBe…()` or `page.waitForResponse(...)` on the triggering action.
- **State leak** — a previous test left data behind, or this test assumes data the bootstrap doesn't seed. Ensure the test seeds via API and cleans up in `try / finally`. The bootstrap creates a fresh stack with **zero** dashboards / alerts.
- **Genuine app bug** — the app is broken, not the test. Mark the test with `test.fixme(...)` and add a one-line `// known: <description>` comment. Don't silently change the assertion to make it pass.
5. **Fix.** Edit the spec. Preserve TC-NN titles, the `authedPage` fixture, `try / finally` cleanup, and serial mode if present. If you renumber, retitle, or `test.fixme(...)` any TC, flag it in your final summary so the user can re-run the planner — the plan and the QA checklist (`tests/e2e/specs/<feature>/checklists/<feature>-functional-checklist.md`) re-derive from the current `.spec.ts` and will otherwise drift.
6. **Re-run only the fixed test** before moving to the next failure. Three options:
- `npx playwright test -g 'TC-09' --project=chromium` — target a single TC by title
- `npx playwright test --last-failed --project=chromium` — replay everything that failed last run
- `mcp__playwright-test__test_run` with the test name
Don't re-run the whole file each iteration — it slows the loop.
7. **Iterate** until the file is green. If a test stays red after high-confidence fixes, mark it `test.fixme(...)` with a comment and move on rather than spinning indefinitely.
# Repo-specific signals
- **Reuse helpers before adding new code.** [`tests/e2e/helpers/dashboards.ts`](../../tests/e2e/helpers/dashboards.ts) and [`tests/e2e/helpers/auth.ts`](../../tests/e2e/helpers/auth.ts) already export the API-seed, cleanup, navigation, and action-menu helpers most fixes need. Prefer importing from there over re-inlining auth/login/POST plumbing in the spec.
- **Ant Popover items can fail with "Element is outside of the viewport" — even with `force: true`.** `force` skips actionability checks but Playwright still requires click coordinates to land in the viewport when it dispatches the synthetic mouse event. The robust fix is `tooltip.getByText('…').dispatchEvent('click')` — fires the click directly on the DOM node, React's `onClick` runs, and no coordinate calculation happens. Apply this whenever the failure log mentions "outside of the viewport" on a popover/tooltip option, especially in CI where layout differs subtly from local.
- **Action-icon rows below the fold.** With multiple seeded dashboards, a search-filtered row can scroll behind a sticky table header. The `openDashboardActionMenu` helper does `scrollIntoViewIfNeeded` already — if a test still drives the icon directly, fix it to use the helper or add the scroll.
- **React Query mutations vs page.goto.** UI delete clicks call `mutate()` asynchronously; if the test navigates away before the response lands, the mutation is cancelled and the dashboard is *not* deleted. Wait for the DELETE response and the dialog dismissal explicitly: `page.waitForResponse((r) => r.request().method() === 'DELETE' && /\/dashboards\//.test(r.url()))` plus `await expect(dialog).not.toBeVisible()`.
- **Monaco editor swallows Escape inside the Import JSON dialog.** If a test that presses Escape times out, click the modal title (or any non-editor element inside the dialog) first to blur Monaco, then press Escape.
- **The list pages render zero-state when the workspace is empty.** Many locators (search input, sort button, `new-dashboard-cta` testid, "All Dashboards" header) are absent in zero-state. A 30s timeout on those usually means the workspace was empty — seed first via `createDashboardViaApi`.
- **The "Enter dashboard name…" inline field is a `RequestDashboardBtn` (template-request feedback form), not a create flow.** Tests that try to use it to create a named dashboard will silently no-op. The only UI create paths are the "New dashboard" dropdown → "Create dashboard" (default name "Sample Title", see `DEFAULT_DASHBOARD_TITLE`) or "Import JSON".
- **Auth.** `tests/e2e/fixtures/auth.ts` logs in once per worker and caches `storageState` (cookies + localStorage with `AUTH_TOKEN`). For API-driven seeding/cleanup, use `authToken(page)` from `helpers/dashboards.ts` and pass `Authorization: Bearer <token>`. Never re-implement login.
- **Ant Design popovers** (sort menu, action menu) are click-toggle. The trigger element is often an inline `<svg>` with a `data-testid` — clicking it opens the popover; clicking it again closes. After selecting an option, the popover auto-closes. If a test interacts with the popover twice, wait for the menu items to be visible explicitly between toggles.
- **Artifacts.** Every failed test writes to `tests/e2e/artifacts/results/<test-slug>/` — the `error-context.md` accessibility snapshot is the fastest way to see what the page actually looked like when it failed.
- **Type-check.** After edits, run `npx tsc --noEmit -p tests/e2e/tsconfig.json` if it succeeds, or rely on `npx playwright test --list` to validate the spec parses.
# Hard rules
- **Never wait for `networkidle`.** It's flaky and discouraged.
- **Never use `page.waitForTimeout(ms)`.** Always express the wait as `await expect(locator).toBeVisible()` or similar.
- **Never weaken an assertion just to make a test pass.** If the underlying behavior is broken, mark `test.fixme(...)` with a comment.
- **Don't ask the user questions** — make the most reasonable repair you can with the information at hand.
- **Don't rewrite passing tests** while fixing a failing one. Surgical edits only.
- **Never commit `test.only`** — CI fails on `forbidOnly: true`.

View File

@@ -1,104 +0,0 @@
---
name: playwright-test-planner
description: Use this agent to create a comprehensive E2E test plan for a SigNoz frontend feature. Examples — <example>Context: A new feature has shipped and we need test coverage. user: 'Plan E2E tests for the alerts list page' assistant: 'I'll use the planner agent to read the relevant frontend source, navigate the page in a real browser, and produce a structured test plan.' <commentary>Test planning needs both source code understanding and live browser exploration — perfect for this agent.</commentary></example> <example>Context: User wants edge-case coverage on an existing feature. user: 'What scenarios are we missing for dashboard variables?' assistant: 'Launching the planner agent to map flows and identify gaps.'</example>
tools: Glob, Grep, Read, Write, Bash, mcp__playwright-test__browser_click, mcp__playwright-test__browser_close, mcp__playwright-test__browser_console_messages, mcp__playwright-test__browser_drag, mcp__playwright-test__browser_evaluate, mcp__playwright-test__browser_file_upload, mcp__playwright-test__browser_handle_dialog, mcp__playwright-test__browser_hover, mcp__playwright-test__browser_navigate, mcp__playwright-test__browser_navigate_back, mcp__playwright-test__browser_network_requests, mcp__playwright-test__browser_press_key, mcp__playwright-test__browser_select_option, mcp__playwright-test__browser_snapshot, mcp__playwright-test__browser_take_screenshot, mcp__playwright-test__browser_type, mcp__playwright-test__browser_wait_for, mcp__playwright-test__planner_setup_page
model: sonnet
color: green
---
You are an expert E2E test planner for the SigNoz frontend, working inside the SigNoz monorepo. Your test plans drive Playwright specs that run against the local pytest-bootstrapped backend. Read [docs/contributing/tests/e2e.md](../../docs/contributing/tests/e2e.md) before planning — it documents the harness, the `authedPage` fixture, the TC-NN naming convention, and the self-contained-state principle that every plan you write must respect.
You will:
1. **Inspect the source component**
- Read the relevant React source under [frontend/src/](../../frontend/src/) directly with the `Read` / `Grep` / `Glob` tools — this is a monorepo, no need to fetch from GitHub.
- For a feature like "dashboards list", start at `frontend/src/pages/<Feature>Page/` and `frontend/src/container/<Feature>/`. Trace the component tree to identify:
- Interactive elements and their `data-testid` attributes (preferred locators)
- Conditional rendering (empty states, loading, error, role-gated UI)
- URL query-param state (search, sort, pagination)
- API endpoints the UI calls — these inform what cleanup endpoints exist for `try/finally` teardown
- The frontend stores its JWT in `localStorage` under `AUTH_TOKEN` and the API requires `Authorization: Bearer <token>` for protected endpoints. Plans that need API-driven seeding should note this so the generator can use `page.request.*`.
2. **Check what's already wired up.**
- [tests/e2e/helpers/dashboards.ts](../../tests/e2e/helpers/dashboards.ts) and [tests/e2e/helpers/auth.ts](../../tests/e2e/helpers/auth.ts) hold reusable helpers (`createDashboardViaApi`, `gotoDashboardsList`, `openDashboardActionMenu`, `newAdminContext`, `importApmMetricsDashboardViaUI`, etc.). When the plan touches dashboards, reference these by name in the steps so the generator can reuse them rather than reinvent.
- [tests/e2e/fixtures/apm-metrics.json](../../tests/e2e/fixtures/apm-metrics.json) is a real-world dashboard payload (rich tags, panels, description) suitable as a seed fixture — note in the plan if a scenario benefits from it.
3. **Navigate and explore**
- Invoke `planner_setup_page` once before any other browser tool.
- Use `browser_snapshot` to read the page's accessibility tree. **Do not take screenshots unless absolutely necessary** — snapshots are cheaper and more legible.
- Drive each flow end-to-end: happy path, error states, edge cases, URL deep-linking, browser-back behaviour.
4. **Design comprehensive scenarios**
- Happy path
- Edge cases and boundary conditions (empty state, single item, > pagination threshold)
- Error handling and validation
- URL state and deep-linking
- Cross-flow regressions (e.g. searching while paginated)
5. **Structure the test plan**
Each scenario must include:
- **TC-NN** title — `TC-NN <short description>` (matches the naming this repo uses for test titles).
- Preconditions (what state the test expects — note that the bootstrap creates a fresh stack with **zero dashboards / alerts / etc.**, so plans must seed their own data).
- Step-by-step user actions.
- Expected outcomes per step.
- Cleanup notes (what gets created and how to remove it — usually via API).
6. **Save the plan and the checklist**
- **Plan:** `tests/e2e/specs/<feature>/<feature>-test-plan.md`. **`tests/e2e/specs/` is gitignored** — plans are scratch artifacts: working input for the generator, regenerable, not committed. Don't treat them as durable documentation. The committed tests are the source of truth.
- **Checklist:** `tests/e2e/specs/<feature>/checklists/<feature>-functional-checklist.md`. A manual-verification runbook that mirrors the TC list one-to-one (one checkbox per TC) for QA hand-off. Same gitignore, same scratch status.
- **The checklist must stay in sync with the TCs.** When you regenerate the plan, regenerate the checklist alongside it — they share TC IDs and titles, and the checklist ordering must match. If the existing spec under `tests/e2e/tests/<feature>/` has more / fewer / different TCs than the prior plan, the spec is authoritative: re-derive plan and checklist from it.
- **On re-runs against an evolved feature:** read the existing `.spec.ts` files first. Treat the committed tests as ground truth; produce a plan and checklist that reflect *what is currently in the spec*, not what the prior plan said. This is how the planner handles TC additions, deletions, merges, and renumbering performed by the generator or healer.
- Use clear headings, numbered steps, and a top-level "Application Overview" section.
- At the top of the plan, list any pre-existing limitations (e.g. "ascending sort not yet implemented") so the generator emits them as `// known behaviour` comments rather than failing assertions.
<example-spec>
# Dashboards List Page — Test Plan
## Application Overview
The dashboards list page (`/dashboard`) lists all dashboards in the workspace. From here a user can:
- Search by title, description, or tags (real-time, URL-synced via `?search=`)
- Sort by last-updated (URL-synced via `?columnKey=&order=`)
- Open per-row actions: View, Open in New Tab, Copy Link, Export JSON, Delete dashboard
- Create a new dashboard via the "New dashboard" dropdown (Create dashboard / Import JSON / View templates)
**Bootstrap state:** the pytest harness creates a fresh stack with no pre-seeded dashboards. Every test must seed its own data. The "Enter dashboard name…" inline input is a "request a new template" feedback form — **not** a create flow. The only UI create path is the dropdown.
**Known limitations:**
- Ascending sort is not yet implemented — repeated clicks on the sort button keep `order=descend`.
- Cancelling the delete confirmation dialog navigates to the dashboard detail page rather than staying on the list.
## Test Scenarios
### 1. Page Load and Layout
#### TC-01 page chrome and core controls render
**Preconditions:** at least one dashboard exists (seed via API).
**Steps:**
1. Navigate to `/dashboard`.
2. Verify URL is `/dashboard` (no query params).
3. Verify the page heading "Dashboards" (level 1) is visible.
4. ...
**Expected:**
- All Dashboards section header rendered.
- Search input, sort button, and at least one dashboard thumbnail visible.
**Cleanup:** delete the seeded dashboard via `DELETE /api/v1/dashboards/<id>`.
#### TC-02 ...
</example-spec>
**Quality bar:**
- Steps must be specific enough that any tester (or the generator agent) can follow without ambiguity.
- Include negative scenarios — empty state, no-match search, validation errors.
- Each scenario must own its preconditions and cleanup. **Do not invent cross-file global fixtures** — they break parallel-by-file execution. Suite-level `beforeAll` / `afterAll` *within* a single spec file is fine and is the preferred shape for files with > ~10 scenarios; per-test `try / finally` is fine for smaller specs.
- Prefer stable `data-testid` attributes when noting locators; fall back to ARIA roles or accessible names; treat CSS selectors as last resort.
- **Don't pad coverage.** Every TC must catch a regression that would actually ship if the test were missing — a real branch in the source, a real user-visible failure mode. A TC that asserts a hard-coded string is still rendered, or that a button is still a button, adds nothing but maintenance cost: it inflates the suite, slows CI, and trains readers to skim past the directory. Before writing a scenario, ask "what code change would break this?" — if the only answer is "deleting the literal under test," cut it or fold it into a richer scenario as one assertion among many.
- **Collapse near-duplicates.** Two TCs that differ only in input value (search by title vs search by description, when the underlying code path is the same) should merge into one parameterised scenario unless each input genuinely exercises a distinct branch. Prefer one assertion-rich TC over three thin ones.
- **Smoke-checks aren't TCs.** "Heading is visible" belongs as the first assertion inside a real scenario, not as its own numbered case.
**Output format:** a single Markdown file under `tests/e2e/specs/<feature>/<feature>-test-plan.md` (gitignored scratch path) ready to hand to the generator agent. The file is regenerable; once the spec is written, the plan can be discarded.

View File

@@ -27,8 +27,8 @@ services:
- ${PWD}/fs/tmp/var/lib/clickhouse/user_scripts/:/var/lib/clickhouse/user_scripts/
- ${PWD}/../../../deploy/common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
ports:
- "127.0.0.1:8123:8123"
- "127.0.0.1:9000:9000"
- '127.0.0.1:8123:8123'
- '127.0.0.1:9000:9000'
tty: true
healthcheck:
test:
@@ -47,16 +47,13 @@ services:
condition: service_healthy
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
networks:
- default
- signoz-devenv
zookeeper:
image: signoz/zookeeper:3.7.1
container_name: zookeeper
volumes:
- ${PWD}/fs/tmp/zookeeper:/bitnami/zookeeper
ports:
- "127.0.0.1:2181:2181"
- '127.0.0.1:2181:2181'
environment:
- ALLOW_ANONYMOUS_LOGIN=yes
healthcheck:
@@ -77,19 +74,12 @@ services:
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
depends_on:
clickhouse:
condition: service_healthy
restart: on-failure
networks:
- default
- signoz-devenv
networks:
signoz-devenv:
name: signoz-devenv

View File

@@ -3,7 +3,7 @@ services:
image: signoz/signoz-otel-collector:v0.142.0
container_name: signoz-otel-collector-dev
entrypoint:
- /bin/sh
- /bin/sh
command:
- -c
- |
@@ -34,11 +34,4 @@ services:
retries: 3
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
networks:
- default
- signoz-devenv
networks:
signoz-devenv:
name: signoz-devenv
- "host.docker.internal:host-gateway"

View File

@@ -12,10 +12,10 @@ receivers:
scrape_configs:
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
- targets:
- localhost:8888
labels:
job_name: otel-collector
processors:
batch:
@@ -29,26 +29,7 @@ processors:
signozspanmetrics/delta:
metrics_exporter: signozclickhousemetrics
metrics_flush_interval: 60s
latency_histogram_buckets:
[
100us,
1ms,
2ms,
6ms,
10ms,
50ms,
100ms,
250ms,
500ms,
1000ms,
1400ms,
2000ms,
5s,
10s,
20s,
40s,
60s,
]
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
@@ -79,13 +60,13 @@ extensions:
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/signoz_traces
datasource: tcp://host.docker.internal:9000/signoz_traces
low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING}
use_new_schema: true
signozclickhousemetrics:
dsn: tcp://clickhouse:9000/signoz_metrics
dsn: tcp://host.docker.internal:9000/signoz_metrics
clickhouselogsexporter:
dsn: tcp://clickhouse:9000/signoz_logs
dsn: tcp://host.docker.internal:9000/signoz_logs
timeout: 10s
use_new_schema: true
@@ -112,4 +93,4 @@ service:
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter]
exporters: [clickhouselogsexporter]

View File

@@ -7,8 +7,4 @@ deploy
sample-apps
# frontend
**/node_modules
# local env files (tracked example.env templates are unaffected)
**/.env
**/.env.*
node_modules

159
.github/CODEOWNERS vendored
View File

@@ -10,120 +10,108 @@
/frontend/src/container/OnboardingV2Container/AddDataSource/AddDataSource.tsx @makeavish
# CI
/deploy/ @therealpandey
.github @therealpandey
go.mod @therealpandey
# Scaffold Owners
/pkg/config/ @therealpandey
/pkg/errors/ @therealpandey
/pkg/factory/ @therealpandey
/pkg/types/ @therealpandey
/pkg/valuer/ @therealpandey
/cmd/ @therealpandey
.golangci.yml @therealpandey
/pkg/config/ @vikrantgupta25
/pkg/errors/ @vikrantgupta25
/pkg/factory/ @vikrantgupta25
/pkg/types/ @vikrantgupta25
/pkg/valuer/ @vikrantgupta25
/cmd/ @vikrantgupta25
.golangci.yml @vikrantgupta25
# Zeus Owners
/pkg/zeus/ @therealpandey
/ee/zeus/ @therealpandey
/pkg/licensing/ @therealpandey
/ee/licensing/ @therealpandey
/pkg/zeus/ @vikrantgupta25
/ee/zeus/ @vikrantgupta25
/pkg/licensing/ @vikrantgupta25
/ee/licensing/ @vikrantgupta25
# SQL Owners
/pkg/sqlmigration/ @therealpandey
/ee/sqlmigration/ @therealpandey
/pkg/sqlschema/ @therealpandey
/ee/sqlschema/ @therealpandey
/pkg/sqlmigration/ @vikrantgupta25
/ee/sqlmigration/ @vikrantgupta25
/pkg/sqlschema/ @vikrantgupta25
/ee/sqlschema/ @vikrantgupta25
# Analytics Owners
/pkg/analytics/ @therealpandey
/pkg/statsreporter/ @therealpandey
/pkg/analytics/ @vikrantgupta25
/pkg/statsreporter/ @vikrantgupta25
# Emailing Owners
/pkg/emailing/ @therealpandey
/pkg/types/emailtypes/ @therealpandey
/templates/email/ @therealpandey
/pkg/emailing/ @vikrantgupta25
/pkg/types/emailtypes/ @vikrantgupta25
/templates/email/ @vikrantgupta25
# Querier Owners
/pkg/querier/ @srikanthccv @therealpandey
/pkg/variables/ @srikanthccv @therealpandey
/pkg/types/querybuildertypes/ @srikanthccv @therealpandey
/pkg/types/telemetrytypes/ @srikanthccv @therealpandey
/pkg/querybuilder/ @srikanthccv @therealpandey
/pkg/telemetrylogs/ @srikanthccv @therealpandey
/pkg/telemetrymetadata/ @srikanthccv @therealpandey
/pkg/telemetrymetrics/ @srikanthccv @therealpandey
/pkg/telemetrytraces/ @srikanthccv @therealpandey
/pkg/querier/ @srikanthccv
/pkg/variables/ @srikanthccv
/pkg/types/querybuildertypes/ @srikanthccv
/pkg/types/telemetrytypes/ @srikanthccv
/pkg/querybuilder/ @srikanthccv
/pkg/telemetrylogs/ @srikanthccv
/pkg/telemetrymetadata/ @srikanthccv
/pkg/telemetrymetrics/ @srikanthccv
/pkg/telemetrytraces/ @srikanthccv
# Metrics
/pkg/types/metrictypes/ @srikanthccv @therealpandey
/pkg/types/metricsexplorertypes/ @srikanthccv @therealpandey
/pkg/modules/metricsexplorer/ @srikanthccv @therealpandey
/pkg/prometheus/ @srikanthccv @therealpandey
/pkg/types/metrictypes/ @srikanthccv
/pkg/types/metricsexplorertypes/ @srikanthccv
/pkg/modules/metricsexplorer/ @srikanthccv
/pkg/prometheus/ @srikanthccv
# APM
/pkg/types/servicetypes/ @srikanthccv @therealpandey
/pkg/types/apdextypes/ @srikanthccv @therealpandey
/pkg/modules/apdex/ @srikanthccv @therealpandey
/pkg/modules/services/ @srikanthccv @therealpandey
/pkg/types/servicetypes/ @srikanthccv
/pkg/types/apdextypes/ @srikanthccv
/pkg/modules/apdex/ @srikanthccv
/pkg/modules/services/ @srikanthccv
# Dashboard
/pkg/types/dashboardtypes/ @srikanthccv @therealpandey
/pkg/modules/dashboard/ @srikanthccv @therealpandey
/pkg/types/dashboardtypes/ @srikanthccv
/pkg/modules/dashboard/ @srikanthccv
# Rule/Alertmanager
/pkg/types/ruletypes/ @srikanthccv @therealpandey
/pkg/types/alertmanagertypes @srikanthccv @therealpandey
/pkg/alertmanager/ @srikanthccv @therealpandey
/pkg/ruler/ @srikanthccv @therealpandey
/pkg/modules/rulestatehistory/ @srikanthccv @therealpandey
/pkg/types/rulestatehistorytypes/ @srikanthccv @therealpandey
/pkg/types/ruletypes/ @srikanthccv
/pkg/types/alertmanagertypes @srikanthccv
/pkg/alertmanager/ @srikanthccv
/pkg/ruler/ @srikanthccv
# Correlation-adjacent
/pkg/contextlinks/ @srikanthccv @therealpandey
/pkg/types/parsertypes/ @srikanthccv @therealpandey
/pkg/queryparser/ @srikanthccv @therealpandey
/pkg/contextlinks/ @srikanthccv
/pkg/types/parsertypes/ @srikanthccv
/pkg/queryparser/ @srikanthccv
# AuthN / AuthZ Owners
/pkg/authz/ @therealpandey
/ee/authz/ @therealpandey
/pkg/authn/ @therealpandey
/ee/authn/ @therealpandey
/pkg/modules/user/ @therealpandey
/pkg/modules/session/ @therealpandey
/pkg/modules/organization/ @therealpandey
/pkg/modules/authdomain/ @therealpandey
/pkg/modules/role/ @therealpandey
/pkg/types/coretypes/ @therealpandey @vikrantgupta25
/pkg/authz/ @vikrantgupta25
/ee/authz/ @vikrantgupta25
/pkg/authn/ @vikrantgupta25
/ee/authn/ @vikrantgupta25
/pkg/modules/user/ @vikrantgupta25
/pkg/modules/session/ @vikrantgupta25
/pkg/modules/organization/ @vikrantgupta25
/pkg/modules/authdomain/ @vikrantgupta25
/pkg/modules/role/ @vikrantgupta25
# IdentN Owners
/pkg/identn/ @therealpandey
/pkg/http/middleware/identn.go @therealpandey
# IdentN Owners
/pkg/identn/ @vikrantgupta25
/pkg/http/middleware/identn.go @vikrantgupta25
# Integration tests
/tests/integration/ @therealpandey
# e2e tests
/tests/e2e/ @AshwinBhatkal
# Flagger Owners
/pkg/flagger/ @therealpandey
/tests/integration/ @vikrantgupta25
# OpenAPI types generator
@@ -144,7 +132,6 @@ go.mod @therealpandey
/frontend/src/container/ListOfDashboard/ @SigNoz/pulse-frontend
# Dashboard Widget Page
/frontend/src/pages/DashboardWidget/ @SigNoz/pulse-frontend
/frontend/src/container/NewWidget/ @SigNoz/pulse-frontend
@@ -160,35 +147,7 @@ go.mod @therealpandey
/frontend/src/container/PublicDashboardContainer/ @SigNoz/pulse-frontend
## Dashboard Libs + Components
/frontend/src/lib/uPlotV2/ @SigNoz/pulse-frontend
/frontend/src/lib/dashboard/ @SigNoz/pulse-frontend
/frontend/src/lib/dashboardVariables/ @SigNoz/pulse-frontend
/frontend/src/components/NewSelect/ @SigNoz/pulse-frontend
## Dashboard V2
/frontend/src/pages/DashboardPageV2/ @SigNoz/pulse-frontend
/frontend/src/pages/DashboardsListPageV2/ @SigNoz/pulse-frontend
## Infrastructure Monitoring
/frontend/src/pages/InfrastructureMonitoring/ @SigNoz/pulse-frontend
/frontend/src/container/InfraMonitoringHosts/ @SigNoz/pulse-frontend
/frontend/src/container/InfraMonitoringK8s/ @SigNoz/pulse-frontend
## Alerts
/frontend/src/pages/AlertList/ @SigNoz/pulse-frontend
/frontend/src/pages/AlertDetails/ @SigNoz/pulse-frontend
/frontend/src/pages/CreateAlert/ @SigNoz/pulse-frontend
/frontend/src/pages/EditRules/ @SigNoz/pulse-frontend
/frontend/src/container/AlertHistory/ @SigNoz/pulse-frontend
/frontend/src/container/CreateAlertRule/ @SigNoz/pulse-frontend
/frontend/src/container/CreateAlertV2/ @SigNoz/pulse-frontend
/frontend/src/container/EditAlertV2/ @SigNoz/pulse-frontend
/frontend/src/container/FormAlertRules/ @SigNoz/pulse-frontend
/frontend/src/container/ListAlertRules/ @SigNoz/pulse-frontend
/frontend/src/container/TriggeredAlerts/ @SigNoz/pulse-frontend
/frontend/src/container/AnomalyAlertEvaluationView/ @SigNoz/pulse-frontend
## OpenAPI Schema - Generated
/frontend/src/api/generated/services/ @therealpandey @vikrantgupta25 @srikanthccv
/docs/api/openapi.yml @therealpandey @vikrantgupta25 @srikanthccv

View File

@@ -54,7 +54,6 @@ jobs:
JS_SRC: frontend
JS_OUTPUT_ARTIFACT_CACHE_KEY: community-jsbuild-${{ github.sha }}
JS_OUTPUT_ARTIFACT_PATH: frontend/build
JS_PKG_MANAGER: pnpm
DOCKER_BUILD: false
DOCKER_MANIFEST: false
go-build:

View File

@@ -58,6 +58,8 @@ jobs:
run: |
mkdir -p frontend
echo 'CI=1' > frontend/.env
echo 'VITE_INTERCOM_APP_ID="${{ secrets.INTERCOM_APP_ID }}"' >> frontend/.env
echo 'VITE_SEGMENT_ID="${{ secrets.SEGMENT_ID }}"' >> frontend/.env
echo 'VITE_SENTRY_AUTH_TOKEN="${{ secrets.SENTRY_AUTH_TOKEN }}"' >> frontend/.env
echo 'VITE_SENTRY_ORG="${{ secrets.SENTRY_ORG }}"' >> frontend/.env
echo 'VITE_SENTRY_PROJECT_ID="${{ secrets.SENTRY_PROJECT_ID }}"' >> frontend/.env
@@ -69,8 +71,6 @@ jobs:
echo 'VITE_APPCUES_APP_ID="${{ secrets.APPCUES_APP_ID }}"' >> frontend/.env
echo 'VITE_PYLON_IDENTITY_SECRET="${{ secrets.PYLON_IDENTITY_SECRET }}"' >> frontend/.env
echo 'VITE_DOCS_BASE_URL="https://signoz.io"' >> frontend/.env
echo 'VITE_ENVIRONMENT="production"' >> frontend/.env
echo 'VITE_VERSION="${{ steps.build-info.outputs.version }}"' >> frontend/.env
- name: cache-dotenv
uses: actions/cache@v4
with:
@@ -87,7 +87,6 @@ jobs:
JS_INPUT_ARTIFACT_PATH: frontend/.env
JS_OUTPUT_ARTIFACT_CACHE_KEY: enterprise-jsbuild-${{ github.sha }}
JS_OUTPUT_ARTIFACT_PATH: frontend/build
JS_PKG_MANAGER: pnpm
DOCKER_BUILD: false
DOCKER_MANIFEST: false
go-build:

View File

@@ -64,18 +64,12 @@ jobs:
run: |
mkdir -p frontend
echo 'CI=1' > frontend/.env
echo 'VITE_SENTRY_AUTH_TOKEN="${{ secrets.SENTRY_AUTH_TOKEN }}"' >> frontend/.env
echo 'VITE_SENTRY_ORG="${{ secrets.SENTRY_ORG }}"' >> frontend/.env
echo 'VITE_SENTRY_PROJECT_ID="${{ secrets.SENTRY_PROJECT_ID }}"' >> frontend/.env
echo 'VITE_SENTRY_DSN="${{ secrets.SENTRY_DSN }}"' >> frontend/.env
echo 'VITE_TUNNEL_URL="${{ secrets.NP_TUNNEL_URL }}"' >> frontend/.env
echo 'VITE_TUNNEL_DOMAIN="${{ secrets.NP_TUNNEL_DOMAIN }}"' >> frontend/.env
echo 'VITE_PYLON_APP_ID="${{ secrets.NP_PYLON_APP_ID }}"' >> frontend/.env
echo 'VITE_APPCUES_APP_ID="${{ secrets.NP_APPCUES_APP_ID }}"' >> frontend/.env
echo 'VITE_PYLON_IDENTITY_SECRET="${{ secrets.NP_PYLON_IDENTITY_SECRET }}"' >> frontend/.env
echo 'VITE_DOCS_BASE_URL="https://staging.signoz.io"' >> frontend/.env
echo 'VITE_ENVIRONMENT="staging"' >> frontend/.env
echo 'VITE_VERSION="${{ steps.build-info.outputs.version }}"' >> frontend/.env
- name: cache-dotenv
uses: actions/cache@v4
with:
@@ -92,7 +86,6 @@ jobs:
JS_INPUT_ARTIFACT_PATH: frontend/.env
JS_OUTPUT_ARTIFACT_CACHE_KEY: staging-jsbuild-${{ github.sha }}
JS_OUTPUT_ARTIFACT_PATH: frontend/build
JS_PKG_MANAGER: pnpm
DOCKER_BUILD: false
DOCKER_MANIFEST: false
go-build:

View File

@@ -1,99 +0,0 @@
name: e2eci
on:
pull_request:
types:
- labeled
pull_request_target:
types:
- labeled
jobs:
fmtlint:
if: |
((github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))) && contains(github.event.pull_request.labels.*.name, 'safe-to-e2e')
runs-on: ubuntu-latest
steps:
- name: checkout
uses: actions/checkout@v4
- name: node
uses: actions/setup-node@v4
with:
node-version: lts/*
- name: install-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: install
run: |
cd tests/e2e && pnpm install
- name: fmt
run: |
cd tests/e2e && pnpm fmt:check
- name: lint
run: |
cd tests/e2e && pnpm lint
test:
strategy:
fail-fast: false
matrix:
project:
- chromium
if: |
((github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))) && contains(github.event.pull_request.labels.*.name, 'safe-to-e2e')
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: checkout
uses: actions/checkout@v4
- name: python
uses: actions/setup-python@v5
with:
python-version: 3.13
- name: uv
uses: astral-sh/setup-uv@v4
- name: node
uses: actions/setup-node@v4
with:
node-version: lts/*
- name: install-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: python-install
run: |
cd tests && uv sync
- name: pnpm-install
run: |
cd tests/e2e && pnpm install --frozen-lockfile
- name: playwright-browsers
run: |
cd tests/e2e && pnpm playwright install --with-deps ${{ matrix.project }}
- name: bring-up-stack
run: |
cd tests && \
uv run pytest \
--basetemp=./tmp/ \
-vv --reuse --with-web \
e2e/bootstrap/setup.py::test_setup
- name: playwright-test
run: |
cd tests/e2e && \
pnpm playwright test --project=${{ matrix.project }}
- name: teardown-stack
if: always()
run: |
cd tests && \
uv run pytest \
--basetemp=./tmp/ \
-vv --teardown \
e2e/bootstrap/setup.py::test_teardown
- name: upload-artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-artifacts-${{ matrix.project }}
path: tests/e2e/artifacts/
retention-days: 5

View File

@@ -77,10 +77,6 @@ jobs:
uses: actions/setup-node@v5
with:
node-version: "22"
- name: setup-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: docker-community
shell: bash
run: |
@@ -106,37 +102,3 @@ jobs:
run: |
go run cmd/enterprise/*.go generate openapi
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in openapi spec. Run go run cmd/enterprise/*.go generate openapi locally and commit."; exit 1)
authz:
if: |
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))
runs-on: ubuntu-latest
steps:
- name: self-checkout
uses: actions/checkout@v4
- name: go-install
uses: actions/setup-go@v5
with:
go-version: "1.24"
- name: generate-authz
run: |
go run cmd/enterprise/*.go generate authz
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in authz permissions. Run go run cmd/enterprise/*.go generate authz locally and commit."; exit 1)
web-settings:
if: |
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))
runs-on: ubuntu-latest
steps:
- name: self-checkout
uses: actions/checkout@v4
- name: go-install
uses: actions/setup-go@v5
with:
go-version: "1.24"
- name: generate-web-settings
run: |
go run cmd/enterprise/*.go generate config web-settings
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in web settings schema. Run go run cmd/enterprise/*.go generate config web-settings locally and commit."; exit 1)

View File

@@ -25,10 +25,6 @@ jobs:
uses: actions/setup-node@v5
with:
node-version: "22"
- name: setup-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: build-frontend
run: make js-build
- name: upload-frontend-artifact

View File

@@ -24,6 +24,8 @@ jobs:
- name: dotenv-frontend
working-directory: frontend
run: |
echo 'VITE_INTERCOM_APP_ID="${{ secrets.INTERCOM_APP_ID }}"' > .env
echo 'VITE_SEGMENT_ID="${{ secrets.SEGMENT_ID }}"' >> .env
echo 'VITE_SENTRY_AUTH_TOKEN="${{ secrets.SENTRY_AUTH_TOKEN }}"' >> .env
echo 'VITE_SENTRY_ORG="${{ secrets.SENTRY_ORG }}"' >> .env
echo 'VITE_SENTRY_PROJECT_ID="${{ secrets.SENTRY_PROJECT_ID }}"' >> .env
@@ -35,16 +37,10 @@ jobs:
echo 'VITE_APPCUES_APP_ID="${{ secrets.APPCUES_APP_ID }}"' >> .env
echo 'VITE_PYLON_IDENTITY_SECRET="${{ secrets.PYLON_IDENTITY_SECRET }}"' >> .env
echo 'VITE_DOCS_BASE_URL="https://signoz.io"' >> .env
echo 'VITE_ENVIRONMENT="production"' >> .env
echo 'VITE_VERSION="${{ github.ref_name }}"' >> .env
- name: node-setup
uses: actions/setup-node@v5
with:
node-version: "22"
- name: setup-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: build-frontend
run: make js-build
- name: upload-frontend-artifact

View File

@@ -25,11 +25,10 @@ jobs:
uses: astral-sh/setup-uv@v4
- name: install
run: |
cd tests && uv sync
cd tests/integration && uv sync
- name: fmt
run: |
make py-fmt
git diff --exit-code -- tests/
- name: lint
run: |
make py-lint
@@ -37,35 +36,28 @@ jobs:
strategy:
fail-fast: false
matrix:
suite:
- alerts
- basepath
src:
- bootstrap
- passwordauthn
- callbackauthn
- cloudintegrations
- dashboard
- ingestionkeys
- inframonitoring
- logspipelines
- passwordauthn
- preference
- querier
- rawexportdata
- role
- rootuser
- serviceaccount
- querier_json_body
- querier_skip_resource_fingerprint
- ttl
- alerts
- ingestionkeys
- rootuser
sqlstore-provider:
- postgres
- sqlite
sqlite-mode:
- wal
clickhouse-version:
- 25.5.6
- 25.12.5
schema-migrator-version:
- v0.144.3
- v0.142.0
postgres-version:
- 15
if: |
@@ -83,9 +75,8 @@ jobs:
uses: astral-sh/setup-uv@v4
- name: install
run: |
cd tests && uv sync
cd tests/integration && uv sync
- name: webdriver
if: matrix.suite == 'callbackauthn' || matrix.suite == 'basepath'
run: |
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
echo "deb http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee -a /etc/apt/sources.list.d/google-chrome.list
@@ -104,12 +95,11 @@ jobs:
google-chrome-stable --version
- name: run
run: |
cd tests && \
cd tests/integration && \
uv run pytest \
--basetemp=./tmp/ \
integration/tests/${{matrix.suite}} \
src/${{matrix.src}} \
--sqlstore-provider ${{matrix.sqlstore-provider}} \
--sqlite-mode ${{matrix.sqlite-mode}} \
--postgres-version ${{matrix.postgres-version}} \
--clickhouse-version ${{matrix.clickhouse-version}} \
--schema-migrator-version ${{matrix.schema-migrator-version}}

View File

@@ -22,7 +22,6 @@ jobs:
with:
PRIMUS_REF: main
JS_SRC: frontend
JS_PKG_MANAGER: pnpm
test:
if: |
github.event_name == 'merge_group' ||
@@ -33,7 +32,6 @@ jobs:
with:
PRIMUS_REF: main
JS_SRC: frontend
JS_PKG_MANAGER: pnpm
fmt:
if: |
github.event_name == 'merge_group' ||
@@ -44,7 +42,6 @@ jobs:
with:
PRIMUS_REF: main
JS_SRC: frontend
JS_PKG_MANAGER: pnpm
lint:
if: |
github.event_name == 'merge_group' ||
@@ -55,7 +52,6 @@ jobs:
with:
PRIMUS_REF: main
JS_SRC: frontend
JS_PKG_MANAGER: pnpm
languages:
if: |
github.event_name == 'merge_group' ||
@@ -67,6 +63,46 @@ jobs:
uses: actions/checkout@v4
- name: run
run: bash frontend/scripts/validate-md-languages.sh
authz:
if: |
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))
runs-on: ubuntu-latest
steps:
- name: self-checkout
uses: actions/checkout@v5
- name: node-install
uses: actions/setup-node@v5
with:
node-version: "22"
- name: deps-install
working-directory: ./frontend
run: |
yarn install
- name: uv-install
uses: astral-sh/setup-uv@v5
- name: uv-deps
working-directory: ./tests/integration
run: |
uv sync
- name: setup-test
run: |
make py-test-setup
- name: generate
working-directory: ./frontend
run: |
yarn generate:permissions-type
- name: teardown-test
if: always()
run: |
make py-test-teardown
- name: validate
run: |
if ! git diff --exit-code frontend/src/hooks/useAuthZ/permissions.type.ts; then
echo "::error::frontend/src/hooks/useAuthZ/permissions.type.ts is out of date. Please run the generator locally and commit the changes: npm run generate:permissions-type (from the frontend directory)"
exit 1
fi
openapi:
if: |
github.event_name == 'merge_group' ||
@@ -80,36 +116,9 @@ jobs:
uses: actions/setup-node@v5
with:
node-version: "22"
- name: install-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: install-frontend
run: cd frontend && pnpm install
run: cd frontend && yarn install
- name: generate-api-clients
run: |
cd frontend && pnpm generate:api
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in generated api clients. Run pnpm generate:api in frontend/ locally and commit."; exit 1)
web-settings:
if: |
github.event_name == 'merge_group' ||
(github.event_name == 'pull_request' && ! github.event.pull_request.head.repo.fork && github.event.pull_request.user.login != 'dependabot[bot]' && ! contains(github.event.pull_request.labels.*.name, 'safe-to-test')) ||
(github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe-to-test'))
runs-on: ubuntu-latest
steps:
- name: self-checkout
uses: actions/checkout@v4
- name: node-install
uses: actions/setup-node@v5
with:
node-version: "22"
- name: install-pnpm
uses: pnpm/action-setup@v6
with:
version: 10
- name: install-frontend
run: cd frontend && pnpm install
- name: generate-web-settings
run: |
cd frontend && pnpm generate:config:web-settings
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in generated web settings types. Run pnpm generate:config:web-settings in frontend/ locally and commit."; exit 1)
cd frontend && yarn generate:api
git diff --compact-summary --exit-code || (echo; echo "Unexpected difference in generated api clients. Run yarn generate:api in frontend/ locally and commit."; exit 1)

62
.github/workflows/run-e2e.yaml vendored Normal file
View File

@@ -0,0 +1,62 @@
name: e2eci
on:
workflow_dispatch:
inputs:
userRole:
description: "Role of the user (ADMIN, EDITOR, VIEWER)"
required: true
type: choice
options:
- ADMIN
- EDITOR
- VIEWER
jobs:
test:
name: Run Playwright Tests
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: lts/*
- name: Mask secrets and input
run: |
echo "::add-mask::${{ secrets.BASE_URL }}"
echo "::add-mask::${{ secrets.LOGIN_USERNAME }}"
echo "::add-mask::${{ secrets.LOGIN_PASSWORD }}"
echo "::add-mask::${{ github.event.inputs.userRole }}"
- name: Install dependencies
working-directory: frontend
run: |
npm install -g yarn
yarn
- name: Install Playwright Browsers
working-directory: frontend
run: yarn playwright install --with-deps
- name: Run Playwright Tests
working-directory: frontend
run: |
BASE_URL="${{ secrets.BASE_URL }}" \
LOGIN_USERNAME="${{ secrets.LOGIN_USERNAME }}" \
LOGIN_PASSWORD="${{ secrets.LOGIN_PASSWORD }}" \
USER_ROLE="${{ github.event.inputs.userRole }}" \
yarn playwright test
- name: Upload Playwright Report
uses: actions/upload-artifact@v4
if: always()
with:
name: playwright-report
path: frontend/playwright-report/
retention-days: 30

2
.gitignore vendored
View File

@@ -51,8 +51,6 @@ ee/query-service/tests/test-deploy/data/
# local data
*.backup
*.db
*.db-shm
*.db-wal
**/db
/deploy/docker/clickhouse-setup/data/
/deploy/docker-swarm/clickhouse-setup/data/

View File

@@ -1,21 +1,19 @@
# Please adjust to your needs (see https://www.gitpod.io/docs/config-gitpod-file)
# and commit this file to your remote git repository to share the goodness with others.
tasks:
- name: Run Docker Images
init: |
cd ./deploy/docker
sudo docker compose up -d
- name: Install pnpm
init: |
npm i -g pnpm
- name: Run Frontend
init: |
cd ./frontend
pnpm install
command: pnpm dev
yarn install
command:
yarn dev
ports:
- port: 8080

View File

@@ -6,14 +6,12 @@ linters:
- depguard
- errcheck
- forbidigo
- godot
- govet
- iface
- ineffassign
- misspell
- nilnil
- sloglint
- staticcheck
- wastedassign
- unparam
- unused
@@ -37,7 +35,7 @@ linters:
- identical
sloglint:
no-mixed-args: true
attr-only: true
kv-only: true
no-global: all
context: all
static-msg: true

View File

@@ -8,14 +8,6 @@ packages:
filename: "alertmanager.go"
structname: 'Mock{{.InterfaceName}}'
pkgname: '{{.SrcPackageName}}test'
github.com/SigNoz/signoz/pkg/types/alertmanagertypes:
interfaces:
MaintenanceStore:
config:
dir: '{{.InterfaceDir}}/alertmanagertypestest'
filename: "maintenance.go"
structname: 'Mock{{.InterfaceName}}'
pkgname: '{{.SrcPackageName}}test'
github.com/SigNoz/signoz/pkg/tokenizer:
config:
all: true

43
.vscode/settings.json vendored
View File

@@ -1,26 +1,21 @@
{
"oxc.typeAware": true,
"oxc.tsConfigPath": "./frontend/tsconfig.json",
"oxc.configPath": "./frontend/.oxlintrc.json",
"oxc.fmt.configPath": "./frontend/.oxfmtrc.json",
"editor.formatOnSave": true,
"editor.defaultFormatter": "oxc.oxc-vscode",
"editor.codeActionsOnSave": {
"source.fixAll.oxc": "explicit"
},
"[go]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "golang.go"
},
"[sql]": {
"editor.defaultFormatter": "adpyke.vscode-sql-formatter"
},
"[html]": {
"editor.defaultFormatter": "vscode.html-language-features"
},
"python-envs.defaultEnvManager": "ms-python.python:system",
"python-envs.pythonProjects": [],
"[json]": {
"editor.defaultFormatter": "vscode.json-language-features"
}
"eslint.workingDirectories": [
"./frontend"
],
"editor.formatOnSave": true,
"editor.defaultFormatter": "esbenp.prettier-vscode",
"editor.codeActionsOnSave": {
"source.fixAll.eslint": "explicit"
},
"prettier.requireConfig": true,
"[go]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "golang.go"
},
"[sql]": {
"editor.defaultFormatter": "adpyke.vscode-sql-formatter"
},
"[html]": {
"editor.defaultFormatter": "vscode.html-language-features"
}
}

View File

@@ -154,7 +154,7 @@ $(GO_BUILD_ARCHS_ENTERPRISE_RACE): go-build-enterprise-race-%: $(TARGET_DIR)
.PHONY: js-build
js-build: ## Builds the js frontend
@echo ">> building js frontend"
@cd $(JS_BUILD_CONTEXT) && CI=1 pnpm install && pnpm build
@cd $(JS_BUILD_CONTEXT) && CI=1 yarn install && yarn build
##############################################################
# docker commands
@@ -201,24 +201,26 @@ docker-buildx-enterprise: go-build-enterprise js-build
# python commands
##############################################################
.PHONY: py-fmt
py-fmt: ## Run ruff format across the shared tests project
@cd tests && uv run ruff format .
py-fmt: ## Run black for integration tests
@cd tests/integration && uv run black .
.PHONY: py-lint
py-lint: ## Run ruff check across the shared tests project
@cd tests && uv run ruff check --fix .
py-lint: ## Run lint for integration tests
@cd tests/integration && uv run isort .
@cd tests/integration && uv run autoflake .
@cd tests/integration && uv run pylint .
.PHONY: py-test-setup
py-test-setup: ## Bring up the shared SigNoz backend used by integration and e2e tests
@cd tests && uv run pytest --basetemp=./tmp/ -vv --reuse --capture=no integration/bootstrap/setup.py::test_setup
py-test-setup: ## Runs integration tests
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --reuse --capture=no src/bootstrap/setup.py::test_setup
.PHONY: py-test-teardown
py-test-teardown: ## Tear down the shared SigNoz backend
@cd tests && uv run pytest --basetemp=./tmp/ -vv --teardown --capture=no integration/bootstrap/setup.py::test_teardown
py-test-teardown: ## Runs integration tests with teardown
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --teardown --capture=no src/bootstrap/setup.py::test_teardown
.PHONY: py-test
py-test: ## Runs integration tests
@cd tests && uv run pytest --basetemp=./tmp/ -vv --capture=no integration/tests/
@cd tests/integration && uv run pytest --basetemp=./tmp/ -vv --capture=no src/
.PHONY: py-clean
py-clean: ## Clear all pycache and pytest cache from tests directory recursively

View File

@@ -1,158 +0,0 @@
package cmd
import (
"bytes"
"context"
"os"
"sort"
"strings"
"text/template"
"github.com/SigNoz/signoz/pkg/types/coretypes"
"github.com/spf13/cobra"
)
const permissionsTypePath = "frontend/src/hooks/useAuthZ/permissions.type.ts"
var permissionsTypeTemplate = template.Must(template.New("permissions").Parse(
`// AUTO GENERATED FILE - DO NOT EDIT - GENERATED BY cmd/enterprise/*.go generate authz
export default {
status: 'success',
data: {
resources: [
{{- range .Resources }}
{
kind: '{{ .Kind }}',
type: '{{ .Type }}',
{{ .FormattedAllowedVerbs }}
},
{{- end }}
],
relations: {
{{- range .Relations }}
{{ .Verb }}: [{{ range $i, $t := .Types }}{{ if $i }}, {{ end }}'{{ $t }}'{{ end }}],
{{- end }}
},
},
} as const;
`))
type permissionsTypeRelation struct {
Verb string
Types []string
}
type permissionsTypeResource struct {
Kind string
Type string
FormattedAllowedVerbs string
}
type permissionsTypeData struct {
Resources []permissionsTypeResource
Relations []permissionsTypeRelation
}
// formatAllowedVerbs returns a prettier-compatible formatted allowedVerbs line.
// indentLevel is the number of tabs for the property (matching kind/type indent).
// printWidth is prettier's printWidth; tabWidth is assumed to be 1 (each \t = 1 char).
func formatAllowedVerbs(verbs []string, indentLevel int, printWidth int) string {
quoted := make([]string, len(verbs))
for i, v := range verbs {
quoted[i] = "'" + v + "'"
}
indent := strings.Repeat("\t", indentLevel)
oneLine := indent + "allowedVerbs: [" + strings.Join(quoted, ", ") + "],"
if len(oneLine) <= printWidth {
return oneLine
}
var b strings.Builder
b.WriteString(indent + "allowedVerbs: [\n")
for _, q := range quoted {
b.WriteString(indent + "\t" + q + ",\n")
}
b.WriteString(indent + "],")
return b.String()
}
func registerGenerateAuthz(parentCmd *cobra.Command) {
authzCmd := &cobra.Command{
Use: "authz",
Short: "Generate authz permissions for the frontend",
RunE: func(currCmd *cobra.Command, args []string) error {
return runGenerateAuthz(currCmd.Context())
},
}
parentCmd.AddCommand(authzCmd)
}
func runGenerateAuthz(_ context.Context) error {
registry := coretypes.NewRegistry()
allowedResources := map[string]bool{
coretypes.NewResourceRef(coretypes.ResourceServiceAccount).String(): true,
coretypes.NewResourceRef(coretypes.ResourceRole).String(): true,
coretypes.NewResourceRef(coretypes.ResourceMetaResourceFactorAPIKey).String(): true,
}
allowedTypes := map[string]bool{}
refs := registry.ResourceRefs()
resources := make([]permissionsTypeResource, 0, len(refs))
for _, ref := range refs {
if !allowedResources[ref.String()] {
continue
}
allowedTypes[ref.Type.StringValue()] = true
resource, err := coretypes.NewResourceFromTypeAndKind(ref.Type, ref.Kind)
if err != nil {
return err
}
verbs := resource.AllowedVerbs()
allowedVerbStrings := make([]string, 0, len(verbs))
for _, verb := range verbs {
allowedVerbStrings = append(allowedVerbStrings, verb.StringValue())
}
sort.Strings(allowedVerbStrings)
resources = append(resources, permissionsTypeResource{
Kind: ref.Kind.String(),
Type: ref.Type.StringValue(),
FormattedAllowedVerbs: formatAllowedVerbs(allowedVerbStrings, 4, 80),
})
}
typesByVerb := registry.TypesByVerb()
verbs := make([]coretypes.Verb, 0, len(typesByVerb))
for verb := range typesByVerb {
verbs = append(verbs, verb)
}
sort.Slice(verbs, func(i, j int) bool { return verbs[i].StringValue() < verbs[j].StringValue() })
relations := make([]permissionsTypeRelation, 0, len(verbs))
for _, verb := range verbs {
types := make([]string, 0, len(typesByVerb[verb]))
for _, t := range typesByVerb[verb] {
if !allowedTypes[t.StringValue()] {
continue
}
types = append(types, t.StringValue())
}
relations = append(relations, permissionsTypeRelation{
Verb: verb.StringValue(),
Types: types,
})
}
var buf bytes.Buffer
if err := permissionsTypeTemplate.Execute(&buf, permissionsTypeData{Resources: resources, Relations: relations}); err != nil {
return err
}
return os.WriteFile(permissionsTypePath, buf.Bytes(), 0o600)
}

View File

@@ -11,7 +11,7 @@ RUN apk update && \
COPY ./target/${OS}-${TARGETARCH}/signoz-community /root/signoz
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY frontend/build/ /etc/signoz/web/
RUN chmod 755 /root /root/signoz

View File

@@ -12,7 +12,7 @@ RUN apk update && \
rm -rf /var/cache/apk/*
COPY ./target/${OS}-${ARCH}/signoz-community /root/signoz-community
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY frontend/build/ /etc/signoz/web/
RUN chmod 755 /root /root/signoz-community

View File

@@ -14,7 +14,6 @@ func main() {
// register a list of commands to the root command
registerServer(cmd.RootCmd, logger)
cmd.RegisterGenerate(cmd.RootCmd, logger)
cmd.RegisterMetastore(cmd.RootCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
cmd.Execute(logger)
}

View File

@@ -1,16 +0,0 @@
package main
import (
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/signoz"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
)
func sqlstoreProviderFactories() factory.NamedMap[factory.ProviderFactory[sqlstore.SQLStore, sqlstore.Config]] {
return signoz.NewSQLStoreProviderFactories()
}
func sqlschemaProviderFactories(sqlstore sqlstore.SQLStore) factory.NamedMap[factory.ProviderFactory[sqlschema.SQLSchema, sqlschema.Config]] {
return signoz.NewSQLSchemaProviderFactories(sqlstore)
}

View File

@@ -4,61 +4,42 @@ import (
"context"
"log/slog"
"github.com/spf13/cobra"
"github.com/SigNoz/signoz/cmd"
"github.com/SigNoz/signoz/pkg/alertmanager"
"github.com/SigNoz/signoz/pkg/analytics"
"github.com/SigNoz/signoz/pkg/auditor"
"github.com/SigNoz/signoz/pkg/authn"
"github.com/SigNoz/signoz/pkg/authz"
"github.com/SigNoz/signoz/pkg/authz/openfgaauthz"
"github.com/SigNoz/signoz/pkg/authz/openfgaschema"
"github.com/SigNoz/signoz/pkg/authz/openfgaserver"
"github.com/SigNoz/signoz/pkg/cache"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/flagger"
"github.com/SigNoz/signoz/pkg/gateway"
"github.com/SigNoz/signoz/pkg/gateway/noopgateway"
"github.com/SigNoz/signoz/pkg/global"
"github.com/SigNoz/signoz/pkg/licensing"
"github.com/SigNoz/signoz/pkg/licensing/nooplicensing"
"github.com/SigNoz/signoz/pkg/meterreporter"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration/implcloudintegration"
"github.com/SigNoz/signoz/pkg/modules/dashboard"
"github.com/SigNoz/signoz/pkg/modules/dashboard/impldashboard"
"github.com/SigNoz/signoz/pkg/modules/organization"
"github.com/SigNoz/signoz/pkg/modules/retention"
"github.com/SigNoz/signoz/pkg/modules/rulestatehistory"
"github.com/SigNoz/signoz/pkg/modules/serviceaccount"
"github.com/SigNoz/signoz/pkg/modules/tag"
"github.com/SigNoz/signoz/pkg/prometheus"
"github.com/SigNoz/signoz/pkg/querier"
"github.com/SigNoz/signoz/pkg/query-service/app"
"github.com/SigNoz/signoz/pkg/queryparser"
"github.com/SigNoz/signoz/pkg/ruler"
"github.com/SigNoz/signoz/pkg/ruler/signozruler"
"github.com/SigNoz/signoz/pkg/signoz"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/SigNoz/signoz/pkg/telemetrystore"
"github.com/SigNoz/signoz/pkg/types/authtypes"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/version"
"github.com/SigNoz/signoz/pkg/zeus"
"github.com/SigNoz/signoz/pkg/zeus/noopzeus"
"github.com/spf13/cobra"
)
func registerServer(parentCmd *cobra.Command, logger *slog.Logger) {
var configFiles []string
var flags signoz.DeprecatedFlags
serverCmd := &cobra.Command{
Use: "server",
Short: "Run the SigNoz server",
FParseErrWhitelist: cobra.FParseErrWhitelist{UnknownFlags: true},
RunE: func(currCmd *cobra.Command, args []string) error {
config, err := cmd.NewSigNozConfig(currCmd.Context(), logger, configFiles)
config, err := cmd.NewSigNozConfig(currCmd.Context(), logger, flags)
if err != nil {
return err
}
@@ -67,7 +48,7 @@ func registerServer(parentCmd *cobra.Command, logger *slog.Logger) {
},
}
serverCmd.Flags().StringArrayVar(&configFiles, "config", nil, "path to a YAML configuration file (can be specified multiple times, later files override earlier ones)")
flags.RegisterFlags(serverCmd)
parentCmd.AddCommand(serverCmd)
}
@@ -86,75 +67,60 @@ func runServer(ctx context.Context, config signoz.Config, logger *slog.Logger) e
},
signoz.NewEmailingProviderFactories(),
signoz.NewCacheProviderFactories(),
signoz.NewWebProviderFactories(config.Global),
sqlschemaProviderFactories,
sqlstoreProviderFactories(),
signoz.NewWebProviderFactories(),
func(sqlstore sqlstore.SQLStore) factory.NamedMap[factory.ProviderFactory[sqlschema.SQLSchema, sqlschema.Config]] {
return signoz.NewSQLSchemaProviderFactories(sqlstore)
},
signoz.NewSQLStoreProviderFactories(),
signoz.NewTelemetryStoreProviderFactories(),
func(ctx context.Context, providerSettings factory.ProviderSettings, store authtypes.AuthNStore, licensing licensing.Licensing) (map[authtypes.AuthNProvider]authn.AuthN, error) {
return signoz.NewAuthNs(ctx, providerSettings, store, licensing, config.Global)
return signoz.NewAuthNs(ctx, providerSettings, store, licensing)
},
func(ctx context.Context, sqlstore sqlstore.SQLStore, config authz.Config, _ licensing.Licensing, _ []authz.OnBeforeRoleDelete) (factory.ProviderFactory[authz.AuthZ, authz.Config], error) {
openfgaDataStore, err := openfgaserver.NewSQLStore(sqlstore, config)
if err != nil {
return nil, err
}
return openfgaauthz.NewProviderFactory(sqlstore, openfgaschema.NewSchema().Get(ctx), openfgaDataStore, authtypes.NewRegistry()), nil
func(ctx context.Context, sqlstore sqlstore.SQLStore, _ licensing.Licensing, _ dashboard.Module) factory.ProviderFactory[authz.AuthZ, authz.Config] {
return openfgaauthz.NewProviderFactory(sqlstore, openfgaschema.NewSchema().Get(ctx))
},
func(store sqlstore.SQLStore, settings factory.ProviderSettings, analytics analytics.Analytics, orgGetter organization.Getter, queryParser queryparser.QueryParser, _ querier.Querier, _ licensing.Licensing, tagModule tag.Module) dashboard.Module {
return impldashboard.NewModule(impldashboard.NewStore(store), settings, analytics, orgGetter, queryParser, tagModule)
func(store sqlstore.SQLStore, settings factory.ProviderSettings, analytics analytics.Analytics, orgGetter organization.Getter, queryParser queryparser.QueryParser, _ querier.Querier, _ licensing.Licensing) dashboard.Module {
return impldashboard.NewModule(impldashboard.NewStore(store), settings, analytics, orgGetter, queryParser)
},
func(_ licensing.Licensing) factory.ProviderFactory[gateway.Gateway, gateway.Config] {
return noopgateway.NewProviderFactory()
},
func(_ licensing.Licensing) factory.NamedMap[factory.ProviderFactory[auditor.Auditor, auditor.Config]] {
return signoz.NewAuditorProviderFactories()
},
func(_ context.Context, _ factory.ProviderSettings, _ flagger.Flagger, _ licensing.Licensing, _ telemetrystore.TelemetryStore, _ retention.Getter, _ organization.Getter, _ zeus.Zeus) (factory.NamedMap[factory.ProviderFactory[meterreporter.Reporter, meterreporter.Config]], string) {
return signoz.NewMeterReporterProviderFactories(), "noop"
},
func(ps factory.ProviderSettings, q querier.Querier, a analytics.Analytics) querier.Handler {
return querier.NewHandler(ps, q, a)
},
func(_ sqlstore.SQLStore, _ dashboard.Module, _ global.Global, _ zeus.Zeus, _ gateway.Gateway, _ licensing.Licensing, _ serviceaccount.Module, _ cloudintegration.Config) (cloudintegration.Module, error) {
return implcloudintegration.NewModule(), nil
},
func(c cache.Cache, am alertmanager.Alertmanager, ss sqlstore.SQLStore, ts telemetrystore.TelemetryStore, ms telemetrytypes.MetadataStore, p prometheus.Prometheus, og organization.Getter, rsh rulestatehistory.Module, q querier.Querier, qp queryparser.QueryParser) factory.NamedMap[factory.ProviderFactory[ruler.Ruler, ruler.Config]] {
return factory.MustNewNamedMap(signozruler.NewFactory(c, am, ss, ts, ms, p, og, rsh, q, qp, nil, nil))
},
)
if err != nil {
logger.ErrorContext(ctx, "failed to create signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to create signoz", "error", err)
return err
}
server, err := app.NewServer(config, signoz)
if err != nil {
logger.ErrorContext(ctx, "failed to create server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to create server", "error", err)
return err
}
if err := server.Start(ctx); err != nil {
logger.ErrorContext(ctx, "failed to start server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to start server", "error", err)
return err
}
signoz.Start(ctx)
if err := signoz.Wait(ctx); err != nil {
logger.ErrorContext(ctx, "failed to start signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to start signoz", "error", err)
return err
}
err = server.Stop(ctx)
if err != nil {
logger.ErrorContext(ctx, "failed to stop server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to stop server", "error", err)
return err
}
err = signoz.Stop(ctx)
if err != nil {
logger.ErrorContext(ctx, "failed to stop signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to stop signoz", "error", err)
return err
}

View File

@@ -10,23 +10,18 @@ import (
"github.com/SigNoz/signoz/pkg/signoz"
)
func NewSigNozConfig(ctx context.Context, logger *slog.Logger, configFiles []string) (signoz.Config, error) {
uris := make([]string, 0, len(configFiles)+1)
for _, f := range configFiles {
uris = append(uris, "file:"+f)
}
uris = append(uris, "env:")
func NewSigNozConfig(ctx context.Context, logger *slog.Logger, flags signoz.DeprecatedFlags) (signoz.Config, error) {
config, err := signoz.NewConfig(
ctx,
logger,
config.ResolverConfig{
Uris: uris,
Uris: []string{"env:"},
ProviderFactories: []config.ProviderFactory{
envprovider.NewFactory(),
fileprovider.NewFactory(),
},
},
flags,
)
if err != nil {
return signoz.Config{}, err

View File

@@ -1,86 +0,0 @@
package cmd
import (
"context"
"log/slog"
"os"
"path/filepath"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNewSigNozConfig_NoConfigFiles(t *testing.T) {
logger := slog.New(slog.DiscardHandler)
config, err := NewSigNozConfig(context.Background(), logger, nil)
require.NoError(t, err)
assert.NotZero(t, config)
}
func TestNewSigNozConfig_SingleConfigFile(t *testing.T) {
dir := t.TempDir()
configPath := filepath.Join(dir, "config.yaml")
err := os.WriteFile(configPath, []byte(`
cache:
provider: "redis"
`), 0644)
require.NoError(t, err)
logger := slog.New(slog.DiscardHandler)
config, err := NewSigNozConfig(context.Background(), logger, []string{configPath})
require.NoError(t, err)
assert.Equal(t, "redis", config.Cache.Provider)
}
func TestNewSigNozConfig_MultipleConfigFiles_LaterOverridesEarlier(t *testing.T) {
dir := t.TempDir()
basePath := filepath.Join(dir, "base.yaml")
err := os.WriteFile(basePath, []byte(`
cache:
provider: "memory"
sqlstore:
provider: "sqlite"
`), 0644)
require.NoError(t, err)
overridePath := filepath.Join(dir, "override.yaml")
err = os.WriteFile(overridePath, []byte(`
cache:
provider: "redis"
`), 0644)
require.NoError(t, err)
logger := slog.New(slog.DiscardHandler)
config, err := NewSigNozConfig(context.Background(), logger, []string{basePath, overridePath})
require.NoError(t, err)
// Later file overrides earlier
assert.Equal(t, "redis", config.Cache.Provider)
// Value from base file that wasn't overridden persists
assert.Equal(t, "sqlite", config.SQLStore.Provider)
}
func TestNewSigNozConfig_EnvOverridesConfigFile(t *testing.T) {
dir := t.TempDir()
configPath := filepath.Join(dir, "config.yaml")
err := os.WriteFile(configPath, []byte(`
cache:
provider: "fromfile"
`), 0644)
require.NoError(t, err)
t.Setenv("SIGNOZ_CACHE_PROVIDER", "fromenv")
logger := slog.New(slog.DiscardHandler)
config, err := NewSigNozConfig(context.Background(), logger, []string{configPath})
require.NoError(t, err)
// Env should override file
assert.Equal(t, "fromenv", config.Cache.Provider)
}
func TestNewSigNozConfig_NonexistentFile(t *testing.T) {
logger := slog.New(slog.DiscardHandler)
_, err := NewSigNozConfig(context.Background(), logger, []string{"/nonexistent/config.yaml"})
assert.Error(t, err)
}

View File

@@ -11,7 +11,7 @@ RUN apk update && \
COPY ./target/${OS}-${TARGETARCH}/signoz /root/signoz
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY frontend/build/ /etc/signoz/web/
RUN chmod 755 /root /root/signoz

View File

@@ -26,7 +26,7 @@ RUN go mod download
COPY ./cmd/ ./cmd/
COPY ./ee/ ./ee/
COPY ./pkg/ ./pkg/
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY Makefile Makefile
RUN TARGET_DIR=/root ARCHS=${TARGETARCH} ZEUS_URL=${ZEUSURL} LICENSE_URL=${ZEUSURL}/api/v1 make go-build-enterprise-race

View File

@@ -12,7 +12,7 @@ RUN apk update && \
rm -rf /var/cache/apk/*
COPY ./target/${OS}-${ARCH}/signoz /root/signoz
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY frontend/build/ /etc/signoz/web/
RUN chmod 755 /root /root/signoz

View File

@@ -3,9 +3,8 @@ FROM node:22-bookworm AS build
WORKDIR /opt/
COPY ./frontend/ ./
ENV NODE_OPTIONS=--max-old-space-size=8192
RUN CI=1 npm i -g pnpm@10
RUN CI=1 pnpm install
RUN CI=1 pnpm build
RUN CI=1 yarn install
RUN CI=1 yarn build
FROM golang:1.25-bookworm
@@ -35,7 +34,7 @@ RUN go mod download
COPY ./cmd/ ./cmd/
COPY ./ee/ ./ee/
COPY ./pkg/ ./pkg/
COPY ./templates /root/templates
COPY ./templates/email /root/templates
COPY Makefile Makefile
RUN TARGET_DIR=/root ARCHS=${TARGETARCH} ZEUS_URL=${ZEUSURL} LICENSE_URL=${ZEUSURL}/api/v1 make go-build-enterprise-race

View File

@@ -14,7 +14,6 @@ func main() {
// register a list of commands to the root command
registerServer(cmd.RootCmd, logger)
cmd.RegisterGenerate(cmd.RootCmd, logger)
cmd.RegisterMetastore(cmd.RootCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
cmd.Execute(logger)
}

View File

@@ -1,29 +0,0 @@
package main
import (
"github.com/SigNoz/signoz/ee/sqlschema/postgressqlschema"
"github.com/SigNoz/signoz/ee/sqlstore/postgressqlstore"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/signoz"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/SigNoz/signoz/pkg/sqlstore/sqlstorehook"
)
func sqlstoreProviderFactories() factory.NamedMap[factory.ProviderFactory[sqlstore.SQLStore, sqlstore.Config]] {
existingFactories := signoz.NewSQLStoreProviderFactories()
if err := existingFactories.Add(postgressqlstore.NewFactory(sqlstorehook.NewLoggingFactory(), sqlstorehook.NewInstrumentationFactory())); err != nil {
panic(err)
}
return existingFactories
}
func sqlschemaProviderFactories(sqlstore sqlstore.SQLStore) factory.NamedMap[factory.ProviderFactory[sqlschema.SQLSchema, sqlschema.Config]] {
existingFactories := signoz.NewSQLSchemaProviderFactories(sqlstore)
if err := existingFactories.Add(postgressqlschema.NewFactory(sqlstore)); err != nil {
panic(err)
}
return existingFactories
}

View File

@@ -1,55 +0,0 @@
package main
import (
"github.com/SigNoz/signoz/pkg/metercollector"
"github.com/SigNoz/signoz/pkg/telemetrylogs"
"github.com/SigNoz/signoz/pkg/telemetrymetrics"
"github.com/SigNoz/signoz/pkg/telemetrytraces"
"github.com/SigNoz/signoz/pkg/types/retentiontypes"
"github.com/SigNoz/signoz/pkg/types/zeustypes"
)
var meterConfigs = []metercollector.Config{
{
Provider: metercollector.ProviderStatic,
Static: metercollector.StaticConfig{
Name: zeustypes.MeterPlatformActive,
Unit: zeustypes.MeterUnitCount,
Aggregation: zeustypes.MeterAggregationMax,
Value: 1,
},
},
{
Provider: metercollector.ProviderTelemetry,
Telemetry: metercollector.TelemetryConfig{
Name: zeustypes.MeterLogSize,
Unit: zeustypes.MeterUnitBytes,
Aggregation: zeustypes.MeterAggregationSum,
DBName: telemetrylogs.DBName,
TableName: telemetrylogs.LogsV2LocalTableName,
DefaultRetentionDays: retentiontypes.DefaultLogsRetentionDays,
},
},
{
Provider: metercollector.ProviderTelemetry,
Telemetry: metercollector.TelemetryConfig{
Name: zeustypes.MeterSpanSize,
Unit: zeustypes.MeterUnitBytes,
Aggregation: zeustypes.MeterAggregationSum,
DBName: telemetrytraces.DBName,
TableName: telemetrytraces.SpanIndexV3LocalTableName,
DefaultRetentionDays: retentiontypes.DefaultTracesRetentionDays,
},
},
{
Provider: metercollector.ProviderTelemetry,
Telemetry: metercollector.TelemetryConfig{
Name: zeustypes.MeterDatapointCount,
Unit: zeustypes.MeterUnitCount,
Aggregation: zeustypes.MeterAggregationSum,
DBName: telemetrymetrics.DBName,
TableName: telemetrymetrics.SamplesV4LocalTableName,
DefaultRetentionDays: retentiontypes.DefaultMetricsRetentionDays,
},
},
}

View File

@@ -5,76 +5,51 @@ import (
"log/slog"
"time"
"github.com/spf13/cobra"
"github.com/SigNoz/signoz/cmd"
"github.com/SigNoz/signoz/ee/auditor/fileauditor"
"github.com/SigNoz/signoz/ee/auditor/otlphttpauditor"
"github.com/SigNoz/signoz/ee/authn/callbackauthn/oidccallbackauthn"
"github.com/SigNoz/signoz/ee/authn/callbackauthn/samlcallbackauthn"
"github.com/SigNoz/signoz/ee/authz/openfgaauthz"
eequerier "github.com/SigNoz/signoz/ee/querier"
"github.com/SigNoz/signoz/ee/authz/openfgaschema"
"github.com/SigNoz/signoz/ee/authz/openfgaserver"
"github.com/SigNoz/signoz/ee/gateway/httpgateway"
enterpriselicensing "github.com/SigNoz/signoz/ee/licensing"
"github.com/SigNoz/signoz/ee/licensing/httplicensing"
"github.com/SigNoz/signoz/ee/metercollector/staticmetercollector"
"github.com/SigNoz/signoz/ee/metercollector/telemetrymetercollector"
"github.com/SigNoz/signoz/ee/meterreporter/httpmeterreporter"
"github.com/SigNoz/signoz/ee/modules/cloudintegration/implcloudintegration"
"github.com/SigNoz/signoz/ee/modules/cloudintegration/implcloudintegration/implcloudprovider"
"github.com/SigNoz/signoz/ee/modules/dashboard/impldashboard"
eequerier "github.com/SigNoz/signoz/ee/querier"
enterpriseapp "github.com/SigNoz/signoz/ee/query-service/app"
eerules "github.com/SigNoz/signoz/ee/query-service/rules"
"github.com/SigNoz/signoz/ee/sqlschema/postgressqlschema"
"github.com/SigNoz/signoz/ee/sqlstore/postgressqlstore"
enterprisezeus "github.com/SigNoz/signoz/ee/zeus"
"github.com/SigNoz/signoz/ee/zeus/httpzeus"
"github.com/SigNoz/signoz/pkg/alertmanager"
"github.com/SigNoz/signoz/pkg/analytics"
"github.com/SigNoz/signoz/pkg/auditor"
"github.com/SigNoz/signoz/pkg/authn"
"github.com/SigNoz/signoz/pkg/authz"
"github.com/SigNoz/signoz/pkg/cache"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/factory"
pkgflagger "github.com/SigNoz/signoz/pkg/flagger"
"github.com/SigNoz/signoz/pkg/gateway"
"github.com/SigNoz/signoz/pkg/global"
"github.com/SigNoz/signoz/pkg/licensing"
"github.com/SigNoz/signoz/pkg/meterreporter"
"github.com/SigNoz/signoz/pkg/modules/cloudintegration"
pkgcloudintegration "github.com/SigNoz/signoz/pkg/modules/cloudintegration/implcloudintegration"
"github.com/SigNoz/signoz/pkg/modules/dashboard"
pkgimpldashboard "github.com/SigNoz/signoz/pkg/modules/dashboard/impldashboard"
"github.com/SigNoz/signoz/pkg/modules/organization"
"github.com/SigNoz/signoz/pkg/modules/retention"
"github.com/SigNoz/signoz/pkg/modules/rulestatehistory"
"github.com/SigNoz/signoz/pkg/modules/serviceaccount"
"github.com/SigNoz/signoz/pkg/modules/tag"
"github.com/SigNoz/signoz/pkg/prometheus"
"github.com/SigNoz/signoz/pkg/querier"
"github.com/SigNoz/signoz/pkg/queryparser"
"github.com/SigNoz/signoz/pkg/ruler"
"github.com/SigNoz/signoz/pkg/ruler/signozruler"
"github.com/SigNoz/signoz/pkg/signoz"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/SigNoz/signoz/pkg/telemetrystore"
"github.com/SigNoz/signoz/pkg/sqlstore/sqlstorehook"
"github.com/SigNoz/signoz/pkg/types/authtypes"
"github.com/SigNoz/signoz/pkg/types/cloudintegrationtypes"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/version"
"github.com/SigNoz/signoz/pkg/zeus"
"github.com/spf13/cobra"
)
func registerServer(parentCmd *cobra.Command, logger *slog.Logger) {
var configFiles []string
var flags signoz.DeprecatedFlags
serverCmd := &cobra.Command{
Use: "server",
Short: "Run the SigNoz server",
FParseErrWhitelist: cobra.FParseErrWhitelist{UnknownFlags: true},
RunE: func(currCmd *cobra.Command, args []string) error {
config, err := cmd.NewSigNozConfig(currCmd.Context(), logger, configFiles)
config, err := cmd.NewSigNozConfig(currCmd.Context(), logger, flags)
if err != nil {
return err
}
@@ -83,7 +58,7 @@ func registerServer(parentCmd *cobra.Command, logger *slog.Logger) {
},
}
serverCmd.Flags().StringArrayVar(&configFiles, "config", nil, "path to a YAML configuration file (can be specified multiple times, later files override earlier ones)")
flags.RegisterFlags(serverCmd)
parentCmd.AddCommand(serverCmd)
}
@@ -91,6 +66,13 @@ func runServer(ctx context.Context, config signoz.Config, logger *slog.Logger) e
// print the version
version.Info.PrettyPrint(config.Version)
// add enterprise sqlstore factories to the community sqlstore factories
sqlstoreFactories := signoz.NewSQLStoreProviderFactories()
if err := sqlstoreFactories.Add(postgressqlstore.NewFactory(sqlstorehook.NewLoggingFactory(), sqlstorehook.NewInstrumentationFactory())); err != nil {
logger.ErrorContext(ctx, "failed to add postgressqlstore factory", "error", err)
return err
}
signoz, err := signoz.New(
ctx,
config,
@@ -102,22 +84,29 @@ func runServer(ctx context.Context, config signoz.Config, logger *slog.Logger) e
},
signoz.NewEmailingProviderFactories(),
signoz.NewCacheProviderFactories(),
signoz.NewWebProviderFactories(config.Global),
sqlschemaProviderFactories,
sqlstoreProviderFactories(),
signoz.NewWebProviderFactories(),
func(sqlstore sqlstore.SQLStore) factory.NamedMap[factory.ProviderFactory[sqlschema.SQLSchema, sqlschema.Config]] {
existingFactories := signoz.NewSQLSchemaProviderFactories(sqlstore)
if err := existingFactories.Add(postgressqlschema.NewFactory(sqlstore)); err != nil {
panic(err)
}
return existingFactories
},
sqlstoreFactories,
signoz.NewTelemetryStoreProviderFactories(),
func(ctx context.Context, providerSettings factory.ProviderSettings, store authtypes.AuthNStore, licensing licensing.Licensing) (map[authtypes.AuthNProvider]authn.AuthN, error) {
samlCallbackAuthN, err := samlcallbackauthn.New(ctx, store, licensing, config.Global)
samlCallbackAuthN, err := samlcallbackauthn.New(ctx, store, licensing)
if err != nil {
return nil, err
}
oidcCallbackAuthN, err := oidccallbackauthn.New(store, licensing, providerSettings, config.Global)
oidcCallbackAuthN, err := oidccallbackauthn.New(store, licensing, providerSettings)
if err != nil {
return nil, err
}
authNs, err := signoz.NewAuthNs(ctx, providerSettings, store, licensing, config.Global)
authNs, err := signoz.NewAuthNs(ctx, providerSettings, store, licensing)
if err != nil {
return nil, err
}
@@ -127,97 +116,53 @@ func runServer(ctx context.Context, config signoz.Config, logger *slog.Logger) e
return authNs, nil
},
func(ctx context.Context, sqlstore sqlstore.SQLStore, config authz.Config, licensing licensing.Licensing, onBeforeRoleDelete []authz.OnBeforeRoleDelete) (factory.ProviderFactory[authz.AuthZ, authz.Config], error) {
openfgaDataStore, err := openfgaserver.NewSQLStore(sqlstore, config)
if err != nil {
return nil, err
}
return openfgaauthz.NewProviderFactory(sqlstore, openfgaschema.NewSchema().Get(ctx), openfgaDataStore, licensing, onBeforeRoleDelete, authtypes.NewRegistry()), nil
func(ctx context.Context, sqlstore sqlstore.SQLStore, licensing licensing.Licensing, dashboardModule dashboard.Module) factory.ProviderFactory[authz.AuthZ, authz.Config] {
return openfgaauthz.NewProviderFactory(sqlstore, openfgaschema.NewSchema().Get(ctx), licensing, dashboardModule)
},
func(store sqlstore.SQLStore, settings factory.ProviderSettings, analytics analytics.Analytics, orgGetter organization.Getter, queryParser queryparser.QueryParser, querier querier.Querier, licensing licensing.Licensing, tagModule tag.Module) dashboard.Module {
return impldashboard.NewModule(pkgimpldashboard.NewStore(store), settings, analytics, orgGetter, queryParser, querier, licensing, tagModule)
func(store sqlstore.SQLStore, settings factory.ProviderSettings, analytics analytics.Analytics, orgGetter organization.Getter, queryParser queryparser.QueryParser, querier querier.Querier, licensing licensing.Licensing) dashboard.Module {
return impldashboard.NewModule(pkgimpldashboard.NewStore(store), settings, analytics, orgGetter, queryParser, querier, licensing)
},
func(licensing licensing.Licensing) factory.ProviderFactory[gateway.Gateway, gateway.Config] {
return httpgateway.NewProviderFactory(licensing)
},
func(licensing licensing.Licensing) factory.NamedMap[factory.ProviderFactory[auditor.Auditor, auditor.Config]] {
factories := signoz.NewAuditorProviderFactories()
if err := factories.Add(otlphttpauditor.NewFactory(licensing, version.Info)); err != nil {
panic(err)
}
if err := factories.Add(fileauditor.NewFactory(licensing, version.Info)); err != nil {
panic(err)
}
return factories
},
func(ctx context.Context, providerSettings factory.ProviderSettings, flagger pkgflagger.Flagger, licensing licensing.Licensing, telemetryStore telemetrystore.TelemetryStore, retentionGetter retention.Getter, orgGetter organization.Getter, zeus zeus.Zeus) (factory.NamedMap[factory.ProviderFactory[meterreporter.Reporter, meterreporter.Config]], string) {
factories := signoz.NewMeterReporterProviderFactories()
collectorFactories := factory.MustNewNamedMap(
staticmetercollector.NewFactory(),
telemetrymetercollector.NewFactory(telemetryStore, retentionGetter),
)
if err := factories.Add(httpmeterreporter.NewFactory(collectorFactories, meterConfigs, flagger, licensing, orgGetter, zeus)); err != nil {
panic(err)
}
return factories, "http"
},
func(ps factory.ProviderSettings, q querier.Querier, a analytics.Analytics) querier.Handler {
communityHandler := querier.NewHandler(ps, q, a)
return eequerier.NewHandler(ps, q, communityHandler)
},
func(sqlStore sqlstore.SQLStore, dashboardModule dashboard.Module, global global.Global, zeus zeus.Zeus, gateway gateway.Gateway, licensing licensing.Licensing, serviceAccount serviceaccount.Module, config cloudintegration.Config) (cloudintegration.Module, error) {
defStore := pkgcloudintegration.NewServiceDefinitionStore()
awsCloudProviderModule, err := implcloudprovider.NewAWSCloudProvider(defStore)
if err != nil {
return nil, err
}
azureCloudProviderModule := implcloudprovider.NewAzureCloudProvider(defStore)
cloudProvidersMap := map[cloudintegrationtypes.CloudProviderType]cloudintegration.CloudProviderModule{
cloudintegrationtypes.CloudProviderTypeAWS: awsCloudProviderModule,
cloudintegrationtypes.CloudProviderTypeAzure: azureCloudProviderModule,
}
return implcloudintegration.NewModule(pkgcloudintegration.NewStore(sqlStore), dashboardModule, global, zeus, gateway, licensing, serviceAccount, cloudProvidersMap, config)
},
func(c cache.Cache, am alertmanager.Alertmanager, ss sqlstore.SQLStore, ts telemetrystore.TelemetryStore, ms telemetrytypes.MetadataStore, p prometheus.Prometheus, og organization.Getter, rsh rulestatehistory.Module, q querier.Querier, qp queryparser.QueryParser) factory.NamedMap[factory.ProviderFactory[ruler.Ruler, ruler.Config]] {
return factory.MustNewNamedMap(signozruler.NewFactory(c, am, ss, ts, ms, p, og, rsh, q, qp, eerules.PrepareTaskFunc, eerules.TestNotification))
},
)
if err != nil {
logger.ErrorContext(ctx, "failed to create signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to create signoz", "error", err)
return err
}
server, err := enterpriseapp.NewServer(config, signoz)
if err != nil {
logger.ErrorContext(ctx, "failed to create server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to create server", "error", err)
return err
}
if err := server.Start(ctx); err != nil {
logger.ErrorContext(ctx, "failed to start server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to start server", "error", err)
return err
}
signoz.Start(ctx)
if err := signoz.Wait(ctx); err != nil {
logger.ErrorContext(ctx, "failed to start signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to start signoz", "error", err)
return err
}
err = server.Stop(ctx)
if err != nil {
logger.ErrorContext(ctx, "failed to stop server", errors.Attr(err))
logger.ErrorContext(ctx, "failed to stop server", "error", err)
return err
}
err = signoz.Stop(ctx)
if err != nil {
logger.ErrorContext(ctx, "failed to stop signoz", errors.Attr(err))
logger.ErrorContext(ctx, "failed to stop signoz", "error", err)
return err
}

View File

@@ -1,61 +0,0 @@
package cmd
import (
"encoding/json"
"os"
"reflect"
"strings"
"github.com/SigNoz/signoz/pkg/web"
"github.com/spf13/cobra"
"github.com/swaggest/jsonschema-go"
)
const webSettingsSchemaPath = "docs/config/web-settings.json"
func registerGenerateConfig(parentCmd *cobra.Command) {
configCmd := &cobra.Command{
Use: "config",
Short: "Generate JSON Schema for config",
}
configCmd.AddCommand(&cobra.Command{
Use: "web-settings",
Short: "Generate JSON Schema for web settings",
RunE: func(currCmd *cobra.Command, args []string) error {
return generateWebSettings()
},
})
parentCmd.AddCommand(configCmd)
}
func generateWebSettings() error {
falseVal := false
noAdditional := jsonschema.SchemaOrBool{TypeBoolean: &falseVal}
reflector := jsonschema.Reflector{}
reflector.DefaultOptions = append(reflector.DefaultOptions,
jsonschema.InterceptSchema(func(params jsonschema.InterceptSchemaParams) (bool, error) {
if params.Value.Kind() == reflect.Struct {
params.Schema.AdditionalProperties = &noAdditional
}
return false, nil
}),
jsonschema.InterceptDefName(func(t reflect.Type, defaultDefName string) string {
return strings.TrimPrefix(defaultDefName, "Web")
}),
)
schema, err := reflector.Reflect(web.Settings{})
if err != nil {
return err
}
data, err := json.MarshalIndent(schema, "", " ")
if err != nil {
return err
}
return os.WriteFile(webSettingsSchemaPath, append(data, '\n'), 0o600)
}

View File

@@ -16,8 +16,6 @@ func RegisterGenerate(parentCmd *cobra.Command, logger *slog.Logger) {
}
registerGenerateOpenAPI(generateCmd)
registerGenerateAuthz(generateCmd)
registerGenerateConfig(generateCmd)
parentCmd.AddCommand(generateCmd)
}

View File

@@ -1,154 +0,0 @@
package cmd
import (
"context"
"log/slog"
"github.com/spf13/cobra"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/instrumentation"
"github.com/SigNoz/signoz/pkg/signoz"
"github.com/SigNoz/signoz/pkg/sqlmigration"
"github.com/SigNoz/signoz/pkg/sqlmigrator"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/SigNoz/signoz/pkg/version"
)
type SQLStoreProviderFactories func() factory.NamedMap[factory.ProviderFactory[sqlstore.SQLStore, sqlstore.Config]]
type SQLSchemaProviderFactories func(sqlstore.SQLStore) factory.NamedMap[factory.ProviderFactory[sqlschema.SQLSchema, sqlschema.Config]]
func RegisterMetastore(parentCmd *cobra.Command, logger *slog.Logger, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) {
metastoreCmd := &cobra.Command{
Use: "metastore",
Short: "Run commands to interact with the Metastore",
SilenceUsage: true,
SilenceErrors: true,
CompletionOptions: cobra.CompletionOptions{DisableDefaultCmd: true},
}
registerMigrate(metastoreCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
parentCmd.AddCommand(metastoreCmd)
}
func registerMigrate(parentCmd *cobra.Command, logger *slog.Logger, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) {
migrateCmd := &cobra.Command{
Use: "migrate",
Short: "Run migrations for the Metastore",
SilenceUsage: true,
SilenceErrors: true,
CompletionOptions: cobra.CompletionOptions{DisableDefaultCmd: true},
}
registerSync(migrateCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
parentCmd.AddCommand(migrateCmd)
}
func registerSync(parentCmd *cobra.Command, logger *slog.Logger, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) {
syncCmd := &cobra.Command{
Use: "sync",
Short: "Runs 'sync' migrations for the metastore. Sync migrations are used to mutate schemas of the metastore. These migrations need to be successfully applied before bringing up the application.",
SilenceUsage: true,
SilenceErrors: true,
CompletionOptions: cobra.CompletionOptions{DisableDefaultCmd: true},
}
registerSyncUp(syncCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
registerSyncCheck(syncCmd, logger, sqlstoreProviderFactories, sqlschemaProviderFactories)
parentCmd.AddCommand(syncCmd)
}
func registerSyncUp(parentCmd *cobra.Command, logger *slog.Logger, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) {
var configFiles []string
syncUpCmd := &cobra.Command{
Use: "up",
Short: "Runs 'up' migrations for the metastore. Up migrations are used to apply new migrations to the metastore",
FParseErrWhitelist: cobra.FParseErrWhitelist{UnknownFlags: true},
RunE: func(currCmd *cobra.Command, args []string) error {
config, err := NewSigNozConfig(currCmd.Context(), logger, configFiles)
if err != nil {
return err
}
return runSyncUp(currCmd.Context(), config, sqlstoreProviderFactories, sqlschemaProviderFactories)
},
}
syncUpCmd.Flags().StringArrayVar(&configFiles, "config", nil, "path to a YAML configuration file (can be specified multiple times, later files override earlier ones)")
parentCmd.AddCommand(syncUpCmd)
}
func registerSyncCheck(parentCmd *cobra.Command, logger *slog.Logger, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) {
var configFiles []string
syncCheckCmd := &cobra.Command{
Use: "check",
Short: "Runs a check for 'sync' migrations on the metastore. Returns a non-zero exit code if any migrations are pending.",
FParseErrWhitelist: cobra.FParseErrWhitelist{UnknownFlags: true},
RunE: func(currCmd *cobra.Command, args []string) error {
config, err := NewSigNozConfig(currCmd.Context(), logger, configFiles)
if err != nil {
return err
}
return runSyncCheck(currCmd.Context(), config, sqlstoreProviderFactories, sqlschemaProviderFactories)
},
}
syncCheckCmd.Flags().StringArrayVar(&configFiles, "config", nil, "path to a YAML configuration file (can be specified multiple times, later files override earlier ones)")
parentCmd.AddCommand(syncCheckCmd)
}
func runSyncUp(ctx context.Context, config signoz.Config, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) error {
migrator, err := newSyncMigrator(ctx, config, sqlstoreProviderFactories, sqlschemaProviderFactories)
if err != nil {
return err
}
return migrator.Migrate(ctx)
}
func runSyncCheck(ctx context.Context, config signoz.Config, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) error {
migrator, err := newSyncMigrator(ctx, config, sqlstoreProviderFactories, sqlschemaProviderFactories)
if err != nil {
return err
}
return migrator.Check(ctx)
}
func newSyncMigrator(ctx context.Context, config signoz.Config, sqlstoreProviderFactories SQLStoreProviderFactories, sqlschemaProviderFactories SQLSchemaProviderFactories) (sqlmigrator.SQLMigrator, error) {
instrumentation, err := instrumentation.New(ctx, config.Instrumentation, version.Info, "signoz")
if err != nil {
return nil, err
}
providerSettings := instrumentation.ToProviderSettings()
sqlstore, err := factory.NewProviderFromNamedMap(ctx, providerSettings, config.SQLStore, sqlstoreProviderFactories(), config.SQLStore.Provider)
if err != nil {
return nil, err
}
sqlschema, err := factory.NewProviderFromNamedMap(ctx, providerSettings, config.SQLSchema, sqlschemaProviderFactories(sqlstore), config.SQLStore.Provider)
if err != nil {
return nil, err
}
telemetrystore, err := factory.NewProviderFromNamedMap(ctx, providerSettings, config.TelemetryStore, signoz.NewTelemetryStoreProviderFactories(), config.TelemetryStore.Provider)
if err != nil {
return nil, err
}
sqlmigrations, err := sqlmigration.New(ctx, providerSettings, config.SQLMigration, signoz.NewSQLMigrationProviderFactories(sqlstore, sqlschema, telemetrystore, providerSettings))
if err != nil {
return nil, err
}
return sqlmigrator.New(ctx, providerSettings, sqlstore, sqlmigrations, config.SQLMigrator), nil
}

View File

@@ -4,10 +4,8 @@ import (
"log/slog"
"os"
"github.com/spf13/cobra"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/version"
"github.com/spf13/cobra"
)
var RootCmd = &cobra.Command{
@@ -22,7 +20,7 @@ var RootCmd = &cobra.Command{
func Execute(logger *slog.Logger) {
err := RootCmd.Execute()
if err != nil {
logger.ErrorContext(RootCmd.Context(), "error running command", errors.Attr(err))
logger.ErrorContext(RootCmd.Context(), "error running command", "error", err)
os.Exit(1)
}
}

4
conf/cache-config.yml Normal file
View File

@@ -0,0 +1,4 @@
provider: "inmemory"
inmemory:
ttl: 60m
cleanupInterval: 10m

View File

@@ -6,15 +6,9 @@
##################### Global #####################
global:
# the url under which the signoz apiserver is externally reachable.
# the path component (e.g. /signoz in https://example.com/signoz) is used
# as the base path for all HTTP routes (both API and web frontend).
external_url: <unset>
# the url where the SigNoz backend receives telemetry data (traces, metrics, logs) from instrumented applications.
ingestion_url: <unset>
# the url of the SigNoz MCP server. when unset, the MCP settings page is hidden in the frontend.
# mcp_url: <unset>
# the url of the SigNoz AI Assistant server. when unset, the AI Assistant is hidden in the frontend.
# ai_assistant_url: <unset>
##################### Version #####################
version:
@@ -45,35 +39,14 @@ instrumentation:
host: "0.0.0.0"
port: 9090
##################### PProf #####################
pprof:
# Whether to enable the pprof server.
enabled: true
# The address on which the pprof server listens.
address: 0.0.0.0:6060
##################### Web #####################
web:
# Whether to enable the web frontend
enabled: true
# The index file to use as the SPA entrypoint.
index: index.html
# The prefix to serve web on
prefix: /
# The directory containing the static build files.
directory: /etc/signoz/web
# Settings exposed to the web.
settings:
posthog:
# Whether to enable PostHog in web.
enabled: false
appcues:
# Whether to enable Appcues in web.
enabled: false
sentry:
# Whether to enable Sentry in web.
enabled: false
pylon:
# Whether to enable Pylon in web.
enabled: false
##################### Cache #####################
cache:
@@ -102,18 +75,13 @@ sqlstore:
provider: sqlite
# The maximum number of open connections to the database.
max_open_conns: 100
# The maximum amount of time a connection may be reused.
# If max_conn_lifetime == 0, connections are not closed due to a connection's age.
max_conn_lifetime: 0
sqlite:
# The path to the SQLite database file.
path: /var/lib/signoz/signoz.db
# The journal mode for the sqlite database. Supported values: delete, wal.
mode: wal
# The timeout for the sqlite database to wait for a lock.
# Mode is the mode to use for the sqlite database.
mode: delete
# BusyTimeout is the timeout for the sqlite database to wait for a lock.
busy_timeout: 10s
# The default transaction locking behavior. Supported values: deferred, immediate, exclusive.
transaction_mode: deferred
##################### APIServer #####################
apiserver:
@@ -169,8 +137,6 @@ telemetrystore:
##################### Prometheus #####################
prometheus:
# The maximum time a PromQL query is allowed to run before being aborted.
timeout: 2m
active_query_tracker:
# Whether to enable the active query tracker.
enabled: true
@@ -188,11 +154,6 @@ alertmanager:
poll_interval: 1m
# The URL under which Alertmanager is externally reachable (for example, if Alertmanager is served via a reverse proxy). Used for generating relative and absolute links back to Alertmanager itself.
external_url: http://localhost:8080
# The list of globs from which SigNoz's alertmanager notification templates are loaded (e.g. the email.signoz.html layout).
# This mirrors the upstream alertmanager `templates` config option. The upstream default templates (default.tmpl, email.tmpl)
# are always loaded from the embedded alertmanager assets, so only SigNoz's own templates need to be listed here.
templates:
- /opt/signoz/conf/templates/alertmanager/*.gotmpl
# The global configuration for the alertmanager. All the exahustive fields can be found in the upstream: https://github.com/prometheus/alertmanager/blob/efa05feffd644ba4accb526e98a8c6545d26a783/config/config.go#L833
global:
# ResolveTimeout is the time after which an alert is declared resolved if it has not been updated.
@@ -382,92 +343,3 @@ identn:
impersonation:
# toggle impersonation identN, when enabled, all requests will impersonate the root user
enabled: false
##################### Service Account #####################
serviceaccount:
email:
# email domain for the service account principal
domain: signozserviceaccount.com
analytics:
# toggle service account analytics
enabled: true
##################### Auditor #####################
auditor:
# Specifies the auditor provider to use.
# noop: discards all audit events (community default).
# otlphttp: exports audit events via OTLP HTTP (enterprise).
provider: noop
# The async channel capacity for audit events. Events are dropped when full (fail-open).
buffer_size: 1000
# The maximum number of events per export batch.
batch_size: 100
# The maximum time between export flushes.
flush_interval: 1s
otlphttp:
# The target scheme://host:port/path of the OTLP HTTP endpoint.
endpoint: http://localhost:4318/v1/logs
# Whether to use HTTP instead of HTTPS.
insecure: false
# The maximum duration for an export attempt.
timeout: 10s
# Additional HTTP headers sent with every export request.
headers: {}
retry:
# Whether to retry on transient failures.
enabled: true
# The initial wait time before the first retry.
initial_interval: 5s
# The upper bound on backoff interval.
max_interval: 30s
# The total maximum time spent retrying.
max_elapsed_time: 60s
##################### Cloud Integration #####################
cloudintegration:
# cloud integration agent configuration
agent:
# The version of the cloud integration agent.
version: v0.0.8
##################### Trace Detail #####################
traces:
waterfall:
# Number of spans returned per request when the trace is too large to show all at once.
span_page_size: 500
# Maximum depth of descendents to auto-expand for the selected span.
max_depth_to_auto_expand: 5
# Threshold below which all spans are returned without windowing.
max_limit_to_select_all_spans: 10000
flamegraph:
# Maximum number of BFS depth levels included in a windowed response.
max_selected_levels: 50
# Maximum spans per level before sampling is applied.
max_spans_per_level: 100
# Number of highest-latency spans always included when sampling a level.
sampling_top_latency_count: 5
# Number of timestamp buckets used for uniform sampling within a level.
sampling_bucket_count: 50
# Threshold below which all spans are returned without windowing or sampling.
select_all_spans_limit: 100000
##################### Authz #################################
authz:
# Specifies the authz provider to use.
provider: openfga
openfga:
# maximum tuples allowed per openfga write operation.
max_tuples_per_write: 300
##################### Meter Reporter #####################
meterreporter:
# The interval between collection ticks. Minimum 10m, maximum 24h.
interval: 6h
# Whether to backfill sealed days from the license creation day.
backfill: true
# Random jitter applied to the first collect and to every subsequent cycle.
# The first collect fires at a random time in [0, jitter); each cycle then takes
# interval - random(0, jitter). Must be between 10m and interval. Defaults to
# min(interval, 2h) when unset.
jitter: 2h

25
conf/prometheus.yml Normal file
View File

@@ -0,0 +1,25 @@
# my global config
global:
scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- 127.0.0.1:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
- 'alerts.yml'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs: []
remote_read:
- url: tcp://localhost:9000/signoz_metrics

View File

@@ -1,119 +0,0 @@
# Migrating from Docker Compose to Foundry
This guide walks you through migrating an existing SigNoz deployment running via
Docker Compose to [Foundry](https://signoz.io/docs/install/docker/).
## Overview
SigNoz has developed a CLI to make installation and deployment much easier. Foundry is the evolution of how SigNoz should be installed and managed going forward.
The install script is now deprecated and will no longer receive updates.
To stay up to date on new installation platforms and patterns, please refer to [Foundry](https://github.com/SigNoz/foundry).
Two `foundryctl` commands are used throughout this guide:
- **`forge`** — generates deployment manifests from your `casting.yaml`. It does not touch running containers, so it is safe to re-run while you iterate.
- **`cast`** — applies the generated manifests: it creates and starts the containers (and pulls new images).
## Prerequisites
- [ ] Install Foundry - `curl -fsSL https://signoz.io/foundry.sh | bash`
## Migration Steps
> [!WARNING]
> **Before proceeding, back up both:**
> - **Your docker volumes** — these hold your data.
> - **Your existing `docker-compose.yaml` (and any config it references)** — keep a copy somewhere safe. The compose manifests are no longer distributed by SigNoz, so this backup is your only way to roll back to your previous setup.
1. Make a note of the volume names used by your existing deployment for the following components:
- ClickHouse
- SigNoz
- ZooKeeper
> If you used the docker compose file we provided, the volumes will be `signoz-clickhouse`, `signoz-sqlite`, and `signoz-zookeeper-1`.
2. Generate your `casting.yaml`. Based on internal testing, the following casting should generate the manifests that mimic the legacy docker compose setup (compare against your backed-up `docker-compose.yaml`). Once created, run `foundryctl forge -f casting.yaml`.
> [!NOTE]
> The casting contains `patches` and `config` overrides to ensure that the newly generated manifests use the existing volumes and configurations.
> [!WARNING]
> If your deployment had more than 1 shard or replica, you will need to adjust your manifest volumes accordingly. Additionally, if you had specific container images, you need to include them in your casting.
> [!IMPORTANT]
> The `replica` and `shard` macros below are placeholders. Replace them with the values from your existing ClickHouse configuration (check the `macros` section of your current ClickHouse config, e.g. `config.xml`/`metrika.xml`), otherwise the generated manifests will not match your existing data.
```yaml
apiVersion: v1alpha1
kind: Installation
metadata:
name: signoz
spec:
deployment:
flavor: compose
mode: docker
metastore:
kind: sqlite
telemetrykeeper:
kind: zookeeper
telemetrystore:
spec:
config:
data:
config-0-0.yaml: |
macros:
replica: "example01-01-1" # replace with your existing ClickHouse replica macro (see legacy configuration files for reference)
shard: "01" # replace with your existing ClickHouse shard macro (see legacy configuration files for reference)
patches:
- target: "deployment/compose.yaml"
operations:
- op: replace
path: /volumes/dev-telemetrykeeper-0-data/name
value: signoz-zookeeper-1
- op: replace
path: /volumes/dev-telemetrystore-0-0-data/name
value: signoz-clickhouse
- op: replace
path: /volumes/dev-metastore-sqlite-0-data/name
value: signoz-sqlite
- op: add
path: /services/dev-telemetrykeeper-zookeeper-0/user
value: root
```
> [!NOTE]
> The `user: root` patch on the ZooKeeper service is required so the container can read/write the data in your reused ZooKeeper volume, which was created with `root`-owned files by the legacy compose setup. Without it, ZooKeeper may fail to start with permission errors.
3. Validate the manifests in `pours/deployment`. Pay special attention to `compose.yaml` — it should mimic the legacy manifest and the configuration files needed for `clickhouse`. **Do note that these are now in YAML instead of XML.**
If you had custom settings for features like SMTP or ingestion processors/receivers, you will need to include those in your casting file.
4. Execute a `docker compose down`. **Do not** include parameters such as `--volumes` (or `-v`), as it will wipe the volumes we need to maintain and reuse to avoid data loss. **NOTE: this will generate downtime so please plan accordingly**.
5. Validate the SigNoz containers are down with `docker ps -a`. Multiple containers cannot bind the same volume.
6. Run `foundryctl cast -f casting.yaml`. This will recreate the containers based on the spec. This process will download new container images.
> [!NOTE]
> When `cast` is run, the migration container will execute its migrations.
## Verifying the Migration
- SigNoz containers will be up and running.
- Log in to the SigNoz UI and verify that data is present.
- Validate that your data ingestion is receiving data.
- Review the logs from both ClickHouse and ZooKeeper; no errors should be present.
## Rolling Back
Because step 4 brought the legacy stack down *without* `-v`, your original volumes
are untouched and still hold your data. To roll back:
- Stop and remove the containers created by Foundry (`docker compose down`, again without `-v`).
- Confirm the containers are gone with `docker ps -a` so nothing else is bound to the volumes.
- Reapply your original docker compose file (`docker compose up -d`). It will reattach to the
existing volumes and restore your prior state.
## Troubleshooting
- Please reach out to our community on [Slack](https://signoz.io/slack).
## References
- [SigNoz Docker installation docs](https://signoz.io/docs/install/docker/)
- [SigNoz documentation](https://signoz.io/docs)
- [Foundry](https://github.com/SigNoz/foundry)

View File

@@ -3,18 +3,77 @@
Check that you have cloned [signoz/signoz](https://github.com/signoz/signoz)
and currently are in `signoz/deploy` folder.
## Installation
## Docker
> **Note:** The `install.sh` script and the `docker-compose` manifests have been deprecated.
If you don't have docker set up, please follow [this guide](https://docs.docker.com/engine/install/)
to set up docker before proceeding with the next steps.
SigNoz now installs and runs through [Foundry](https://signoz.io/docs/install/docker/).
### Using Install Script
> **Already running SigNoz via Docker Compose?** See the [Migration Guide](./MIGRATION.md) to transition your existing deployment to Foundry.
Now run the following command to install:
Please follow the latest installation instructions at [signoz.io/docs/install/docker](https://signoz.io/docs/install/docker/).
Foundry has support for **different platforms and architectures**, please review the project documentation for more details.
```sh
./install.sh
```
> You will need Docker installed first. If you don't have it set up, follow [this guide](https://docs.docker.com/engine/install/).
### Using Docker Compose
If you don't have docker compose set up, please follow [this guide](https://docs.docker.com/compose/install/)
to set up docker compose before proceeding with the next steps.
```sh
cd deploy/docker
docker compose up -d
```
Open http://localhost:8080 in your favourite browser.
To start collecting logs and metrics from your infrastructure, run the following command:
```sh
cd generator/infra
docker compose up -d
```
To start generating sample traces, run the following command:
```sh
cd generator/hotrod
docker compose up -d
```
In a couple of minutes, you should see the data generated from hotrod in SigNoz UI.
For more details, please refer to the [SigNoz documentation](https://signoz.io/docs/install/docker/).
## Docker Swarm
To install SigNoz using Docker Swarm, run the following command:
```sh
cd deploy/docker-swarm
docker stack deploy -c docker-compose.yaml signoz
```
Open http://localhost:8080 in your favourite browser.
To start collecting logs and metrics from your infrastructure, run the following command:
```sh
cd generator/infra
docker stack deploy -c docker-compose.yaml infra
```
To start generating sample traces, run the following command:
```sh
cd generator/hotrod
docker stack deploy -c docker-compose.yaml hotrod
```
In a couple of minutes, you should see the data generated from hotrod in SigNoz UI.
For more details, please refer to the [SigNoz documentation](https://signoz.io/docs/install/docker-swarm/).
## Uninstall/Troubleshoot?

View File

@@ -0,0 +1,75 @@
<?xml version="1.0"?>
<clickhouse>
<!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
Optional. If you don't use replicated tables, you could omit that.
See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
-->
<zookeeper>
<node index="1">
<host>zookeeper-1</host>
<port>2181</port>
</node>
<node index="2">
<host>zookeeper-2</host>
<port>2181</port>
</node>
<node index="3">
<host>zookeeper-3</host>
<port>2181</port>
</node>
</zookeeper>
<!-- Configuration of clusters that could be used in Distributed tables.
https://clickhouse.com/docs/en/operations/table_engines/distributed/
-->
<remote_servers>
<cluster>
<!-- Inter-server per-cluster secret for Distributed queries
default: no secret (no authentication will be performed)
If set, then Distributed queries will be validated on shards, so at least:
- such cluster should exist on the shard,
- such cluster should have the same secret.
And also (and which is more important), the initial_user will
be used as current user for the query.
Right now the protocol is pretty simple and it only takes into account:
- cluster name
- query
Also it will be nice if the following will be implemented:
- source hostname (see interserver_http_host), but then it will depends from DNS,
it can use IP address instead, but then the you need to get correct on the initiator node.
- target hostname / ip address (same notes as for source hostname)
- time-based security tokens
-->
<!-- <secret></secret> -->
<shard>
<!-- Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas). -->
<!-- <internal_replication>false</internal_replication> -->
<!-- Optional. Shard weight when writing data. Default: 1. -->
<!-- <weight>1</weight> -->
<replica>
<host>clickhouse</host>
<port>9000</port>
<!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->
<!-- <priority>1</priority> -->
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-3</host>
<port>9000</port>
</replica>
</shard>
</cluster>
</remote_servers>
</clickhouse>

View File

@@ -0,0 +1,75 @@
<?xml version="1.0"?>
<clickhouse>
<!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
Optional. If you don't use replicated tables, you could omit that.
See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
-->
<zookeeper>
<node index="1">
<host>zookeeper-1</host>
<port>2181</port>
</node>
<!-- <node index="2">
<host>zookeeper-2</host>
<port>2181</port>
</node>
<node index="3">
<host>zookeeper-3</host>
<port>2181</port>
</node> -->
</zookeeper>
<!-- Configuration of clusters that could be used in Distributed tables.
https://clickhouse.com/docs/en/operations/table_engines/distributed/
-->
<remote_servers>
<cluster>
<!-- Inter-server per-cluster secret for Distributed queries
default: no secret (no authentication will be performed)
If set, then Distributed queries will be validated on shards, so at least:
- such cluster should exist on the shard,
- such cluster should have the same secret.
And also (and which is more important), the initial_user will
be used as current user for the query.
Right now the protocol is pretty simple and it only takes into account:
- cluster name
- query
Also it will be nice if the following will be implemented:
- source hostname (see interserver_http_host), but then it will depends from DNS,
it can use IP address instead, but then the you need to get correct on the initiator node.
- target hostname / ip address (same notes as for source hostname)
- time-based security tokens
-->
<!-- <secret></secret> -->
<shard>
<!-- Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas). -->
<!-- <internal_replication>false</internal_replication> -->
<!-- Optional. Shard weight when writing data. Default: 1. -->
<!-- <weight>1</weight> -->
<replica>
<host>clickhouse</host>
<port>9000</port>
<!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->
<!-- <priority>1</priority> -->
</replica>
</shard>
<!-- <shard>
<replica>
<host>clickhouse-2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-3</host>
<port>9000</port>
</replica>
</shard> -->
</cluster>
</remote_servers>
</clickhouse>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,21 @@
<functions>
<function>
<type>executable</type>
<name>histogramQuantile</name>
<return_type>Float64</return_type>
<argument>
<type>Array(Float64)</type>
<name>buckets</name>
</argument>
<argument>
<type>Array(Float64)</type>
<name>counts</name>
</argument>
<argument>
<type>Float64</type>
<name>quantile</name>
</argument>
<format>CSV</format>
<command>./histogramQuantile</command>
</function>
</functions>

View File

@@ -0,0 +1,41 @@
<?xml version="1.0"?>
<clickhouse>
<storage_configuration>
<disks>
<default>
<keep_free_space_bytes>10485760</keep_free_space_bytes>
</default>
<s3>
<type>s3</type>
<!-- For S3 cold storage,
if region is us-east-1, endpoint can be https://<bucket-name>.s3.amazonaws.com
if region is not us-east-1, endpoint should be https://<bucket-name>.s3-<region>.amazonaws.com
For GCS cold storage,
endpoint should be https://storage.googleapis.com/<bucket-name>/data/
-->
<endpoint>https://BUCKET-NAME.s3-REGION-NAME.amazonaws.com/data/</endpoint>
<access_key_id>ACCESS-KEY-ID</access_key_id>
<secret_access_key>SECRET-ACCESS-KEY</secret_access_key>
<!-- In case of S3, uncomment the below configuration in case you want to read
AWS credentials from the Environment variables if they exist. -->
<!-- <use_environment_credentials>true</use_environment_credentials> -->
<!-- In case of GCS, uncomment the below configuration, since GCS does
not support batch deletion and result in error messages in logs. -->
<!-- <support_batch_delete>false</support_batch_delete> -->
</s3>
</disks>
<policies>
<tiered>
<volumes>
<default>
<disk>default</disk>
</default>
<s3>
<disk>s3</disk>
<perform_ttl_move_on_insert>0</perform_ttl_move_on_insert>
</s3>
</volumes>
</tiered>
</policies>
</storage_configuration>
</clickhouse>

View File

@@ -0,0 +1,123 @@
<?xml version="1.0"?>
<clickhouse>
<!-- See also the files in users.d directory where the settings can be overridden. -->
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>
<!-- How to choose between replicas during distributed query processing.
random - choose random replica from set of replicas with minimum number of errors
nearest_hostname - from set of replicas with minimum number of errors, choose replica
with minimum number of different symbols between replica's hostname and local hostname
(Hamming distance).
in_order - first live replica is chosen in specified order.
first_or_random - if first replica one has higher number of errors, pick a random one from replicas with minimum number of errors.
-->
<load_balancing>random</load_balancing>
</default>
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<!-- Users and ACL. -->
<users>
<!-- If user name was not specified, 'default' user is used. -->
<default>
<!-- See also the files in users.d directory where the password can be overridden.
Password could be specified in plaintext or in SHA256 (in hex format).
If you want to specify password in plaintext (not recommended), place it in 'password' element.
Example: <password>qwerty</password>.
Password could be empty.
If you want to specify SHA256, place it in 'password_sha256_hex' element.
Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
Restrictions of SHA256: impossibility to connect to ClickHouse using MySQL JS client (as of July 2019).
If you want to specify double SHA1, place it in 'password_double_sha1_hex' element.
Example: <password_double_sha1_hex>e395796d6546b1b65db9d665cd43f0e858dd4303</password_double_sha1_hex>
If you want to specify a previously defined LDAP server (see 'ldap_servers' in the main config) for authentication,
place its name in 'server' element inside 'ldap' element.
Example: <ldap><server>my_ldap_server</server></ldap>
If you want to authenticate the user via Kerberos (assuming Kerberos is enabled, see 'kerberos' in the main config),
place 'kerberos' element instead of 'password' (and similar) elements.
The name part of the canonical principal name of the initiator must match the user name for authentication to succeed.
You can also place 'realm' element inside 'kerberos' element to further restrict authentication to only those requests
whose initiator's realm matches it.
Example: <kerberos />
Example: <kerberos><realm>EXAMPLE.COM</realm></kerberos>
How to generate decent password:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
In first line will be password and in second - corresponding SHA256.
How to generate double SHA1:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d '-'
In first line will be password and in second - corresponding double SHA1.
-->
<password></password>
<!-- List of networks with open access.
To open access from everywhere, specify:
<ip>::/0</ip>
To open access only from localhost, specify:
<ip>::1</ip>
<ip>127.0.0.1</ip>
Each element of list has one of the following forms:
<ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 10.0.0.1/255.255.255.0
2a02:6b8::3 or 2a02:6b8::3/64 or 2a02:6b8::3/ffff:ffff:ffff:ffff::.
<host> Hostname. Example: server01.clickhouse.com.
To check access, DNS query is performed, and all received addresses compared to peer address.
<host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.clickhouse\.com$
To check access, DNS PTR query is performed for peer address and then regexp is applied.
Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
Strongly recommended that regexp is ends with $
All results of DNS requests are cached till server restart.
-->
<networks>
<ip>::/0</ip>
</networks>
<!-- Settings profile for user. -->
<profile>default</profile>
<!-- Quota for user. -->
<quota>default</quota>
<!-- User can create other users and grant rights to them. -->
<!-- <access_management>1</access_management> -->
</default>
</users>
<!-- Quotas. -->
<quotas>
<!-- Name of quota. -->
<default>
<!-- Limits for time interval. You could specify many intervals with different limits. -->
<interval>
<!-- Length of interval. -->
<duration>3600</duration>
<!-- No limits. Just calculate resource usage for time interval. -->
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
</clickhouse>

View File

@@ -0,0 +1,16 @@
from locust import HttpUser, task, between
class UserTasks(HttpUser):
wait_time = between(5, 15)
@task
def rachel(self):
self.client.get("/dispatch?customer=123&nonse=0.6308392664170006")
@task
def trom(self):
self.client.get("/dispatch?customer=392&nonse=0.015296363321630757")
@task
def japanese(self):
self.client.get("/dispatch?customer=731&nonse=0.8022286220408668")
@task
def coffee(self):
self.client.get("/dispatch?customer=567&nonse=0.0022220379420636593")

View File

@@ -0,0 +1 @@
server_endpoint: ws://signoz:4320/v1/opamp

View File

@@ -0,0 +1,25 @@
# my global config
global:
scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: []
# - "first_rules.yml"
# - "second_rules.yml"
# - 'alerts.yml'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs: []
remote_read:
- url: tcp://clickhouse:9000/signoz_metrics

View File

@@ -0,0 +1,288 @@
version: "3"
x-common: &common
networks:
- signoz-net
deploy:
restart_policy:
condition: on-failure
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
- zookeeper-1
- zookeeper-2
- zookeeper-3
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
- clickhouse
- clickhouse-2
- clickhouse-3
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
deploy:
restart_policy:
condition: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-2:
!!merge <<: *zookeeper-defaults
# ports:
# - "2182:2181"
# - "2889:2888"
# - "3889:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-2:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=2
- ZOO_SERVERS=zookeeper-1:2888:3888,0.0.0.0:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-3:
!!merge <<: *zookeeper-defaults
# ports:
# - "2183:2181"
# - "2890:2888"
# - "3890:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-3:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=3
- ZOO_SERVERS=zookeeper-1:2888:3888,zookeeper-2:2888:3888,0.0.0.0:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
# TODO: needed for schema-migrator to work, remove this redundancy once we have a better solution
hostname: clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-2:
!!merge <<: *clickhouse-defaults
hostname: clickhouse-2
# ports:
# - "9001:9000"
# - "8124:8123"
# - "9182:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse-2/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-3:
!!merge <<: *clickhouse-defaults
hostname: clickhouse-3
# ports:
# - "9002:9000"
# - "8125:8123"
# - "9183:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse-3/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:v0.116.1
ports:
- "8080:8080" # signoz port
# - "6060:6060" # pprof port
volumes:
- ./clickhouse-setup/data/signoz/:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.2
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
configs:
- source: otel-collector-config
target: /etc/otel-collector-config.yaml
- source: otel-manager-config
target: /etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name={{.Node.Hostname}},os.type={{.Node.Platform.OS}}
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
deploy:
replicas: 3
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.2
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
clickhouse-2:
name: signoz-clickhouse-2
clickhouse-3:
name: signoz-clickhouse-3
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
zookeeper-2:
name: signoz-zookeeper-2
zookeeper-3:
name: signoz-zookeeper-3
configs:
clickhouse-config:
file: ../common/clickhouse/config.xml
clickhouse-users:
file: ../common/clickhouse/users.xml
clickhouse-custom-function:
file: ../common/clickhouse/custom-function.xml
clickhouse-cluster:
file: ../common/clickhouse/cluster.ha.xml
otel-collector-config:
file: ./otel-collector-config.yaml
otel-manager-config:
file: ../common/signoz/otel-collector-opamp-config.yaml

View File

@@ -0,0 +1,206 @@
version: "3"
x-common: &common
networks:
- signoz-net
deploy:
restart_policy:
condition: on-failure
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
- init-clickhouse
- zookeeper-1
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
- clickhouse
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
deploy:
restart_policy:
condition: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
# TODO: needed for clickhouse TCP connectio
hostname: clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.xml
volumes:
- clickhouse:/var/lib/clickhouse/
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:v0.116.1
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.2
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
configs:
- source: otel-collector-config
target: /etc/otel-collector-config.yaml
- source: otel-manager-config
target: /etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name={{.Node.Hostname}},os.type={{.Node.Platform.OS}}
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
deploy:
replicas: 3
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.2
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
configs:
clickhouse-config:
file: ../common/clickhouse/config.xml
clickhouse-users:
file: ../common/clickhouse/users.xml
clickhouse-custom-function:
file: ../common/clickhouse/custom-function.xml
clickhouse-cluster:
file: ../common/clickhouse/cluster.xml
otel-collector-config:
file: ./otel-collector-config.yaml
otel-manager-config:
file: ../common/signoz/otel-collector-opamp-config.yaml

View File

@@ -0,0 +1,118 @@
connectors:
signozmeter:
metrics_flush_interval: 1h
dimensions:
- name: service.name
- name: deployment.environment
- name: host.name
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
global:
scrape_interval: 60s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
batch/meter:
send_batch_max_size: 25000
send_batch_size: 20000
timeout: 1s
resourcedetection:
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
detectors: [env, system]
timeout: 2s
signozspanmetrics/delta:
metrics_exporter: signozclickhousemetrics
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: signoz.collector.id
- name: service.version
- name: browser.platform
- name: browser.mobile
- name: k8s.cluster.name
- name: k8s.node.name
- name: k8s.namespace.name
- name: host.name
- name: host.type
- name: container.name
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/signoz_traces
low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING}
use_new_schema: true
signozclickhousemetrics:
dsn: tcp://clickhouse:9000/signoz_metrics
clickhouselogsexporter:
dsn: tcp://clickhouse:9000/signoz_logs
timeout: 10s
use_new_schema: true
signozclickhousemeter:
dsn: tcp://clickhouse:9000/signoz_meter
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
encoding: json
extensions:
- health_check
- pprof
pipelines:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]
exporters: [signozclickhousemeter]

1
deploy/docker/.env Normal file
View File

@@ -0,0 +1 @@
COMPOSE_PROJECT_NAME=signoz

View File

@@ -0,0 +1,3 @@
This data directory is deprecated and will be removed in the future.
Please use the migration script under `scripts/volume-migration` to migrate data from bind mounts to Docker volumes.
The script also renames the project name to `signoz` and the network name to `signoz-net` (if not already in place).

View File

View File

@@ -0,0 +1,2 @@
This directory is deprecated and will be removed in the future.
Please use the new directory for Clickhouse setup scripts: `scripts/clickhouse` instead.

View File

@@ -0,0 +1,265 @@
version: "3"
x-common: &common
networks:
- signoz-net
restart: unless-stopped
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
# addding non LTS version due to this fix https://github.com/ClickHouse/ClickHouse/commit/32caf8716352f45c1b617274c7508c86b7d1afab
image: clickhouse/clickhouse-server:25.5.6
tty: true
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
init-clickhouse:
condition: service_completed_successfully
zookeeper-1:
condition: service_healthy
zookeeper-2:
condition: service_healthy
zookeeper-3:
condition: service_healthy
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
clickhouse:
condition: service_healthy
clickhouse-2:
condition: service_healthy
clickhouse-3:
condition: service_healthy
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
container_name: signoz-init-clickhouse
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
restart: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-1
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-2:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-2
# ports:
# - "2182:2181"
# - "2889:2888"
# - "3889:3888"
volumes:
- zookeeper-2:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=2
- ZOO_SERVERS=zookeeper-1:2888:3888,0.0.0.0:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-3:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-3
# ports:
# - "2183:2181"
# - "2890:2888"
# - "3890:3888"
volumes:
- zookeeper-3:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=3
- ZOO_SERVERS=zookeeper-1:2888:3888,zookeeper-2:2888:3888,0.0.0.0:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-2:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse-2
# ports:
# - "9001:9000"
# - "8124:8123"
# - "9182:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse-2:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-3:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse-3
# ports:
# - "9002:9000"
# - "8125:8123"
# - "9183:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse-3:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:${VERSION:-v0.116.1}
container_name: signoz
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.2}
container_name: signoz-otel-collector
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
- ../common/signoz/otel-collector-opamp-config.yaml:/etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.2}
container_name: signoz-telemetrystore-migrator
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
restart: on-failure
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
clickhouse-2:
name: signoz-clickhouse-2
clickhouse-3:
name: signoz-clickhouse-3
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
zookeeper-2:
name: signoz-zookeeper-2
zookeeper-3:
name: signoz-zookeeper-3

View File

@@ -0,0 +1,185 @@
version: "3"
x-common: &common
networks:
- signoz-net
restart: unless-stopped
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
init-clickhouse:
condition: service_completed_successfully
zookeeper-1:
condition: service_healthy
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
clickhouse:
condition: service_healthy
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
container_name: signoz-init-clickhouse
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
restart: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-1
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:${VERSION:-v0.116.1}
container_name: signoz
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.2}
container_name: signoz-otel-collector
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
- ../common/signoz/otel-collector-opamp-config.yaml:/etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.2}
container_name: signoz-telemetrystore-migrator
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
restart: on-failure
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1

View File

@@ -0,0 +1,118 @@
connectors:
signozmeter:
metrics_flush_interval: 1h
dimensions:
- name: service.name
- name: deployment.environment
- name: host.name
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
global:
scrape_interval: 60s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
batch/meter:
send_batch_max_size: 25000
send_batch_size: 20000
timeout: 1s
resourcedetection:
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
detectors: [env, system]
timeout: 2s
signozspanmetrics/delta:
metrics_exporter: signozclickhousemetrics
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: signoz.collector.id
- name: service.version
- name: browser.platform
- name: browser.mobile
- name: k8s.cluster.name
- name: k8s.node.name
- name: k8s.namespace.name
- name: host.name
- name: host.type
- name: container.name
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/signoz_traces
low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING}
use_new_schema: true
signozclickhousemetrics:
dsn: tcp://clickhouse:9000/signoz_metrics
clickhouselogsexporter:
dsn: tcp://clickhouse:9000/signoz_logs
timeout: 10s
use_new_schema: true
signozclickhousemeter:
dsn: tcp://clickhouse:9000/signoz_meter
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
encoding: json
extensions:
- health_check
- pprof
pipelines:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]
exporters: [signozclickhousemeter]

563
deploy/install.sh Executable file
View File

@@ -0,0 +1,563 @@
#!/bin/bash
set -o errexit
# Variables
BASE_DIR="$(dirname "$(readlink -f "$0")")"
DOCKER_STANDALONE_DIR="docker"
DOCKER_SWARM_DIR="docker-swarm" # TODO: Add docker swarm support
# Regular Colors
Black='\033[0;30m' # Black
Red='\[\e[0;31m\]' # Red
Green='\033[0;32m' # Green
Yellow='\033[0;33m' # Yellow
Blue='\033[0;34m' # Blue
Purple='\033[0;35m' # Purple
Cyan='\033[0;36m' # Cyan
White='\033[0;37m' # White
NC='\033[0m' # No Color
is_command_present() {
type "$1" >/dev/null 2>&1
}
# Check whether 'wget' command exists.
has_wget() {
has_cmd wget
}
# Check whether 'curl' command exists.
has_curl() {
has_cmd curl
}
# Check whether the given command exists.
has_cmd() {
command -v "$1" > /dev/null 2>&1
}
# Check if docker compose plugin is present
has_docker_compose_plugin() {
docker compose version > /dev/null 2>&1
}
is_mac() {
[[ $OSTYPE == darwin* ]]
}
is_arm64(){
[[ `uname -m` == 'arm64' || `uname -m` == 'aarch64' ]]
}
check_os() {
if is_mac; then
package_manager="brew"
desired_os=1
os="Mac"
return
fi
if is_arm64; then
arch="arm64"
arch_official="aarch64"
else
arch="amd64"
arch_official="x86_64"
fi
platform=$(uname -s | tr '[:upper:]' '[:lower:]')
os_name="$(cat /etc/*-release | awk -F= '$1 == "NAME" { gsub(/"/, ""); print $2; exit }')"
case "$os_name" in
Ubuntu*|Pop!_OS)
desired_os=1
os="ubuntu"
package_manager="apt-get"
;;
Amazon\ Linux*)
desired_os=1
os="amazon linux"
package_manager="yum"
;;
Debian*)
desired_os=1
os="debian"
package_manager="apt-get"
;;
Linux\ Mint*)
desired_os=1
os="linux mint"
package_manager="apt-get"
;;
Red\ Hat*)
desired_os=1
os="rhel"
package_manager="yum"
;;
CentOS*)
desired_os=1
os="centos"
package_manager="yum"
;;
Rocky*)
desired_os=1
os="centos"
package_manager="yum"
;;
SLES*)
desired_os=1
os="sles"
package_manager="zypper"
;;
openSUSE*)
desired_os=1
os="opensuse"
package_manager="zypper"
;;
*)
desired_os=0
os="Not Found: $os_name"
esac
}
# This function checks if the relevant ports required by SigNoz are available or not
# The script should error out in case they aren't available
check_ports_occupied() {
local port_check_output
local ports_pattern="8080|4317"
if is_mac; then
port_check_output="$(netstat -anp tcp | awk '$6 == "LISTEN" && $4 ~ /^.*\.('"$ports_pattern"')$/')"
elif is_command_present ss; then
# The `ss` command seems to be a better/faster version of `netstat`, but is not available on all Linux
# distributions by default. Other distributions have `ss` but no `netstat`. So, we try for `ss` first, then
# fallback to `netstat`.
port_check_output="$(ss --all --numeric --tcp | awk '$1 == "LISTEN" && $4 ~ /^.*:('"$ports_pattern"')$/')"
elif is_command_present netstat; then
port_check_output="$(netstat --all --numeric --tcp | awk '$6 == "LISTEN" && $4 ~ /^.*:('"$ports_pattern"')$/')"
fi
if [[ -n $port_check_output ]]; then
send_event "port_not_available"
echo "+++++++++++ ERROR ++++++++++++++++++++++"
echo "SigNoz requires ports 8080 & 4317 to be open. Please shut down any other service(s) that may be running on these ports."
echo "You can run SigNoz on another port following this guide https://signoz.io/docs/install/troubleshooting/"
echo "++++++++++++++++++++++++++++++++++++++++"
echo ""
exit 1
fi
}
install_docker() {
echo "++++++++++++++++++++++++"
echo "Setting up docker repos"
if [[ $package_manager == apt-get ]]; then
apt_cmd="$sudo_cmd apt-get --yes --quiet"
$apt_cmd update
$apt_cmd install software-properties-common gnupg-agent
curl -fsSL "https://download.docker.com/linux/$os/gpg" | $sudo_cmd apt-key add -
$sudo_cmd add-apt-repository \
"deb [arch=$arch] https://download.docker.com/linux/$os $(lsb_release -cs) stable"
$apt_cmd update
echo "Installing docker"
$apt_cmd install docker-ce docker-ce-cli containerd.io
elif [[ $package_manager == zypper ]]; then
zypper_cmd="$sudo_cmd zypper --quiet --no-gpg-checks --non-interactive"
echo "Installing docker"
if [[ $os == sles ]]; then
os_sp="$(cat /etc/*-release | awk -F= '$1 == "VERSION_ID" { gsub(/"/, ""); print $2; exit }')"
os_arch="$(uname -i)"
SUSEConnect -p sle-module-containers/$os_sp/$os_arch -r ''
fi
$zypper_cmd install docker docker-runc containerd
$sudo_cmd systemctl enable docker.service
elif [[ $package_manager == yum && $os == 'amazon linux' ]]; then
echo
echo "Amazon Linux detected ... "
echo
# yum install docker
# service docker start
$sudo_cmd yum install -y amazon-linux-extras
$sudo_cmd amazon-linux-extras enable docker
$sudo_cmd yum install -y docker
else
yum_cmd="$sudo_cmd yum --assumeyes --quiet"
$yum_cmd install yum-utils
$sudo_cmd yum-config-manager --add-repo https://download.docker.com/linux/$os/docker-ce.repo
echo "Installing docker"
$yum_cmd install docker-ce docker-ce-cli containerd.io
fi
}
compose_version () {
local compose_version
compose_version="$(curl -s https://api.github.com/repos/docker/compose/releases/latest | grep 'tag_name' | cut -d\" -f4)"
echo "${compose_version:-v2.18.1}"
}
install_docker_compose() {
if [[ $package_manager == "apt-get" || $package_manager == "zypper" || $package_manager == "yum" ]]; then
if [[ ! -f /usr/bin/docker-compose ]];then
echo "++++++++++++++++++++++++"
echo "Installing docker-compose"
compose_url="https://github.com/docker/compose/releases/download/$(compose_version)/docker-compose-$platform-$arch_official"
echo "Downloading docker-compose from $compose_url"
$sudo_cmd curl -L "$compose_url" -o /usr/local/bin/docker-compose
$sudo_cmd chmod +x /usr/local/bin/docker-compose
$sudo_cmd ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
echo "docker-compose installed!"
echo ""
fi
else
send_event "docker_compose_not_found"
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "docker-compose not found! Please install docker-compose first and then continue with this installation."
echo "Refer https://docs.docker.com/compose/install/ for installing docker-compose."
echo "+++++++++++++++++++++++++++++++++++++++++++++++++"
exit 1
fi
}
start_docker() {
echo -e "🐳 Starting Docker ...\n"
if [[ $os == "Mac" ]]; then
open --background -a Docker && while ! docker system info > /dev/null 2>&1; do sleep 1; done
else
if ! $sudo_cmd systemctl is-active docker.service > /dev/null; then
echo "Starting docker service"
$sudo_cmd systemctl start docker.service
fi
if [[ -z $sudo_cmd ]]; then
if ! docker ps > /dev/null && true; then
request_sudo
fi
fi
fi
}
wait_for_containers_start() {
local timeout=$1
# The while loop is important because for-loops don't work for dynamic values
while [[ $timeout -gt 0 ]]; do
status_code="$(curl -s -o /dev/null -w "%{http_code}" "http://localhost:8080/api/v1/health?live=1" || true)"
if [[ status_code -eq 200 ]]; then
break
else
echo -ne "Waiting for all containers to start. This check will timeout in $timeout seconds ...\r\c"
fi
((timeout--))
sleep 1
done
echo ""
}
bye() { # Prints a friendly good bye message and exits the script.
# Switch back to the original directory
popd > /dev/null 2>&1
if [[ "$?" -ne 0 ]]; then
set +o errexit
echo "🔴 The containers didn't seem to start correctly. Please run the following command to check containers that may have errored out:"
echo ""
echo -e "cd ${DOCKER_STANDALONE_DIR}"
echo -e "$sudo_cmd $docker_compose_cmd ps -a"
echo "Please read our troubleshooting guide https://signoz.io/docs/install/troubleshooting/"
echo "or reach us for support in #help channel in our Slack Community https://signoz.io/slack"
echo "++++++++++++++++++++++++++++++++++++++++"
if [[ $email == "" ]]; then
echo -e "\n📨 Please share your email to receive support with the installation"
read -rp 'Email: ' email
while [[ $email == "" ]]
do
read -rp 'Email: ' email
done
fi
send_event "installation_support"
echo ""
echo -e "\nWe will reach out to you at the email provided shortly, Exiting for now. Bye! 👋 \n"
exit 0
fi
}
request_sudo() {
if hash sudo 2>/dev/null; then
echo -e "\n\n🙇 We will need sudo access to complete the installation."
if (( $EUID != 0 )); then
sudo_cmd="sudo"
echo -e "Please enter your sudo password, if prompted."
if ! $sudo_cmd -l | grep -e "NOPASSWD: ALL" > /dev/null && ! $sudo_cmd -v; then
echo "Need sudo privileges to proceed with the installation."
exit 1;
fi
echo -e "Got it! Thanks!! 🙏\n"
echo -e "Okay! We will bring up the SigNoz cluster from here 🚀\n"
fi
fi
}
echo ""
echo -e "👋 Thank you for trying out SigNoz! "
echo ""
sudo_cmd=""
docker_compose_cmd=""
# Check sudo permissions
if (( $EUID != 0 )); then
echo "🟡 Running installer with non-sudo permissions."
echo " In case of any failure or prompt, please consider running the script with sudo privileges."
echo ""
else
sudo_cmd="sudo"
fi
# Checking OS and assigning package manager
desired_os=0
os=""
email=""
echo -e "🌏 Detecting your OS ...\n"
check_os
# Obtain unique installation id
# sysinfo="$(uname -a)"
# if [[ $? -ne 0 ]]; then
# uuid="$(uuidgen)"
# uuid="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
# sysinfo="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
# fi
if ! sysinfo="$(uname -a)"; then
uuid="$(uuidgen)"
uuid="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
sysinfo="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
fi
digest_cmd=""
if hash shasum 2>/dev/null; then
digest_cmd="shasum -a 256"
elif hash sha256sum 2>/dev/null; then
digest_cmd="sha256sum"
elif hash openssl 2>/dev/null; then
digest_cmd="openssl dgst -sha256"
fi
if [[ -z $digest_cmd ]]; then
SIGNOZ_INSTALLATION_ID="$sysinfo"
else
SIGNOZ_INSTALLATION_ID=$(echo "$sysinfo" | $digest_cmd | grep -E -o '[a-zA-Z0-9]{64}')
fi
setup_type='clickhouse'
# Run bye if failure happens
trap bye EXIT
URL="https://api.segment.io/v1/track"
HEADER_1="Content-Type: application/json"
HEADER_2="Authorization: Basic OWtScko3b1BDR1BFSkxGNlFqTVBMdDVibGpGaFJRQnI="
send_event() {
error=""
case "$1" in
'install_started')
event="Installation Started"
;;
'os_not_supported')
event="Installation Error"
error="OS Not Supported"
;;
'docker_not_installed')
event="Installation Error"
error="Docker not installed"
;;
'docker_compose_not_found')
event="Installation Error"
event="Docker Compose not found"
;;
'port_not_available')
event="Installation Error"
error="port not available"
;;
'installation_error_checks')
event="Installation Error - Checks"
error="Containers not started"
others='"data": "some_checks",'
;;
'installation_support')
event="Installation Support"
others='"email": "'"$email"'",'
;;
'installation_success')
event="Installation Success"
;;
'identify_successful_installation')
event="Identify Successful Installation"
others='"email": "'"$email"'",'
;;
*)
print_error "unknown event type: $1"
exit 1
;;
esac
if [[ "$error" != "" ]]; then
error='"error": "'"$error"'", '
fi
DATA='{ "anonymousId": "'"$SIGNOZ_INSTALLATION_ID"'", "event": "'"$event"'", "properties": { "os": "'"$os"'", '"$error $others"' "setup_type": "'"$setup_type"'" } }'
if has_curl; then
curl -sfL -d "$DATA" --header "$HEADER_1" --header "$HEADER_2" "$URL" > /dev/null 2>&1
elif has_wget; then
wget -q --post-data="$DATA" --header "$HEADER_1" --header "$HEADER_2" "$URL" > /dev/null 2>&1
fi
}
send_event "install_started"
if [[ $desired_os -eq 0 ]]; then
send_event "os_not_supported"
fi
# Check is Docker daemon is installed and available. If not, the install & start Docker for Linux machines. We cannot automatically install Docker Desktop on Mac OS
if ! is_command_present docker; then
if [[ $package_manager == "apt-get" || $package_manager == "zypper" || $package_manager == "yum" ]]; then
request_sudo
install_docker
# enable docker without sudo from next reboot
sudo usermod -aG docker "${USER}"
elif is_mac; then
echo ""
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "Docker Desktop must be installed manually on Mac OS to proceed. Docker can only be installed automatically on Ubuntu / openSUSE / SLES / Redhat / Cent OS"
echo "https://docs.docker.com/docker-for-mac/install/"
echo "++++++++++++++++++++++++++++++++++++++++++++++++"
send_event "docker_not_installed"
exit 1
else
echo ""
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "Docker must be installed manually on your machine to proceed. Docker can only be installed automatically on Ubuntu / openSUSE / SLES / Redhat / Cent OS"
echo "https://docs.docker.com/get-docker/"
echo "++++++++++++++++++++++++++++++++++++++++++++++++"
send_event "docker_not_installed"
exit 1
fi
fi
if has_docker_compose_plugin; then
echo "docker compose plugin is present, using it"
docker_compose_cmd="docker compose"
# Install docker-compose
else
docker_compose_cmd="docker-compose"
if ! is_command_present docker-compose; then
request_sudo
install_docker_compose
fi
fi
start_docker
# Switch to the Docker Standalone directory
pushd "${BASE_DIR}/${DOCKER_STANDALONE_DIR}" > /dev/null 2>&1
# check for open ports, if signoz is not installed
if is_command_present docker-compose; then
if $sudo_cmd $docker_compose_cmd ps | grep "signoz" | grep -q "healthy" > /dev/null 2>&1; then
echo "SigNoz already installed, skipping the occupied ports check"
else
check_ports_occupied
fi
fi
echo ""
echo -e "\n🟡 Pulling the latest container images for SigNoz.\n"
$sudo_cmd $docker_compose_cmd pull
echo ""
echo "🟡 Starting the SigNoz containers. It may take a few minutes ..."
echo
# The $docker_compose_cmd command does some nasty stuff for the `--detach` functionality. So we add a `|| true` so that the
# script doesn't exit because this command looks like it failed to do it's thing.
$sudo_cmd $docker_compose_cmd up --detach --remove-orphans || true
wait_for_containers_start 60
echo ""
if [[ $status_code -ne 200 ]]; then
echo "+++++++++++ ERROR ++++++++++++++++++++++"
echo "🔴 The containers didn't seem to start correctly. Please run the following command to check containers that may have errored out:"
echo ""
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd ps -a"
echo ""
echo "Try bringing down the containers and retrying the installation"
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd down -v"
echo ""
echo "Please read our troubleshooting guide https://signoz.io/docs/install/troubleshooting/"
echo "or reach us on SigNoz for support https://signoz.io/slack"
echo "++++++++++++++++++++++++++++++++++++++++"
send_event "installation_error_checks"
exit 1
else
send_event "installation_success"
echo "++++++++++++++++++ SUCCESS ++++++++++++++++++++++"
echo ""
echo "🟢 Your installation is complete!"
echo ""
echo -e "🟢 SigNoz is running on http://localhost:8080"
echo ""
echo " By default, retention period is set to 15 days for logs and traces, and 30 days for metrics."
echo -e "To change this, navigate to the General tab on the Settings page of SigNoz UI. For more details, refer to https://signoz.io/docs/userguide/retention-period \n"
echo " To bring down SigNoz and clean volumes:"
echo ""
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd down -v"
echo ""
echo "+++++++++++++++++++++++++++++++++++++++++++++++++"
echo ""
echo "👉 Need help in Getting Started?"
echo -e "Join us on Slack https://signoz.io/slack"
echo ""
echo -e "\n📨 Please share your email to receive support & updates about SigNoz!"
read -rp 'Email: ' email
while [[ $email == "" ]]
do
read -rp 'Email: ' email
done
send_event "identify_successful_installation"
fi
echo -e "\n🙏 Thank you!\n"

File diff suppressed because it is too large Load Diff

View File

@@ -1,74 +0,0 @@
{
"required": [
"posthog",
"appcues",
"sentry",
"pylon"
],
"additionalProperties": false,
"definitions": {
"Appcues": {
"required": [
"enabled"
],
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean"
}
},
"type": "object"
},
"Posthog": {
"required": [
"enabled"
],
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean"
}
},
"type": "object"
},
"Pylon": {
"required": [
"enabled"
],
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean"
}
},
"type": "object"
},
"Sentry": {
"required": [
"enabled"
],
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean"
}
},
"type": "object"
}
},
"properties": {
"appcues": {
"$ref": "#/definitions/Appcues"
},
"posthog": {
"$ref": "#/definitions/Posthog"
},
"pylon": {
"$ref": "#/definitions/Pylon"
},
"sentry": {
"$ref": "#/definitions/Sentry"
}
},
"type": "object"
}

View File

@@ -13,12 +13,13 @@ Before diving in, make sure you have these tools installed:
- Download from [go.dev/dl](https://go.dev/dl/)
- Check [go.mod](../../go.mod#L3) for the minimum version
- **Node** - Powers our frontend
- Download from [nodejs.org](https://nodejs.org)
- Check [.nvmrc](../../frontend/.nvmrc) for the version
- **Pnpm** - Our frontend package manager
- Follow the [installation guide](https://pnpm.io/installation)
- **Yarn** - Our frontend package manager
- Follow the [installation guide](https://yarnpkg.com/getting-started/install)
- **Docker** - For running Clickhouse and Postgres locally
- Get it from [docs.docker.com/get-docker](https://docs.docker.com/get-docker/)
@@ -94,7 +95,7 @@ This command:
2. Install dependencies:
```bash
pnpm install
yarn install
```
3. Create a `.env` file in this directory:
@@ -104,10 +105,10 @@ This command:
4. Start the development server:
```bash
pnpm dev
yarn dev
```
> 💡 **Tip**: `pnpm dev` will automatically rebuild when you make changes to the code
> 💡 **Tip**: `yarn dev` will automatically rebuild when you make changes to the code
Now you're all set to start developing! Happy coding! 🎉

View File

@@ -109,20 +109,6 @@ func (h *handler) CreateThing(rw http.ResponseWriter, req *http.Request) {
}
```
When you need an ID from `claims` as a `valuer.UUID` (for example to pass it to a module), derive it with the `Must*` constructor instead of `NewUUID` plus an error check. Claims are validated by the auth middleware, so the conversion cannot fail and the error branch would be dead code:
```go
// Good — claims are pre-validated, the conversion cannot fail.
orgID := valuer.MustNewUUID(claims.OrgID)
// Avoid — the error path is unreachable.
orgID, err := valuer.NewUUID(claims.OrgID)
if err != nil {
render.Error(rw, err)
return
}
```
### 3. Register the handler in `signozapiserver`
In `pkg/apiserver/signozapiserver`, add a route in the appropriate `add*Routes` function (`addUserRoutes`, `addSessionRoutes`, `addOrgRoutes`, etc.). The pattern is:
@@ -137,7 +123,6 @@ if err := router.Handle("/api/v1/things", handler.New(
Description: "This endpoint creates a thing",
Request: new(types.PostableThing),
RequestContentType: "application/json",
RequestQuery: new(types.QueryableThing),
Response: new(types.GettableThing),
ResponseContentType: "application/json",
SuccessStatusCode: http.StatusCreated,
@@ -170,8 +155,6 @@ The `handler.New` function ties the HTTP handler to OpenAPI metadata via `OpenAP
- **Request / RequestContentType**:
- `Request` is a Go type that describes the request body or form.
- `RequestContentType` is usually `"application/json"` or `"application/x-www-form-urlencoded"` (for callbacks like SAML).
- **RequestQuery**:
- `RequestQuery` is a Go type that descirbes query url params.
- **RequestExamples**: An array of `handler.OpenAPIExample` that provide concrete request payloads in the generated spec. See [Adding request examples](#adding-request-examples) below.
- **Response / ResponseContentType**:
- `Response` is the Go type for the successful response payload.
@@ -347,50 +330,6 @@ func (Step) JSONSchema() (jsonschema.Schema, error) {
}
```
### `oneOf` with a discriminator
For a sum type whose variants are keyed by a property (e.g. `kind`), expose the variants via `JSONSchemaOneOf()` and add a discriminator. Without it, code generators intersect the variants (`A & B & C`) instead of producing a clean discriminated union (`A | B | C`).
The parent keeps its `JSONSchemaOneOf()` (the `oneOf` itself) and *additionally* tags it via `PrepareJSONSchema` with the `x-signoz-discriminator` extension; `signoz.attachDiscriminators` then promotes that marker to a real OpenAPI 3 `discriminator` (and strips the duplicate parent properties) after reflection.
```go
// On the parent: expose the oneOf variants...
func (Plugin) JSONSchemaOneOf() []any {
return []any{FooVariant{}}
}
// ...and tag that same oneOf with the discriminator marker.
func (Plugin) PrepareJSONSchema(s *jsonschema.Schema) error {
if s.ExtraProperties == nil {
s.ExtraProperties = map[string]any{}
}
s.ExtraProperties["x-signoz-discriminator"] = map[string]any{
"propertyName": "kind",
"mapping": map[string]string{
"signoz/Foo": "#/components/schemas/FooVariant",
},
}
return nil
}
```
Each variant must declare the discriminator property (`kind`) and mark it `required`.
This produces the following in the generated OpenAPI spec:
```yaml
Plugin:
discriminator:
propertyName: kind
mapping:
signoz/Foo: '#/components/schemas/FooVariant'
oneOf:
- $ref: '#/components/schemas/FooVariant'
type: object
```
Note the discriminator property lives in the variants, not on the parent — the parent is only the union.
## What should I remember?
@@ -401,4 +340,3 @@ Note the discriminator property lives in the variants, not on the parent — the
- **Add `nullable:"true"`** on fields that can be `null`. Pay special attention to slices and maps -- in Go these default to `nil` which serializes to `null`. If the field should always be an array, initialize it and do not mark it nullable.
- **Implement `Enum()`** on every type that has a fixed set of acceptable values so the JSON schema generates proper `enum` constraints.
- **Add request examples** via `RequestExamples` in `OpenAPIDef` for any non-trivial endpoint. See `pkg/apiserver/signozapiserver/querier.go` for reference.
- **Derive IDs from `claims` with `valuer.MustNewUUID`** (e.g. `claims.OrgID`, `claims.UserID`). Claims are pre-validated by the auth middleware, so use the `Must*` constructor — don't write `NewUUID` followed by an `if err != nil { render.Error(...); return }` block.

View File

@@ -0,0 +1,215 @@
# Integration Tests
SigNoz uses integration tests to verify that different components work together correctly in a real environment. These tests run against actual services (ClickHouse, PostgreSQL, etc.) to ensure end-to-end functionality.
## How to set up the integration test environment?
### Prerequisites
Before running integration tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
### Initial Setup
1. Navigate to the integration tests directory:
```bash
cd tests/integration
```
2. Install dependencies using uv:
```bash
uv sync
```
> **_NOTE:_** the build backend could throw an error while installing `psycopg2`, pleae see https://www.psycopg.org/docs/install.html#build-prerequisites
### Starting the Test Environment
To spin up all the containers necessary for writing integration tests and keep them running:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/bootstrap/setup.py::test_setup
```
This command will:
- Start all required services (ClickHouse, PostgreSQL, Zookeeper, etc.)
- Keep containers running due to the `--reuse` flag
- Verify that the setup is working correctly
### Stopping the Test Environment
When you're done writing integration tests, clean up the environment:
```bash
uv run pytest --basetemp=./tmp/ -vv --teardown -s src/bootstrap/setup.py::test_teardown
```
This will destroy the running integration test setup and clean up resources.
## Understanding the Integration Test Framework
Python and pytest form the foundation of the integration testing framework. Testcontainers are used to spin up disposable integration environments. Wiremock is used to spin up **test doubles** of other services.
- **Why Python/pytest?** It's expressive, low-boilerplate, and has powerful fixture capabilities that make integration testing straightforward. Extensive libraries for HTTP requests, JSON handling, and data analysis (numpy) make it easier to test APIs and verify data
- **Why testcontainers?** They let us spin up isolated dependencies that match our production environment without complex setup.
- **Why wiremock?** Well maintained, documented and extensible.
```
.
├── conftest.py
├── fixtures
│ ├── __init__.py
│ ├── auth.py
│ ├── clickhouse.py
│ ├── fs.py
│ ├── http.py
│ ├── migrator.py
│ ├── network.py
│ ├── postgres.py
│ ├── signoz.py
│ ├── sql.py
│ ├── sqlite.py
│ ├── types.py
│ └── zookeeper.py
├── uv.lock
├── pyproject.toml
└── src
└── bootstrap
├── __init__.py
├── 01_database.py
├── 02_register.py
└── 03_license.py
```
Each test suite follows some important principles:
1. **Organization**: Test suites live under `src/` in self-contained packages. Fixtures (a pytest concept) live inside `fixtures/`.
2. **Execution Order**: Files are prefixed with two-digit numbers (`01_`, `02_`, `03_`) to ensure sequential execution.
3. **Time Constraints**: Each suite should complete in under 10 minutes (setup takes ~4 mins).
### Test Suite Design
Test suites should target functional domains or subsystems within SigNoz. When designing a test suite, consider these principles:
- **Functional Cohesion**: Group tests around a specific capability or service boundary
- **Data Flow**: Follow the path of data through related components
- **Change Patterns**: Components frequently modified together should be tested together
The exact boundaries for modules are intentionally flexible, allowing teams to define logical groupings based on their specific context and knowledge of the system.
Eg: The **bootstrap** integration test suite validates core system functionality:
- Database initialization
- Version check
Other test suites can be **pipelines, auth, querier.**
## How to write an integration test?
Now start writing an integration test. Create a new file `src/bootstrap/05_version.py` and paste the following:
```python
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_version(signoz: types.SigNoz) -> None:
response = requests.get(signoz.self.host_config.get("/api/v1/version"), timeout=2)
logger.info(response)
```
We have written a simple test which calls the `version` endpoint of the container in step 1. In **order to just run this function, run the following command:**
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/bootstrap/05_version.py::test_version
```
> Note: The `--reuse` flag is used to reuse the environment if it is already running. Always use this flag when writing and running integration tests. If you don't use this flag, the environment will be destroyed and recreated every time you run the test.
Here's another example of how to write a more comprehensive integration test:
```python
from http import HTTPStatus
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_user_registration(signoz: types.SigNoz) -> None:
"""Test user registration functionality."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": "testuser",
"orgId": "",
"orgName": "test.org",
"email": "test@example.com",
"password": "password123Z$",
},
timeout=2,
)
assert response.status_code == HTTPStatus.OK
assert response.json()["setupCompleted"] is True
```
## How to run integration tests?
### Running All Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/
```
### Running Specific Test Categories
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>
# Run querier tests
uv run pytest --basetemp=./tmp/ -vv --reuse src/querier/
# Run auth tests
uv run pytest --basetemp=./tmp/ -vv --reuse src/auth/
```
### Running Individual Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>/<file>.py::test_name
# Run test_register in file 01_register.py in passwordauthn suite
uv run pytest --basetemp=./tmp/ -vv --reuse src/passwordauthn/01_register.py::test_register
```
## How to configure different options for integration tests?
Tests can be configured using pytest options:
- `--sqlstore-provider` - Choose database provider (default: postgres)
- `--postgres-version` - PostgreSQL version (default: 15)
- `--clickhouse-version` - ClickHouse version (default: 25.5.6)
- `--zookeeper-version` - Zookeeper version (default: 3.7.1)
Example:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse --sqlstore-provider=postgres --postgres-version=14 src/auth/
```
## What should I remember?
- **Always use the `--reuse` flag** when setting up the environment to keep containers running
- **Use the `--teardown` flag** when cleaning up to avoid resource leaks
- **Follow the naming convention** with two-digit numeric prefixes (`01_`, `02_`) for test execution order
- **Use proper timeouts** in HTTP requests to avoid hanging tests
- **Clean up test data** between tests to avoid interference
- **Use descriptive test names** that clearly indicate what is being tested
- **Leverage fixtures** for common setup and authentication
- **Test both success and failure scenarios** to ensure robust functionality

View File

@@ -15,8 +15,8 @@ We **recommend** (almost enforce) reviewing these guides before contributing to
- [Endpoint](endpoint.md) - HTTP endpoint patterns
- [Flagger](flagger.md) - Feature flag patterns
- [Handler](handler.md) - HTTP handler patterns
- [Integration](integration.md) - Integration testing
- [Provider](provider.md) - Dependency injection and provider patterns
- [Packages](packages.md) - Naming, layout, and conventions for `pkg/` packages
- [Service](service.md) - Managed service lifecycle with `factory.Service`
- [SQL](sql.md) - Database and SQL patterns
- [Types](types.md) - Domain types, request/response bodies, and storage rows in `pkg/types/`

View File

@@ -1,152 +0,0 @@
# Types
Domain types in `pkg/types/<domain>/` live on three serialization boundaries — inbound HTTP, outbound HTTP, and SQL — on top of an in-memory domain representation. SigNoz's convention is **core-type-first**: every domain defines a single canonical type `X`, and specialized flavors (`PostableX`, `GettableX`, `UpdatableX`, `StorableX`) are introduced **only when they actually differ from `X`**. This guide spells out when each flavor is warranted and how they relate to each other.
Before reading, make sure you have read [abstractions.md](abstractions.md) — the rules here build on its guidance that every new type must earn its place.
## The core type is required
Every domain package in `pkg/types/<domain>/` defines exactly one core type `X`: `AuthDomain`, `Channel`, `Rule`, `Dashboard`, `Role`, `PlannedMaintenance`. This is the canonical in-memory representation of the domain object. Domain methods, validation invariants, and business logic hang off `X` — not off the flavor types.
Two rules shape how the core type behaves:
- **Conversions can be either `New<Output>From<Input>` or a receiver-style `(x *X) ToY()` method.** Either form is fine; pick whichever reads best at the call site:
```go
// Constructor form
func NewGettableAuthDomainFromAuthDomain(d *AuthDomain, info *AuthNProviderInfo) *GettableAuthDomain
// Receiver form
func (m *PlannedMaintenanceWithRules) ToPlannedMaintenance() *PlannedMaintenance
```
- **`X` can double as the storage row** when the DB shape would be identical. `Channel` embeds `bun.BaseModel` directly, and there is no `StorableChannel`. This is the preferred shape when it works.
Domain packages under `pkg/types/` must not import from other `pkg/` packages. Keep the core type's methods lightweight and push orchestration out to the module layer.
## Add a flavor only when it differs
For each of the four flavors, create it only if its shape diverges from `X`. If a flavor would have the same fields and tags as `X`, reuse `X` directly, or declare a type alias. Every flavor must earn its place per [abstractions.md](abstractions.md) rule 6 ("Wrappers must add semantics, not just rename").
| Flavor | Create it when it differs in… |
| ------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `PostableX` | JSON shape differs from `X` — typically no `Id`, no audit fields, no server-computed fields. Often owns input validation via `Validate()` or a custom `UnmarshalJSON`. |
| `GettableX` | Response shape adds server-computed fields that are not persisted — e.g., `GettableAuthDomain` adds `AuthNProviderInfo`, which is resolved at read time. |
| `UpdatableX` | Only a strict subset of `PostableX` is replaceable on PUT. If the updatable shape equals `PostableX`, reuse `PostableX`. |
| `StorableX` | DB row shape differs from `X` — usually `X` carries nested typed config while `StorableX` carries a flat `Data string` JSON column, plus bun tags, audit mixins, and an `OrgID`. If `X` already has those, skip the flavor. |
The failure mode this rule exists to prevent: minting all four flavors on reflex for every new resource, even when two or three are structurally identical. Each unnecessary flavor is another type contributors must understand and another conversion that can drift.
## Worked examples
### Channel — core type only
```go
type Channels = []*Channel
type GettableChannels = []*Channel
type Channel struct {
bun.BaseModel `bun:"table:notification_channel"`
types.Identifiable
types.TimeAuditable
Name string `json:"name" required:"true" bun:"name"`
Type string `json:"type" required:"true" bun:"type"`
Data string `json:"data" required:"true" bun:"data"`
OrgID string `json:"orgId" required:"true" bun:"org_id"`
}
```
`Channel` is both the domain type and the bun row. `GettableChannels` is a **type alias** because `*Channel` already serializes correctly as a response. There is no `StorableChannel`, `PostableChannel`, or `UpdatableChannel` — those would be identical to `Channel` and so do not exist. Prefer this shape when it works.
### AuthDomain — all four flavors
```go
type AuthDomain struct {
storableAuthDomain *StorableAuthDomain
authDomainConfig *AuthDomainConfig
}
type StorableAuthDomain struct {
bun.BaseModel `bun:"table:auth_domain"`
types.Identifiable
Name string `bun:"name"`
Data string `bun:"data"` // AuthDomainConfig serialized as JSON
OrgID valuer.UUID `bun:"org_id"`
types.TimeAuditable
}
type PostableAuthDomain struct {
Config AuthDomainConfig `json:"config"`
Name string `json:"name"`
}
type UpdateableAuthDomain struct {
Config AuthDomainConfig `json:"config"` // Name intentionally absent
}
type GettableAuthDomain struct {
*StorableAuthDomain
*AuthDomainConfig
AuthNProviderInfo *AuthNProviderInfo `json:"authNProviderInfo"`
}
```
Each flavor exists for a concrete reason:
- `StorableAuthDomain` stores the typed config as an opaque `Data string` column, so the schema does not need to migrate every time a config field is added.
- `PostableAuthDomain` carries the config as a structured object (not a string) for the request.
- `UpdateableAuthDomain` excludes `Name` because a domain's name cannot change after creation.
- `GettableAuthDomain` adds `AuthNProviderInfo`, which is derived at read time and never persisted.
The core `AuthDomain` holds the two live halves — `storableAuthDomain` and `authDomainConfig` — and owns business methods such as `Update(config)`. Conversions use the `New<Output>From<Input>` form: `NewAuthDomainFromConfig`, `NewAuthDomainFromStorableAuthDomain`, `NewGettableAuthDomainFromAuthDomain`.
## Conventions that tie the flavors together
- **Conversions** use either a `New<Output>From<Input>` constructor — e.g. `NewChannelFromReceiver`, `NewGettableAuthDomainFromAuthDomain` — or a receiver-style `ToY()` method. Both forms coexist in the codebase; use whichever fits the call site.
- **Validation belongs on the core type `X`.** Putting it on `X` means every write path — HTTP create, HTTP update, in-process migration, replay — runs the same checks. `Validate()` on `PostableX` is reserved for checks that are specific to the request shape and do not apply to `X`. `UnmarshalJSON` on `PostableX` is a separate tool that lives there because decoding only happens at the HTTP boundary — `PostableAuthDomain.UnmarshalJSON` rejecting a malformed domain name at decode time is the canonical example.
```go
// Domain invariants: every write path re-runs these.
func (x *X) Validate() error { ... }
// Request-shape-only: checks that do not apply once the value is persisted.
func (p *PostableX) Validate() error { ... }
```
- **Type aliases, not wrappers**, when two shapes are identical. `type GettableChannels = []*Channel` is correct because it adds no semantics beyond the underlying type.
- **Serialization tags** follow [handler.md](handler.md): `required:"true"` means the JSON key must be present, `nullable:"true"` is required on any slice or map that may serialize as `null`, and types with a fixed value set must implement `Enum() []any`.
## A note on `UpdatableX` and `PatchableX`
- `UpdatableX` — the body for PUT (full replace) when the shape is a strict subset of `PostableX`. If the updatable shape equals `PostableX`, reuse `PostableX`.
- `PatchableX` — the body for PATCH (partial update); only the fields a client is allowed to patch. For example, `PatchableRole` carries a single `Description` field even though `Role` has many — clients may patch the description but not anything else.
```go
type PatchableRole struct {
Description string `json:"description"`
}
```
Both are optional. Do not introduce them if `PostableX` already covers the case.
## What to avoid
- **Do not mint a flavor that mirrors the core type.** If `StorableX` would have the same fields as `X`, use `X` directly with `bun.BaseModel` embedded. `Channel` is the canonical example.
- **Do not bolt domain methods onto `StorableX`.** Storage types are data carriers. Domain methods live on `X`.
- **Do not invent new suffixes** (`Creatable`, `Fetchable`, `Savable`). The core type plus `Postable` / `Gettable` / `Updatable` / `Patchable` / `Storable` covers every case that exists today.
- **Spelling — `Updatable`, not `Updateable`.** `Updateable` is a common typo. Prefer the shorter form when introducing new types, and rename any stragglers you come across.
- **Spelling — `Storable`, not `Storeable`.** `Storeable` is a common typo. Prefer the shorter form when introducing new types, and rename any stragglers you come across.
## What should I remember?
- Every domain package defines the core type `X`. Only `X` is mandatory.
- Add `PostableX` / `GettableX` / `UpdatableX` / `StorableX` one at a time, only when the shape actually diverges from `X`.
- Domain logic lives on `X`, not on the flavor types.
- Conversions can be a `New<Output>From<Input>` constructor or a receiver-style `ToY()` method — pick whichever reads best at the call site.
- Use a type alias when two shapes are truly identical.
- `pkg/types/<domain>/` must not import from other `pkg/` packages.
## Further reading
- [abstractions.md](abstractions.md) — when to introduce a new type at all.
- [handler.md](handler.md) — struct tag rules at the HTTP boundary.
- [packages.md](packages.md) — where types live under `pkg/types/`.
- [sql.md](sql.md) — star-schema requirements for `StorableX`.

View File

@@ -7,12 +7,12 @@ This guide explains how to add new data sources to the SigNoz onboarding flow. T
The configuration is located at:
```
frontend/src/container/OnboardingV2Container/onboarding-configs/onboarding-config-with-links.ts
frontend/src/container/OnboardingV2Container/onboarding-configs/onboarding-config-with-links.json
```
## Structure Overview
## JSON Structure Overview
The configuration file exports a TypeScript array (`onboardingConfigWithLinks`) containing data source objects. Each object represents a selectable option in the onboarding flow. SVG logos are imported as ES modules at the top of the file.
The configuration file is a JSON array containing data source objects. Each object represents a selectable option in the onboarding flow.
## Data Source Object Keys
@@ -24,7 +24,7 @@ The configuration file exports a TypeScript array (`onboardingConfigWithLinks`)
| `label` | `string` | Display name shown to users (e.g., `"AWS EC2"`) |
| `tags` | `string[]` | Array of category tags for grouping (e.g., `["AWS"]`, `["database"]`) |
| `module` | `string` | Destination module after onboarding completion |
| `imgUrl` | `string` | Imported SVG URL **(SVG required)** (e.g., `import ec2Url from '@/assets/Logos/ec2.svg'`, then use `ec2Url`) |
| `imgUrl` | `string` | Path to the logo/icon **(SVG required)** (e.g., `"/Logos/ec2.svg"`) |
### Optional Keys
@@ -57,34 +57,36 @@ The `module` key determines where users are redirected after completing onboardi
The `question` object enables multi-step selection flows:
```ts
question: {
desc: 'What would you like to monitor?',
type: 'select',
helpText: 'Choose the telemetry type you want to collect.',
helpLink: '/docs/azure-monitoring/overview/',
helpLinkText: 'Read the guide →',
options: [
{
key: 'logging',
label: 'Logs',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/logging/',
},
{
key: 'metrics',
label: 'Metrics',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/metrics/',
},
{
key: 'tracing',
label: 'Traces',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/tracing/',
},
],
},
```json
{
"question": {
"desc": "What would you like to monitor?",
"type": "select",
"helpText": "Choose the telemetry type you want to collect.",
"helpLink": "/docs/azure-monitoring/overview/",
"helpLinkText": "Read the guide →",
"options": [
{
"key": "logging",
"label": "Logs",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/logging/"
},
{
"key": "metrics",
"label": "Metrics",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/metrics/"
},
{
"key": "tracing",
"label": "Traces",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/tracing/"
}
]
}
}
```
### Question Keys
@@ -104,161 +106,152 @@ Options can be simple (direct link) or nested (with another question):
### Simple Option (Direct Link)
```ts
```json
{
key: 'aws-ec2-logs',
label: 'Logs',
imgUrl: ec2Url,
link: '/docs/userguide/collect_logs_from_file/',
},
"key": "aws-ec2-logs",
"label": "Logs",
"imgUrl": "/Logos/ec2.svg",
"link": "/docs/userguide/collect_logs_from_file/"
}
```
### Option with Internal Redirect
```ts
```json
{
key: 'aws-ec2-metrics-one-click',
label: 'One Click AWS',
imgUrl: ec2Url,
link: '/integrations?integration=aws-integration&service=ec2',
internalRedirect: true,
},
"key": "aws-ec2-metrics-one-click",
"label": "One Click AWS",
"imgUrl": "/Logos/ec2.svg",
"link": "/integrations?integration=aws-integration&service=ec2",
"internalRedirect": true
}
```
> **Important**: Set `internalRedirect: true` only for internal app routes (like `/integrations?...`). Docs links should NOT have this flag.
### Nested Option (Multi-step Flow)
```ts
```json
{
key: 'aws-ec2-metrics',
label: 'Metrics',
imgUrl: ec2Url,
question: {
desc: 'How would you like to set up monitoring?',
helpText: 'Choose your setup method.',
options: [...],
},
},
"key": "aws-ec2-metrics",
"label": "Metrics",
"imgUrl": "/Logos/ec2.svg",
"question": {
"desc": "How would you like to set up monitoring?",
"helpText": "Choose your setup method.",
"options": [...]
}
}
```
## Examples
### Simple Data Source (Direct Link)
```ts
import elbUrl from '@/assets/Logos/elb.svg';
// inside the onboardingConfigWithLinks array:
```json
{
dataSource: 'aws-elb',
label: 'AWS ELB',
tags: ['AWS'],
module: 'logs',
relatedSearchKeywords: [
'aws',
'aws elb',
'elb logs',
'elastic load balancer',
"dataSource": "aws-elb",
"label": "AWS ELB",
"tags": ["AWS"],
"module": "logs",
"relatedSearchKeywords": [
"aws",
"aws elb",
"elb logs",
"elastic load balancer"
],
imgUrl: elbUrl,
link: '/docs/aws-monitoring/elb/',
},
"imgUrl": "/Logos/elb.svg",
"link": "/docs/aws-monitoring/elb/"
}
```
### Data Source with Single Question Level
```ts
import azureVmUrl from '@/assets/Logos/azure-vm.svg';
// inside the onboardingConfigWithLinks array:
```json
{
dataSource: 'app-service',
label: 'App Service',
imgUrl: azureVmUrl,
tags: ['Azure'],
module: 'apm',
relatedSearchKeywords: ['azure', 'app service'],
question: {
desc: 'What telemetry data do you want to visualise?',
type: 'select',
options: [
"dataSource": "app-service",
"label": "App Service",
"imgUrl": "/Logos/azure-vm.svg",
"tags": ["Azure"],
"module": "apm",
"relatedSearchKeywords": ["azure", "app service"],
"question": {
"desc": "What telemetry data do you want to visualise?",
"type": "select",
"options": [
{
key: 'logging',
label: 'Logs',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/logging/',
"key": "logging",
"label": "Logs",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/logging/"
},
{
key: 'metrics',
label: 'Metrics',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/metrics/',
"key": "metrics",
"label": "Metrics",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/metrics/"
},
{
key: 'tracing',
label: 'Traces',
imgUrl: azureVmUrl,
link: '/docs/azure-monitoring/app-service/tracing/',
},
],
},
},
"key": "tracing",
"label": "Traces",
"imgUrl": "/Logos/azure-vm.svg",
"link": "/docs/azure-monitoring/app-service/tracing/"
}
]
}
}
```
### Data Source with Nested Questions (2-3 Levels)
```ts
import ec2Url from '@/assets/Logos/ec2.svg';
// inside the onboardingConfigWithLinks array:
```json
{
dataSource: 'aws-ec2',
label: 'AWS EC2',
tags: ['AWS'],
module: 'logs',
relatedSearchKeywords: ['aws', 'aws ec2', 'ec2 logs', 'ec2 metrics'],
imgUrl: ec2Url,
question: {
desc: 'What would you like to monitor for AWS EC2?',
type: 'select',
helpText: 'Choose the type of telemetry data you want to collect.',
options: [
"dataSource": "aws-ec2",
"label": "AWS EC2",
"tags": ["AWS"],
"module": "logs",
"relatedSearchKeywords": ["aws", "aws ec2", "ec2 logs", "ec2 metrics"],
"imgUrl": "/Logos/ec2.svg",
"question": {
"desc": "What would you like to monitor for AWS EC2?",
"type": "select",
"helpText": "Choose the type of telemetry data you want to collect.",
"options": [
{
key: 'aws-ec2-logs',
label: 'Logs',
imgUrl: ec2Url,
link: '/docs/userguide/collect_logs_from_file/',
"key": "aws-ec2-logs",
"label": "Logs",
"imgUrl": "/Logos/ec2.svg",
"link": "/docs/userguide/collect_logs_from_file/"
},
{
key: 'aws-ec2-metrics',
label: 'Metrics',
imgUrl: ec2Url,
question: {
desc: 'How would you like to set up EC2 Metrics monitoring?',
helpText: 'One Click uses AWS CloudWatch integration. Manual setup uses OpenTelemetry.',
helpLink: '/docs/aws-monitoring/one-click-vs-manual/',
helpLinkText: 'Read the comparison guide →',
options: [
"key": "aws-ec2-metrics",
"label": "Metrics",
"imgUrl": "/Logos/ec2.svg",
"question": {
"desc": "How would you like to set up EC2 Metrics monitoring?",
"helpText": "One Click uses AWS CloudWatch integration. Manual setup uses OpenTelemetry.",
"helpLink": "/docs/aws-monitoring/one-click-vs-manual/",
"helpLinkText": "Read the comparison guide →",
"options": [
{
key: 'aws-ec2-metrics-one-click',
label: 'One Click AWS',
imgUrl: ec2Url,
link: '/integrations?integration=aws-integration&service=ec2',
internalRedirect: true,
"key": "aws-ec2-metrics-one-click",
"label": "One Click AWS",
"imgUrl": "/Logos/ec2.svg",
"link": "/integrations?integration=aws-integration&service=ec2",
"internalRedirect": true
},
{
key: 'aws-ec2-metrics-manual',
label: 'Manual Setup',
imgUrl: ec2Url,
link: '/docs/tutorial/opentelemetry-binary-usage-in-virtual-machine/',
},
],
},
},
],
},
},
"key": "aws-ec2-metrics-manual",
"label": "Manual Setup",
"imgUrl": "/Logos/ec2.svg",
"link": "/docs/tutorial/opentelemetry-binary-usage-in-virtual-machine/"
}
]
}
}
]
}
}
```
## Best Practices
@@ -277,16 +270,10 @@ import ec2Url from '@/assets/Logos/ec2.svg';
### 3. Logos
- Place logo files in `src/assets/Logos/`
- Place logo files in `public/Logos/`
- Use SVG format
- Import the SVG at the top of the file and reference the imported variable:
```ts
import myServiceUrl from '@/assets/Logos/my-service.svg';
// then in the config object:
imgUrl: myServiceUrl,
```
- **Fetching Icons**: New icons can be easily fetched from [OpenBrand](https://openbrand.sh/). Use the pattern `https://openbrand.sh/?url=<TARGET_URL>`, where `<TARGET_URL>` is the URL-encoded link to the service's website. For example, to get Render's logo, use [https://openbrand.sh/?url=https%3A%2F%2Frender.com](https://openbrand.sh/?url=https%3A%2F%2Frender.com).
- **Optimize new SVGs**: Run any newly downloaded SVGs through an optimizer like [SVGOMG (svgo)](https://svgomg.net/) or use `npx svgo src/assets/Logos/your-logo.svg` to minimise their size before committing.
- Reference as `"/Logos/your-logo.svg"`
- **Optimize new SVGs**: Run any newly downloaded SVGs through an optimizer like [SVGOMG (svgo)](https://svgomg.net/) or use `npx svgo public/Logos/your-logo.svg` to minimise their size before committing.
### 4. Links
@@ -302,9 +289,9 @@ import ec2Url from '@/assets/Logos/ec2.svg';
## Adding a New Data Source
1. Add the logo SVG to `src/assets/Logos/` and add a top-level import in the config file (e.g., `import myServiceUrl from '@/assets/Logos/my-service.svg'`)
2. Add your data source object to the `onboardingConfigWithLinks` array, referencing the imported variable for `imgUrl`
3. Test the flow locally with `pnpm dev`
1. Add your data source object to the JSON array
2. Ensure the logo exists in `public/Logos/`
3. Test the flow locally with `yarn dev`
4. Validation:
- Navigate to the [onboarding page](http://localhost:3301/get-started-with-signoz-cloud) on your local machine
- Data source appears in the list

View File

@@ -1,300 +0,0 @@
# E2E Tests
SigNoz uses end-to-end tests to verify the frontend works correctly against a real backend. These tests use Playwright to drive a real browser against a containerized SigNoz stack that pytest brings up — the same fixture graph integration tests use, with an extra HTTP seeder container for per-spec telemetry seeding.
## How to set up the E2E test environment?
### Prerequisites
Before running E2E tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
- Node 18+ and Yarn
### Initial Setup
1. Install Python deps for the shared tests project:
```bash
cd tests
uv sync
```
2. Install Node deps and Playwright browsers:
```bash
cd e2e
yarn install
yarn install:browsers # one-time Playwright browser install
```
### Starting the Test Environment
To spin up the backend stack (SigNoz, ClickHouse, Postgres, Zookeeper, Zeus mock, gateway mock, seeder, migrator-with-web) and keep it running:
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse --with-web \
e2e/bootstrap/setup.py::test_setup
```
This command will:
- Bring up all containers via pytest fixtures
- Register the admin user (`admin@integration.test` / `password123Z$`)
- Apply the enterprise license (via a WireMock stub of Zeus) and dismiss the org-onboarding prompt so specs can navigate directly to feature pages
- Start the HTTP seeder container (`tests/seeder/` — exposing `/telemetry/{traces,logs,metrics}` POST + DELETE)
- Write backend coordinates to `tests/e2e/.env.local` (loaded by `playwright.config.ts` via dotenv)
- Keep containers running via the `--reuse` flag
The `--with-web` flag builds the frontend into the SigNoz container — required for E2E. The build takes ~4 mins on a cold start.
### Stopping the Test Environment
When you're done writing E2E tests, clean up the environment:
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --teardown \
e2e/bootstrap/setup.py::test_teardown
```
## Understanding the E2E Test Framework
Playwright drives a real browser (Chromium / Firefox / WebKit) against the running SigNoz frontend. The backend is brought up by the same pytest fixture graph integration tests use, so both suites share one source of truth for container lifecycle, license seeding, and test-user accounts.
- **Why Playwright?** First-class TypeScript support, network interception, automatic wait-for-visibility, built-in trace viewer that captures every request/response the UI triggers — so specs rarely need separate API probes alongside UI clicks.
- **Why pytest for lifecycle?** The integration suite already owns container bring-up. Reusing it keeps the E2E stack exactly in sync with the integration stack and avoids a parallel lifecycle framework.
- **Why a separate seeder container?** Per-spec telemetry seeding (traces / logs / metrics) needs a thin HTTP wrapper around the ClickHouse insert helpers so a browser spec can POST from inside the test. The seeder lives at `tests/seeder/`, is built from `tests/Dockerfile.seeder`, and reuses the same `fixtures/{traces,logs,metrics}.py` as integration tests.
```
tests/
├── fixtures/ # shared with integration (see integration.md)
├── integration/ # pytest integration suite
├── seeder/ # standalone HTTP seeder container
│ ├── __init__.py
│ ├── Dockerfile
│ └── server.py # FastAPI app wrapping fixtures.{traces,logs,metrics}
└── e2e/
├── package.json
├── playwright.config.ts # loads .env + .env.local via dotenv
├── .env.example # staging-mode template
├── .env.local # generated by bootstrap/setup.py (gitignored)
├── bootstrap/
│ └── setup.py # test_setup / test_teardown — pytest lifecycle
├── fixtures/ # Playwright test fixtures (test.extend) only
│ └── auth.ts
├── helpers/ # function helpers + the constants they share with tests
│ ├── auth.ts
│ └── dashboards.ts
├── testdata/ # static data files (JSON) used by helpers and tests
│ └── apm-metrics.json # (example)
├── tests/ # Playwright .spec.ts files, one dir per feature area
│ └── alerts/
│ └── alerts.spec.ts # (example)
└── artifacts/ # per-run output (gitignored)
├── html/ # HTML reporter output
├── json/ # JSON reporter output
└── results/ # per-test traces / screenshots / videos on failure
```
### `fixtures/` vs `helpers/` — what goes where
These two folders look similar but mean different things:
- **`fixtures/`** holds *Playwright test fixtures* (created via `test.extend({...})`). By the canonical definition, a fixture is "a consistent, predefined set of data, objects, or environmental conditions used to ensure tests run in a stable state" — i.e. setup/teardown that runs *automatically* around each test or worker. `auth.ts` matches: it extends Playwright's `test` with an `authedPage` that's logged-in before every test runs and torn down after. If the only thing in this folder ever is `auth.ts`, that's fine — fixtures are a deliberately small surface.
- **`helpers/`** holds plain function helpers that you call *explicitly* from a test or hook — they don't extend Playwright's `test`. This covers both behaviour helpers (e.g. `gotoDashboardsList(page)`) and the constants those helpers and the tests both refer to (e.g. `SEARCH_PLACEHOLDER`). Constants live next to the helpers that use them so a single import line in a test covers both.
- **`testdata/`** holds static data files (typically JSON / YAML) consumed by the helpers — for example, `apm-metrics.json`, a real dashboard payload uploaded through the UI by an importer helper.
Rule of thumb: if it's a `test.extend` fixture, put it in `fixtures/`. If it's a function you call explicitly (or a constant the function uses), put it in `helpers/`. If it's a static file the helpers read, put it in `testdata/`.
Each spec follows these principles:
1. **Directory per feature**: `tests/e2e/tests/<feature>/*.spec.ts`. Cross-resource junction concerns (e.g. cascade-delete) go in their own file, not packed into one giant spec.
2. **Test titles use `TC-NN`**: `test('TC-01 alerts page — tabs render', ...)`. Preserves ordering at a glance and maps to external coverage tracking.
3. **UI-first**: drive flows through the UI. Playwright traces capture every BE request/response the UI triggers, so asserting on UI outcomes implicitly validates BE contracts. Reach for direct `page.request.*` only when the test's *purpose* is asserting a response contract (use `page.waitForResponse` on a UI click) or when a specific UI step is structurally flaky (e.g. Ant DatePicker calendar-cell indices) — and even then try UI first.
4. **Self-contained state**: each spec seeds its own data and cleans up at suite teardown. The pytest harness creates a fresh stack with **zero** dashboards / alerts / etc. — never assume pre-existing data. Two patterns work:
- **Per-test seed + cleanup in `try / finally`** — small specs where each test owns its data.
- **Suite-level seed + `afterAll` teardown** — preferred for larger specs. Each `createDashboard(...)` call adds the resulting ID to a module-level `Set<string>`, and one `test.afterAll(...)` deletes everything in the set. See `tests/e2e/tests/dashboards/list.spec.ts` for the full pattern. `test.beforeAll` / `test.afterAll` cannot use `authedPage` directly (it's test-scoped); use `newAdminContext(browser)` from `helpers/auth.ts` instead — it performs one fresh login per suite hook.
5. **Seed via API when the UI flow is multi-step or brittle.** The frontend stores its JWT in `localStorage` under `AUTH_TOKEN`; `page.request.*` inherits the auth fixture's storage state. A typical pattern:
```ts
const token = await page.evaluate(
() => (globalThis as any).localStorage.getItem('AUTH_TOKEN') || '',
);
await page.request.post('/api/v1/dashboards', {
data: { title: 'my-name', uploadedGrafana: false },
headers: { Authorization: `Bearer ${token}` },
});
```
This is faster and more reliable than a multi-step UI seed. Reach for the UI flow only when the test's *purpose* is asserting that flow.
6. **Reusable static data lives in `tests/e2e/testdata/`.** For example, `apm-metrics.json` is a real dashboard payload that `importApmMetricsDashboardViaUI` (in `helpers/dashboards.ts`) uploads through the actual Import JSON UI flow to seed a richly-tagged dashboard for search/list tests.
## How to write an E2E test?
Create a new file `tests/e2e/tests/alerts/smoke.spec.ts`:
```typescript
import { test, expect } from '../../fixtures/auth';
test('TC-01 alerts page — tabs render', async ({ authedPage: page }) => {
await page.goto('/alerts');
await expect(page.getByRole('tab', { name: /alert rules/i })).toBeVisible();
await expect(page.getByRole('tab', { name: /configuration/i })).toBeVisible();
});
```
The `authedPage` fixture (from `tests/e2e/fixtures/auth.ts`) gives you a `Page` whose browser context is already authenticated as the admin user. First use per worker triggers one login; the resulting `storageState` is held in memory and reused for later requests.
To run just this test (assuming the stack is up via `test_setup`):
```bash
cd tests/e2e
npx playwright test tests/alerts/smoke.spec.ts --project=chromium
```
Here's a more comprehensive example that exercises a CRUD flow via the UI:
```typescript
import { test, expect } from '../../fixtures/auth';
test.describe.configure({ mode: 'serial' });
test('TC-02 alerts list — create, toggle, delete', async ({ authedPage: page }) => {
await page.goto('/alerts?tab=AlertRules');
const name = 'smoke-rule';
// Seed via UI — click "New Alert", fill form, save.
await page.getByRole('button', { name: /new alert/i }).click();
await page.getByTestId('alert-name-input').fill(name);
// ... fill metric / threshold / save ...
// Find the row and exercise the action menu.
const row = page.locator('tr', { hasText: name });
await expect(row).toBeVisible();
await row.locator('[data-testid="alert-actions"] button').first().click();
// waitForResponse captures the network call the UI triggers — no parallel fetch needed.
const patchWait = page.waitForResponse(
(r) => r.url().includes('/rules/') && r.request().method() === 'PATCH',
);
await page.getByRole('menuitem').filter({ hasText: /^disable$/i }).click();
await patchWait;
await expect(row).toContainText(/disabled/i);
});
```
### Locator priority
1. `getByTestId('...')` — preferred when the source exposes one. Stable, app-author-provided handle that survives copy-edits.
2. `getByRole('button', { name: 'Submit' })`
3. `getByLabel('Email')`
4. `getByPlaceholder('...')`
5. `getByText('...')`
6. `locator('.ant-select')` — last resort (Ant Design dropdowns often have no semantic alternative)
## Agents
Three Claude agents in `.claude/agents/` accelerate writing and maintaining E2E specs:
- **`playwright-test-planner`** — explores a feature in a real browser plus the local frontend source and writes a test plan as a scratch markdown file (under `tests/e2e/specs/`, which is gitignored — plans are working artifacts for the generator, not committed docs).
- **`playwright-test-generator`** — converts a test plan into Playwright spec files under `tests/e2e/tests/<feature>/`. Drives each scenario through MCP browser tools and emits TC-NN-titled tests using the `authedPage` fixture and the API-seed pattern.
- **`playwright-test-healer`** — runs failing specs, debugs them with snapshots / console / network introspection, and edits the spec to fix selector drift, timing, or state-leak issues.
The agents rely on the Playwright-test MCP server (`mcp__playwright-test__*` tools). Configure it in your Claude MCP settings; the permission allowlist lives in [.claude/settings.local.json](../../../.claude/settings.local.json).
## How to run E2E tests?
### Running All Tests
With the stack already up, from `tests/e2e/`:
```bash
yarn test # headless, all projects
```
### Running Specific Projects
```bash
yarn test:chromium # chromium only
yarn test:firefox
yarn test:webkit
```
### Running Specific Tests
```bash
cd tests/e2e
# Single feature dir
npx playwright test tests/alerts/ --project=chromium
# Single file
npx playwright test tests/alerts/alerts.spec.ts --project=chromium
# Single test by title grep
npx playwright test --project=chromium -g "TC-01"
```
### Iterative modes
```bash
yarn test:ui # Playwright UI mode — watch + step through
yarn test:headed # headed browser
yarn test:debug # Playwright inspector, pause-on-breakpoint
yarn codegen # record-and-replay locator generation
yarn report # open the last HTML report (artifacts/html)
```
### Staging fallback
Point `SIGNOZ_E2E_BASE_URL` at a remote env via `.env` — no local backend bring-up, no `.env.local` generated, Playwright hits the URL directly:
```bash
cd tests/e2e
cp .env.example .env # fill SIGNOZ_E2E_USERNAME / PASSWORD
yarn test:staging
```
## How to configure different options for E2E tests?
### Environment variables
| Variable | Description |
|---|---|
| `SIGNOZ_E2E_BASE_URL` | Base URL the browser targets. Written by `bootstrap/setup.py` for local mode; set manually for staging. |
| `SIGNOZ_E2E_USERNAME` | Admin email. Bootstrap writes `admin@integration.test`. |
| `SIGNOZ_E2E_PASSWORD` | Admin password. Bootstrap writes the integration-test default. |
| `SIGNOZ_E2E_SEEDER_URL` | Seeder HTTP base URL — hit by specs that need per-test telemetry. |
Loading order in `playwright.config.ts`: `.env` first (user-provided, staging), then `.env.local` with `override: true` (bootstrap-generated, local mode). Anything already set in `process.env` at yarn-test time wins because dotenv doesn't touch vars that are already present.
### Playwright options
The full `playwright.config.ts` is the source of truth. Common things to tweak:
- `projects` — Chromium / Firefox / WebKit are enabled by default. Disable to speed up iteration.
- `retries` — `2` on CI (`process.env.CI`), `0` locally.
- `fullyParallel: true` — files run in parallel by worker; within a file, use `test.describe.configure({ mode: 'serial' })` if tests share list pages / mutate shared state.
- `trace: 'on-first-retry'`, `screenshot: 'only-on-failure'`, `video: 'retain-on-failure'` — default diagnostic artifacts land in `artifacts/results/<test>/`.
### Pytest options (bootstrap side)
The same pytest flags integration tests expose work here, since E2E reuses the shared fixture graph:
- `--reuse` — keep containers warm between runs (required for all iteration).
- `--teardown` — tear everything down.
- `--with-web` — build the frontend into the SigNoz container. **Required for E2E**; integration tests don't need it.
- `--sqlstore-provider`, `--postgres-version`, `--clickhouse-version`, etc. — see `docs/contributing/integration.md`.
## What should I remember?
- **Always use the `--reuse` flag** when setting up the E2E stack. `--with-web` adds a ~4 min frontend build; you only want to pay that once.
- **Don't teardown before setup.** `--reuse` correctly handles partially-set-up state, so chaining teardown → setup wastes time.
- **Prefer UI-driven flows.** Playwright captures BE requests in the trace; a parallel `fetch` probe is almost always redundant. Drop to `page.request.*` only when the UI can't reach what you need.
- **Use `page.waitForResponse` on UI clicks** to assert BE contracts — it still exercises the UI trigger path.
- **Title every test `TC-NN <short description>`** — keeps the suite navigable and reportable.
- **Split by resource, not by regression suite.** One spec per feature resource; cross-resource junction concerns (cascade-delete, linked-edit) get their own file.
- **Use short descriptive resource names** (`alerts-list-rule`, `labels-rule`, `downtime-once`) — no timestamp disambiguation. Each test owns its resources and cleans up in `try/finally`.
- **Never commit `test.only`** — a pre-commit check or CI runs with `forbidOnly: true`.
- **Prefer explicit waits over `page.waitForTimeout(ms)`.** `await expect(locator).toBeVisible()` is always better than `waitForTimeout(5000)`.
- **Unique test names won't save you from shared-tenant state.** When two tests hit the same list page, either serialize (`describe.configure({ mode: 'serial' })`) or isolate cleanup religiously.
- **Artifacts go to `tests/e2e/artifacts/`** — HTML report at `artifacts/html`, traces at `artifacts/results/<test>/`. All gitignored; archive the dir in CI.

View File

@@ -1,251 +0,0 @@
# Integration Tests
SigNoz uses integration tests to verify that different components work together correctly in a real environment. These tests run against actual services (ClickHouse, PostgreSQL, SigNoz, Zeus mock, Keycloak, etc.) spun up as containers, so suites exercise the same code paths production does.
## How to set up the integration test environment?
### Prerequisites
Before running integration tests, ensure you have the following installed:
- Python 3.13+
- [uv](https://docs.astral.sh/uv/getting-started/installation/)
- Docker (for containerized services)
### Initial Setup
1. Navigate to the shared tests project:
```bash
cd tests
```
2. Install dependencies using uv:
```bash
uv sync
```
> **_NOTE:_** the build backend could throw an error while installing `psycopg2`, please see https://www.psycopg.org/docs/install.html#build-prerequisites
### Starting the Test Environment
To spin up all the containers necessary for writing integration tests and keep them running:
```bash
make py-test-setup
```
Under the hood this runs, from `tests/`:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse integration/bootstrap/setup.py::test_setup
```
This command will:
- Start all required services (ClickHouse, PostgreSQL, Zookeeper, SigNoz, Zeus mock, gateway mock)
- Register an admin user
- Keep containers running via the `--reuse` flag
### Stopping the Test Environment
When you're done writing integration tests, clean up the environment:
```bash
make py-test-teardown
```
Which runs:
```bash
uv run pytest --basetemp=./tmp/ -vv --teardown integration/bootstrap/setup.py::test_teardown
```
This destroys the running integration test setup and cleans up resources.
## Understanding the Integration Test Framework
Python and pytest form the foundation of the integration testing framework. Testcontainers are used to spin up disposable integration environments. WireMock is used to spin up **test doubles** of external services (Zeus cloud API, gateway, etc.).
- **Why Python/pytest?** It's expressive, low-boilerplate, and has powerful fixture capabilities that make integration testing straightforward. Extensive libraries for HTTP requests, JSON handling, and data analysis (numpy) make it easier to test APIs and verify data.
- **Why testcontainers?** They let us spin up isolated dependencies that match our production environment without complex setup.
- **Why WireMock?** Well maintained, documented, and extensible.
```
tests/
├── conftest.py # pytest_plugins registration
├── pyproject.toml
├── uv.lock
├── fixtures/ # shared fixture library (flat package)
│ ├── __init__.py
│ ├── auth.py # admin/editor/viewer users, tokens, license
│ ├── clickhouse.py
│ ├── http.py # WireMock helpers
│ ├── keycloak.py # IdP container
│ ├── postgres.py
│ ├── signoz.py # SigNoz-backend container
│ ├── sql.py
│ ├── types.py
│ └── ... # logs, metrics, traces, alerts, dashboards, ...
├── integration/
│ ├── bootstrap/
│ │ └── setup.py # test_setup / test_teardown
│ ├── testdata/ # JSON / JSONL / YAML inputs per suite
│ └── tests/ # one directory per feature area
│ ├── alerts/
│ │ ├── 01_*.py # numbered suite files
│ │ └── conftest.py # optional suite-local fixtures
│ ├── auditquerier/
│ ├── cloudintegrations/
│ ├── dashboard/
│ ├── passwordauthn/
│ ├── querier/
│ └── ...
└── e2e/ # Playwright suite (see docs/contributing/e2e.md)
```
Each test suite follows these principles:
1. **Organization**: Suites live under `tests/integration/tests/` in self-contained packages. Shared fixtures live in the top-level `tests/fixtures/` package so the e2e tree can reuse them.
2. **Execution Order**: Files are prefixed with two-digit numbers (`01_`, `02_`, `03_`) to ensure sequential execution when tests depend on ordering.
3. **Time Constraints**: Each suite should complete in under 10 minutes (setup takes ~4 mins).
### Test Suite Design
Test suites should target functional domains or subsystems within SigNoz. When designing a test suite, consider these principles:
- **Functional Cohesion**: Group tests around a specific capability or service boundary
- **Data Flow**: Follow the path of data through related components
- **Change Patterns**: Components frequently modified together should be tested together
The exact boundaries for suites are intentionally flexible, allowing contributors to define logical groupings based on their domain knowledge. Current suites cover alerts, audit querier, callback authn, cloud integrations, dashboards, ingestion keys, logs pipelines, password authn, preferences, querier, raw export data, roles, root user, service accounts, and TTL.
## How to write an integration test?
Now start writing an integration test. Create a new file `tests/integration/tests/bootstrap/01_version.py` and paste the following:
```python
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_version(signoz: types.SigNoz) -> None:
response = requests.get(
signoz.self.host_configs["8080"].get("/api/v1/version"),
timeout=2,
)
logger.info(response)
```
We have written a simple test which calls the `version` endpoint of the SigNoz backend. **To run just this function, run the following command:**
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/bootstrap/01_version.py::test_version
```
> **Note:** The `--reuse` flag is used to reuse the environment if it is already running. Always use this flag when writing and running integration tests. Without it the environment is destroyed and recreated every run.
Here's another example of how to write a more comprehensive integration test:
```python
from http import HTTPStatus
import requests
from fixtures import types
from fixtures.logger import setup_logger
logger = setup_logger(__name__)
def test_user_registration(signoz: types.SigNoz) -> None:
"""Test user registration functionality."""
response = requests.post(
signoz.self.host_configs["8080"].get("/api/v1/register"),
json={
"name": "testuser",
"orgId": "",
"orgName": "test.org",
"email": "test@example.com",
"password": "password123Z$",
},
timeout=2,
)
assert response.status_code == HTTPStatus.OK
assert response.json()["setupCompleted"] is True
```
Test inputs (JSON fixtures, expected payloads) go under `tests/integration/testdata/<suite>/` and are loaded via `fixtures.fs.get_testdata_file_path`.
## How to run integration tests?
### Running All Tests
```bash
make py-test
```
Which runs:
```bash
uv run pytest --basetemp=./tmp/ -vv integration/tests/
```
### Running Specific Test Categories
```bash
cd tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/<suite>/
# Run querier tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/querier/
# Run passwordauthn tests
uv run pytest --basetemp=./tmp/ -vv --reuse integration/tests/passwordauthn/
```
### Running Individual Tests
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/<suite>/<file>.py::test_name
# Run test_register in 01_register.py in the passwordauthn suite
uv run pytest --basetemp=./tmp/ -vv --reuse \
integration/tests/passwordauthn/01_register.py::test_register
```
## How to configure different options for integration tests?
Tests can be configured using pytest options:
- `--sqlstore-provider` — Choose the SQL store provider (default: `postgres`)
- `--sqlite-mode` — SQLite journal mode: `delete` or `wal` (default: `delete`). Only relevant when `--sqlstore-provider=sqlite`.
- `--postgres-version` — PostgreSQL version (default: `15`)
- `--clickhouse-version` — ClickHouse version (default: `25.5.6`)
- `--zookeeper-version` — Zookeeper version (default: `3.7.1`)
- `--schema-migrator-version` — SigNoz schema migrator version (default: `v0.144.2`)
Example:
```bash
uv run pytest --basetemp=./tmp/ -vv --reuse \
--sqlstore-provider=postgres --postgres-version=14 \
integration/tests/passwordauthn/
```
## What should I remember?
- **Always use the `--reuse` flag** when setting up the environment or running tests to keep containers warm. Without it every run rebuilds the stack (~4 mins).
- **Use the `--teardown` flag** only when cleaning up — mixing `--teardown` with `--reuse` is a contradiction.
- **Do not pre-emptively teardown before setup.** If the stack is partially up, `--reuse` picks up from wherever it is. `make py-test-teardown` then `make py-test-setup` wastes minutes.
- **Follow the naming convention** with two-digit numeric prefixes (`01_`, `02_`) for ordered test execution within a suite.
- **Use proper timeouts** in HTTP requests to avoid hanging tests (`timeout=5` is typical).
- **Clean up test data** between tests in the same suite to avoid interference — or rely on a fresh SigNoz container if you need full isolation.
- **Use descriptive test names** that clearly indicate what is being tested.
- **Leverage fixtures** for common setup. The shared fixture package is at `tests/fixtures/` — reuse before adding new ones.
- **Test both success and failure scenarios** (4xx / 5xx paths) to ensure robust functionality.
- **Run `make py-fmt` and `make py-lint` before committing** Python changes — black + isort + autoflake + pylint.
- **`--sqlite-mode=wal` does not work on macOS.** The integration test environment runs SigNoz inside a Linux container with the SQLite database file mounted from the macOS host. WAL mode requires shared memory between connections, and connections crossing the VM boundary (macOS host ↔ Linux container) cannot share the WAL index, resulting in `SQLITE_IOERR_SHORT_READ`. WAL mode is tested in CI on Linux only.

View File

@@ -16,7 +16,7 @@ func (hp *HourlyProvider) GetBaseSeasonalProvider() *BaseSeasonalProvider {
return &hp.BaseSeasonalProvider
}
// NewHourlyProvider now uses the generic option type.
// NewHourlyProvider now uses the generic option type
func NewHourlyProvider(opts ...GenericProviderOption[*HourlyProvider]) *HourlyProvider {
hp := &HourlyProvider{
BaseSeasonalProvider: BaseSeasonalProvider{},

Some files were not shown because too many files have changed in this diff Show More