signoz

mirror of https://github.com/SigNoz/signoz.git synced 2026-02-18 06:52:34 +00:00

Author	SHA1	Message	Date
Nikhil Soni	c295ef386d	chore(agent): merge and compact traces skill into single reference doc Combines trace-detail-architecture.md and TRACES_MODULE.md into one concise traces-module.md (196 lines, down from 1119 combined). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:56:20 +05:30
Nikhil Soni	bf0394cc28	chore(agent): add clickhouse-query skill, project settings, and update existing skills Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-17 18:27:12 +05:30
Nikhil Soni	fa08ca2fac	chore(agent): add skill to code review	2026-02-17 14:08:58 +05:30
Nikhil Soni	08c53fe7e8	docs: add few modules implemtation details Generated by claude code	2026-01-27 22:33:49 +05:30
Nikhil Soni	c1fac00d2e	feat: add claude.md and github commands	2026-01-27 22:33:12 +05:30

11 changed files with 2185 additions and 0 deletions

									
										136

.claude/CLAUDE.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,136 @@

				# CLAUDE.md

				This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

				## Project Overview

				SigNoz is an open-source observability platform (APM, logs, metrics, traces) built on OpenTelemetry and ClickHouse. It provides a unified solution for monitoring applications with features including distributed tracing, log management, metrics dashboards, and alerting.

				## Build and Development Commands

				### Development Environment Setup

				```bash

				make devenv-up              # Start ClickHouse and OTel Collector for local dev

				make devenv-clickhouse      # Start only ClickHouse

				make devenv-signoz-otel-collector  # Start only OTel Collector

				make devenv-clickhouse-clean       # Clean ClickHouse data

				```

				### Backend (Go)

				```bash

				make go-run-community       # Run community backend server

				make go-run-enterprise      # Run enterprise backend server

				make go-test                # Run all Go unit tests

				go test -race ./pkg/...     # Run tests for specific package

				go test -race ./pkg/querier/...  # Example: run querier tests

				```

				### Integration Tests (Python)

				```bash

				cd tests/integration

				uv sync                     # Install dependencies

				make py-test-setup          # Start test environment (keep running with --reuse)

				make py-test                # Run all integration tests

				make py-test-teardown       # Stop test environment

				# Run specific test

				uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>/<file>.py::test_name

				```

				### Code Quality

				```bash

				# Go linting (golangci-lint)

				golangci-lint run

				# Python formatting/linting

				make py-fmt                 # Format with black

				make py-lint                # Run isort, autoflake, pylint

				```

				### OpenAPI Generation

				```bash

				go run cmd/enterprise/*.go generate openapi

				```

				## Architecture Overview

				### Backend Structure

				The Go backend follows a **provider pattern** for dependency injection:

				- **`pkg/signoz/`** - IoC container that wires all providers together

				- **`pkg/modules/`** - Business logic modules (user, organization, dashboard, etc.)

				- **`pkg/<provider>/`** - Provider implementations following consistent structure:

				  - `<name>.go` - Interface definition

				  - `config.go` - Configuration (implements `factory.Config`)

				  - `<implname><name>/provider.go` - Implementation

				  - `<name>test/` - Mock implementations for testing

				### Key Packages

				- **`pkg/querier/`** - Query engine for telemetry data (logs, traces, metrics)

				- **`pkg/telemetrystore/`** - ClickHouse telemetry storage interface

				- **`pkg/sqlstore/`** - Relational database (SQLite/PostgreSQL) for metadata

				- **`pkg/apiserver/`** - HTTP API server with OpenAPI integration

				- **`pkg/alertmanager/`** - Alert management

				- **`pkg/authn/`, `pkg/authz/`** - Authentication and authorization

				- **`pkg/flagger/`** - Feature flags (OpenFeature-based)

				- **`pkg/errors/`** - Structured error handling

				### Enterprise vs Community

				- **`cmd/community/`** - Community edition entry point

				- **`cmd/enterprise/`** - Enterprise edition entry point

				- **`ee/`** - Enterprise-only features

				## Code Conventions

				### Error Handling

				Use the custom `pkg/errors` package instead of standard library:

				```go

				errors.New(typ, code, message)           // Instead of errors.New()

				errors.Newf(typ, code, message, args...) // Instead of fmt.Errorf()

				errors.Wrapf(err, typ, code, msg)        // Wrap with context

				```

				Define domain-specific error codes:

				```go

				var CodeThingNotFound = errors.MustNewCode("thing_not_found")

				```

				### HTTP Handlers

				Handlers are thin adapters in modules that:

				1. Extract auth context from request

				2. Decode request body using `binding` package

				3. Call module functions

				4. Return responses using `render` package

				Register routes in `pkg/apiserver/signozapiserver/` with `handler.New()` and `OpenAPIDef`.

				### SQL/Database

				- Use Bun ORM via `sqlstore.BunDBCtx(ctx)`

				- Star schema with `organizations` as central entity

				- All tables have `id`, `created_at`, `updated_at`, `org_id` columns

				- Write idempotent migrations in `pkg/sqlmigration/`

				- No `ON CASCADE` deletes - handle in application logic

				### REST Endpoints

				- Use plural resource names: `/v1/organizations`, `/v1/users`

				- Use `me` for current user/org: `/v1/organizations/me/users`

				- Follow RESTful conventions for CRUD operations

				### Linting Rules (from .golangci.yml)

				- Don't use `errors` package - use `pkg/errors`

				- Don't use `zap` logger - use `slog`

				- Don't use `fmt.Errorf` or `fmt.Print*`

				## Testing

				### Unit Tests

				- Run with race detector: `go test -race ./...`

				- Provider mocks are in `<provider>test/` packages

				### Integration Tests

				- Located in `tests/integration/`

				- Use pytest with testcontainers

				- Files prefixed with numbers for execution order (e.g., `01_database.py`)

				- Always use `--reuse` flag during development

				- Fixtures in `tests/integration/fixtures/`

									
										15

.claude/settings.json
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				{

				  "permissions": {

				    "allow": [

				      "Read",

				      "Glob",

				      "Grep",

				      "Bash(git *)",

				      "Bash(make *)",

				      "Bash(cd *)",

				      "Bash(ls *)",

				      "Bash(go run *)",

				      "Bash(yarn run *)"

				    ]

				  }

				}

									
										21

.claude/skills/clickhouse-query/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,21 @@

				---

				description: Write optimised ClickHouse queries for SigNoz dashboards (traces, errors, logs)

				user_invocable: true

				---

				# Writing ClickHouse Queries for SigNoz Dashboards

				Read [clickhouse-traces-reference.md](./clickhouse-traces-reference.md) for full schema and query reference before writing any query. It covers:

				- All table schemas (`distributed_signoz_index_v3`, `distributed_traces_v3_resource`, `distributed_signoz_error_index_v2`, etc.)

				- The mandatory resource filter CTE pattern and timestamp bucketing

				- Attribute access syntax (standard, indexed, resource)

				- Dashboard panel query templates (timeseries, value, table)

				- Real-world query examples (span counts, error rates, latency, event extraction)

				## Workflow

				1. **Understand the ask**: What metric/data does the user want? (e.g., error rate, latency, span count)

				2. **Pick the panel type**: Timeseries (time-series chart), Value (single number), or Table (rows).

				3. **Build the query** following the mandatory patterns from the reference doc.

				4. **Validate** the query uses all required optimizations (resource CTE, ts_bucket_start, indexed columns).

									
										460

.claude/skills/clickhouse-query/clickhouse-traces-reference.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,460 @@

				# ClickHouse Traces Query Reference for SigNoz

				Source: https://signoz.io/docs/userguide/writing-clickhouse-traces-query/

				All tables live in the `signoz_traces` database.

				---

				## Table Schemas

				### distributed_signoz_index_v3 (Primary Spans Table)

				The main table for querying span data. 30+ columns following OpenTelemetry conventions.

				```sql

				(

				    `ts_bucket_start` UInt64 CODEC(DoubleDelta, LZ4),

				    `resource_fingerprint` String CODEC(ZSTD(1)),

				    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),

				    `trace_id` FixedString(32) CODEC(ZSTD(1)),

				    `span_id` String CODEC(ZSTD(1)),

				    `trace_state` String CODEC(ZSTD(1)),

				    `parent_span_id` String CODEC(ZSTD(1)),

				    `flags` UInt32 CODEC(T64, ZSTD(1)),

				    `name` LowCardinality(String) CODEC(ZSTD(1)),

				    `kind` Int8 CODEC(T64, ZSTD(1)),

				    `kind_string` String CODEC(ZSTD(1)),

				    `duration_nano` UInt64 CODEC(T64, ZSTD(1)),

				    `status_code` Int16 CODEC(T64, ZSTD(1)),

				    `status_message` String CODEC(ZSTD(1)),

				    `status_code_string` String CODEC(ZSTD(1)),

				    `attributes_string` Map(LowCardinality(String), String) CODEC(ZSTD(1)),

				    `attributes_number` Map(LowCardinality(String), Float64) CODEC(ZSTD(1)),

				    `attributes_bool` Map(LowCardinality(String), Bool) CODEC(ZSTD(1)),

				    `resources_string` Map(LowCardinality(String), String) CODEC(ZSTD(1)),  -- deprecated

				    `resource` JSON(max_dynamic_paths = 100) CODEC(ZSTD(1)),

				    `events` Array(String) CODEC(ZSTD(2)),

				    `links` String CODEC(ZSTD(1)),

				    `response_status_code` LowCardinality(String) CODEC(ZSTD(1)),

				    `external_http_url` LowCardinality(String) CODEC(ZSTD(1)),

				    `http_url` LowCardinality(String) CODEC(ZSTD(1)),

				    `external_http_method` LowCardinality(String) CODEC(ZSTD(1)),

				    `http_method` LowCardinality(String) CODEC(ZSTD(1)),

				    `http_host` LowCardinality(String) CODEC(ZSTD(1)),

				    `db_name` LowCardinality(String) CODEC(ZSTD(1)),

				    `db_operation` LowCardinality(String) CODEC(ZSTD(1)),

				    `has_error` Bool CODEC(T64, ZSTD(1)),

				    `is_remote` LowCardinality(String) CODEC(ZSTD(1)),

				    -- Pre-indexed "selected" columns (use these instead of map access when available):

				    `resource_string_service$$name` LowCardinality(String) DEFAULT resources_string['service.name'] CODEC(ZSTD(1)),

				    `attribute_string_http$$route` LowCardinality(String) DEFAULT attributes_string['http.route'] CODEC(ZSTD(1)),

				    `attribute_string_messaging$$system` LowCardinality(String) DEFAULT attributes_string['messaging.system'] CODEC(ZSTD(1)),

				    `attribute_string_messaging$$operation` LowCardinality(String) DEFAULT attributes_string['messaging.operation'] CODEC(ZSTD(1)),

				    `attribute_string_db$$system` LowCardinality(String) DEFAULT attributes_string['db.system'] CODEC(ZSTD(1)),

				    `attribute_string_rpc$$system` LowCardinality(String) DEFAULT attributes_string['rpc.system'] CODEC(ZSTD(1)),

				    `attribute_string_rpc$$service` LowCardinality(String) DEFAULT attributes_string['rpc.service'] CODEC(ZSTD(1)),

				    `attribute_string_rpc$$method` LowCardinality(String) DEFAULT attributes_string['rpc.method'] CODEC(ZSTD(1)),

				    `attribute_string_peer$$service` LowCardinality(String) DEFAULT attributes_string['peer.service'] CODEC(ZSTD(1))

				)

				ORDER BY (ts_bucket_start, resource_fingerprint, has_error, name, timestamp)

				```

				### distributed_traces_v3_resource (Resource Lookup Table)

				Used in the resource filter CTE pattern for efficient filtering by resource attributes.

				```sql

				(

				    `labels` String CODEC(ZSTD(5)),

				    `fingerprint` String CODEC(ZSTD(1)),

				    `seen_at_ts_bucket_start` Int64 CODEC(Delta(8), ZSTD(1))

				)

				```

				### distributed_signoz_error_index_v2 (Error Events)

				```sql

				(

				    `timestamp` DateTime64(9) CODEC(DoubleDelta, LZ4),

				    `errorID` FixedString(32) CODEC(ZSTD(1)),

				    `groupID` FixedString(32) CODEC(ZSTD(1)),

				    `traceID` FixedString(32) CODEC(ZSTD(1)),

				    `spanID` String CODEC(ZSTD(1)),

				    `serviceName` LowCardinality(String) CODEC(ZSTD(1)),

				    `exceptionType` LowCardinality(String) CODEC(ZSTD(1)),

				    `exceptionMessage` String CODEC(ZSTD(1)),

				    `exceptionStacktrace` String CODEC(ZSTD(1)),

				    `exceptionEscaped` Bool CODEC(T64, ZSTD(1)),

				    `resourceTagsMap` Map(LowCardinality(String), String) CODEC(ZSTD(1)),

				    INDEX idx_error_id errorID TYPE bloom_filter GRANULARITY 4,

				    INDEX idx_resourceTagsMapKeys mapKeys(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64,

				    INDEX idx_resourceTagsMapValues mapValues(resourceTagsMap) TYPE bloom_filter(0.01) GRANULARITY 64

				)

				```

				### distributed_top_level_operations

				```sql

				(

				    `name` LowCardinality(String) CODEC(ZSTD(1)),

				    `serviceName` LowCardinality(String) CODEC(ZSTD(1))

				)

				```

				### distributed_span_attributes_keys

				```sql

				(

				    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),

				    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),

				    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),

				    `isColumn` Bool CODEC(ZSTD(1))

				)

				```

				### distributed_span_attributes

				```sql

				(

				    `timestamp` DateTime CODEC(DoubleDelta, ZSTD(1)),

				    `tagKey` LowCardinality(String) CODEC(ZSTD(1)),

				    `tagType` Enum8('tag' = 1, 'resource' = 2) CODEC(ZSTD(1)),

				    `dataType` Enum8('string' = 1, 'bool' = 2, 'float64' = 3) CODEC(ZSTD(1)),

				    `stringTagValue` String CODEC(ZSTD(1)),

				    `float64TagValue` Nullable(Float64) CODEC(ZSTD(1)),

				    `isColumn` Bool CODEC(ZSTD(1))

				)

				```

				---

				## Mandatory Optimization Patterns

				### 1. Resource Filter CTE

				**Always** use a CTE to pre-filter resource fingerprints when filtering by resource attributes (service.name, environment, etc.). This is the single most impactful optimization.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE (simpleJSONExtractString(labels, 'service.name') = 'myservice')

				    AND seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT ...

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE resource_fingerprint GLOBAL IN __resource_filter

				    AND ...

				```

				- Multiple resource filters: chain with AND in the CTE WHERE clause.

				- Use `simpleJSONExtractString(labels, '<key>')` to extract resource attribute values.

				### 2. Timestamp Bucketing

				**Always** include `ts_bucket_start` filter alongside `timestamp` filter. Data is bucketed in 30-minute (1800-second) intervals.

				```sql

				WHERE timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}}

				  AND ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				```

				The `- 1800` on the start ensures spans at bucket boundaries are not missed.

				### 3. Use Indexed Columns Over Map Access

				When a pre-indexed ("selected") column exists, use it instead of map access:

				| Instead of | Use |

				|---|---|

				| `attributes_string['http.route']` | `attribute_string_http$$route` |

				| `attributes_string['db.system']` | `attribute_string_db$$system` |

				| `attributes_string['rpc.method']` | `attribute_string_rpc$$method` |

				| `attributes_string['peer.service']` | `attribute_string_peer$$service` |

				| `resources_string['service.name']` | `resource_string_service$$name` |

				The naming convention: replace `.` with `$$` in the attribute name and prefix with `attribute_string_`, `attribute_number_`, or `attribute_bool_`.

				### 4. Use Pre-extracted Columns

				These top-level columns are faster than map access:

				- `http_method`, `http_url`, `http_host`

				- `db_name`, `db_operation`

				- `has_error`, `duration_nano`, `name`, `kind`

				- `response_status_code`

				---

				## Attribute Access Syntax

				### Standard (non-indexed) attributes

				```sql

				attributes_string['http.status_code']

				attributes_number['response_time']

				attributes_bool['is_error']

				```

				### Selected (indexed) attributes — direct column names

				```sql

				attribute_string_http$$route       -- for http.route

				attribute_number_response$$time    -- for response.time

				attribute_bool_is$$error           -- for is.error

				```

				### Resource attributes in SELECT / GROUP BY

				```sql

				resource.service.name::String

				resource.environment::String

				```

				### Resource attributes in WHERE (via CTE)

				```sql

				simpleJSONExtractString(labels, 'service.name') = 'myservice'

				```

				### Checking attribute existence

				```sql

				mapContains(attributes_string, 'http.method')

				```

				---

				## Dashboard Panel Query Templates

				### Timeseries Panel

				Aggregates data over time intervals for chart visualization.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE (simpleJSONExtractString(labels, 'service.name') = '{{service}}')

				    AND seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    toStartOfInterval(timestamp, INTERVAL 1 MINUTE) AS ts,

				    toFloat64(count()) AS value

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE

				    resource_fingerprint GLOBAL IN __resource_filter AND

				    timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				    ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				GROUP BY ts

				ORDER BY ts ASC;

				```

				### Value Panel

				Returns a single aggregated number. Wrap the timeseries query and reduce with `avg()`, `sum()`, `min()`, `max()`, or `any()`.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE (simpleJSONExtractString(labels, 'service.name') = '{{service}}')

				    AND seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    avg(value) as value,

				    any(ts) as ts

				FROM (

				    SELECT

				        toStartOfInterval(timestamp, INTERVAL 1 MINUTE) AS ts,

				        toFloat64(count()) AS value

				    FROM signoz_traces.distributed_signoz_index_v3

				    WHERE

				        resource_fingerprint GLOBAL IN __resource_filter AND

				        timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				        ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				    GROUP BY ts

				    ORDER BY ts ASC

				)

				```

				### Table Panel

				Rows grouped by dimensions. Use `now() as ts` instead of a time interval column.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    now() as ts,

				    resource.service.name::String as `service.name`,

				    toFloat64(count()) AS value

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE

				    resource_fingerprint GLOBAL IN __resource_filter AND

				    timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				    ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp AND

				    `service.name` IS NOT NULL

				GROUP BY `service.name`, ts

				ORDER BY value DESC;

				```

				---

				## Query Examples

				### Timeseries — Error spans per service per minute

				Shows `has_error` filtering, resource attribute in SELECT, and multi-series grouping.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    toStartOfInterval(timestamp, INTERVAL 1 MINUTE) AS ts,

				    resource.service.name::String as `service.name`,

				    toFloat64(count()) AS value

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE

				    resource_fingerprint GLOBAL IN __resource_filter AND

				    timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				    has_error = true AND

				    `service.name` IS NOT NULL AND

				    ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				GROUP BY `service.name`, ts

				ORDER BY ts ASC;

				```

				### Value — Average duration of GET requests

				Shows the value-panel wrapping pattern (`avg(value)` / `any(ts)`) with a service resource filter.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE (simpleJSONExtractString(labels, 'service.name') = 'api-service')

				    AND seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    avg(value) as value,

				    any(ts) as ts FROM (

				        SELECT

				            toStartOfInterval(timestamp, INTERVAL 1 MINUTE) AS ts,

				            toFloat64(avg(duration_nano)) AS value

				        FROM signoz_traces.distributed_signoz_index_v3

				        WHERE

				            resource_fingerprint GLOBAL IN __resource_filter AND

				            timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				            ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp AND

				            http_method = 'GET'

				        GROUP BY ts

				        ORDER BY ts ASC

				    )

				```

				### Table — Average duration by HTTP method

				Shows `now() as ts` pattern, pre-extracted column usage, and non-null filtering.

				```sql

				WITH __resource_filter AS (

				    SELECT fingerprint

				    FROM signoz_traces.distributed_traces_v3_resource

				    WHERE (simpleJSONExtractString(labels, 'service.name') = 'api-gateway')

				    AND seen_at_ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp

				)

				SELECT

				    now() as ts,

				    http_method,

				    toFloat64(avg(duration_nano)) AS avg_duration_nano

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE

				    resource_fingerprint GLOBAL IN __resource_filter AND

				    timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}} AND

				    ts_bucket_start BETWEEN $start_timestamp - 1800 AND $end_timestamp AND

				    http_method IS NOT NULL AND http_method != ''

				GROUP BY http_method, ts

				ORDER BY avg_duration_nano DESC;

				```

				### Advanced — Extract values from span events

				Shows `arrayFilter`/`arrayMap` pattern for querying the `events` JSON array.

				```sql

				WITH arrayFilter(x -> JSONExtractString(x, 'name')='Getting customer', events) AS filteredEvents

				SELECT toStartOfInterval(timestamp, INTERVAL 1 MINUTE) AS interval,

				toFloat64(count()) AS count,

				arrayJoin(arrayMap(x -> JSONExtractString(JSONExtractString(x, 'attributeMap'), 'customer_id'), filteredEvents)) AS resultArray

				FROM signoz_traces.distributed_signoz_index_v3

				WHERE not empty(filteredEvents)

				AND timestamp > toUnixTimestamp(now() - INTERVAL 30 MINUTE)

				AND ts_bucket_start >= toUInt64(toUnixTimestamp(now() - toIntervalMinute(30))) - 1800

				GROUP BY (resultArray, interval) order by (resultArray, interval) ASC;

				```

				### Advanced — Average latency between two specific spans

				Shows cross-span latency calculation using `minIf()` and indexed service columns.

				```sql

				SELECT

				    interval,

				    round(avg(time_diff), 2) AS result

				FROM

				(

				    SELECT

				        interval,

				        traceID,

				        if(startTime1 != 0, if(startTime2 != 0, (toUnixTimestamp64Nano(startTime2) - toUnixTimestamp64Nano(startTime1)) / 1000000, nan), nan) AS time_diff

				    FROM

				    (

				        SELECT

				            toStartOfInterval(timestamp, toIntervalMinute(1)) AS interval,

				            traceID,

				            minIf(timestamp, if(resource_string_service$$name='driver', if(name = '/driver.DriverService/FindNearest', if((resources_string['component']) = 'gRPC', true, false), false), false)) AS startTime1,

				            minIf(timestamp, if(resource_string_service$$name='route', if(name = 'HTTP GET /route', true, false), false)) AS startTime2

				        FROM signoz_traces.distributed_signoz_index_v3

				        WHERE (timestamp BETWEEN {{.start_datetime}} AND {{.end_datetime}})

				            AND (ts_bucket_start BETWEEN {{.start_timestamp}} - 1800 AND {{.end_timestamp}})

				            AND (resource_string_service$$name IN ('driver', 'route'))

				        GROUP BY (interval, traceID)

				        ORDER BY (interval, traceID) ASC

				    )

				)

				WHERE isNaN(time_diff) = 0

				GROUP BY interval

				ORDER BY interval ASC;

				```

				---

				## SigNoz Dashboard Variables

				These template variables are automatically replaced by SigNoz when the query runs:

				| Variable | Description |

				|---|---|

				| `{{.start_datetime}}` | Start of selected time range (DateTime64) |

				| `{{.end_datetime}}` | End of selected time range (DateTime64) |

				| `$start_timestamp` | Start as Unix timestamp (seconds) |

				| `$end_timestamp` | End as Unix timestamp (seconds) |

				---

				## Query Optimization Checklist

				Before finalizing any query, verify:

				- [ ] **Resource filter CTE** is present when filtering by resource attributes (service.name, environment, etc.)

				- [ ] **`ts_bucket_start`** filter is included alongside `timestamp` filter, with `- 1800` on start

				- [ ] **`GLOBAL IN`** is used (not just `IN`) for the resource fingerprint subquery

				- [ ] **Indexed columns** are used over map access where available (e.g., `http_method` over `attributes_string['http.method']`)

				- [ ] **Pre-extracted columns** are used where available (`has_error`, `duration_nano`, `http_method`, `db_name`, etc.)

				- [ ] **`seen_at_ts_bucket_start`** filter is included in the resource CTE

				- [ ] Aggregation results are cast with `toFloat64()` for dashboard compatibility

				- [ ] For timeseries: results are ordered by time column ASC

				- [ ] For table panels: `now() as ts` is used instead of time intervals

				- [ ] For value panels: outer query uses `avg(value)` / `any(ts)` pattern

									
										37

.claude/skills/commit/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				---

				name: commit

				description: Create a conventional commit with staged changes

				allowed-tools: Bash(git:*)

				---

				# Create Conventional Commit

				Commit staged changes using conventional commit format: `type(scope): description`

				## Types

				- `feat:` - New feature

				- `fix:` - Bug fix

				- `chore:` - Maintenance/refactor/tooling

				- `test:` - Tests only

				- `docs:` - Documentation

				## Process

				1. Review staged changes: `git diff --cached`

				2. Determine type, optional scope, and description (imperative, <70 chars)

				3. Commit using HEREDOC:

				   ```bash

				   git commit -m "$(cat <<'EOF'

				   type(scope): description

				   EOF

				   )"

				   ```

				4. Verify: `git log -1`

				## Notes

				- Description: imperative mood, lowercase, no period

				- Body: explain WHY, not WHAT (code shows what). Keep it concise and brief.

				- Do not include co-authored by claude in commit message, we want ownership and accountability to remain with the human contributor.

				- Do not automatically add files to stage unless asked to.

									
										22

.claude/skills/dev-server/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,22 @@

				---

				description: How to start SigNoz frontend and backend dev servers

				---

				# Dev Server Setup

				Full guide: [development.md](../../docs/contributing/development.md)

				## Start Order

				1. **Infra**: Ensure clickhouse container is running using `docker ps | grep clickhouse` 

				2. **Backend**: `make go-run-community` (serves at `localhost:8080`)

				3. **Frontend**: `cd frontend && yarn install && yarn dev` (serves at `localhost:3301`)

				   - Requires `frontend/.env` with `FRONTEND_API_ENDPOINT=http://localhost:8080`

				   - For git worktrees, frontend/.env can be created using command: `cp frontend/example.env frontend/.env`.

				## Verify

				- ClickHouse: `curl http://localhost:8123/ping` → "Ok."

				- OTel Collector: `curl http://localhost:13133`

				- Backend: `curl http://localhost:8080/api/v1/health` → `{"status":"ok"}`

				- Frontend: `http://localhost:3301`

									
										55

.claude/skills/raise-pr/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,55 @@

				---

				name: raise-pr

				description: Create a pull request with auto-filled template. Pass 'commit' to commit staged changes first.

				allowed-tools: Bash(gh:*, git:*), Read

				argument-hint: [commit?]

				---

				# Raise Pull Request

				Create a PR with auto-filled template from commits after origin/main.

				## Arguments

				- No argument: Create PR with existing commits

				- `commit`: Commit staged changes first, then create PR

				## Process

				1. **If `$ARGUMENTS` is "commit"**: Review staged changes and commit with descriptive message

				    - Check for staged changes: `git diff --cached --stat`

				    - If changes exist:

				        - Review the changes: `git diff --cached`

				        - Use commit skill for making the commit, i.e. follow conventional commit practices

				        - Commit command: `git commit -m "message"`

				2. **Analyze commits since origin/main**:

				   - `git log origin/main..HEAD --pretty=format:"%s%n%b"` - get commit messages

				   - `git diff origin/main...HEAD --stat` - see changes

				3. **Read template**: `.github/pull_request_template.md`

				4. **Generate PR**:

				   - **Title**: Short (<70 chars), from commit messages or main change

				   - **Body**: Fill template sections based on commits/changes:

				     - Summary (why/what/approach) - end with "Closes #<issue_number>" if issue number is available from branch name (git branch --show-current)

				     - Change Type checkboxes

				     - Bug Context (if applicable)

				     - Testing Strategy

				     - Risk Assessment

				     - Changelog (if user-facing)

				     - Checklist

				5. **Create PR**:

				   ```bash

				   git push -u origin $(git branch --show-current)

				   gh pr create --base main --title "..." --body "..."

				   gh pr view

				   ```

				## Notes

				- Analyze ALL commits messages from origin/main to HEAD

				- Fill template sections based on code analysis

				- Leave template sections as they are if you can't determine the content

				- Don't add the changes to git stage, only commit or push whatever user has already staged

									
										254

.claude/skills/review/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,254 @@

				---

				name: review

				description: Review code changes for bugs, performance issues, and SigNoz convention compliance

				allowed-tools: Bash(git:*, gh:*), Read, Glob, Grep

				---

				# Review Command

				Perform a thorough code review following SigNoz's coding conventions and contributing guidelines and any potential bug introduced.

				## Usage

				Invoke this command to review code changes, files, or pull requests with actionable and concise feedback.

				## Process

				1. **Determine scope**:

				   - Ask user what to review if not specified:

				     - Specific files or directories

				     - Current git diff (staged or unstaged)

				     - Specific PR number or commit range

				     - All changes since origin/main

				2. **Gather context**:

				   ```bash

				   # For current changes

				   git diff --cached           # Staged changes

				   git diff                    # Unstaged changes

				   # For commit range

				   git diff origin/main...HEAD # All changes since main

				   # for last commit only

				   git diff HEAD~1..HEAD

				   # For specific PR

				   gh pr view <number> --json files,additions,deletions

				   gh pr diff <number>

				   ```

				3. **Read all relevant files thoroughly**:

				   - Use Read tool for modified files

				   - Understand the context and purpose of changes

				   - Check surrounding code for context

				4. **Review against SigNoz guidelines**:

				   - **Frontend**: Check [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md)

				   - **Backend/Architecture**: Check [CLAUDE.md](../CLAUDE.md) for provider pattern, error handling, SQL, REST, and linting conventions

				   - **General**: Check [Contributing Guidelines](../../CONTRIBUTING.md)

				   - **Commits**: Verify [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)

				5. **Verify feature intent**:

				   - Read the PR description, commit message, or linked issue to understand *what* the change claims to do

				   - Trace the code path end-to-end to confirm the change actually achieves its stated goal

				   - Check that the happy path works as described

				   - Identify any scenarios where the feature silently does nothing or produces wrong results

				6. **Review for bug introduction**:

				   - **Regressions**: Does the change break existing behavior? Check callers of modified functions/interfaces

				   - **Edge cases**: Empty inputs, nil/undefined values, boundary conditions, concurrent access

				   - **Error paths**: Are all error cases handled? Can errors be swallowed silently?

				   - **State management**: Are state transitions correct? Can state become inconsistent?

				   - **Race conditions**: Shared mutable state, async operations, missing locks or guards

				   - **Type mismatches**: Unsafe casts, implicit conversions, `any` usage hiding real types

				7. **Review for performance implications**:

				   - **Backend**: N+1 queries, missing indexes, unbounded result sets, large allocations in hot paths, unnecessary DB round-trips

				   - **Frontend**: Unnecessary re-renders from inline objects/functions as props, missing memoization on expensive computations, large bundle imports that should be lazy-loaded, unthrottled event handlers

				   - **General**: O(n²) or worse algorithms on potentially large datasets, unnecessary network calls, missing pagination or limits

				8. **Provide actionable, concise feedback** in structured format

				## Review Checklist

				For coding conventions and style, refer to the linked guideline docs. This checklist focuses on **review-specific concerns** that guidelines alone don't catch.

				### Correctness & Intent

				- [ ] Change achieves what the PR/commit/issue describes

				- [ ] Happy path works end-to-end

				- [ ] Edge cases handled (empty, nil, boundary, concurrent)

				- [ ] Error paths don't swallow failures silently

				- [ ] No regressions to existing callers of modified code

				### Security

				- [ ] No exposed secrets, API keys, credentials

				- [ ] No sensitive data in logs

				- [ ] Input validation at system boundaries

				- [ ] Authentication/authorization checked for new endpoints

				- [ ] No SQL injection or XSS risks

				### Performance

				- [ ] No N+1 queries or unbounded result sets

				- [ ] No unnecessary re-renders (inline objects/functions as props, missing memoization)

				- [ ] No large imports that should be lazy-loaded

				- [ ] No O(n²) on potentially large datasets

				- [ ] Pagination/limits present where needed

				### Testing

				- [ ] New functionality has tests

				- [ ] Edge cases and error paths tested

				- [ ] Tests are deterministic (no flakiness)

				### Git/Commits

				- [ ] Commit messages follow `type(scope): description` ([Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/))

				- [ ] Commits are atomic and logical

				## Output Format

				Provide feedback in this structured format:

				```markdown

				## Code Review

				**Scope**: [What was reviewed]

				**Overall**: [1-2 sentence summary and general sentiment]

				---

				### 🚨 Critical Issues (Must Fix)

				1. **[Category]** `file:line`

				   **Problem**: [What's wrong]

				   **Why**: [Why it matters]

				   **Fix**: [Specific solution]

				   ```[language]

				   // Example fix if helpful

				   ```

				### ⚠️ Suggestions (Should Consider)

				1. **[Category]** `file:line`

				   **Issue**: [What could be improved]

				   **Suggestion**: [Concrete improvement]

				### ✅ Positive Highlights

				- [Good practice observed]

				- [Well-implemented feature]

				---

				**References**:

				- [Relevant guideline links]

				```

				## Review Categories

				Use these categories for issues:

				- **Bug / Regression**: Logic errors, edge cases, race conditions, broken existing behavior

				- **Feature Gap**: Change doesn't fully achieve its stated intent

				- **Security Risk**: Authentication, authorization, data exposure, injection

				- **Performance Issue**: Inefficient queries, unnecessary re-renders, memory leaks, unbounded data

				- **Convention Violation**: Style, patterns, architectural guidelines (link to relevant guideline doc)

				- **Code Quality**: Complexity, duplication, naming, type safety

				- **Testing**: Missing tests, inadequate coverage, flaky tests

				## Example Review

				```markdown

				## Code Review

				**Scope**: Changes in `frontend/src/pages/TraceDetail/` (3 files, 245 additions)

				**Overall**: Good implementation of pagination feature. Found 2 critical issues and 3 suggestions.

				---

				### 🚨 Critical Issues (Must Fix)

				1. **Security Risk** `TraceList.tsx:45`

				   **Problem**: API token exposed in client-side code

				   **Why**: Security vulnerability - tokens should never be in frontend

				   **Fix**: Move authentication to backend, use session-based auth

				2. **Performance Issue** `TraceList.tsx:89`

				   **Problem**: Inline function passed as prop causes unnecessary re-renders

				   **Why**: Violates frontend guideline, degrades performance with large lists

				   **Fix**:

				   ```typescript

				   const handleTraceClick = useCallback((traceId: string) => {

				     navigate(`/trace/${traceId}`);

				   }, [navigate]);

				   ```

				### ⚠️ Suggestions (Should Consider)

				1. **Code Quality** `TraceList.tsx:120-180`

				   **Issue**: Function exceeds 40-line guideline

				   **Suggestion**: Extract into smaller functions:

				   - `filterTracesByTimeRange()`

				   - `aggregateMetrics()`

				   - `renderChartData()`

				2. **Type Safety** `types.ts:23`

				   **Issue**: Using `any` for trace attributes

				   **Suggestion**: Define proper interface for TraceAttributes

				3. **Convention** `TraceList.tsx:12`

				   **Issue**: File imports not organized

				   **Suggestion**: Let simple-import-sort auto-organize (will happen on save)

				### ✅ Positive Highlights

				- Excellent use of virtualization for large trace lists

				- Good error boundary implementation

				- Well-structured component hierarchy

				- Comprehensive unit tests included

				---

				**References**:

				- [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md)

				- [useCallback best practices](https://kentcdodds.com/blog/usememo-and-usecallback)

				```

				## Tone Guidelines

				- **Be respectful**: Focus on code, not the person

				- **Be specific**: Always reference exact file:line locations

				- **Be concise**: Get to the point, avoid verbosity

				- **Be actionable**: Every comment should have clear resolution path

				- **Be balanced**: Acknowledge good work alongside issues

				- **Be educational**: Explain why something is an issue, link to guidelines

				## Priority Levels

				1. **Critical (🚨)**: Security, bugs, data corruption, crashes

				2. **Important (⚠️)**: Performance, maintainability, convention violations

				3. **Nice to have (💡)**: Style preferences, micro-optimizations

				## Important Notes

				- **Reference specific guidelines** from docs when applicable

				- **Provide code examples** for fixes when helpful

				- **Ask questions** if code intent is unclear

				- **Link to external resources** for educational value

				- **Distinguish** must-fix from should-consider

				- **Be concise** - reviewers value their time

				## Critical Rules

				- **NEVER** be vague - always specify file and line number

				- **NEVER** just point out problems - suggest solutions

				- **NEVER** review without reading the actual code

				- **ALWAYS** check against SigNoz's specific guidelines

				- **ALWAYS** provide rationale for each comment

				- **ALWAYS** be constructive and respectful

				## Reference Documents

				- [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md) - React, TypeScript, styling

				- [Contributing Guidelines](../../CONTRIBUTING.md) - Workflow, commit conventions

				- [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) - Commit format

				- [CLAUDE.md](../CLAUDE.md) - Project architecture and conventions

									
										14

.claude/skills/traces/SKILL.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,14 @@

				---

				description: Architecture context for the traces module (query building, waterfall, flamegraph)

				---

				# Traces Module

				Read [traces-module.md](./traces-module.md) for full context before working on this module. It covers:

				- Storage schema (`signoz_index_v3`, `trace_summary`) and gotchas

				- API endpoints (Query Range V5, waterfall, flamegraph, funnels)

				- Query building system (statement builder, field mapper, trace operators)

				- Backend processing pipelines and caching

				- Frontend component map, state flow, and API hooks

				- Key file index for backend and frontend

									
										191

.claude/skills/traces/traces-module.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,191 @@

				# SigNoz Traces Module — Developer Guide

				## Overview

				```

				App → OTel SDK → OTLP Receiver → [signozspanmetrics, batch] →

				  ClickHouse Exporter → signoz_traces DB → Query Service (Go) → Frontend (React)

				```

				**Query Service layers**: HTTP Handlers (`http_handler.go`) → Querier (`querier.go`, orchestration/cache) → Statement Builders (`pkg/telemetrytraces/`) → ClickHouse

				---

				## Storage Schema

				All tables in `signoz_traces` database. Schema DDL: `signoz-otel-collector/cmd/signozschemamigrator/schema_migrator/traces_migrations.go`.

				### `distributed_signoz_index_v3` — Primary span storage

				- **Engine**: MergeTree (plain — **no deduplication**, use `DISTINCT ON (span_id)`)

				- **Key columns**: `ts_bucket_start` (UInt64), `timestamp` (DateTime64(9)), `trace_id` (FixedString(32)), `span_id`, `duration_nano`, `has_error`, `name`, `resource_string_service$$name`, `attributes_string`, `events`, `links`

				- **ORDER BY**: `(ts_bucket_start, resource_fingerprint, has_error, name, timestamp)`

				- **Partition**: `toDate(timestamp)`

				### `distributed_trace_summary` — Pre-aggregated trace metadata

				- **Engine**: AggregatingMergeTree. Columns: `trace_id`, `start` (min), `end` (max), `num_spans` (sum)

				- **Populated by** `trace_summary_mv` — materialized view on `signoz_index_v3` that triggers per-batch, inserting partial aggregates. ClickHouse merges them asynchronously.

				- **CRITICAL**: Always query with `GROUP BY trace_id` (never raw `SELECT *`)

				### Other tables

				`distributed_tag_attributes_v2` (attribute keys for autocomplete), `distributed_span_attributes_keys` (which attributes exist)

				---

				## API Endpoints

				### 1. Query Range V5 — `POST /api/v5/query_range`

				Primary query endpoint for traces (also logs/metrics). Supports query builder queries, trace operators, aggregations, filters, group by. See [QUERY_RANGE_API.md](../../docs/modules/QUERY_RANGE_API.md).

				Key files: `pkg/telemetrytraces/statement_builder.go`, `trace_operator_statement_builder.go`, `pkg/querier/trace_operator_query.go`

				### 2. Waterfall — `POST /api/v2/traces/waterfall/{traceId}`

				Handler: `http_handler.go:1748` → Reader: `clickhouseReader/reader.go:873`

				**Request**: `{ "selectedSpanId", "isSelectedSpanIDUnCollapsed", "uncollapsedSpans[]" }`

				**Response**: `{ startTimestampMillis, endTimestampMillis, totalSpansCount, totalErrorSpansCount, rootServiceName, rootServiceEntryPoint, serviceNameToTotalDurationMap, spans[], hasMissingSpans, uncollapsedSpans[] }`

				**Pipeline**:

				1. Query `trace_summary` for time range → query `signoz_index_v3` with `DISTINCT ON (span_id)` and `ts_bucket_start >= start - 1800`

				2. Build span tree: map spanID→Span, link parent via CHILD_OF refs, create Missing Span nodes for absent parents

				3. Cache (key: `getWaterfallSpansForTraceWithMetadata-{traceID}`, TTL: 5 min, skipped if trace end within flux interval of 2 min from now)

				4. `GetSelectedSpans` (`tracedetail/waterfall.go:159`): find path to selectedSpanID, DFS into uncollapsed nodes, compute SubTreeNodeCount, return sliding window of **500 spans** (40% before, 60% after selected)

				### 3. Flamegraph — `POST /api/v2/traces/flamegraph/{traceId}`

				Handler: `http_handler.go:1781` → Reader: `reader.go:1091`

				**Request**: `{ "selectedSpanId" }` **Response**: `{ startTimestampMillis, endTimestampMillis, durationNano, spans[][] }`

				Same DB query as waterfall, but uses **BFS** (not DFS) to organize by level. Returns `[][]*FlamegraphSpan` (lighter model, no tagMap). Level sampling when > 100 spans/level: top 5 by latency + 50 timestamp buckets (2 each). Window: **50 levels**.

				### 4. Other APIs

				- **Trace Fields**: `GET/POST /api/v2/traces/fields` (handlers at `http_handler.go:4912-4921`)

				- **Trace Funnels**: CRUD at `/api/v1/trace-funnels/*`, analytics at `/{funnel_id}/analytics/*` (`pkg/modules/tracefunnel/`)

				---

				## Query Building System

				### Query Structure

				```go

				QueryBuilderQuery[TraceAggregation]{

				    Signal: SignalTraces,

				    Filter: &Filter{Expression: "service.name = 'api' AND duration_nano > 1000000"},

				    Aggregations: []TraceAggregation{{Expression: "count()", Alias: "total"}},

				    GroupBy: []GroupByKey{{TelemetryFieldKey: {Name: "service.name"}}},

				}

				```

				### SQL Generation (`statement_builder.go`)

				1. **Field resolution** via `field_mapper.go` — maps intrinsic (`trace_id`, `duration_nano`), calculated (`http_method`, `has_error`), and attribute fields (`attributes_string[...]`) to CH columns. Example: `"service.name"` → `"resource_string_service$$name"`

				2. **Time optimization** — if `trace_id` in filter, queries `trace_summary` first to narrow range

				3. **Filter building** via `condition_builder.go` — supports `=`, `!=`, `IN`, `LIKE`, `ILIKE`, `EXISTS`, `CONTAINS`, comparisons

				4. **Build SQL** by request type: `buildListQuery()`, `buildTimeSeriesQuery()`, `buildScalarQuery()`, `buildTraceQuery()`

				### Trace Operators (`trace_operator_statement_builder.go`)

				Combines multiple trace queries with set operations. Parses expression (e.g., `"A AND B"`) → builds CTE per query via `trace_operator_cte_builder.go` → combines with INTERSECT (AND), UNION (OR), EXCEPT (NOT).

				---

				## Frontend (Trace Detail)

				### State Flow

				```

				TraceDetailsV2 (pages/TraceDetailV2/TraceDetailV2.tsx)

				  ├── uncollapsedNodes, interestedSpanId, selectedSpan

				  ├── useGetTraceV2 → waterfall API

				  ├── TraceMetadata (totalSpans, errors, duration)

				  ├── TraceFlamegraph (separate API via useGetTraceFlamegraph)

				  └── TraceWaterfall → Success → TableV3 (virtualized)

				```

				### Components

				| Component | File |

				|-----------|------|

				| TraceDetailsV2 | `pages/TraceDetailV2/TraceDetailV2.tsx` |

				| TraceMetadata | `container/TraceMetadata/TraceMetadata.tsx` |

				| TraceWaterfall | `container/TraceWaterfall/TraceWaterfall.tsx` |

				| Success (waterfall table) | `container/TraceWaterfall/.../Success/Success.tsx` |

				| Filters | `container/TraceWaterfall/.../Filters/Filters.tsx` |

				| TraceFlamegraph | `container/PaginatedTraceFlamegraph/PaginatedTraceFlamegraph.tsx` |

				| SpanDetailsDrawer | `container/SpanDetailsDrawer/SpanDetailsDrawer.tsx` |

				### API Hooks

				| Hook | API |

				|------|-----|

				| `useGetTraceV2` (`hooks/trace/useGetTraceV2.tsx`) | POST waterfall |

				| `useGetTraceFlamegraph` (`hooks/trace/useGetTraceFlamegraph.tsx`) | POST flamegraph |

				Adapter: `api/trace/getTraceV2.tsx`. Types: `types/api/trace/getTraceV2.ts`.

				---

				## Known Gotchas

				1. **trace_summary**: Always `GROUP BY trace_id` — raw reads return partial unmerged rows

				2. **signoz_index_v3 dedup**: Plain MergeTree. Waterfall uses `DISTINCT ON (span_id)`. Flamegraph relies on map-key dedup (keeps last-seen)

				3. **Flux interval**: Traces ending within 2 min of now bypass cache → fresh DB query every interaction

				4. **SubTreeNodeCount**: Self-inclusive (root count = total tree nodes)

				5. **Waterfall pagination**: Max 500 spans per response (sliding window). Frontend virtual-scrolls and re-fetches at edges

				---

				## Extending the Module

				- **New calculated field**: Define in `telemetrytraces/const.go` → map in `field_mapper.go` → optionally update `condition_builder.go`

				- **New API endpoint**: Handler in `http_handler.go` → register route → implement in ClickHouseReader or Querier

				- **New aggregation**: Update `querybuilder/agg_expr_rewriter.go`

				- **New trace operator**: Update `grammar/TraceOperatorGrammar.g4` + `trace_operator_cte_builder.go`

				---

				## Key File Index

				### Backend

				| File | Purpose |

				|------|---------|

				| `pkg/telemetrytraces/statement_builder.go` | Trace SQL generation |

				| `pkg/telemetrytraces/field_mapper.go` | Field → CH column mapping |

				| `pkg/telemetrytraces/condition_builder.go` | WHERE clause building |

				| `pkg/telemetrytraces/trace_operator_statement_builder.go` | Trace operator SQL |

				| `pkg/telemetrytraces/trace_operator_cte_builder.go` | Trace operator CTEs |

				| `pkg/querier/trace_operator_query.go` | Trace operator execution |

				| `pkg/query-service/app/http_handler.go:1748` | Waterfall handler |

				| `pkg/query-service/app/http_handler.go:1781` | Flamegraph handler |

				| `pkg/query-service/app/clickhouseReader/reader.go:831` | GetSpansForTrace |

				| `pkg/query-service/app/clickhouseReader/reader.go:873` | Waterfall logic |

				| `pkg/query-service/app/clickhouseReader/reader.go:1091` | Flamegraph logic |

				| `pkg/query-service/app/traces/tracedetail/waterfall.go` | DFS traversal, span selection |

				| `pkg/query-service/app/traces/tracedetail/flamegraph.go` | BFS traversal, level sampling |

				| `pkg/query-service/model/response.go:279` | Span model (waterfall) |

				| `pkg/query-service/model/response.go:305` | FlamegraphSpan model |

				| `pkg/query-service/model/trace.go` | SpanItemV2, TraceSummary |

				| `pkg/query-service/model/cacheable.go` | Cache structures |

				### Frontend

				| File | Purpose |

				|------|---------|

				| `pages/TraceDetailV2/TraceDetailV2.tsx` | Page container |

				| `container/TraceWaterfall/.../Success/Success.tsx` | Waterfall table |

				| `container/PaginatedTraceFlamegraph/PaginatedTraceFlamegraph.tsx` | Flamegraph |

				| `hooks/trace/useGetTraceV2.tsx` | Waterfall API hook |

				| `hooks/trace/useGetTraceFlamegraph.tsx` | Flamegraph API hook |

				| `api/trace/getTraceV2.tsx` | API adapter |

				| `types/api/trace/getTraceV2.ts` | TypeScript types |

				### Schema DDL

				| File | Purpose |

				|------|---------|

				| `signozschemamigrator/.../traces_migrations.go:10-134` | signoz_index_v3 |

				| `signozschemamigrator/.../traces_migrations.go:271-348` | trace_summary + MV |

									
										980

docs/modules/QUERY_RANGE_API.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,980 @@

				# Query Range API (V5) - Developer Guide

				This document provides a comprehensive guide to the Query Range API (V5), which is the primary query endpoint for traces, logs, and metrics in SigNoz. It covers architecture, request/response models, code flows, and implementation details.

				## Table of Contents

				1. [Overview](#overview)

				2. [API Endpoint](#api-endpoint)

				3. [Request/Response Models](#requestresponse-models)

				4. [Query Types](#query-types)

				5. [Request Types](#request-types)

				6. [Code Flow](#code-flow)

				7. [Key Components](#key-components)

				8. [Query Execution](#query-execution)

				9. [Caching](#caching)

				10. [Result Processing](#result-processing)

				11. [Performance Considerations](#performance-considerations)

				12. [Extending the API](#extending-the-api)

				---

				## Overview

				The Query Range API (V5) is the unified query endpoint for all telemetry signals (traces, logs, metrics) in SigNoz. It provides:

				- **Unified Interface**: Single endpoint for all signal types

				- **Query Builder**: Visual query builder support

				- **Multiple Query Types**: Builder queries, PromQL, ClickHouse SQL, Formulas, Trace Operators

				- **Flexible Response Types**: Time series, scalar, raw data, trace-specific

				- **Advanced Features**: Aggregations, filters, group by, ordering, pagination

				- **Caching**: Intelligent caching for performance

				### Key Technologies

				- **Backend**: Go (Golang)

				- **Storage**: ClickHouse (columnar database)

				- **Query Language**: Custom query builder + PromQL + ClickHouse SQL

				- **Protocol**: HTTP/REST API

				---

				## API Endpoint

				### Endpoint Details

				**URL**: `POST /api/v5/query_range`

				**Handler**: `QuerierAPI.QueryRange` → `querier.QueryRange`

				**Location**: 

				- Handler: `pkg/querier/querier.go:122`

				- Route Registration: `pkg/query-service/app/http_handler.go:480`

				**Authentication**: Requires ViewAccess permission

				**Content-Type**: `application/json`

				### Request Flow

				```

				HTTP Request (POST /api/v5/query_range)

				    ↓

				HTTP Handler (QuerierAPI.QueryRange)

				    ↓

				Querier.QueryRange (pkg/querier/querier.go)

				    ↓

				Query Execution (Statement Builders → ClickHouse)

				    ↓

				Result Processing & Merging

				    ↓

				HTTP Response (QueryRangeResponse)

				```

				---

				## Request/Response Models

				### Request Model

				**Location**: `pkg/types/querybuildertypes/querybuildertypesv5/req.go`

				```go

				type QueryRangeRequest struct {

				    Start       uint64          // Start timestamp (milliseconds)

				    End         uint64          // End timestamp (milliseconds)

				    RequestType RequestType      // Response type (TimeSeries, Scalar, Raw, Trace)

				    Variables   map[string]VariableItem // Template variables

				    CompositeQuery CompositeQuery // Container for queries

				    NoCache     bool            // Skip cache flag

				}

				```

				### Composite Query

				```go

				type CompositeQuery struct {

				    Queries []QueryEnvelope // Array of queries to execute

				}

				```

				### Query Envelope

				```go

				type QueryEnvelope struct {

				    Type QueryType // Query type (Builder, PromQL, ClickHouseSQL, Formula, TraceOperator)

				    Spec any       // Query specification (type-specific)

				}

				```

				### Response Model

				**Location**: `pkg/types/querybuildertypes/querybuildertypesv5/req.go`

				```go

				type QueryRangeResponse struct {

				    Type    RequestType           // Response type

				    Data    QueryData             // Query results

				    Meta    ExecStats             // Execution statistics

				    Warning *QueryWarnData        // Warnings (if any)

				    QBEvent *QBEvent              // Query builder event metadata

				}

				type QueryData struct {

				    Results []any // Array of result objects (type depends on RequestType)

				}

				type ExecStats struct {

				    RowsScanned   uint64            // Total rows scanned

				    BytesScanned  uint64            // Total bytes scanned

				    DurationMS    uint64            // Query duration in milliseconds

				    StepIntervals map[string]uint64 // Step intervals per query

				}

				```

				---

				## Query Types

				The API supports multiple query types, each with its own specification format.

				### 1. Builder Query (`QueryTypeBuilder`)

				Visual query builder queries. Supports traces, logs, and metrics.

				**Spec Type**: `QueryBuilderQuery[T]` where T is:

				- `TraceAggregation` for traces

				- `LogAggregation` for logs

				- `MetricAggregation` for metrics

				**Example**:

				```go

				QueryBuilderQuery[TraceAggregation] {

				    Name: "query_name",

				    Signal: SignalTraces,

				    Filter: &Filter {

				        Expression: "service.name = 'api' AND duration_nano > 1000000",

				    },

				    Aggregations: []TraceAggregation {

				        {Expression: "count()", Alias: "total"},

				        {Expression: "avg(duration_nano)", Alias: "avg_duration"},

				    },

				    GroupBy: []GroupByKey {...},

				    Order: []OrderBy {...},

				    Limit: 100,

				}

				```

				**Key Files**:

				- Traces: `pkg/telemetrytraces/statement_builder.go`

				- Logs: `pkg/telemetrylogs/statement_builder.go`

				- Metrics: `pkg/telemetrymetrics/statement_builder.go`

				### 2. PromQL Query (`QueryTypePromQL`)

				Prometheus Query Language queries for metrics.

				**Spec Type**: `PromQuery`

				**Example**:

				```go

				PromQuery {

				    Query: "rate(http_requests_total[5m])",

				    Step: Step{Duration: time.Minute},

				}

				```

				**Key Files**: `pkg/querier/promql_query.go`

				### 3. ClickHouse SQL Query (`QueryTypeClickHouseSQL`)

				Direct ClickHouse SQL queries.

				**Spec Type**: `ClickHouseQuery`

				**Example**:

				```go

				ClickHouseQuery {

				    Query: "SELECT count() FROM signoz_traces.distributed_signoz_index_v3 WHERE ...",

				}

				```

				**Key Files**: `pkg/querier/ch_sql_query.go`

				### 4. Formula Query (`QueryTypeFormula`)

				Mathematical formulas combining other queries.

				**Spec Type**: `QueryBuilderFormula`

				**Example**:

				```go

				QueryBuilderFormula {

				    Expression: "A / B * 100", // A and B are query names

				}

				```

				**Key Files**: `pkg/querier/formula_query.go`

				### 5. Trace Operator Query (`QueryTypeTraceOperator`)

				Set operations on trace queries (AND, OR, NOT).

				**Spec Type**: `QueryBuilderTraceOperator`

				**Example**:

				```go

				QueryBuilderTraceOperator {

				    Expression: "A AND B", // A and B are query names

				    Filter: &Filter {...},

				}

				```

				**Key Files**: 

				- `pkg/telemetrytraces/trace_operator_statement_builder.go`

				- `pkg/querier/trace_operator_query.go`

				---

				## Request Types

				The `RequestType` determines the format of the response data.

				### 1. `RequestTypeTimeSeries`

				Returns time series data for charts.

				**Response Format**: `TimeSeriesData`

				```go

				type TimeSeriesData struct {

				    QueryName    string

				    Aggregations []AggregationBucket

				}

				type AggregationBucket struct {

				    Index  int

				    Series []TimeSeries

				    Alias  string

				    Meta   AggregationMeta

				}

				type TimeSeries struct {

				    Labels map[string]string

				    Values []TimeSeriesValue

				}

				type TimeSeriesValue struct {

				    Timestamp int64

				    Value     float64

				}

				```

				**Use Case**: Line charts, bar charts, area charts

				### 2. `RequestTypeScalar`

				Returns a single scalar value.

				**Response Format**: `ScalarData`

				```go

				type ScalarData struct {

				    QueryName string

				    Data      []ScalarValue

				}

				type ScalarValue struct {

				    Timestamp int64

				    Value     float64

				}

				```

				**Use Case**: Single value displays, stat panels

				### 3. `RequestTypeRaw`

				Returns raw data rows.

				**Response Format**: `RawData`

				```go

				type RawData struct {

				    QueryName string

				    Columns   []string

				    Rows      []RawDataRow

				}

				type RawDataRow struct {

				    Timestamp time.Time

				    Data      map[string]any

				}

				```

				**Use Case**: Tables, logs viewer, trace lists

				### 4. `RequestTypeTrace`

				Returns trace-specific data structure.

				**Response Format**: Trace-specific format (see traces documentation)

				**Use Case**: Trace-specific visualizations

				---

				## Code Flow

				### Complete Request Flow

				```

				1. HTTP Request

				   POST /api/v5/query_range

				   Body: QueryRangeRequest JSON

				   ↓

				2. HTTP Handler

				   QuerierAPI.QueryRange (pkg/querier/querier.go)

				   - Validates request

				   - Extracts organization ID from auth context

				   ↓

				3. Querier.QueryRange (pkg/querier/querier.go:122)

				   - Validates QueryRangeRequest

				   - Processes each query in CompositeQuery.Queries

				   - Identifies dependencies (e.g., trace operators, formulas)

				   - Calculates step intervals

				   - Fetches metric temporality if needed

				   ↓

				4. Query Creation

				   For each QueryEnvelope:

				   a. Builder Query:

				      - newBuilderQuery() creates builderQuery instance

				      - Selects appropriate statement builder based on signal:

				        * Traces → traceStmtBuilder

				        * Logs → logStmtBuilder

				        * Metrics → metricStmtBuilder or meterStmtBuilder

				      ↓

				   b. PromQL Query:

				      - newPromqlQuery() creates promqlQuery instance

				      - Uses Prometheus engine

				      ↓

				   c. ClickHouse SQL Query:

				      - newchSQLQuery() creates chSQLQuery instance

				      - Direct SQL execution

				      ↓

				   d. Formula Query:

				      - newFormulaQuery() creates formulaQuery instance

				      - References other queries by name

				      ↓

				   e. Trace Operator Query:

				      - newTraceOperatorQuery() creates traceOperatorQuery instance

				      - Uses traceOperatorStmtBuilder

				      ↓

				5. Statement Building (for Builder queries)

				   StatementBuilder.Build()

				   - Resolves field keys from metadata store

				   - Builds SQL based on request type:

				     * RequestTypeRaw → buildListQuery()

				     * RequestTypeTimeSeries → buildTimeSeriesQuery()

				     * RequestTypeScalar → buildScalarQuery()

				     * RequestTypeTrace → buildTraceQuery()

				   - Returns SQL statement with arguments

				   ↓

				6. Query Execution

				   Query.Execute()

				   - Executes SQL/query against ClickHouse or Prometheus

				   - Processes results into response format

				   - Returns Result with data and statistics

				   ↓

				7. Caching (if applicable)

				   - Checks bucket cache for time series queries

				   - Executes queries for missing time ranges

				   - Merges cached and fresh results

				   ↓

				8. Result Processing

				   querier.run()

				   - Executes all queries (with dependency resolution)

				   - Collects results and warnings

				   - Merges results from multiple queries

				   ↓

				9. Post-Processing

				   postProcessResults()

				   - Applies formulas if present

				   - Handles variable substitution

				   - Formats results for response

				   ↓

				10. HTTP Response

				    - Returns QueryRangeResponse with results

				    - Includes execution statistics

				    - Includes warnings if any

				```

				### Key Decision Points

				1. **Query Type Selection**: Based on `QueryEnvelope.Type`

				2. **Signal Selection**: For builder queries, based on `Signal` field

				3. **Request Type Handling**: Different SQL generation for different request types

				4. **Caching Strategy**: Only for time series queries with valid fingerprints

				5. **Dependency Resolution**: Trace operators and formulas resolve dependencies first

				---

				## Key Components

				### 1. Querier

				**Location**: `pkg/querier/querier.go`

				**Purpose**: Orchestrates query execution, caching, and result merging

				**Key Methods**:

				- `QueryRange()`: Main entry point for query execution

				- `run()`: Executes queries and merges results

				- `executeWithCache()`: Handles caching logic

				- `mergeResults()`: Merges cached and fresh results

				- `postProcessResults()`: Applies formulas and variable substitution

				**Key Features**:

				- Query orchestration across multiple query types

				- Intelligent caching with bucket-based strategy

				- Result merging from multiple queries

				- Formula evaluation

				- Time range optimization

				- Step interval calculation and validation

				### 2. Statement Builder Interface

				**Location**: `pkg/types/querybuildertypes/querybuildertypesv5/`

				**Purpose**: Converts query builder specifications into executable queries

				**Interface**:

				```go

				type StatementBuilder[T any] interface {

				    Build(

				        ctx context.Context,

				        start uint64,

				        end uint64,

				        requestType RequestType,

				        query QueryBuilderQuery[T],

				        variables map[string]VariableItem,

				    ) (*Statement, error)

				}

				```

				**Implementations**:

				- `traceQueryStatementBuilder` - Traces (`pkg/telemetrytraces/statement_builder.go`)

				- `logQueryStatementBuilder` - Logs (`pkg/telemetrylogs/statement_builder.go`)

				- `metricQueryStatementBuilder` - Metrics (`pkg/telemetrymetrics/statement_builder.go`)

				**Key Features**:

				- Field resolution via metadata store

				- SQL generation for different request types

				- Filter, aggregation, group by, ordering support

				- Time range optimization

				### 3. Query Interface

				**Location**: `pkg/types/querybuildertypes/querybuildertypesv5/`

				**Purpose**: Represents an executable query

				**Interface**:

				```go

				type Query interface {

				    Execute(ctx context.Context) (*Result, error)

				    Fingerprint() string // For caching

				    Window() (uint64, uint64) // Time range

				}

				```

				**Implementations**:

				- `builderQuery[T]` - Builder queries (`pkg/querier/builder_query.go`)

				- `promqlQuery` - PromQL queries (`pkg/querier/promql_query.go`)

				- `chSQLQuery` - ClickHouse SQL queries (`pkg/querier/ch_sql_query.go`)

				- `formulaQuery` - Formula queries (`pkg/querier/formula_query.go`)

				- `traceOperatorQuery` - Trace operator queries (`pkg/querier/trace_operator_query.go`)

				### 4. Telemetry Store

				**Location**: `pkg/telemetrystore/`

				**Purpose**: Abstraction layer for ClickHouse database access

				**Key Methods**:

				- `Query()`: Execute SQL query

				- `QueryRow()`: Execute query returning single row

				- `Select()`: Execute query returning multiple rows

				**Implementation**: `clickhouseTelemetryStore` (`pkg/telemetrystore/clickhousetelemetrystore/`)

				### 5. Metadata Store

				**Location**: `pkg/types/telemetrytypes/`

				**Purpose**: Provides metadata about available fields, keys, and attributes

				**Key Methods**:

				- `GetKeysMulti()`: Get field keys for multiple selectors

				- `FetchTemporalityMulti()`: Get metric temporality information

				**Implementation**: `telemetryMetadataStore` (`pkg/telemetrymetadata/`)

				### 6. Bucket Cache

				**Location**: `pkg/querier/`

				**Purpose**: Caches query results by time buckets for performance

				**Key Methods**:

				- `GetMissRanges()`: Get time ranges not in cache

				- `Put()`: Store query result in cache

				**Features**:

				- Bucket-based caching (aligned to step intervals)

				- Automatic cache invalidation

				- Parallel query execution for missing ranges

				---

				## Query Execution

				### Builder Query Execution

				**Location**: `pkg/querier/builder_query.go`

				**Process**:

				1. Statement builder generates SQL

				2. SQL executed against ClickHouse via TelemetryStore

				3. Results processed based on RequestType:

				   - TimeSeries: Grouped by time buckets and labels

				   - Scalar: Single value extraction

				   - Raw: Row-by-row processing

				4. Statistics collected (rows scanned, bytes scanned, duration)

				### PromQL Query Execution

				**Location**: `pkg/querier/promql_query.go`

				**Process**:

				1. Query parsed by Prometheus engine

				2. Executed against Prometheus-compatible data

				3. Results converted to QueryRangeResponse format

				### ClickHouse SQL Query Execution

				**Location**: `pkg/querier/ch_sql_query.go`

				**Process**:

				1. SQL query executed directly

				2. Results processed based on RequestType

				3. Variable substitution applied

				### Formula Query Execution

				**Location**: `pkg/querier/formula_query.go`

				**Process**:

				1. Referenced queries executed first

				2. Formula expression evaluated using govaluate

				3. Results computed from query results

				### Trace Operator Query Execution

				**Location**: `pkg/querier/trace_operator_query.go`

				**Process**:

				1. Expression parsed to find dependencies

				2. Referenced queries executed

				3. Set operations applied (INTERSECT, UNION, EXCEPT)

				4. Results combined

				---

				## Caching

				### Caching Strategy

				**Location**: `pkg/querier/querier.go:642`

				**When Caching Applies**:

				- Time series queries only

				- Queries with valid fingerprints

				- `NoCache` flag not set

				**How It Works**:

				1. Query fingerprint generated (includes query structure, filters, time range)

				2. Cache checked for existing results

				3. Missing time ranges identified

				4. Queries executed only for missing ranges (parallel execution)

				5. Fresh results merged with cached results

				6. Merged result stored in cache

				### Cache Key Generation

				**Location**: `pkg/querier/builder_query.go:52`

				The fingerprint includes:

				- Signal type

				- Source type

				- Step interval

				- Aggregations

				- Filters

				- Group by fields

				- Time range (for cache key, not fingerprint)

				### Cache Benefits

				- **Performance**: Avoids re-executing identical queries

				- **Efficiency**: Only queries missing time ranges

				- **Parallelism**: Multiple missing ranges queried in parallel

				---

				## Result Processing

				### Result Merging

				**Location**: `pkg/querier/querier.go:795`

				**Process**:

				1. Results from multiple queries collected

				2. For time series: Series merged by labels

				3. For raw data: Rows combined

				4. Statistics aggregated (rows scanned, bytes scanned, duration)

				### Formula Evaluation

				**Location**: `pkg/querier/formula_query.go`

				**Process**:

				1. Formula expression parsed

				2. Referenced query results retrieved

				3. Expression evaluated using govaluate library

				4. Result computed and formatted

				### Variable Substitution

				**Location**: `pkg/querier/querier.go`

				**Process**:

				1. Variables extracted from request

				2. Variable values substituted in queries

				3. Applied to filters, aggregations, and other query parts

				---

				## Performance Considerations

				### Query Optimization

				1. **Time Range Optimization**: 

				   - For trace queries with `trace_id` filter, query `trace_summary` first to narrow time range

				   - Use appropriate time ranges to limit data scanned

				2. **Step Interval Calculation**:

				   - Automatic step interval calculation based on time range

				   - Minimum step interval enforcement

				   - Warnings for suboptimal intervals

				3. **Index Usage**:

				   - Queries use time bucket columns (`ts_bucket_start`) for efficient filtering

				   - Proper filter placement for index utilization

				4. **Limit Enforcement**:

				   - Raw data queries should include limits

				   - Pagination support via offset/cursor

				### Best Practices

				1. **Use Query Builder**: Prefer query builder over raw SQL for better optimization

				2. **Limit Time Ranges**: Always specify reasonable time ranges

				3. **Use Aggregations**: For large datasets, use aggregations instead of raw data

				4. **Cache Awareness**: Be mindful of cache TTLs when testing

				5. **Parallel Queries**: Multiple independent queries execute in parallel

				6. **Step Intervals**: Let system calculate optimal step intervals

				### Monitoring

				Execution statistics are included in response:

				- `RowsScanned`: Total rows scanned

				- `BytesScanned`: Total bytes scanned

				- `DurationMS`: Query execution time

				- `StepIntervals`: Step intervals per query

				---

				## Extending the API

				### Adding a New Query Type

				1. **Define Query Type** (`pkg/types/querybuildertypes/querybuildertypesv5/query.go`):

				```go

				const (

				    QueryTypeMyNewType QueryType = "my_new_type"

				)

				```

				2. **Define Query Spec**:

				```go

				type MyNewQuerySpec struct {

				    Name string

				    // ... your fields

				}

				```

				3. **Update QueryEnvelope Unmarshaling** (`pkg/types/querybuildertypes/querybuildertypesv5/query.go`):

				```go

				case QueryTypeMyNewType:

				    var spec MyNewQuerySpec

				    if err := UnmarshalJSONWithContext(shadow.Spec, &spec, "my new query spec"); err != nil {

				        return wrapUnmarshalError(err, "invalid my new query spec: %v", err)

				    }

				    q.Spec = spec

				```

				4. **Implement Query Interface** (`pkg/querier/my_new_query.go`):

				```go

				type myNewQuery struct {

				    spec MyNewQuerySpec

				    // ... other fields

				}

				func (q *myNewQuery) Execute(ctx context.Context) (*qbtypes.Result, error) {

				    // Implementation

				}

				func (q *myNewQuery) Fingerprint() string {

				    // Generate fingerprint for caching

				}

				func (q *myNewQuery) Window() (uint64, uint64) {

				    // Return time range

				}

				```

				5. **Update Querier** (`pkg/querier/querier.go`):

				```go

				case QueryTypeMyNewType:

				    myQuery, ok := query.Spec.(MyNewQuerySpec)

				    if !ok {

				        return nil, errors.NewInvalidInputf(...)

				    }

				    queries[myQuery.Name] = newMyNewQuery(myQuery, ...)

				```

				### Adding a New Request Type

				1. **Define Request Type** (`pkg/types/querybuildertypes/querybuildertypesv5/req.go`):

				```go

				const (

				    RequestTypeMyNewType RequestType = "my_new_type"

				)

				```

				2. **Update Statement Builders**: Add handling in `Build()` method

				3. **Update Query Execution**: Add result processing for new type

				4. **Update Response Models**: Add response data structure

				### Adding a New Aggregation Function

				1. **Update Aggregation Rewriter** (`pkg/querybuilder/agg_expr_rewriter.go`):

				```go

				func (r *aggExprRewriter) RewriteAggregation(expr string) (string, error) {

				    if strings.HasPrefix(expr, "my_function(") {

				        // Parse arguments

				        // Return ClickHouse SQL expression

				        return "myClickHouseFunction(...)", nil

				    }

				    // ... existing functions

				}

				```

				2. **Update Documentation**: Document the new function

				---

				## Common Patterns

				### Pattern 1: Simple Time Series Query

				```go

				req := qbtypes.QueryRangeRequest{

				    Start:       startMs,

				    End:         endMs,

				    RequestType: qbtypes.RequestTypeTimeSeries,

				    CompositeQuery: qbtypes.CompositeQuery{

				        Queries: []qbtypes.QueryEnvelope{

				            {

				                Type: qbtypes.QueryTypeBuilder,

				                Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{

				                    Name:   "A",

				                    Signal: telemetrytypes.SignalMetrics,

				                    Aggregations: []qbtypes.MetricAggregation{

				                        {Expression: "sum(rate)", Alias: "total"},

				                    },

				                    StepInterval: qbtypes.Step{Duration: time.Minute},

				                },

				            },

				        },

				    },

				}

				```

				### Pattern 2: Query with Filter and Group By

				```go

				req := qbtypes.QueryRangeRequest{

				    Start:       startMs,

				    End:         endMs,

				    RequestType: qbtypes.RequestTypeTimeSeries,

				    CompositeQuery: qbtypes.CompositeQuery{

				        Queries: []qbtypes.QueryEnvelope{

				            {

				                Type: qbtypes.QueryTypeBuilder,

				                Spec: qbtypes.QueryBuilderQuery[qbtypes.TraceAggregation]{

				                    Name:   "A",

				                    Signal: telemetrytypes.SignalTraces,

				                    Filter: &qbtypes.Filter{

				                        Expression: "service.name = 'api' AND duration_nano > 1000000",

				                    },

				                    Aggregations: []qbtypes.TraceAggregation{

				                        {Expression: "count()", Alias: "total"},

				                    },

				                    GroupBy: []qbtypes.GroupByKey{

				                        {TelemetryFieldKey: telemetrytypes.TelemetryFieldKey{

				                            Name: "service.name",

				                            FieldContext: telemetrytypes.FieldContextResource,

				                        }},

				                    },

				                },

				            },

				        },

				    },

				}

				```

				### Pattern 3: Formula Query

				```go

				req := qbtypes.QueryRangeRequest{

				    Start:       startMs,

				    End:         endMs,

				    RequestType: qbtypes.RequestTypeTimeSeries,

				    CompositeQuery: qbtypes.CompositeQuery{

				        Queries: []qbtypes.QueryEnvelope{

				            {

				                Type: qbtypes.QueryTypeBuilder,

				                Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{

				                    Name: "A",

				                    // ... query A definition

				                },

				            },

				            {

				                Type: qbtypes.QueryTypeBuilder,

				                Spec: qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{

				                    Name: "B",

				                    // ... query B definition

				                },

				            },

				            {

				                Type: qbtypes.QueryTypeFormula,

				                Spec: qbtypes.QueryBuilderFormula{

				                    Name:       "C",

				                    Expression: "A / B * 100",

				                },

				            },

				        },

				    },

				}

				```

				---

				## Testing

				### Unit Tests

				- `pkg/querier/querier_test.go` - Querier tests

				- `pkg/querier/builder_query_test.go` - Builder query tests

				- `pkg/querier/formula_query_test.go` - Formula query tests

				### Integration Tests

				- `tests/integration/` - End-to-end API tests

				### Running Tests

				```bash

				# Run all querier tests

				go test ./pkg/querier/...

				# Run with verbose output

				go test -v ./pkg/querier/...

				# Run specific test

				go test -v ./pkg/querier/ -run TestQueryRange

				```

				---

				## Debugging

				### Enable Debug Logging

				```go

				// In querier.go

				q.logger.DebugContext(ctx, "Executing query",

				    "query", queryName,

				    "start", start,

				    "end", end)

				```

				### Common Issues

				1. **Query Not Found**: Check query name matches in CompositeQuery

				2. **SQL Errors**: Check generated SQL in logs, verify ClickHouse syntax

				3. **Performance**: Check execution statistics, optimize time ranges

				4. **Cache Issues**: Set `NoCache: true` to bypass cache

				5. **Formula Errors**: Check formula expression syntax and referenced query names

				---

				## References

				### Key Files

				- `pkg/querier/querier.go` - Main query orchestration

				- `pkg/querier/builder_query.go` - Builder query execution

				- `pkg/types/querybuildertypes/querybuildertypesv5/` - Request/response models

				- `pkg/telemetrystore/` - ClickHouse interface

				- `pkg/telemetrymetadata/` - Metadata store

				### Signal-Specific Documentation

				- [Traces Module](./TRACES_MODULE.md) - Trace-specific details

				- Logs module documentation (when available)

				- Metrics module documentation (when available)

				### Related Documentation

				- [ClickHouse Documentation](https://clickhouse.com/docs)

				- [PromQL Documentation](https://prometheus.io/docs/prometheus/latest/querying/basics/)

				---

				## Contributing

				When contributing to the Query Range API:

				1. **Follow Existing Patterns**: Match the style of existing query types

				2. **Add Tests**: Include unit tests for new functionality

				3. **Update Documentation**: Update this doc for significant changes

				4. **Consider Performance**: Optimize queries and use caching appropriately

				5. **Handle Errors**: Provide meaningful error messages

				For questions or help, reach out to the maintainers or open an issue.

Compare commits

5 Commits

platform-p ... ns/claude-

136

.claude/CLAUDE.md Normal file

View File

15

.claude/settings.json Normal file

View File

21

.claude/skills/clickhouse-query/SKILL.md Normal file

View File

460

.claude/skills/clickhouse-query/clickhouse-traces-reference.md Normal file

View File

37

.claude/skills/commit/SKILL.md Normal file

View File

22

.claude/skills/dev-server/SKILL.md Normal file

View File

55

.claude/skills/raise-pr/SKILL.md Normal file

View File

254

.claude/skills/review/SKILL.md Normal file

View File

14

.claude/skills/traces/SKILL.md Normal file

View File

191

.claude/skills/traces/traces-module.md Normal file

View File

980

docs/modules/QUERY_RANGE_API.md Normal file

View File

Compare commits

5 Commits platform-p ... ns/claude-

136 .claude/CLAUDE.md Normal file Unescape Escape View File

15 .claude/settings.json Normal file Unescape Escape View File

21 .claude/skills/clickhouse-query/SKILL.md Normal file Unescape Escape View File

460 .claude/skills/clickhouse-query/clickhouse-traces-reference.md Normal file Unescape Escape View File

37 .claude/skills/commit/SKILL.md Normal file Unescape Escape View File

22 .claude/skills/dev-server/SKILL.md Normal file Unescape Escape View File

55 .claude/skills/raise-pr/SKILL.md Normal file Unescape Escape View File

254 .claude/skills/review/SKILL.md Normal file Unescape Escape View File

14 .claude/skills/traces/SKILL.md Normal file Unescape Escape View File

191 .claude/skills/traces/traces-module.md Normal file Unescape Escape View File

980 docs/modules/QUERY_RANGE_API.md Normal file Unescape Escape View File

5 Commits

platform-p ... ns/claude-

136

.claude/CLAUDE.md Normal file

View File

15

.claude/settings.json Normal file

View File

21

.claude/skills/clickhouse-query/SKILL.md Normal file

View File

460

.claude/skills/clickhouse-query/clickhouse-traces-reference.md Normal file

View File

37

.claude/skills/commit/SKILL.md Normal file

View File

22

.claude/skills/dev-server/SKILL.md Normal file

View File

55

.claude/skills/raise-pr/SKILL.md Normal file

View File

254

.claude/skills/review/SKILL.md Normal file

View File

14

.claude/skills/traces/SKILL.md Normal file

View File

191

.claude/skills/traces/traces-module.md Normal file

View File

980

docs/modules/QUERY_RANGE_API.md Normal file

View File