Commit Graph

80 Commits

Author SHA1 Message Date
Nityananda Gohain
ad243b88aa feat: send all data for trace list api (#10583)
* chore: send all data for trace list api

* chore: add integration tests

* chore: fix tests

* chore: add comment

Co-authored-by: Tushar Vats <tushar@signoz.io>

* fix: retain existing behaviour

* fix: add changes

* fix: linting issues

* fix: py-fmt

* fix: restrict merging to only span data

* fix: address comments

* fix: address comments

* fix: send parsed events and links

* fix: remove unnecessary tests

* fix: lint issues

* fix: send all data for trace operators as well

* fix: lint issues

* fix: move tests to the same file

* fix: tests

* fix: lint

* fix: comment

* fix: lint

---------

Co-authored-by: Tushar Vats <tushar@signoz.io>
2026-06-23 08:25:59 +00:00
Tushar Vats
4f51ee37ba fix: modularize query range function (#11774) 2026-06-22 11:35:33 +00:00
Srikanth Chekuri
2cf7ef93ea chore: send warning instead of error for unseen metrics and missing (… (#11754)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: send warning instead of error for unseen metrics and missing (metric, key)

* chore: update integration test

* chore: fix integration test

* chore: fix test

* chore: add unit test for missing key
2026-06-18 10:42:09 +00:00
Tushar Vats
629ea3b8be feat: extend error responses with new error struct (#11542)
Some checks failed
build-staging / js-build (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: extend error responses with new error struct

* fix: enriched error for dashboard api

* fix: merge issues

* fix: reverted dashboards changes and add for cloud integrations

* fix: delete file

* fix: add back file

* fix: added a helper

* fix: removed invlaid referencess

* fix: generate openapi

* fix: keeping additional along with suggestion

* Revert "fix: keeping additional along with suggestion"

This reverts commit be30e2ffd2.

* fix: added suggestions per additonal error

* fix: generate openapi

* fix: remove valid references

* fix: removeg valid references for select and group by and only did you mean is kept

* fix: unit test

* fix: use binding for deconding for both ee and community

* fix: trim down suggestions methods

* fix: added renamed methods and moved stuff around

* fix: typo

* fix: removed json decoder

* fix: added empty check

* fix: retain addtional

* fix: reverted re-structing of file
2026-06-15 11:58:12 +00:00
Pandey
287b60cbe6 feat(statsreporter): expose collected stats via GET /api/v1/stats (#11698)
* feat(statsreporter): expose collected stats via GET /api/v1/stats

Extract per-org stats collection out of the analytics reporter into an
always-on Aggregator (collector fan-out + telemetry-store counts) shared
by the reporter and a new HTTP handler. The GET /api/v1/stats endpoint
returns the caller's org stats regardless of whether scheduled reporting
is enabled.

* refactor(statsreporter): collect telemetry stats via the querier

Move the trace/log/metric row-count and last-observed queries out of the
stats aggregator and into the querier, which now implements
statsreporter.StatsCollector. The aggregator becomes a pure collector
fan-out and no longer depends on telemetrystore; the querier is wired in
as one of the stats collectors.

* chore: regenerate openapi spec and frontend client

Backend docs/api/openapi.yml gains the GET /api/v1/stats (GetStats)
operation; the Orval client gains a useGetStats hook and GetStats200
type.

* chore: remove comment from querier Collect

* fix(statsreporter): use MustNewUUID for org from claims

Claims come from validated auth context, so the org UUID is guaranteed
valid; drop the dead NewUUID error branch.

* fix(flagger): use MustNewUUID for org from claims

Claims come from validated auth context, so the org UUID is guaranteed
valid; drop the dead NewUUID error branch.

* docs(contributing): note MustNewUUID for IDs from claims

* perf(querier): combine count and last-observed into one query per signal

Each signal's COUNT(*) and max(timestamp) scan the same table, so fetch
both in a single query — 3 queries instead of 6. Same emitted keys and
empty-table guard.
2026-06-15 11:27:17 +00:00
Tushar Vats
080aae9567 fix: mark numeric columns as aggregation in scalar query response (#11593)
* fix: mark numeric columns as aggregation in scalar query response

* fix: make py-fmt

* fix: comments
2026-06-12 12:34:16 +00:00
Naman Verma
4d3d1ef423 fix: add check for percentile aggregation for non-histogram metrics (#11387)
* fix: add check for percentile aggregation for non-histogram metrics

* test: correct errors pkg in test file

* fix: catch type related errors in querier

* fix: remove comparison related tests

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-06-10 17:17:26 +00:00
Naman Verma
a0b14e0835 fix: do not show errors for non-existent cost meter metrics (#10843)
* fix: show warning for non-existent cost meter metrics

* chore: lint fix by removing unused list

* chore: py fmt add new line

* chore: missing newline between tests

* fix: no warnings or errors for internal metrics

* fix: pylint fix by adding new line

* fix: lint fix in test
2026-06-03 10:09:08 +00:00
Nityananda Gohain
dc6a1d0a4e fix: skip resource filter if it exceeds a threshold (#11524)
* fix: skip resource filter if it exceeds a threshold

* fix: test cleanup

* fix: test cleanup

* fix: more cleanup

* fix: tests
2026-06-03 05:08:40 +00:00
Tushar Vats
4b08ba1330 fix: qb warnings (#11518) 2026-06-01 09:43:22 +00:00
Pandey
53b2b2f017 fix(telemetrystore): fix clickhouse connection-pool slot leak (acquire conn timeout) (#11506)
* fix(telemetrystore): upgrade clickhouse-go to v2.44.0 to fix connection-pool slot leak

clickhouse-go v2.43.0 introduced connection-pool slot leaks triggered by context
cancellation: acquire() failed to release the pool slot when idle.Get returned a
cancellation error (ClickHouse/clickhouse-go#1759), and batch.Close() never released
the connection when closeQuery() failed on a cancelled context
(ClickHouse/clickhouse-go#1795). Both leak slots until the pool is exhausted and every
query fails with 'acquire conn timeout'. Both are fixed in v2.44.0.

v2.44.0 adds HasData() to the driver.Rows interface, which the test mock did not
implement. Swap the mock to the SigNoz fork github.com/SigNoz/clickhouse-go-mock
v0.14.0, which implements HasData() and tracks v2.44.0.

* feat(telemetrystore): emit clickhouse connection-pool metrics

Register OTel observable gauges that report the clickhouse connection-pool stats
from driver.Stats() on each collection cycle:
signoz.telemetrystore.connection.{open,idle,max_open,max_idle}. Plotting open against
max_open makes pool saturation (and leaks like the one fixed in the previous commit)
directly observable in Prometheus.
2026-05-30 01:06:16 +05:30
Nityananda Gohain
2f102ee3f5 fix: ensure timestamp is always in ms (#11483)
Co-authored-by: Tushar Vats <tushar@signoz.io>
2026-05-28 10:12:55 +00:00
Tushar Vats
8da9535c80 chore: breakdown query range function (#11211)
* chore: breakdown query range function

* fix: move unexported helper functions to end of file

---------

Co-authored-by: Yunus M <myounis.ar@live.com>
2026-05-27 09:38:45 +00:00
Nityananda Gohain
ceb1b4871b feat: trace based filters for logs, supporting aggregations as well (#11394)
* feat: trace based filters for logs, supporting aggregations as well

* fix: update comments

* fix: cleanup query from tests

* fix: address comments

* fix: address comments

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-05-26 09:57:18 +00:00
Piyush Singariya
847bc71f4e fix: postprocess json logs message key (#11189)
* fix: backend changes for message key postprocessing

* fix: message postprocessing

* chore: update in e2e tests

* fix: table view

* chore: separate frontend from backend changes

* fix: integration tests
2026-05-20 11:46:27 +00:00
Tushar Vats
76b35b9d8f fix: order by ignored in formula query (#10950)
* fix: order by ignored in formula query

* fix: order by ignored in formula query

* fix: added intergation test

* fix: revert integarion test changes

* fix: added an independent integration test

* fix: make py-fmt

* fix: removed comment

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
Co-authored-by: Pandey <vibhupandey28@gmail.com>
2026-05-18 11:38:40 +00:00
Vikrant Gupta
fd2e526f7c fix(deps): resolve all high/critical Dependabot security alerts (#11323)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(deps): upgrade dependencies to resolve high/critical security alerts

Upgrade pgx/v5 (v5.8.0→v5.9.2), prometheus (v0.310.0→v0.311.3),
gosaml2 (v0.9.0→v0.11.0), goxmldsig (v1.2.0→v1.6.0), and
urllib3 (2.6.3→2.7.0) to fix all open high/critical Dependabot alerts.

Adapt parser.ParseExpr calls to use the new Parser interface introduced
in prometheus v0.311.x.

* refactor: reuse a single PromQL parser instance instead of creating per call

Add Parser() to the prometheus.Prometheus interface so a single
parser.Parser is created at startup and shared across all consumers.
For the legacy v2 querier and PromQLFilterExtractor (which don't have
access to the Prometheus interface), store a parser instance on the
struct, created once during construction.

* refactor: centralize PromQL parser creation via prometheus.NewParser()

Add pkg/prometheus/parser.go with a Parser type alias and NewParser()
factory function, mirroring the existing Engine/NewEngine pattern.
All consumers now create parsers through this single entry point
instead of calling parser.NewParser(parser.Options{}) directly.
2026-05-15 20:02:39 +00:00
Nityananda Gohain
b445b22bd6 fix: handle series with different types of labels (#11112) 2026-05-12 10:35:43 +00:00
Nityananda Gohain
a7fde606ca fix: use ms in prepareFillZeroArgsWithStep (#11196)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
2026-05-05 18:27:19 +00:00
Pandey
3d8cddf84e refactor: split typeable infrastructure into pkg/types/coretypes (#11105)
* refactor: move authtypes to coretypes

* refactor: migrate downstream consumers to coretypes Kind/Type/Relation

Wire all consumers of the typeable infrastructure through coretypes:
- Replace authtypes.Name/Type/Relation references with coretypes equivalents
- Switch Typeable singletons to constructor calls (authtypes.NewTypeableUser
  etc.), with the embedded coretypes.Typeable populated so Kind/Type/Prefix/
  Scope dispatch correctly through the embed
- Update dashboardtypes meta-resource declarations to use authtypes
  constructors so they expose Tuples (authz callers need it)
- Rename Resource.Name field accesses to Resource.Kind to match the field
  rename in authtypes.Resource
- Fix typeable_metaresource.go calling the plural NewTypeableMetaResources
  helper — should be the singular NewTypeableMetaResource

go build ./... and go vet ./... clean (parser-generated unreachable-code
warnings are pre-existing). Authz unit tests pass.

* refactor(audittypes): unify Action with coretypes.Relation

Drop the duplicate Action enum from audittypes — the verbs (create/update/
delete) match coretypes.Relation exactly. Move PastTense onto Relation so
audit EventName derivation continues to work without a parallel hierarchy.

Also retypes AuditDef.ResourceKind from string to coretypes.Kind so audit
declarations get the same regex validation that authz already enforces.

* refactor(retentiontypes): extract TTLSetting into its own package

TTLSetting is the bun model for ClickHouse TTL settings — has nothing to do
with the Organization domain it was previously co-located with in
pkg/types/organization.go. Moved to pkg/types/retentiontypes/ alongside the
ClickHouse reader that's its sole consumer.

No schema change; the bun table tag (table:ttl_setting) is unchanged.

* chore(openapi): regenerate spec for coretypes.Relation and Resource.Kind

* chore(frontend): regenerate API client and migrate Resource.name → Resource.kind

Regenerated TypeScript API types after the AuthtypesResource field rename
and the new CoretypesRelation enum. Updated:

- frontend/scripts/generate-permissions-type.cjs to read `r.kind` from the
  /api/v1/authz/resources response and emit `kind:` in the static
  permissions.type.ts file.
- frontend/src/hooks/useAuthZ/{permissions.type,types,utils,useAuthZ}.tsx:
  Resource.name → Resource.kind throughout.
- frontend/src/container/RolesSettings/{utils.tsx,__tests__/utils.test.ts}:
  same field migration.
- frontend/src/components/createGuardedRoute/createGuardedRoute.test.tsx:
  same.
- useAuthZ/utils.ts: cast string relations to CoretypesRelationDTO at the
  AuthtypesTransactionDTO boundary now that relation is an enum, not a raw
  string.

yarn generate:api passes (orval generation + lint + typecheck).

* refactor: migrate downstream consumers to Resource/Verb rename

* chore(openapi): regenerate spec for Resource/Verb rename

* feat(coretypes): add ListResources accessor with stable sort

* feat(cmd): add 'generate authz' subcommand for permissions type

* refactor(authz): drop runtime authz/resources endpoint

* refactor(frontend): consume static permissions.type.ts directly

* chore(frontend): regenerate Orval client without authz/resources

* ci: move authz schema check from jsci to goci

* refactor(coretypes): move Selector/Object/Transaction from authtypes

* feat(coretypes): add managed role names and permission policy

* feat(coretypes): add Registry assembling resources, types, and managed-role transactions

* refactor(authz): wire *coretypes.Registry; drop RegisterTypeable

* refactor(cmd): wire coretypes.NewRegistry into server bootstraps

* chore: regenerate openapi spec for authtypes -> coretypes type moves

* chore(frontend): regenerate API client for Authtypes -> Coretypes type moves

* refactor(coretypes): rename GettableResource to ResourceRef

* refactor(authz): collapse Registry around static data; bridge once at construction

* refactor(coretypes): tighten Registry, restore anonymous public-dashboard grant

Drops passthrough fields from coretypes.Registry; adds an O(1) lookup map
for NewResourceFromTypeAndKind; replaces stringly-typed Type compares with
Type.Equals; removes the now-redundant getUniqueTypes helper. Restores the
signoz-anonymous read grant on metaresource/public-dashboard that was
silently dropped, and removes the invalid signoz-admin/VerbCreate/TypeUser
entry that panicked at startup.

* chore: regenerate openapi spec for coretypes -> authtypes type moves

* chore(frontend): regenerate API client for Coretypes -> Authtypes type moves

* fix(authz): disambiguate kind→type by relation, preserve multi-part selectors

permissions.type.ts now lists the same kind (dashboard, role,
public-dashboard) under both metaresource and metaresources, so the prior
kind→type map silently overwrote one with the other. Resolve the type
using the requesting relation's allowed types, and slice the selector at
the first colon so multi-part selectors (e.g. id:version) round-trip
correctly. Updates useAuthZ.test.tsx to use the regenerated kind field.

* refactor(authtypes): introduce Relation wrapper over coretypes.Verb

The authz layer modeled relations as raw coretypes.Verb everywhere, which
forced authz-level concepts (action, role-binding) to share a type with
schema-level enumerations. Introduce authtypes.Relation as a thin wrapper
over coretypes.Verb so the authz APIs (CheckWithTupleCreation, ListObjects,
GetObjects, PatchObjects, NewTuples, Transaction.Relation, etc.) can grow
authz-specific affordances without leaking back into coretypes.

Also reshuffles the static coretypes data into dedicated registry_*.go files
(types, kinds, verbs, resources, managed roles) to keep the schema declarations
isolated from the value types they configure.

* refactor(authtypes): expose Relation.Enum() and regenerate openapi spec

Without an Enum() method on Relation the openapi generator emitted an
empty AuthtypesRelation schema (no allowed values). Forward the enum
from the embedded coretypes.Verb so the wire contract is faithful.

* refactor(ee/authz): drop always-nil error returns from managed-role tuple helpers

getManagedRoleGrantTuples and getManagedRoleTransactionTuples never
returned a non-nil error, which the linter (unparam) had flagged. Drop
the unused error return; callers no longer need the err check either.

* chore(frontend): regenerate API client for authtypes.Relation

* fix(authz): satisfy go-lint — keyed Relation literal, drop redundant Verb selector

* refactor(coretypes): sync Kinds slice with full registry_kind declarations

* feat(coretypes): register metaresource and metaresources for all new kinds

Adds 21 metaresource and 21 metaresources entries (covering notification-channel,
route-policy, apdex-setting, auth-domain, session, cloud-integration,
cloud-integration-service, ingestion-key, ingestion-limit, pipeline,
user-preference, org-preference, quick-filter, ttl-setting, rule,
planned-maintenance, saved-view, trace-funnel, factor-password, factor-api-key,
license) so the authz schema covers every resource Kind declared in
registry_kind. Regenerates the static frontend permissions.type.ts to match.

* feat(coretypes): populate ManagedRoleToTransactions from signozapiserver routes

Enumerates every (verb, resource) tuple each managed role holds, derived
from the AdminAccess/EditAccess/ViewAccess middleware on routes in
pkg/apiserver/signozapiserver and the legacy http_handler in
pkg/query-service/app. Admin gets 123 transactions, editor 53, viewer 25,
anonymous keeps the single public-dashboard read.

* feat(coretypes): add integration kind with full CRUD for viewer/editor/admin

Install/uninstall/list integration routes (legacy /api/v1/integrations) all
sit behind ViewAccess, so every authenticated role gets the full CRUD
surface on (metaresource, integration) and (metaresources, integration).
Regenerates the static frontend permissions.type.ts to match.

* feat(coretypes): add subscription kind alongside license, document LCRUD shape

License covers the in-product license resource (Activate/Refresh/GetActive).
Subscription is the billing lifecycle (checkout/portal/billing) served by
ee/query-service routes. Both are admin-only and modeled with a uniform
LCRUD shape; comments call out which verbs actually map to routes versus
which are placeholders for shape parity (e.g. cancellation flows through
Stripe's portal, not an in-process delete).

* feat(coretypes): model telemetryresource for logs, traces, metrics

Mirrors the telemetryresource type from ee/authz/openfgaschema/base.fga
into coretypes: a read-only Type with three Kinds (logs, traces, metrics)
matching telemetrytypes.Signal. Selector is wildcard-only for v1; future
work can narrow per-service or per-environment when the use case lands.
Every managed role (admin/editor/viewer) gets read on each signal,
matching the schema's role#assignee grant. Anonymous stays unchanged.
Regenerates the static frontend permissions.type.ts.

* feat(coretypes): add audit-logs and meter-metrics kinds under telemetryresource

Audit logs (signal=logs, source=audit) and meter metrics (signal=metrics,
source=meter) are sensitive source-qualified telemetry streams that don't
belong under the broad read-grant every role gets on regular logs/traces/
metrics. Modeled as distinct Kinds so they can be permissioned
independently. Admin-only read for now; widen on explicit ask (e.g. an
auditor flow that needs viewer access to audit-logs). Regenerates the
static frontend permissions.type.ts.

* feat(coretypes): add logs-field and traces-field kinds for stored field config

GET/POST /logs/fields and /api/v2/traces/fields manage stored, mutable
field metadata (indexed/promoted columns) over each signal. They're
configuration, not telemetry data, so they sit under metaresource rather
than telemetryresource. Viewer reads, editor/admin update; no
create/delete since POST overwrites. Plural prefix (logs-field /
traces-field) matches the signal naming.

* chore(frontend): regenerate permissions.type.ts to match generate authz output

* feat(authz): add attach permissions to fga model

* fix(tests): use role permissions instead of dashboards

* fix(authz): couple of issues with register flow

* fix(authz): public dashboard read should be anomymous

* fix(tests): integration test for public dashboard access

---------

Co-authored-by: vikrantgupta25 <vikrant@signoz.io>
2026-05-05 19:13:09 +05:30
Piyush Singariya
584bf2fe74 chore: add json enabled as feature flag for FE (#11050)
* chore: add json enabled as feature flag for FE

* fix: still using global bool

* feat: flagger integration in flow

* fix: flagger threaded into tests

* test: removed nil checks

* fix: minor changes

* chore: rename field

* chore: remove querybuilder helper

* fix: unit tests

* fix: correct env var

* fix: lint fix

* fix: lint

* chore: replace flag
2026-04-28 15:29:46 +00:00
Pandey
0648cd4e18 feat(audit): add telemetry audit query infrastructure (#10811)
* feat(audit): add telemetry audit query infrastructure

Add pkg/telemetryaudit/ with tables, field mapper, condition builder,
and statement builder for querying audit logs from signoz_audit database.
Add SourceAudit to source enum and integrate audit key resolution
into the metadata store.

* chore: address review comments

Comment out SourceAudit from Enum() until frontend is ready.
Use actual audit table constants in metadata test helpers.

* fix(audit): align field mapper with actual audit DDL schema

Remove resources_string (not in audit table DDL).
Add event_name as intrinsic column.
Resource context resolves only through the resource JSON column.

* feat(audit): add audit field value autocomplete support

Wire distributed_tag_attributes_v2 for signoz_audit into the
metadata store. Add getAuditFieldValues() and route SignalLogs +
SourceAudit to it in GetFieldValues().

* test(audit): add statement builder tests

Cover all three request types (list, time series, scalar) with
audit-specific query patterns: materialized column filters, AND/OR
conditions, limit CTEs, and group-by expressions.

* refactor(audit): inline field key map into test file

Remove test_data.go and inline the audit field key map directly
into statement_builder_test.go with a compact helper function.

* style(audit): move column map to const.go, use sqlbuilder.As in metadata

Move logsV2Columns from field_mapper.go to const.go to colocate all
column definitions. Switch getAuditKeys() to use sb.As() instead of
raw string formatting. Fix FieldContext alignment.

* fix(audit): align table names with schema migration

Migration uses logs/distributed_logs (not logs_v2/distributed_logs_v2).
Rename LogsV2TableName to LogsTableName and LogsV2LocalTableName to
LogsLocalTableName to match the actual signoz_audit DDL.

* feat(audit): add integration test fixture for audit logs

AuditLog fixture inserts into all 5 signoz_audit tables matching
the schema migration DDL: distributed_logs (no resources_string,
has event_name), distributed_logs_resource, distributed_tag_attributes_v2,
distributed_logs_attribute_keys, distributed_logs_resource_keys.

* fix(audit): rename tag_attributes_v2 to tag_attributes

Migration uses tag_attributes/distributed_tag_attributes (no _v2
suffix). Rename constants and update all references including the
integration test fixture.

* feat(audit): wire audit statement builder into querier

Add auditStmtBuilder to querier struct and route LogAggregation
queries with source=audit to it in all three dispatch locations
(main query, live tail, shiftedQuery). Create and wire the full
audit query stack in signozquerier provider.

* test(audit): add integration tests for audit log querying

Cover the documented query patterns: list all events, filter by
principal ID, filter by outcome, filter by resource name+ID,
filter by principal type, scalar count for alerting, and
isolation test ensuring audit data doesn't leak into regular logs.

* fix(audit): revert sb.As in getAuditKeys, fix fixture column_names

Revert getAuditKeys to use raw SQL strings instead of sb.As() which
incorrectly treated string literals as column references. Add explicit
column_names to all ClickHouse insert calls in the audit fixture.

* fix(audit): remove debug assertion from integration test

* feat(audit): internalize resource filter in audit statement builder

Build the resource filter internally pointing at
signoz_audit.distributed_logs_resource. Add LogsResourceTableName
constant. Remove resourceFilterStmtBuilder from constructor params.
Update test expectations to use the audit resource table.

* fix(audit): rename resource.name to resource.kind, move to resource attributes

Align with schema change from SigNoz/signoz#10826:
- signoz.audit.resource.name renamed to signoz.audit.resource.kind
- resource.kind and resource.id moved from event attributes to OTel
  Resource attributes (resource JSON column)
- Materialized columns reduced from 7 to 5 (resource.kind and
  resource.id no longer materialized)

* refactor(audit): use pytest.mark.parametrize for filter integration tests

Consolidate filter test functions into a single parametrized test.
6/8 tests passing; resource kind+ID filter and scalar count need
further investigation (resource filter JSON key extraction with
dotted keys, scalar response format).

* fix(audit): add source to resource filter for correct metadata routing

Add source param to telemetryresourcefilter.New so the resource
filter's key selectors include Source when calling GetKeysMulti.
Without this, audit resource keys route to signoz_logs metadata
tables instead of signoz_audit. Fix scalar test to use table
response format (columns+data, not rows).

* refactor(audit): reuse querier fixtures in integration tests

Add source param to BuilderQuery and build_scalar_query in the
querier fixture. Replace custom _build_audit_query and
_build_audit_ts_query helpers with BuilderQuery and
build_scalar_query from the shared fixtures.

* refactor(audit): remove wrapper helpers, inline make_query_request calls

Remove _query_audit_raw and _query_audit_scalar helpers. Use
make_query_request, BuilderQuery, and build_scalar_query directly.
Compute time window at test execution time via _time_window() to
avoid stale module-level timestamps.

* refactor(audit): inline _time_window into test functions

* style(audit): use snake_case for pytest parametrize IDs

* refactor(audit): inline DEFAULT_ORDER using build_order_by

Use build_order_by from querier fixtures instead of OrderBy/
TelemetryFieldKey dataclasses. Allow BuilderQuery.order to accept
plain dicts alongside OrderBy objects.

* refactor(audit): inline all data setup, use distinct scenarios per test

Remove _insert_standard_audit_events helper. Each test now owns its
data: list_all uses alert-rule/saved-view/user resource types,
scalar_count uses multiple failures from different principals (count=2),
leak test uses a single organization event. Parametrized filter tests
keep the original 5-event dataset.

* fix(audit): remove silent empty-string guards in metadata store

Remove guards that silently returned nil/empty when audit DB params
were empty. All call sites now pass real constants, so misconfiguration
should fail loudly rather than produce silent empty results.

* style(audit): remove module docstring from integration test

* style: formatting fix in tables file

* style: formatting fix in tables file

* fix: add auditStmtBuilder nil param to querier_test.go

* fix: fix fmt
2026-04-09 08:12:32 +00:00
Naman Verma
349b98c073 fix: check on type instead of temporality for non existent metrics (#10871)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix: show warning for non-existent cost meter metrics

* chore: lint fix by removing unused list

* chore: py fmt add new line

* fix: missing metric check on type instead of temporality

* test: fix unit tests by mocking type data

* test: unit tests

* revert: revert changes from meter branch

* revert: revert changes from meter branch

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-08 02:28:39 +00:00
Srikanth Chekuri
b5b89bb678 fix: several panic errors (#10817)
* fix: several panic errors

* fix: add recover for dashboard sql
2026-04-04 10:38:42 +00:00
Pandey
b5eab118dd refactor: move and internalize resource filter statement builder (#10824)
* refactor: move resourcefilter to pkg/telemetryresourcefilter

Move pkg/querybuilder/resourcefilter to pkg/telemetryresourcefilter
to align with the existing telemetry package naming convention
(telemetrylogs, telemetrytraces, telemetrymetrics, telemetrymeter).
The resource filter is a statement builder, not a query builder utility.

* refactor: internalize resource filter construction in statement builders

Each telemetry statement builder (logs, traces) now creates its own
resource filter internally instead of receiving it as an injected
dependency. This makes it impossible to wire the wrong resource table
and simplifies the provider.

Delete telemetryresourcefilter/tables.go — each telemetry package now
owns its resource table constant (LogsResourceV2TableName in
telemetrylogs, TracesResourceV3TableName in telemetrytraces).

* refactor: create field mapper and condition builder inside resource filter New

Remove fieldMapper and conditionBuilder params from
telemetryresourcefilter.New — they are always the same
(NewFieldMapper + NewConditionBuilder) so create them internally.
2026-04-04 10:27:46 +00:00
Naman Verma
65402ca367 fix: warning instead of error for dormant metrics in query range API (#10737)
* fix: warning instead of error for dormant metrics in query range API

* fix: add missing else

* fix: keep track of present aggregations

* fix: note present aggregation after type is set

* test: integration test fix and new test

* chore: lint errors

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-02 07:25:29 +00:00
Vikrant Gupta
bad80399a6 feat(serviceaccount): integrate service account (#10681)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat(serviceaccount): integrate service account

* feat(serviceaccount): integrate service account with better types

* feat(serviceaccount): fix lint and testing changes

* feat(serviceaccount): update integration tests

* feat(serviceaccount): fix formatting

* feat(serviceaccount): fix openapi spec

* feat(serviceaccount): update txlock to immediate to avoid busy snapshot errors

* feat(serviceaccount): add restrictions for factor_api_key

* feat(serviceaccount): add restrictions for factor_api_key

* feat: enabled service account and deprecated API Keys (#10715)

* feat: enabled service account and deprecated API Keys

* feat: deprecated API Keys

* feat: service account spec updates and role management changes

* feat: updated the error component for roles management

* feat: updated test case

* feat: updated the error component and added retries

* feat: refactored code and added retry to happend 3 times total

* feat: fixed feedbacks and added test case

* feat: refactored code and removed retry

* feat: updated the test cases

---------

Co-authored-by: SagarRajput-7 <162284829+SagarRajput-7@users.noreply.github.com>
2026-04-01 07:20:59 +00:00
Vikrant Gupta
2163e1ce41 chore(lint): enable godot and staticcheck (#10775)
* chore(lint): enable godot and staticcheck

* chore(lint): merge main and fix new lint issues in main
2026-03-31 09:11:49 +00:00
Nityananda Gohain
e4f0d026c1 feat: time aware dynamic field mapper (#9669)
Some checks failed
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat: time aware dynamic field mapper

* fix: tests

* fix: update fetch code

* fix: update comment

* fix: remove goroutine

* fix: update tests

* fix: minor changes

* fix: minor changes

* fix: use proper cache

* fix: use orgId properly

* fix: aggregation

* fix: comments

* fix: test

* fix: lint issues

* fix: minor changes

* fix: address changes and add tests

* fix: use name from evolutions

* fix: make the evolution code reusable and propagte context properly

* fix: tests

* fix: update the evolution metadata table

* fix: lint issues

* fix: address cursor comments

* fix: tests

* fix: tests

* fix: changes

* fix: refactor code

* fix: more changes

* fix: clean up evolution logic

* fix: fetch keys at the start

* fix: more changes

* fix: structural changes

* fix: add api

* fix: minor cleanup

* fix: update query in adjust keys

* fix: edgecases

* fix: update conditionfor

* fix: address comments

* fix: revert commented test

* fix: minor refactoring and addressing comments

* fix: evolution metadata

* fix: more fixes

* fix: add support for version in the evolutions

* fix: final cleanup

* fix: copy slice

* fix: address comments

* fix: address comments

* fix: field_mapper

* chore: add integration tests

* fix: lint

* fix: lint

* chore: integration test for materialized

* chore: add remaining changes

* chore: fix lint in integration tests

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-03-28 05:12:49 +00:00
Pandey
234716df53 fix(querier): return proper HTTP status for PromQL timeout errors (#10689)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(querier): return proper HTTP status for PromQL timeout errors

PromQL queries hitting the context deadline were incorrectly returning
400 Bad Request with "invalid_input" because enhancePromQLError
unconditionally wrapped all errors as TypeInvalidInput. Extract
tryEnhancePromQLExecError to properly classify timeout, cancellation,
and storage errors before falling through to parse error handling.

Also make the PromQL engine timeout configurable via prometheus.timeout
config (default 2m) instead of hardcoding it.

* chore: refactor files

* fix(prometheus): validate timeout config and fix test setups

Add validation in prometheus.Config to reject zero timeout. Update all
test files to explicitly set Timeout: 2 * time.Minute in prometheus.Config
literals to avoid immediate query timeouts.
2026-03-24 13:32:45 +00:00
Nityananda Gohain
dde7c79b4d fix: prevent duplicate and incorrect results from trace_summary timerange override in list view (#10637)
* fix: updated implementation for using trace summary in list view

* chore: move trace optimisation outside of statement builder

* fix: lint issues

* chore: update comments in integration tests

* chore: remove unnecessary test

* fix: py-fmt
2026-03-24 08:04:28 +00:00
Pandey
b811991f9d feat(middleware): add panic recovery middleware (#10666)
* feat(middleware): add panic recovery middleware with TypeFatal error type

Add a global HTTP recovery middleware that catches panics, logs them
with OTel exception semantic conventions via errors.Attr, and returns
a safe user-facing error response. Introduce TypeFatal/CodeFatal for
unrecoverable failures and WithStacktrace to attach pre-formatted
stack traces to errors. Remove redundant per-handler panic recovery
blocks in querier APIs.

* style(errors): keep WithStacktrace call on same line in test

* fix(middleware): replace fmt.Errorf with errors.New in recovery test

* feat(middleware): add request context to panic recovery logs

Capture request body before handler runs and include method, path, and
body in panic recovery logs using OTel semconv attributes. Improve error
message to direct users to GitHub issues or support.
2026-03-23 06:25:26 +00:00
Pandey
95ed125bd9 feat(instrumentation): add OTel exception semantic convention log handler (#10665)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat(instrumentation): add OTel exception semantic convention log handler

Add a loghandler.Wrapper that enriches error log records with OpenTelemetry
exception semantic convention attributes (exception.type, exception.code,
exception.message, exception.stacktrace).

- Add errors.Attr() helper for standardized error logging under "exception" key
- Add exception log handler that replaces raw error attrs with structured group
- Wire exception handler into the instrumentation SDK logger chain
- Remove LogValue() from errors.base as the handler now owns structuring

* refactor: replace "error", err with errors.Attr(err) across codebase

Migrate all slog error logging from ad-hoc "error", err key-value pairs
to the standardized errors.Attr(err) helper, enabling the exception log
handler to enrich these logs with OTel semantic convention attributes.

* refactor: enforce attr-only slog style across codebase

Change sloglint from kv-only to attr-only, requiring all slog calls to
use typed attributes (slog.String, slog.Any, etc.) instead of key-value
pairs. Convert all existing kv-style slog calls in non-excluded paths.

* refactor: tighten slog.Any to specific types and standardize error attrs

- Replace slog.Any with slog.String for string values (action, key, where_clause)
- Replace slog.Any with slog.Uint64 for uint64 values (start, end, step, etc.)
- Replace slog.Any("err", err) with errors.Attr(err) in dispatcher and segment analytics
- Replace slog.Any("error", ctx.Err()) with errors.Attr in factory registry

* fix(instrumentation): use Unwrapb message for exception.message

Use the explicit error message (m) from Unwrapb instead of
foundErr.Error(), which resolves to the inner cause's message
for wrapped errors.

* feat(errors): capture stacktrace at error creation time

Store program counters ([]uintptr) in base errors at creation time
using runtime.Callers, inspired by thanos-io/thanos/pkg/errors. The
exception log handler reads the stacktrace from the error instead of
capturing at log time, showing where the error originated.

* fix(instrumentation): apply default log wrappers uniformly in NewLogger

Move correlation, filtering, and exception wrappers into NewLogger so
all call sites (including CLI loggers in cmd/) get them automatically.

* refactor(instrumentation): remove variadic wrappers from NewLogger

NewLogger no longer accepts arbitrary wrappers. The core wrappers
(correlation, filtering, exception) are hardcoded, preventing callers
from accidentally duplicating behavior.

* refactor: migrate remaining "error", <var> to errors.Attr across legacy paths

Replace all remaining "error", <variable> key-value pairs with
errors.Attr(<variable>) in pkg/query-service/ and ee/query-service/
paths that were missed in the initial migration due to non-standard
variable names (res.Err, filterErr, apiErrorObj.Err, etc).

* refactor(instrumentation): use flat exception.* keys instead of nested group

Use flat keys (exception.type, exception.code, exception.message,
exception.stacktrace) instead of a nested slog.Group in the exception
log handler.
2026-03-22 04:06:31 +00:00
Piyush Singariya
0e5a128325 refactor: consolidate body column for JSON logs (#10325)
* feat: has JSON QB

* fix: tests expected queries and values

* fix: ignored .vscode in gitignore

* fix: tests GroupBy

* revert: gitignore change

* fix: build json plans in metadata

* fix: empty filteredArrays condition

* fix: tests

* fix: tests

* fix: json qb test fix

* fix: review based on tushar

* fix: changes based on review from Srikanth

* fix: remove unnecessary bool checking

* fix: removed comment

* fix: merge json body columns together

* chore: var renamed

* fix: merge conflict

* test: fix

* fix: tests

* fix: go test flakiness

* chore: merge json fields

* fix: handle datatype collision

* revert: few unrelated changes

* revert: more unrelated change

* test: blocked on pr #10153

* feat: mapping body_v2.message:string map to body

* fix: go.mod required changes

* fix: remove unused function

* fix: test fixed

* fix: go mod changes

* fix: tests

* fix: go lint

* revert: remvoing unused function

* revert: change ReadMultiple is needed

* fix: body.message not being mapped correctly

* fix: append warnings from fieldkeys

* fix: change warning to a const to fix tests

* chore: addressing comments from Nitya

* chore: remove unnecessary change

* fix: shift warning attachment to getKeySelectors

* fix: lint error

* feat: update message as typehint in JSON Column (#10545)

* fix: cursor comments

* chore: minor changes based on review

* fix: message field key search in JSON Logs (#10577)

* feat: work in progress

* fix: test run success

* fix: in progress

* fix: excluding message from metadata fetch

* test: cleared

* fix: key name in metadata

* fix: uncomment tests

* chore: change to method for staticfields

* fix: remove confusing comments; remove usage of logical keyword

* chore: shift method above business logic

* chore: changes based on review

* fix: comments in metadata_store.go

* fix: fallback expr switch case

* revert: remove unused JSON Field datatype

* fix: remove the exception checking

* chore: keep message contained to field mapper

* chore: text search tests

* fix: package test fixed

* fix: redundant code block removal

* fix: retain staticfield implementation and spell fix

* fix: nil param lint

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
2026-03-18 10:48:17 +00:00
Naman Verma
24b72084ac fix: return not-found error with diagnostic info for absent metrics (#10560)
* fix: check for metric type without query range constraint

* revert: revert check for metric type without query range constraint

* chore: move temporality+type fetcher to the case where it is actually used

* fix: don't send absent metrics to query builder

* chore: better package import name

* test: unit test add mock for metadata call (which is expected in the test's scenario)

* revert: revert seeding of absent metrics

* fix: throw a not found err if metric data is missing

* test: unit test add mock for metadata call (which is expected in the test's scenario)

* revert: no need for special err handling in threshold rule

* chore: add last seen info in err message

* test: fix broken dashboard test

* test: integration test for short time range query

* chore: python lint issue
2026-03-17 16:15:32 +00:00
Naman Verma
51967c527f Upgrade prometheus/common and prometheus/prometheus to latest available version (#10467)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: upgrade prometheus/common to latest available version

* chore: upgrade prometheus/prometheus to latest available version

* chore: easy changes first

* chore: slightly unsure changes

* fix: correct imported version of semconv in sdk.go

* test: ut fix, just matched expected and actual nothing else

* test: ut fix, just matched expected and actual nothing else

* test: ut fix, just matched expected and actual nothing else

* test: ut fix, just matched expected and actual nothing else

* test: ut fix, pass no nil prometheus registry

* chore: upgrade go version in dockerfile to 1.25

* chore: no need for our own alert store callback

* chore: 1.25 bullseye is still an rc so shifting to bookworm

* fix: parallel calls for each query in readmultiple method

* chore: remove unused var

* Sync PagerDuty frontend defaults with Alertmanager v0.31

Applied via @cursor push command

* chore: make ctx the first param

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2026-03-10 05:09:05 +00:00
Nityananda Gohain
3a82085eb4 chore: enrich clickhouse log_comment (#10446)
* chore: enchance clickhouse log_comment

* chore: enrich all clickhouse queries

* fix: remove is_meatadata

* fix: use the new func

* fix: use semconv keys

* fix: add more details

* fix: minor changes

* fix: address comments

* fix: address cursor comments

* fix: minor changes

* fix: addressed comments

* fix: address comments and move to module functions where possible

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-03-06 12:57:15 +00:00
Naman Verma
0e1bb5fd91 fix: exclude internal attributes from promQL results (#10465)
* fix: exclude internal attributes from promQL results

* fix: __name__ can stay
2026-03-05 13:07:06 +05:30
Naman Verma
396cf3194e feat: add support for count based aggregation in histogram metrics (#10355)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
2026-02-25 17:03:06 +00:00
Pandey
92b07d15ea chore: register querier routes in apiserver (#10370) 2026-02-20 07:08:48 +00:00
Srikanth Chekuri
5c86b80682 chore: add OpenAPI spec for /v5/query_range (#10239)
Some checks failed
build-staging / staging (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
2026-02-18 20:21:37 +00:00
Naman Verma
3726c0aac1 feat: add support for simultaneous delta and cumulative temporality (#10202) 2026-02-12 06:36:07 +00:00
Vikrant Gupta
e699ad8122 fix(meter): custom step intervals for meter aggregations (#10255)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(meter): custom step interval support

* fix(meter): custom step interval support

* fix(meter): custom step interval support

* fix(meter): custom step interval support

* fix(meter): remove frontend harcoding for step interval

* fix(meter): remove frontend harcoding for step interval
2026-02-10 16:54:14 +05:30
Piyush Singariya
7274d51236 feat: has function support New JSON QB (#10050)
* feat: has JSON QB

* fix: tests expected queries and values

* fix: ignored .vscode in gitignore

* fix: tests GroupBy

* revert: gitignore change

* fix: build json plans in metadata

* fix: empty filteredArrays condition

* fix: tests

* fix: tests

* fix: json qb test fix

* fix: review based on tushar

* fix: changes based on review from Srikanth

* fix: remove unnecessary bool checking

* fix: removed comment

* chore: var renamed

* fix: merge conflict

* test: fix

* fix: tests

* fix: go test flakiness

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-01-29 14:53:54 +05:30
Pandey
1c4dfc931f chore: move to clone instead of json marshal (#10076) 2026-01-23 16:34:30 +00:00
Srikanth Chekuri
14011bc277 fix: do not sort in descending locally if the other is explicitly spe… (#10033) 2026-01-20 11:17:08 +00:00
Srikanth Chekuri
ee734cf78c chore: return original error message with hints for invalid promql query (#10034)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
one step towards better experience for https://github.com/SigNoz/signoz/issues/9764
2026-01-20 03:04:59 +05:30
Srikanth Chekuri
0f62a04f92 chore: add query step intervals in response meta (#9954)
* chore: add query step intervals in response meta

* chore: regenerate api spec
2026-01-08 15:04:55 +05:30
Piyush Singariya
b4706743ba chore: JSON Logs Query Experience (#9381) 2026-01-07 20:28:03 +00:00
Karan Balani
f8b3cac191 feat: use flagger package to power feature flags (#9925)
Use the new `flagger` package to power the following features flags in the codebase:
- [x] `use_span_metrics`
- [x] `kafka_span_eval`
- [x] `interpolation_enabled`
2026-01-06 21:33:34 +00:00