* feat: meter reporter for new billing infra
* feat(meterreporter): simplify code, add metric meters, dry-run zeus call
* feat(meterreporter): add traces meters
* chore: update interval validation to allow min 5 mins interval for testing
* feat: add telemetry for collect and ship durations & improve comments
* feat(meterreporter): sealed-range catch-up and today-partial ticks
* chore: intermediate commit
* feat: improve retention period queries based on workspace ids for logs only for now
* chore: skip meter checkpoint call temporarily
* feat(meterreporter): bootstrap from data floor, emit sentinel zero-readings
* chore: lower HistoricalBackfillDays
* fix(meterreporter): pin retention type
* refactor(meterreporter): remove unused retry config
* refactor: add retentiontypes
* chore: intermediate commit
* feat(meterreporter): add metric and trace meters
* refactor: cleanup comments
* refactor: remove HistoricalBackfillDays
* refactor: move few things to ee package
* refactor: simplify some sections of tick
* refactor: push meters in batch for each day
* chore: add tracing and logging
* feat: make retention buckets generic
* feat(metercollector): add MeterCollector interface and split type packages
* feat(metercollector/retention): add narrow retention slice loader and SQL helpers
* refactor(meterreporter): wire http collectors
* chore(meterreporter): trim comments
* test(metercollector): add collector coverage
* chore(meterreporter): increase catchup window
* fix: ci lint and flag default value
* refactor(meters): align retention and zeus
* refactor(retention): move ttl types
* refactor(meters): rename platform fee collector
* refactor(meters): add meter constructor
* refactor(meters): add window constructor
* refactor(meters): consolidate zeus meter types
* refactor(meters): centralize meter metadata
* refactor(retention): add getter module
* refactor(retention): consolidate ttl types
* chore: use int64 instead of float64 as meter value
* chore: int64 conversion in clickhouse query too
* chore: error log - make failed meter collection louder
* chore: start sending data to zeus
* chore: add debug statement for logging meter data
* chore: simplify meter query only use org id and retention duration
* chore: remove unused functions from retention module and move sqlbuilder function too
* chore: remove unused code
* chore: switch to info context log for testing
* refactor(meterreporter): consolidate collectors and push origin into source
Replaces six near-duplicate collector packages with two parametrized,
factory-shaped ones: telemetrymetercollector for the ClickHouse-backed
meters (log size, span size, datapoint count) and staticmetercollector
for fixed-value meters (base platform fee). Each meter is now a Config
entry in cmd/enterprise/meter.go, materialized by iterating the factory.
Pushes the catchup floor concept out of the reporter and into each
collector via a new Origin method. Telemetry collectors return per-meter
min(unix_milli) FROM signoz_meter.samples; static collectors return
todayStart. The reporter now computes per-meter next-day-to-report and
only invokes a collector for days at/after its own next, eliminating
the over-emit + dropCheckpointed dance.
Other tightening: typed Meter.MeterName with JSON marshalers; Meter
dimensions built via attribute.Key-based zeustypes.NewDimensions;
license flows into Collect from the reporter (collectors stop fetching
it themselves); providerSettings plumbed into the meterreporter
factory closure for harness-style provider construction.
* refactor(meterreporter): per-collector Origin, simpler tick, semconv metrics
Pushes the catchup-floor concept out of the reporter and into each
collector via MeterCollector.Origin. Telemetry collectors return per-
meter min(unix_milli) FROM signoz_meter.samples; static collectors
return today. The reporter computes per-meter next-reportable-day,
iterates the day-loop globally, and only invokes a collector for days
at/after its own next — eliminating the over-emit + dropCheckpointed
dance entirely.
collectOrg is split into three named helpers: provider.checkpoints
(Zeus call + index), provider.nextDays (per-meter origin + checkpoint
max), and pure backfillRange (start/end clamped to yesterday + cap).
collectOrg itself reads as a five-step recipe.
Provider stores collectors as map[MeterName]MeterCollector keyed by
name; the slice + sort.Slice scaffolding is gone, validation moves
into newProvider. eligibleCollectors and report take the map directly.
Start matches the opaquetokenizer pattern: synchronous select+ticker,
sharder + per-org loop with license check (skipping orgs with no
active license), per-tick span scoped via an IIFE so defer span.End()
fires once per tick. goroutinesWg removed.
Config drops Timeout. CatchupMaxDaysPerTick renamed to MaxBackfillDays.
runPhase renamed to report. telemetryStore injection removed (no
longer used after dataFloor moved into the telemetry collector).
Metrics rebuilt around OTel semconv: signoz.meterreporter.checkpoints,
.reports, .collections, .meters — each bumped on success and failure,
with error.type set on failure via a new errors.TypeAttr helper in
pkg/errors. collections also carries signoz.meter.name.
* refactor(meterreporter): rename base platform fee meter, add metric units
Renames signoz.meter.base.platform.fee to signoz.meter.platform.active.
The new name matches the per-service template signoz.meter.<service>
.active that scales for future per-service billing meters; "active"
fits the billing-eligibility semantic (org's platform subscription
is active for the period) without conflating with operational
liveness conventions like Prometheus's `up`.
Adds UCUM annotated-count units to each reporter counter:
- signoz.meterreporter.checkpoints -> {checkpoint}
- signoz.meterreporter.reports -> {report}
- signoz.meterreporter.collections -> {collection}
- signoz.meterreporter.meters -> {meter}
* chore: stop leaking collectors if flag is false and address comments
* fix(meterreporter): correct startup and retention metadata
* fix(meterreporter): recover static meter backfill
* chore: address review comments
* chore: move flag evaluation into reporter
* refactor: fix retention origin for staticmeter collectors
* fix(meterreporter): gate backfill by license day
Replace max_backfill_days with a backfill switch.
Clamp sealed-day catch-up to the license creation day.
Send retention duration dimensions in seconds.
* fix(meterreporter): anchor backfill to license day
* chore: address review comments
* chore: drop unrelated authz schema diff
---------
Co-authored-by: Karan Balani <29383381+balanikaran@users.noreply.github.com>
Co-authored-by: grandwizard28 <vibhupandey28@gmail.com>
* feat(middleware): add panic recovery middleware with TypeFatal error type
Add a global HTTP recovery middleware that catches panics, logs them
with OTel exception semantic conventions via errors.Attr, and returns
a safe user-facing error response. Introduce TypeFatal/CodeFatal for
unrecoverable failures and WithStacktrace to attach pre-formatted
stack traces to errors. Remove redundant per-handler panic recovery
blocks in querier APIs.
* style(errors): keep WithStacktrace call on same line in test
* fix(middleware): replace fmt.Errorf with errors.New in recovery test
* feat(middleware): add request context to panic recovery logs
Capture request body before handler runs and include method, path, and
body in panic recovery logs using OTel semconv attributes. Improve error
message to direct users to GitHub issues or support.
* feat(instrumentation): add OTel exception semantic convention log handler
Add a loghandler.Wrapper that enriches error log records with OpenTelemetry
exception semantic convention attributes (exception.type, exception.code,
exception.message, exception.stacktrace).
- Add errors.Attr() helper for standardized error logging under "exception" key
- Add exception log handler that replaces raw error attrs with structured group
- Wire exception handler into the instrumentation SDK logger chain
- Remove LogValue() from errors.base as the handler now owns structuring
* refactor: replace "error", err with errors.Attr(err) across codebase
Migrate all slog error logging from ad-hoc "error", err key-value pairs
to the standardized errors.Attr(err) helper, enabling the exception log
handler to enrich these logs with OTel semantic convention attributes.
* refactor: enforce attr-only slog style across codebase
Change sloglint from kv-only to attr-only, requiring all slog calls to
use typed attributes (slog.String, slog.Any, etc.) instead of key-value
pairs. Convert all existing kv-style slog calls in non-excluded paths.
* refactor: tighten slog.Any to specific types and standardize error attrs
- Replace slog.Any with slog.String for string values (action, key, where_clause)
- Replace slog.Any with slog.Uint64 for uint64 values (start, end, step, etc.)
- Replace slog.Any("err", err) with errors.Attr(err) in dispatcher and segment analytics
- Replace slog.Any("error", ctx.Err()) with errors.Attr in factory registry
* fix(instrumentation): use Unwrapb message for exception.message
Use the explicit error message (m) from Unwrapb instead of
foundErr.Error(), which resolves to the inner cause's message
for wrapped errors.
* feat(errors): capture stacktrace at error creation time
Store program counters ([]uintptr) in base errors at creation time
using runtime.Callers, inspired by thanos-io/thanos/pkg/errors. The
exception log handler reads the stacktrace from the error instead of
capturing at log time, showing where the error originated.
* fix(instrumentation): apply default log wrappers uniformly in NewLogger
Move correlation, filtering, and exception wrappers into NewLogger so
all call sites (including CLI loggers in cmd/) get them automatically.
* refactor(instrumentation): remove variadic wrappers from NewLogger
NewLogger no longer accepts arbitrary wrappers. The core wrappers
(correlation, filtering, exception) are hardcoded, preventing callers
from accidentally duplicating behavior.
* refactor: migrate remaining "error", <var> to errors.Attr across legacy paths
Replace all remaining "error", <variable> key-value pairs with
errors.Attr(<variable>) in pkg/query-service/ and ee/query-service/
paths that were missed in the initial migration due to non-standard
variable names (res.Err, filterErr, apiErrorObj.Err, etc).
* refactor(instrumentation): use flat exception.* keys instead of nested group
Use flat keys (exception.type, exception.code, exception.message,
exception.stacktrace) instead of a nested slog.Group in the exception
log handler.
* feat: json Body Keys
* feat: telemetry types
* feat: change ExtractBodyPaths
* chore: minor comment change
* chore: func rename, file rename
* chore: change table names
* chore: reflect changes from the overhaul
* test: fixing test 1
* fix: test TestQueryToKeys
* fix: test TestPrepareLogsQuery
* chore: remove db
* chore: go mod
* chore: changes based on review
* chore: changes based on review
* fix: in LIKE operation
* chore: addressed few changes
* revert: test file
* fix: comparison fix
* test: add TestBuildListLogsJSONIndexesQuery
* fix: in test TestBuildListLogsJSONIndexesQuery
* fix: pull promoted paths in single db call
* fix: reducing db calls
* test: fix TestBuildListLogsJSONIndexesQuery
* fix: test TestConditionForJSONBodySearch
* fix: lint try 1
* chore: review changes based on cursor
* fix: use enums only
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
This PR fulfills the requirements of #9069 by:
- Adding a golangci-lint directive (forbidigo) to disallow all fmt.Errorf usages.
- Replacing existing fmt.Errorf instances with structured errors from github.com/SigNoz/signoz/pkg/errors for consistent error classification and lint compliance.
- Verified lint and build integrity.
## 📄 Summary
- Instead of relying on JWT for session management, we are adding another token system: opaque. This gives the benefits of expiration and revocation.
- We are now ensuring that emails are regex checked throughout the backend.
- Support has been added for OIDC protocol
* feat(access-control): embed openfga in signoz
* feat(authz): rename access control to authz
* feat(authz): fix codeowners and go mod tidy
* feat(authz): fix lint
* feat(authz): update go version and move convertor to instrumentation
* feat(authz): some more lint issues
* feat(authz): some more lint issues
* feat(authz): some more lint issues
* feat(authz): fix more lint issues
* feat(authz): make logger converter interface