Files
signoz/pkg/errors
Karan Balani 0766ab31c0
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
feat: meter reporter for new billing infra (#11016)
* feat: meter reporter for new billing infra

* feat(meterreporter): simplify code, add metric meters, dry-run zeus call

* feat(meterreporter): add traces meters

* chore: update interval validation to allow min 5 mins interval for testing

* feat: add telemetry for collect and ship durations & improve comments

* feat(meterreporter): sealed-range catch-up and today-partial ticks

* chore: intermediate commit

* feat: improve retention period queries based on workspace ids for logs only for now

* chore: skip meter checkpoint call temporarily

* feat(meterreporter): bootstrap from data floor, emit sentinel zero-readings

* chore: lower HistoricalBackfillDays

* fix(meterreporter): pin retention type

* refactor(meterreporter): remove unused retry config

* refactor: add retentiontypes

* chore: intermediate commit

* feat(meterreporter): add metric and trace meters

* refactor: cleanup comments

* refactor: remove HistoricalBackfillDays

* refactor: move few things to ee package

* refactor: simplify some sections of tick

* refactor: push meters in batch for each day

* chore: add tracing and logging

* feat: make retention buckets generic

* feat(metercollector): add MeterCollector interface and split type packages

* feat(metercollector/retention): add narrow retention slice loader and SQL helpers

* refactor(meterreporter): wire http collectors

* chore(meterreporter): trim comments

* test(metercollector): add collector coverage

* chore(meterreporter): increase catchup window

* fix: ci lint and flag default value

* refactor(meters): align retention and zeus

* refactor(retention): move ttl types

* refactor(meters): rename platform fee collector

* refactor(meters): add meter constructor

* refactor(meters): add window constructor

* refactor(meters): consolidate zeus meter types

* refactor(meters): centralize meter metadata

* refactor(retention): add getter module

* refactor(retention): consolidate ttl types

* chore: use int64 instead of float64 as meter value

* chore: int64 conversion in clickhouse query too

* chore: error log - make failed meter collection louder

* chore: start sending data to zeus

* chore: add debug statement for logging meter data

* chore: simplify meter query only use org id and retention duration

* chore: remove unused functions from retention module and move sqlbuilder function too

* chore: remove unused code

* chore: switch to info context log for testing

* refactor(meterreporter): consolidate collectors and push origin into source

Replaces six near-duplicate collector packages with two parametrized,
factory-shaped ones: telemetrymetercollector for the ClickHouse-backed
meters (log size, span size, datapoint count) and staticmetercollector
for fixed-value meters (base platform fee). Each meter is now a Config
entry in cmd/enterprise/meter.go, materialized by iterating the factory.

Pushes the catchup floor concept out of the reporter and into each
collector via a new Origin method. Telemetry collectors return per-meter
min(unix_milli) FROM signoz_meter.samples; static collectors return
todayStart. The reporter now computes per-meter next-day-to-report and
only invokes a collector for days at/after its own next, eliminating
the over-emit + dropCheckpointed dance.

Other tightening: typed Meter.MeterName with JSON marshalers; Meter
dimensions built via attribute.Key-based zeustypes.NewDimensions;
license flows into Collect from the reporter (collectors stop fetching
it themselves); providerSettings plumbed into the meterreporter
factory closure for harness-style provider construction.

* refactor(meterreporter): per-collector Origin, simpler tick, semconv metrics

Pushes the catchup-floor concept out of the reporter and into each
collector via MeterCollector.Origin. Telemetry collectors return per-
meter min(unix_milli) FROM signoz_meter.samples; static collectors
return today. The reporter computes per-meter next-reportable-day,
iterates the day-loop globally, and only invokes a collector for days
at/after its own next — eliminating the over-emit + dropCheckpointed
dance entirely.

collectOrg is split into three named helpers: provider.checkpoints
(Zeus call + index), provider.nextDays (per-meter origin + checkpoint
max), and pure backfillRange (start/end clamped to yesterday + cap).
collectOrg itself reads as a five-step recipe.

Provider stores collectors as map[MeterName]MeterCollector keyed by
name; the slice + sort.Slice scaffolding is gone, validation moves
into newProvider. eligibleCollectors and report take the map directly.

Start matches the opaquetokenizer pattern: synchronous select+ticker,
sharder + per-org loop with license check (skipping orgs with no
active license), per-tick span scoped via an IIFE so defer span.End()
fires once per tick. goroutinesWg removed.

Config drops Timeout. CatchupMaxDaysPerTick renamed to MaxBackfillDays.
runPhase renamed to report. telemetryStore injection removed (no
longer used after dataFloor moved into the telemetry collector).

Metrics rebuilt around OTel semconv: signoz.meterreporter.checkpoints,
.reports, .collections, .meters — each bumped on success and failure,
with error.type set on failure via a new errors.TypeAttr helper in
pkg/errors. collections also carries signoz.meter.name.

* refactor(meterreporter): rename base platform fee meter, add metric units

Renames signoz.meter.base.platform.fee to signoz.meter.platform.active.
The new name matches the per-service template signoz.meter.<service>
.active that scales for future per-service billing meters; "active"
fits the billing-eligibility semantic (org's platform subscription
is active for the period) without conflating with operational
liveness conventions like Prometheus's `up`.

Adds UCUM annotated-count units to each reporter counter:
  - signoz.meterreporter.checkpoints  -> {checkpoint}
  - signoz.meterreporter.reports      -> {report}
  - signoz.meterreporter.collections  -> {collection}
  - signoz.meterreporter.meters       -> {meter}

* chore: stop leaking collectors if flag is false  and address comments

* fix(meterreporter): correct startup and retention metadata

* fix(meterreporter): recover static meter backfill

* chore: address review comments

* chore: move flag evaluation into reporter

* refactor: fix retention origin for staticmeter collectors

* fix(meterreporter): gate backfill by license day

Replace max_backfill_days with a backfill switch.
Clamp sealed-day catch-up to the license creation day.

Send retention duration dimensions in seconds.

* fix(meterreporter): anchor backfill to license day

* chore: address review comments

* chore: drop unrelated authz schema diff

---------

Co-authored-by: Karan Balani <29383381+balanikaran@users.noreply.github.com>
Co-authored-by: grandwizard28 <vibhupandey28@gmail.com>
2026-05-11 17:47:29 +00:00
..