Commit Graph

304 Commits

Author SHA1 Message Date
Pandey
7e7d7ab570 feat(global): add mcp_url to global config (#11085)
* feat(global): add mcp_url to global config

Adds an optional mcp_url field to the global config so the frontend can
gate the MCP settings page on its presence. When unset the API returns
"mcp_url": null (pointer + nullable:"true"); when set it emits the
parsed URL as a string.

* feat(global): surface mcp_url in frontend types

Adds mcp_url to the manual GlobalConfigData type and refreshes the
generated OpenAPI client so consumers can read the new field.

* docs(global): use <unset> placeholder for mcp_url example

Matches the style of external_url and ingestion_url above it.

* style(global): separate mcp_url prep from return in GetConfig

Adds a blank line between the nullable-conversion block and the return
statement so the two logical phases read as distinct blocks.

* feat(global): mark endpoint fields as required in the API schema

The backend always emits external_url, ingestion_url and mcp_url on
GET /api/v1/global/config (mcp_url as literal null when unset), so the
JSON keys are always present. Add required:"true" to all three and
regenerate the OpenAPI + frontend client so consumers get non-optional
types.

* revert(global): drop mcp_url from legacy GlobalConfigData type

The legacy hand-written type for the non-Orval getGlobalConfig client
should be left alone; consumers that need mcp_url go through the
generated Orval client.
2026-04-24 10:24:21 +00:00
Nikhil Soni
156459e414 refactor: move waterfall api to module structure with updated response (#10797)
* feat: setup types and interface for waterfall v3

v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service

* refactor: move type conversion logic to types pkg

* chore: add reason for using snake case in response

* fix: update span.attributes to map of string to any

To support otel format of diffrent types of attributes

* fix: remove unused fields and rename span type

To avoid confusing with otel span

* refactor: convert waterfall api to modules format

* chore: add same test cases as for old waterfall api

* chore: avoid sorting on every traversal

* fix: remove unused fields and rename span type

To avoid confusing with otel span

* fix: rename timestamp to milli for readability

* fix: add timeout to module context

* fix: use typed paramter field in logs

* chore: generate openapi spec for v3 waterfall

* fix: remove timeout since waterfall take longer

* fix: use int16 for status code as per db schema

* fix: update openapi specs

* refactor: break down GetWaterfall method for readability

* chore: avoid returning nil, nil

* refactor: move type creation and constants to types package

- Move DB/table/cache/windowing constants to tracedetailtypes package
- Add NewWaterfallTrace and NewWaterfallResponse constructors in types
- Use constructors in module.go instead of inline struct literals
- Reorder waterfall.go so public functions precede private ones

* refactor: extract ClickHouse queries into a store abstraction

Move GetTraceSummary and GetTraceSpans out of module.go into a
traceStore interface backed by clickhouseTraceStore in store.go.
The module struct now holds a traceStore instead of a raw
telemetrystore.TelemetryStore, keeping DB access separate from
business logic.

* refactor: move error to types as well

* refactor: separate out store calls and computations

* refactor: breakdown GetSelectedSpans for readability

* refactor: return 404 on missing trace and other cleanup

* refactor: use same method for cache key creation

* chore: remove unused duration nano field

* chore: use sqlbuilder in clickhouse store where possible

* refactor: move waterfall traverse logic to types

and extract out auto expanded span calculation

* chore: convert all timestamp to nano for consitancy

* chore: rename waterfall response to gettableX format

* chore: fix method calls in test after refactoring

* refactor: remove unused methods

* chore: fix openapi spec

* chore: better names for methods and vars

* chore: remove caching to match from v2

* chore: update openapi client

* refactor: move selection decision to types

* chore: move types to the top

* refactor: avoid passing the whole telementry store in a module

* refactor: move waterfall constants to module config

* chore: update openapi specs

* chore: update openapi clints
2026-04-24 05:01:30 +00:00
Vikrant Gupta
07e7fcac4b feat(authz): add check API for community build (#11056)
* feat(authz): add check API for community build

* feat(authz): move to types

* feat(authz): fix the role corelations

* feat(authz): fix the role corelations

* fix(authz): single line returns
2026-04-23 17:59:46 +00:00
swapnil-signoz
21ce7a663a feat: adding cloud integration azure types and openapi spec (#11005)
* refactor: moving types to cloud provider specific namespace/pkg

* refactor: separating cloud provider types

* refactor: using upper case key for AWS

* feat: adding cloud integration azure types

* feat: adding azure services

* refactor: updating omitempty tags

* refactor: updating azure integration config

* feat: completing azure types

* refactor: lint issues

* refactor: updating command key

* fix: handle optional connection URL in AWS integration

* refactor: updating strategy struct

* refactor: updating azure blob storage service name

* refactor: update Azure service identifiers

* refactor: updating types
2026-04-23 14:48:04 +00:00
Vikrant Gupta
afe85c48f9 feat(authz): add support for delete role (#11044)
* feat(authz): add support for delete role

* feat(authz): register config and return error on cleanup failure

* feat(authz): take user and serviceaccount DI for assignee checks

* feat(authz): add the example yaml

* feat(authz): move to callbacks instead of DI
2026-04-23 13:25:19 +00:00
Vikrant Gupta
6996d41b01 fix(serviceaccount): status code for deleted service accounts (#11075)
* fix(serviceaccount): status code for deleted service accounts

* fix(authz): plural relation endpoints
2026-04-23 12:53:52 +00:00
Nikhil Mantri
89b755a6b0 feat(infra-monitoring): v2 hosts list api (#10805)
* chore: baseline setup

* chore: endpoint detail update

* chore: added logic for hosts v3 api

* fix: bug fix

* chore: disk usage

* chore: added validate function

* chore: added some unit tests

* chore: return status as a string

* chore: yarn generate api

* chore: removed isSendingK8sAgentsMetricsCode

* chore: moved funcs

* chore: added validation on order by

* chore: updated spec

* chore: nil pointer dereference fix in req.Filter

* chore: added temporalities of metrics

* chore: unified composite key function

* chore: code improvements

* chore: hostStatusNone added for clarity that this field can be left empty as well in payload

* chore: yarn generate api

* chore: return errors from getMetadata and lint fix

* chore: return errors from getMetadata and lint fix

* chore: added hostName logic

* chore: modified getMetadata query

* chore: add type for response and files rearrange

* chore: warnings added passing from queryResponse warning to host lists response struct

* chore: added better metrics existence check

* chore: added a TODO remark

* chore: added required metrics check

* chore: distributed samples table to local table change for get metadata

* chore: frontend fix

* chore: endpoint correction

* chore: endpoint modification openapi

* chore: escape backtick to prevent sql injection

* chore: rearrage

* chore: improvements

* chore: validate order by to validate function

* chore: improved description

* chore: added TODOs and made filterByStatus a part of filter struct

* chore: ignore empty string hosts in get active hosts

* feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956)

* chore: add functionality for showing active and inactive counts in custom group by

* chore: bug fix

* chore: added subquery for active and total count

* chore: ignore empty string hosts in get active hosts

* fix: sinceUnixMilli for determining active hosts compute once per request

* chore: refactor code

* chore: rename HostsList -> ListHosts

* chore: rearrangement

* chore: inframonitoring types renaming

* chore: added types package

* chore: file structure further breakdown for clarity

* chore: comments correction

* chore: removed temporalities

* chore: comments resolve

* chore: added json tag required: true

* chore: added status unauthorized

* chore: remove a defensive nil map check, the function ensure non-nil map when err nil

* chore: make sort stable in case of tiebreaker by comparing composite group by keys

* chore: regen api client for inframonitoring

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:30:33 +00:00
Abhishek Kumar Singh
30d3f754b5 feat: markdown renderer (#10682)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: custom notifiers in alert manager

* chore: lint fixs

* chore: fix email linter

* chore: added tracing to msteamsv2 notifier

* feat: alert manager template to template title and notification body

* chore: updated test name + code for timeout errors

* chore: added utils for using variables with $ notation

* chore: exposed templates for alertmanager types

* feat: added preprocessor for alert templater

* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata

* chore: fix lint issues

* chore: added handling for missing variable used in template

* feat: converted alerttemplater to interface and updated tests

* refactor: added extractCommonKV instead of 2 different functions

* test: fix preprocessor test case

* feat: added support for  and  in templating

* chore: lint fix

* chore: renamed the interface

* chore: added test for missing function

* refactor: test case and sb related changed

* refactor: comments and test improvements

* chore: lint fix

* chore: updated comments

* feat: added basic html markdown templater

* chore: updated newline to markdown format

* feat: slack blockkit renderer using goldmark

* test: added test for html rendering

* feat: integrated slack blockit in markdownrenderer package and removed plaintext format

* chore: updated br with new line in test and logs added

* refactor: review comments

* refactor: lint fixes

* chore: updated licenses for notifiers

* chore: updated email notifier from upstream

* feat: return single templating result from  with flag for template type

* fix: variables with symbols in template

* feat: slack mrkdwn renderer

* feat: custom raw html renderer to escape <no value>

* chore: integrated slack mrkdwn renderer and added NoOp formatter

* chore: removed notifier test files

* fix: concurrent rendering in markdown renderer

* refactor: changes as per internal review

* chore: lint issue

* chore: removed special handling for softline break

* refactor: removed logger as markdown renderer dependency

* refactor: changed markdown renderer from interface to package-level functions

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-22 21:22:45 +00:00
Srikanth Chekuri
e015310b5b chore: add request examples for alert rules (#11023)
* chore: add request examples for alert rules

* chore: address ci
2026-04-22 13:10:52 +00:00
swapnil-signoz
a60f8551dd feat: adding missing AWS regions (#11039)
* feat: adding missing AWS regions

* feat: fe add new AWS regions for Mexico and Asia Pacific

* refactor: adding missing region in migration as well
2026-04-21 19:28:01 +00:00
Piyush Singariya
50b452080f chore: Removal of JSONDataType field (#10925)
* fix: tons of changes

* chore: remove redundent comparison

* ci: tests fixed

* fix: upgraded collector version

* fix: qbtoexpr tests

* fix: go sum

* chore: upgrade collector version v0.144.3-rc.4

* fix: tests

* ci: test fix

* revert: remove db binaries

* test: selectField tests added

* fix: added safeguards in plan generation

* fix: name changed to field_map

* fix: json access plan remval of AvailableTypes

* fix: invalid index usage on terminal condition

* fix: branches should tell missing array types

* fix: comment removed

* fix: issue with FuzzyMatching and API failing

* fix: int64 mapping

* ci: test and lint fix

* fix: test VisitKey

* test: running test for sku

* fix: buildFieldForJSON works

* fix: few minor changes

* fix: refactor tag vs field_key table

* fix: minor changes based on review

* revert: minor variable change

* fix: added more membership testcases

* revert: minor var names reverted

* ci: tests aligned

* fix: indexed expressions
2026-04-21 05:11:50 +00:00
swapnil-signoz
a9b458f1f6 refactor: moving types to cloud provider specific namespace/pkg (#10976)
* refactor: moving types to cloud provider specific namespace/pkg

* refactor: separating cloud provider types

* refactor: using upper case key for AWS
2026-04-20 10:42:13 +00:00
Naman Verma
ac26299c3d docs: perses schema for dashboards (#10609)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* docs: perses schema for dashboards

* chore: no need for Signal type in commons, only used once

* chore: no need for PageSize type in commons, only used once

* chore: rm comment

* chore: remove stub for time series chart

* chore: remove manually written manifest and package

* chore: remove validate file

* chore: no config folder

* chore: no config folder

* chore: no commons (for now)

* feat: validation script

* fix: remove fields from variable specs that are there in ListVariable

* chore: test file with way more examples

* chore: test file with way more examples

* chore: checkpoint for half correct setup

* chore: rearrange specs in package.json

* chore: py script not needed

* chore: rename

* chore: folders in schemas for arranging

* chore: folders in schemas for arranging

* fix: proper composite query schema

* feat: custom time series schema

* chore: comment explaining when to use composite query and when not

* feat: promql example

* chore: remove upstream import

* fix: promql fix

* docs: time series panel schema without upstream ref

* chore: object for visualization section

* docs: bar chart panel schema without upstream ref

* docs: number panel schema without upstream ref

* docs: number panel schema without upstream ref

* docs: pie chart panel schema without upstream ref

* docs: table chart panel schema without upstream ref

* docs: histogram chart panel schema without upstream ref

* docs: list panel schema without upstream ref

* chore: a more complex example

* chore: examples for panel types

* chore: remaining fields file

* fix: no more online validation

* chore: replace yAxisUnit by unit

* chore: no need for threshold prefix inside threshold obj

* chore: remove unimplemented join query schema

* fix: no nesting in context links

* fix: less verbose field names in dynamic var

* chore: actually name every panel as a panel

* chore: common package for panels' repeated definitions

* chore: common package for queries' repeated definitions

* chore: common package for variables' repeated definitions

* fix: functions in formula

* fix: only allow one of metric or expr aggregation in builder query

* fix: datasource in perses.json

* fix: promql step duration schema

* fix: proper type for selectFields

* chore: single version for all schemas

* fix: normalise enum defs

* chore: change attr name to name

* chore: common threshold type

* chore: doc for how to add a panel spec

* feat: textbox variable

* feat: go struct based schema for dashboardv2 with validations and some tests

* fix: go mod fix

* chore: perses folder not needed anymore

* chore: use perses updated/createdat

* fix: builder query validation (might need to revisit, 3 types seems bad)

* chore: go lint fixes

* chore: define constants for enum values

* chore: nil factory case not needed

* chore: nil factory case not needed

* chore: slight rearrange for builder spec readability

* feat: add TimeSeriesChartAppearance

* chore: no omit empty

* chore: span gaps in schema

* chore: context link not needed in plugins

* chore: remove format from threshold with label, rearrange structs

* test: fix unit tests

* chore: refer to common struct

* feat: query type and panel type matching

* test: unit tests improvement first pass

* test: unit tests improvement second pass

* test: unit tests improvement third pass

* test: unit tests improvement fourth pass

* test: unit test for dashboard with sections

* test: unit test for dashboard with sections

* fix: add missing dashboard metadata fields

* chore: go lint fixes

* chore: go lint fixes

* chore: changes for create v2 api

* chore: more info in StorableDashboardDataV2

* chore: diff check in update method

* chore: add required true tag to required fields

* feat: update metadata methods

* chore: go mod tidy

* chore: put id in metadata.name, authtypes for v2

* revert: only the schema for now in this PR

* chore: comment for why v1.DashboardSpec is chosen

* chore: change source to signal in DynamicVariableSpec

* fix: string values for precision option

* feat: literal options for comparison operator

* fix: missing required tag in threshold fields

* chore: use valuer.string for plugin kind enums

* chore: use only TelemetryFieldKey in ListPanelSpec

* chore: simplify variable plugin validation

* fix: do not allow nil panels

* fix: do not allow nil plugin spec

* fix: signal should be an enum not a string

* chore: rearrange enums to separate those with default values

* test: unit tests for invalid enum values

* fix: all enums should have a default value

* refactor: extract UnmarshalBuilderQueryBySignal to deduplicate signal dispatch

* refactor: proper struct for span gaps

* chore: back to normal strings for kind enums

* chore: ticks in err messages

* chore: ticks in err messages

* chore: remove unused struct

* chore: snake case for non-kind enum values

* chore: proper error wrapping

* chore: accept int values in PrecisionOption as fallback

* fix: actually update the plugin from map to custom struct

* feat: disallow unknown fields in plugins

* chore: make enums valuer.string

* chore: proper enum types in constants

* chore: rename value to avoid overriding valuer.string method

* test: db cycle test

* fix: lint fix in some other file

* test: remove collapse info from sections

* test: use testify package

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-20 09:29:03 +00:00
Pandey
61ae49d4ab refactor(ruler): introduce v2 Rule read type and validate uuid on delete (#10997)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(ruler): add Rule v2 read type and rename storage Rule to StorableRule

* refactor(ruler): map GettableRule to Rule before responding on v2 routes

* docs(openapi): regenerate spec with RuletypesRule on v2 rules routes

* docs(frontend): regenerate API clients with RuletypesRuleDTO

* refactor(ruler): validate uuid-v7 on delete rule handler
2026-04-18 12:39:50 +00:00
Pandey
a5e1f71cf6 refactor(ruler): tighten rule and downtime OpenAPI schema (#10995)
* refactor(ruler): add Enum() on AlertType

* refactor(ruler): convert RepeatType and RepeatOn to valuer.String with Enum()

* refactor(ruler): mark required fields on Recurrence

* refactor(ruler): mark required tag on CumulativeSchedule.Type

* refactor(ruler): rename GettablePlannedMaintenance to PlannedMaintenance

* docs: regenerate OpenAPI spec and frontend clients with tightened schema

* refactor(ruler): add PostablePlannedMaintenance input type with Validate

* refactor(ruler): rename EditPlannedMaintenance to Update and GetAll to List

* refactor(ruler): switch Create/Update to *PostablePlannedMaintenance

* refactor(ruler): convert PlannedMaintenance.Id string to ID valuer.UUID

* refactor(ruler): return *PlannedMaintenance from CreatePlannedMaintenance

* docs: regenerate OpenAPI spec and frontend clients for Postable/ID changes

* refactor(ruler): type PlannedMaintenance.Status as MaintenanceStatus enum

* refactor(ruler): type PlannedMaintenance.Kind as MaintenanceKind enum

* refactor(ruler): mark GettableRule.Id required

* refactor(ruler): mark GettableRule.State required

* refactor(ruler): make GettableRule timestamps non-pointer and users nullable

* refactor(ruler): return bare array from v2 ListRules instead of wrapped object

* docs: regenerate OpenAPI spec and frontend clients for schema pass
2026-04-18 09:46:07 +00:00
Pandey
c9610df66d refactor(ruler): move rules and planned maintenance handlers to signozapiserver (#10957)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(ruler): define Ruler and Handler interfaces with signozruler implementation

Expand the Ruler interface with rule management and planned maintenance
methods matching rules.Manager signatures. Add Handler interface for
HTTP endpoints. Implement handler in signozruler wrapping ruler.Ruler,
and update provider to embed *rules.Manager for interface satisfaction.

* refactor(ruler): move eval_delay from query-service constants to ruler config

Replace constants.GetEvalDelay() with config.EvalDelay on ruler.Config,
defaulting to 2m. This removes the signozruler dependency on
pkg/query-service/constants.

* refactor(ruler): use time.Duration for eval_delay config

Match the convention used by all other configs in the codebase.
TextDuration is for preserving human-readable text through JSON
round-trips in user-facing rule definitions, not for internal config.

* refactor(ruler): add godoc comments and spacing to Ruler interface

* refactor(ruler): wire ruler handler through signoz.New and signozapiserver

- Add Start/Stop to Ruler interface for lifecycle management
- Add rulerCallback to signoz.New() for EE customization
- Wire ruler.Handler through Handlers, signozapiserver provider
- Register 12 routes in signozapiserver/ruler.go (7 rules, 5 downtime)
- Update cmd/community and cmd/enterprise to pass rulerCallback
- Move rules.Manager creation from server.go to signoz.New via callback
- Change APIHandler.ruleManager type from *rules.Manager to ruler.Ruler
- Remove makeRulesManager from both OSS and EE server.go

* refactor(ruler): remove old rules and downtime_schedules routes from http_handler

Remove 7 rules CRUD routes and 5 downtime_schedules routes plus their
handler methods from http_handler.go. These are now served by
signozapiserver/ruler.go via handler.New() with OpenAPIDef.

The 4 v1 history routes (stats, timeline, top_contributors,
overall_status) remain in http_handler.go as they depend on
interfaces.Reader and have v2 equivalents already in signozapiserver.

* refactor(ruler): use ProviderFactory pattern and register in factory.Registry

Replace the rulerCallback with rulerProviderFactories following the
standard ProviderFactory pattern (like auditorProviderFactories). The
ruler is now created via factory.NewProviderFromNamedMap and registered
in factory.Registry for lifecycle management. Start/Stop are no longer
called manually in server.go.

- Ruler interface embeds factory.Service (Start/Stop return error)
- signozruler.NewFactory accepts all deps including EE task funcs
- provider uses named field (not embedding) with explicit delegation
- cmd/community passes nil task funcs, cmd/enterprise passes EE funcs
- Remove NewRulerProviderFactories (replaced by callback from cmd/)
- Remove manual Start/Stop from both OSS and EE server.go

* fix(ruler): make Start block on stopC per factory.Service contract

rules.Manager.Start is non-blocking (run() just closes a channel).
Add stopC to provider so Start blocks until Stop closes it, matching
the factory.Service contract used by the Registry.

* refactor(ruler): remove unused RM() accessor from EE APIHandler

* refactor(ruler): remove RuleManager from APIHandlerOpts

Use Signoz.Ruler directly instead of passing it through opts.

* refactor(ruler): add /api/v1/rules/test and mark /api/v1/testRule as deprecated

* refactor(ruler): use binding.JSON.BindBody for downtime schedule decode

* refactor(ruler): add TODOs for raw string params on Ruler interface

Mark CreateRule, EditRule, PatchRule, TestNotification, and DeleteRule
with TODOs to accept typed params instead of raw JSON strings. Requires
changing the storage model since the manager stores raw JSON as Data.

* refactor(ruler): add TODO on MaintenanceStore to not expose store directly

* docs: regenerate OpenAPI spec and frontend API clients with ruler routes

* refactor(ruler): rename downtime_schedules tag to downtimeschedules

* refactor(ruler): add query params to ListDowntimeSchedules OpenAPIDef

Add ListPlannedMaintenanceParams struct with active/recurring fields.
Use binding.Query.BindQuery in the handler instead of raw URL parsing.
Add RequestQuery to the OpenAPIDef so params appear in the OpenAPI spec
and generated frontend client.

* refactor(ruler): add GettableTestRule response type to TestRule endpoint

Define GettableTestRule struct with AlertCount and Message fields.
Use it as the Response in TestRule OpenAPIDef so the generated frontend
client has a proper response type instead of string.

* refactor(ruler): tighten schema with oneOf unions and required fields

Surface the polymorphism in RuleThresholdData and EvaluationEnvelope via
JSONSchemaOneOf (the same pattern as QueryEnvelope), so the generated
TS types are discriminated unions with typed `spec` instead of unknown.
Also mark `alert`, `ruleType`, and `condition` required on PostableRule
so the generated TS types are non-optional for callers.

* refactor(ruler): add Enum() on EvaluationKind, ScheduleType, ThresholdKind

Surface the fixed set of accepted values for these valuer-wrapped kind
types so OpenAPI emits proper string-enum schemas and the generated TS
types become string-literal unions instead of plain string.

* refactor(ruler): mark required fields on nested rule and maintenance types

Surface fields already enforced by Validate()/UnmarshalJSON as required
in the OpenAPI schema so the generated TS types match runtime behavior.

Touches RuleCondition (compositeQuery, op, matchType), RuleThresholdData
(kind, spec), BasicRuleThreshold (name, target, op, matchType),
RollingWindow (evalWindow, frequency), CumulativeWindow (schedule,
frequency, timezone), EvaluationEnvelope (kind, spec), Schedule
(timezone), GettablePlannedMaintenance (name, schedule).

Does not mark server-populated fields (id, createdAt, updatedAt, status,
kind) on GettablePlannedMaintenance required, since the same struct is
reused for request bodies in MaintenanceStore.CreatePlannedMaintenance.

* refactor(ruler): tighten AlertCompositeQuery, QueryType, PanelType schema

Missed in the earlier tightening pass. AlertCompositeQuery.queries,
panelType, queryType are all required for a valid composite query;
QueryType and PanelType are valuer-wrapped with fixed value sets, so
expose them as enums in the OpenAPI schema.

* refactor(ruler): wrap sql.ErrNoRows as TypeNotFound in by-ID lookups

GetStoredRule and GetPlannedMaintenanceByID previously returned bun's
raw Scan error, so a missing ID leaked "sql: no rows in result set" to
the HTTP response with a 500 status. WrapNotFoundErrf converts
sql.ErrNoRows into TypeNotFound so render.Error emits 404 with a stable
`not_found` code, and passes other errors through unchanged.

* refactor(ruler): move migrated rules routes to /api/v2/rules

The 7 rules routes now live at /api/v2/rules, /api/v2/rules/{id}, and
/api/v2/rules/test — served via handler.New with render.Success and
render.Error. The legacy /api/v1/rules paths will be restored in the
query-service http handler in a follow-up so existing clients keep
receiving the SuccessResponse envelope unchanged.

Drop the /api/v1/testRule deprecated alias from signozapiserver; the
original lives on main's http_handler.go and is restored alongside the
other v1 paths.

Downtime schedule routes stay at /api/v1/downtime_schedules — single
track, no legacy restore planned.

* refactor(ruler): restore /api/v1/rules legacy handlers for back-compat

Bring the 7 rule CRUD/test handlers and their router.HandleFunc lines
back to http_handler.go so /api/v1/rules, /api/v1/rules/{id}, and
/api/v1/testRule continue to emit the legacy SuccessResponse envelope.
The v2 versions under signozapiserver are the new home for the render
envelope used by generated clients.

Delegation uses aH.ruleManager (populated from opts.Signoz.Ruler in
NewAPIHandler), so a single ruler.Ruler instance serves both paths — no
second rules.Manager is instantiated.

Downtime schedules stay single-track under signozapiserver; the 5
downtime handlers are not restored.

* docs: regenerate OpenAPI spec and frontend clients for /api/v2/rules

* refactor(ruler): return 201 Created on POST /api/v2/rules

A successful create now responds with 201 Created and the full
GettableRule body, matching REST convention for resource creation.
Regenerates the OpenAPI spec and frontend clients to reflect the new
status code.

* refactor(ruler): restore dropped sorter TODO in legacy listRules

The legacy listRules handler was copied verbatim from main during the
v1 back-compat restore, but an inner blank line and the load-bearing
`// todo(amol): need to add sorter` comment were stripped. Put them
back so the legacy block round-trips cleanly against main.

* refactor(ruler): return 201 Created on POST /api/v1/downtime_schedules

Match the REST convention already applied to POST /api/v2/rules:
successful creates respond with 201 Created. Response body remains
empty (nil); the generated frontend client surface is unchanged since
no response type was declared.

A richer "return the created resource" response body is a separate
follow-up — holding off until the ruletypes naming cleanup lands.

* fix(ruler): signal Healthy only after manager.Start closes m.block

The ruler provider didn't implement factory.Healthy, so the registry
fell back to factory.closedC and marked the service StateRunning the
instant its Start goroutine spawned — before rules.Manager.Start had
closed m.block. /api/v2/healthz therefore returned 200 while rule
evaluation was still gated, and integration tests that POSTed a rule
immediately after the readiness check saw their task goroutines stuck
on <-m.block until the next frequency tick.

Add a healthyC channel and close it inside Start only after
manager.Start returns; implement factory.Healthy so the registry and
/api/v2/healthz wait on the real readiness signal.

* fix: add the withhealthy interface

* fix(ruler): alias legacy RULES_EVAL_DELAY env var in backward-compat

The eval_delay config was moved from query-service constants (read from
RULES_EVAL_DELAY) onto ruler.Config (read via mapstructure from
SIGNOZ_RULER_EVAL__DELAY). That silently broke the legacy env var for
any existing deployment — notably the alerts integration-test fixture
which sets RULES_EVAL_DELAY=0s to let rules evaluate against just-
inserted data. The resulting default 2m delay pushed the query window
far enough back that the fixture's rate spike fell outside it, causing
8 of 24 parametrize cases in 02_basic_alert_conditions.py to fail with
"Expected N alerts to be fired but got 0 alerts".

Add RULES_EVAL_DELAY to mergeAndEnsureBackwardCompatibility alongside
the ~10 other aliased legacy env vars. Emits the standard deprecation
warning and overrides config.Ruler.EvalDelay.
2026-04-18 08:25:16 +00:00
Abhishek Kumar Singh
dfbcd1e0ec feat: AlertManager templater (#10581)
* chore: custom notifiers in alert manager

* chore: lint fixs

* chore: fix email linter

* chore: added tracing to msteamsv2 notifier

* feat: alert manager template to template title and notification body

* chore: updated test name + code for timeout errors

* chore: added utils for using variables with $ notation

* chore: exposed templates for alertmanager types

* feat: added preprocessor for alert templater

* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata

* chore: fix lint issues

* chore: added handling for missing variable used in template

* feat: converted alerttemplater to interface and updated tests

* refactor: added extractCommonKV instead of 2 different functions

* test: fix preprocessor test case

* feat: added support for  and  in templating

* chore: lint fix

* chore: renamed the interface

* chore: added test for missing function

* refactor: test case and sb related changed

* refactor: comments and test improvements

* chore: lint fix

* chore: updated comments

* chore: updated newline to markdown format

* chore: updated br with new line in test and logs added

* refactor: review comments

* refactor: lint fixes

* chore: updated licenses for notifiers

* chore: updated email notifier from upstream

* feat: return single templating result from  with flag for template type

* fix: variables with symbols in template

* chore: removed notifier test files

* refactor: changes as per internal review

* chore: lint issue

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-17 13:00:15 +00:00
Vikrant Gupta
c8099a88c3 feat(member): add v2 reset password token endpoints and show invite expiry status (#10954)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(member): better UX for pending invite users

* fix(member): add integration tests and reuse timezone util

* fix(member): rename deprecated and remove dead files

* fix(member): do not use hypened endpoints

* fix(member): user friendly button text

* fix(member): update the API endpoints and integration tests

* fix(member): simplify handler naming convention

* fix(member): added v2 API for update my password

* fix(member): remove more dead code

* fix(member): fix integration tests

* fix(member): fix integration tests
2026-04-16 19:40:51 +00:00
Pandey
b3da6fb251 refactor(alertmanager): move API handlers to signozapiserver (#10941)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(alertmanager): move API handlers to signozapiserver

Extract Handler interface in pkg/alertmanager/handler.go and move
the implementation from api.go to signozalertmanager/handler.go.

Register all alertmanager routes (channels, route policies, alerts)
in signozapiserver via handler.New() with OpenAPIDef. Remove
AlertmanagerAPI injection from http_handler.go.

This enables future AuditDef instrumentation on these routes.

* fix(review): rename param, add /api/v1/channels/test endpoint

- Rename `am` to `alertmanagerService` in NewHandlers
- Add /api/v1/channels/test as the canonical test endpoint
- Mark /api/v1/testChannel as deprecated
- Regenerate OpenAPI spec

* fix(review): use camelCase for channel orgId json tag

* fix(review): remove section comments from alertmanager routes

* fix(review): use routepolicies tag without hyphen

* chore: regenerate frontend API clients for alertmanager routes

* fix: add required/nullable/enum tags to alertmanager OpenAPI types

- PostableRoutePolicy: mark expression, name, channels as required
- GettableRoutePolicy: change CreatedAt/UpdatedAt from pointer to value
- Channel: mark name, type, data, orgId as required
- ExpressionKind: add Enum() for rule/policy values
- Regenerate OpenAPI spec and frontend clients

* fix: use typed response for GetAlerts endpoint

* fix: add Receiver request type to channel mutation endpoints

CreateChannel, UpdateChannelByID, TestChannel, and TestChannelDeprecated
all read req.Body as a Receiver config. The OpenAPIDef must declare
the request type so the generated SDK includes the body parameter.

* fix: change CreateChannel access from EditAccess to AdminAccess

Aligns CreateChannel endpoint with the rest of the channel mutation
endpoints (update/delete) which all require admin access. This is
consistent with the frontend where notifications are not accessible
to editors.
2026-04-15 17:31:43 +00:00
Nikhil Mantri
6ad2711c7a feat(/fields/*) : introduce a new param metricNamespace in the APIs for prefix match (#10779)
* chore: initial commit

* chore: added metricNamespace as a new param

* chore: go generate openapi, update spec

* chore: frontend yarn generate:api

* chore: added metricnamespace support in /fields/values as well as added integration tests

* chore: corrected comment

* chore: added unit tests for getMetricsKeys and getMeterSourceMetricKeys

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-15 07:26:42 +00:00
swapnil-signoz
dce496d099 refactor: cloud integration modules implementation (#10718)
* feat: adding cloud integration type for refactor

* refactor: store interfaces to use local types and error

* feat: adding sql store implementation

* refactor: removing interface check

* feat: adding updated types for cloud integration

* refactor: using struct for map

* refactor: update cloud integration types and module interface

* fix: correct GetService signature and remove shadowed Data field

* feat: implement cloud integration store

* refactor: adding comments and removed wrong code

* refactor: streamlining types

* refactor: add comments for backward compatibility in PostableAgentCheckInRequest

* refactor: update Dashboard struct comments and remove unused fields

* refactor: split upsert store method

* feat: adding integration test

* refactor: clean up types

* refactor: renaming service type to service id

* refactor: using serviceID type

* feat: adding method for service id creation

* refactor: updating store methods

* refactor: clean up

* refactor: clean up

* refactor: review comments

* refactor: clean up

* feat: adding handlers

* fix: lint and ci issues

* fix: lint issues

* fix: update error code for service not found

* feat: adding handler skeleton

* chore: removing todo comment

* feat: adding frontend openapi schema

* feat: adding module implementation for create account

* fix: returning valid error instead of panic

* fix: module test

* refactor: simplify ingestion key retrieval logic

* feat: adding module implementation for AWS

* refactor: ci lint changes

* refactor: python formatting change

* fix: new storable account func was unsetting provider account id

* refactor: python lint changes

* refactor: adding validation on update account request

* refactor: reverting older tests and adding new tests

* chore: lint changes

* feat: using service account for API key

* refactor: renaming tests and cleanup

* refactor: removing dashboard overview images

* feat: adding service definition store

* chore: adding TODO comments

* feat: adding API for getting connection credentials

* feat: adding openapi spec for the endpoint

* feat: adding tests for credential API

* feat: adding cloud integration config

* refactor: updating test with new env variable for config

* refactor: moving few cloud provider interface methods to types

* refactor: review comments resolution

* refactor: lint changes

* refactor: code clean up

* refactor: removing email domain function

* refactor: review comments and clean up

* refactor: lint fixes

* refactor: review changes

- Added get connected account module method
- Fixed integration tests
- Removed cloud integration store as callback function's param

* refactor: changing wrong dashboard id for EKS definition
2026-04-13 15:16:26 +00:00
Vikrant Gupta
9ec6da5e76 fix(authz): unauthenticated error code for deleted service account (#10878)
* fix(authz): populate correct error for deleted service account

* chore(authz): reduce the regex restrictions on service accounts

* chore(authz): reduce the regex restrictions on service accounts

* fix(authz): populate correct error for deleted service account

* fix(authz): populate correct error for deleted service account
2026-04-13 13:19:25 +00:00
swapnil-signoz
7279c5f770 feat: adding query params in cloud integration APIs (#10900)
Some checks failed
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat: adding query params in cloud integration APIs

* refactor: create account HTTP status change from OK to CREATED
2026-04-10 09:20:35 +00:00
Pandey
0648cd4e18 feat(audit): add telemetry audit query infrastructure (#10811)
* feat(audit): add telemetry audit query infrastructure

Add pkg/telemetryaudit/ with tables, field mapper, condition builder,
and statement builder for querying audit logs from signoz_audit database.
Add SourceAudit to source enum and integrate audit key resolution
into the metadata store.

* chore: address review comments

Comment out SourceAudit from Enum() until frontend is ready.
Use actual audit table constants in metadata test helpers.

* fix(audit): align field mapper with actual audit DDL schema

Remove resources_string (not in audit table DDL).
Add event_name as intrinsic column.
Resource context resolves only through the resource JSON column.

* feat(audit): add audit field value autocomplete support

Wire distributed_tag_attributes_v2 for signoz_audit into the
metadata store. Add getAuditFieldValues() and route SignalLogs +
SourceAudit to it in GetFieldValues().

* test(audit): add statement builder tests

Cover all three request types (list, time series, scalar) with
audit-specific query patterns: materialized column filters, AND/OR
conditions, limit CTEs, and group-by expressions.

* refactor(audit): inline field key map into test file

Remove test_data.go and inline the audit field key map directly
into statement_builder_test.go with a compact helper function.

* style(audit): move column map to const.go, use sqlbuilder.As in metadata

Move logsV2Columns from field_mapper.go to const.go to colocate all
column definitions. Switch getAuditKeys() to use sb.As() instead of
raw string formatting. Fix FieldContext alignment.

* fix(audit): align table names with schema migration

Migration uses logs/distributed_logs (not logs_v2/distributed_logs_v2).
Rename LogsV2TableName to LogsTableName and LogsV2LocalTableName to
LogsLocalTableName to match the actual signoz_audit DDL.

* feat(audit): add integration test fixture for audit logs

AuditLog fixture inserts into all 5 signoz_audit tables matching
the schema migration DDL: distributed_logs (no resources_string,
has event_name), distributed_logs_resource, distributed_tag_attributes_v2,
distributed_logs_attribute_keys, distributed_logs_resource_keys.

* fix(audit): rename tag_attributes_v2 to tag_attributes

Migration uses tag_attributes/distributed_tag_attributes (no _v2
suffix). Rename constants and update all references including the
integration test fixture.

* feat(audit): wire audit statement builder into querier

Add auditStmtBuilder to querier struct and route LogAggregation
queries with source=audit to it in all three dispatch locations
(main query, live tail, shiftedQuery). Create and wire the full
audit query stack in signozquerier provider.

* test(audit): add integration tests for audit log querying

Cover the documented query patterns: list all events, filter by
principal ID, filter by outcome, filter by resource name+ID,
filter by principal type, scalar count for alerting, and
isolation test ensuring audit data doesn't leak into regular logs.

* fix(audit): revert sb.As in getAuditKeys, fix fixture column_names

Revert getAuditKeys to use raw SQL strings instead of sb.As() which
incorrectly treated string literals as column references. Add explicit
column_names to all ClickHouse insert calls in the audit fixture.

* fix(audit): remove debug assertion from integration test

* feat(audit): internalize resource filter in audit statement builder

Build the resource filter internally pointing at
signoz_audit.distributed_logs_resource. Add LogsResourceTableName
constant. Remove resourceFilterStmtBuilder from constructor params.
Update test expectations to use the audit resource table.

* fix(audit): rename resource.name to resource.kind, move to resource attributes

Align with schema change from SigNoz/signoz#10826:
- signoz.audit.resource.name renamed to signoz.audit.resource.kind
- resource.kind and resource.id moved from event attributes to OTel
  Resource attributes (resource JSON column)
- Materialized columns reduced from 7 to 5 (resource.kind and
  resource.id no longer materialized)

* refactor(audit): use pytest.mark.parametrize for filter integration tests

Consolidate filter test functions into a single parametrized test.
6/8 tests passing; resource kind+ID filter and scalar count need
further investigation (resource filter JSON key extraction with
dotted keys, scalar response format).

* fix(audit): add source to resource filter for correct metadata routing

Add source param to telemetryresourcefilter.New so the resource
filter's key selectors include Source when calling GetKeysMulti.
Without this, audit resource keys route to signoz_logs metadata
tables instead of signoz_audit. Fix scalar test to use table
response format (columns+data, not rows).

* refactor(audit): reuse querier fixtures in integration tests

Add source param to BuilderQuery and build_scalar_query in the
querier fixture. Replace custom _build_audit_query and
_build_audit_ts_query helpers with BuilderQuery and
build_scalar_query from the shared fixtures.

* refactor(audit): remove wrapper helpers, inline make_query_request calls

Remove _query_audit_raw and _query_audit_scalar helpers. Use
make_query_request, BuilderQuery, and build_scalar_query directly.
Compute time window at test execution time via _time_window() to
avoid stale module-level timestamps.

* refactor(audit): inline _time_window into test functions

* style(audit): use snake_case for pytest parametrize IDs

* refactor(audit): inline DEFAULT_ORDER using build_order_by

Use build_order_by from querier fixtures instead of OrderBy/
TelemetryFieldKey dataclasses. Allow BuilderQuery.order to accept
plain dicts alongside OrderBy objects.

* refactor(audit): inline all data setup, use distinct scenarios per test

Remove _insert_standard_audit_events helper. Each test now owns its
data: list_all uses alert-rule/saved-view/user resource types,
scalar_count uses multiple failures from different principals (count=2),
leak test uses a single organization event. Parametrized filter tests
keep the original 5-event dataset.

* fix(audit): remove silent empty-string guards in metadata store

Remove guards that silently returned nil/empty when audit DB params
were empty. All call sites now pass real constants, so misconfiguration
should fail loudly rather than produce silent empty results.

* style(audit): remove module docstring from integration test

* style: formatting fix in tables file

* style: formatting fix in tables file

* fix: add auditStmtBuilder nil param to querier_test.go

* fix: fix fmt
2026-04-09 08:12:32 +00:00
Nikhil Soni
6d1d028d4c refactor: setup types and interface for waterfall v3 (#10794)
* feat: setup types and interface for waterfall v3

v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service

* refactor: move type conversion logic to types pkg

* chore: add reason for using snake case in response

* fix: update span.attributes to map of string to any

To support otel format of diffrent types of attributes

* fix: remove unused fields and rename span type

To avoid confusing with otel span

* chore: rename resources field to follow otel

---------

Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
2026-04-09 05:21:33 +00:00
swapnil-signoz
92660b457d feat: adding types changes and openapi spec (#10866)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: adding types changes and openapi spec

* refactor: review changes

* feat: generating OpenAPI spec

* refactor: updating create account types

* refactor: removing email domain function
2026-04-08 20:11:49 +00:00
Piyush Singariya
72ce8768b3 chore: align usage of negative operators in JSON Body (#10758)
* feat: enable JSON Path index

* fix: contextual path index usage

* test: fix unit tests

* feat: align negative operators to include other logs

* fix: primitive conditions working

* fix: indexed tests passing

* fix: array type filtering from dynamic arrays

* fix: unit tests

* fix: remove not used paths from testdata

* fix: unit tests

* fix: comment

* fix: indexed unit tests

* fix: array json element comparison

* feat: change filtering of dynamic arrays

* fix: dynamic array tests

* fix: stringified integer value input

* fix: better review for test file

* fix: negative operator check

* ci: lint changes

* ci: lint fix
2026-04-07 15:51:26 +00:00
Vikrant Gupta
918cf4dfe5 fix(authz): improve error messages (#10856) 2026-04-07 08:34:53 +00:00
Srikanth Chekuri
1eb27d0bc4 chore: add schema version specific validations (#10808)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: add schema version specific validations

* chore: address reivew comments

* chore: additional checks

* chore: adrress lint

* chore: more lint
2026-04-06 10:45:41 +00:00
Vikrant Gupta
55cce3c708 feat(authz): accept singular roles for user and service accounts (#10827)
* feat(authz): accept singular roles for user and service accounts

* feat(authz): update integration tests

* feat(authz): update integration tests

* feat: move role management to a single select flow on members and service account pages(temporarily)

* feat(authz): enable stats reporter for service accounts

* feat(authz): identity call for activating/deleting user

---------

Co-authored-by: SagarRajput-7 <sagar@signoz.io>
2026-04-06 10:41:56 +00:00
Srikanth Chekuri
b5b89bb678 fix: several panic errors (#10817)
* fix: several panic errors

* fix: add recover for dashboard sql
2026-04-04 10:38:42 +00:00
Vikrant Gupta
a71006b662 chore(user): add partial index for email,org_id (#10828)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore(user): add partial index for email,org_id

* chore(user): fix integration test fmt

* chore(user): fix integration test lint
2026-04-03 20:27:04 +00:00
SagarRajput-7
363734054f feat: updated user api to v2 and accordingly update members page and role management (#10799)
* feat: updated user api to v2 and accordingly update members page and role management

* feat: updated members page to use new role management and v2 user api

* feat: updated test cases

* feat: code refactor

* feat: refactored code and addressed feedbacks

* feat: refactored code and addressed feedbacks

* feat: refactored code and addressed feedbacks

* fix(user): fix openapi spec

* feat: handle isRoot user and self user cases and added test cases

---------

Co-authored-by: vikrantgupta25 <vikrant@signoz.io>
2026-04-03 18:41:27 +00:00
Pandey
726948db9e feat(audittypes): align types with revised schema doc (#10826)
* feat(audittypes): align types with revised schema doc

Rename resource.name → resource.kind to match Typeable.Kind() rename.
Move resource attributes (kind, id) from event attributes to OTel
Resource, grouping events by target resource in NewPLogsFromAuditEvents.
Add network.protocol.name, network.protocol.version, url.scheme to
transport attributes for complete OTel semconv coverage.

* refactor(audittypes): inline resourceKey struct into function scope

* test(audittypes): add tests for NewPLogsFromAuditEvents

Cover resource grouping: empty input, single event, same resource
batched into one ResourceLogs, different resources split, same kind
with different IDs split, and interleaved events grouped correctly.
Verify resource attrs live on Resource (not event attributes).
2026-04-03 18:18:08 +00:00
Tushar Vats
f71d5bf8f1 fix: added validations for having expression (#10286)
* fix: added validations for having expression

* fix: added extra validation and unit tests

* fix: added antlr based parsing for validation

* fix: added more unit tests

* fix: removed validation on having in range request validations

* fix: generated lexer files and added more unit tests

* fix: edge cases

* fix: added cmnd to scripts for generating lexer

* fix: use std libg sorting instead of selection sort

* fix: support implicit and

* fix: allow bare not in expression

* fix: added suggestion for having expression

* fix: typo

* fix: added more unit tests, handle white space difference in aggregation exp and having exp

* fix: added support for in and not, updated errors

* fix: added support for brackets list

* fix: lint error

* fix: handle non spaced expression

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-04-02 03:52:11 +00:00
Srikanth Chekuri
5abfd0732a chore: remove deprecated v3/v4 support in rules (#10760)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: remove deprecated v3/v4 support in rules

* chore: fix test

* chore: fix logs

* chore: fix logging

* chore: fix ci

* chore: address review comments
2026-04-01 19:48:37 +00:00
swapnil-signoz
3dc0a7c8ce feat: adding aws service definitions in types (#10798)
* feat: adding aws service definitions in types

* refactor: moving definitions to module fs
2026-04-01 14:08:46 +00:00
Pandey
42415e0873 feat(audit): handler-level AuditDef, audit middleware, and response capture (#10791)
* feat(audit): handler-level AuditDef and response-capturing wrapper

Add declarative audit instrumentation to the handler package. Routes
declare an AuditDef alongside OpenAPIDef; the handler automatically
captures the response status/body and emits an audit event via
auditor.Audit() after every request.

* refactor(audit): move audit logic to middleware, merge with logging

Move audit event emission from handler to middleware layer. The handler
package keeps only the AuditDef struct and AuditDefProvider interface.
The logging middleware now handles both request logging and audit event
emission using a single response capture, avoiding double-wrapping.

Rename badResponseLoggingWriter to responseCapture with body capture
on all 4xx/5xx responses (previously only 400 and 5xx).

* refactor(audit): rename Logging middleware to Audit, merge into single file

Delete logging.go and merge its contents into audit.go. Rename
Logging/NewLogging to Audit/NewAudit. The response.go file with
responseCapture is unchanged.

* refactor(audit): extract NewAuditEventFromHTTPRequest factory into audittypes

Move event construction to audittypes.NewAuditEventFromHTTPRequest with
an AuditEventContext struct for caller-provided fields. The audittypes
layer reads only transport fields from *http.Request and has no mux,
authtypes, or context dependencies. The middleware pre-extracts
principal, trace, error, and route fields before calling the factory.

* refactor(audit): move error parsing to render.ErrorFromBody and render.ErrorTypeFromStatusCode

Add render.ErrorFromBody to extract errors.JSON from a JSON-encoded
ErrorResponse body, and render.ErrorTypeFromStatusCode to reverse-map
HTTP status codes to error type strings. The middleware now uses these
instead of local duplicates.

* refactor(audit): move AuditDef onto Handler interface, consolidate files

Move AuditDef() onto the Handler interface directly. All Handler
implementations now carry it: handler returns the configured def,
healthOpenAPIHandler returns nil. Delete the separate AuditDefProvider
interface and audit.go handler file. Move excludedRoutes check before
audit emission so excluded routes skip both logging and audit.

* feat(audit): add option.go with AuditDef, Option, and WithAuditDef

* refactor(audit): decompose AuditEvent into attribute sub-structs, add tests

Decompose flat AuditEvent fields into typed sub-structs
(AuditEventAuditAttributes, PrincipalAttributes, ResourceAttributes,
ErrorAttributes, TransportAttributes) each with a constructor and
Put(pcommon.Map) method. Simplify NewAuditEventFromHTTPRequest to
accept authtypes.Claims and oteltrace IDs directly. Simplify the
middleware caller accordingly.

Add unit tests for the factory, outcome boundary, and principal type
derivation.

* refactor(audit): shorten attribute struct names, drop error message

Rename AuditEventAuditAttributes to AuditAttributes,
AuditEventPrincipalAttributes to PrincipalAttributes, and likewise
for Resource, Error, and Transport. The package prefix already
disambiguates.

Remove ErrorMessage from ErrorAttributes to avoid leaking sensitive
or PII data into audit logs. Error type and code are sufficient for
filtering; investigators can correlate via trace ID.

* fix(audit): update auditorserver test and otlphttp provider for new struct layout

Update newTestEvent in server_test.go to use nested AuditAttributes
and ResourceAttributes. Update otlphttpauditor provider to access
PrincipalOrgID via PrincipalAttributes. Fix godot lint on attribute
section comments.

* fix(audit): fix gjson path in ErrorCodeFromBody, add tests

Fix ErrorCodeFromBody gjson path from "errors.code" to "error.code"
to match the ErrorResponse JSON structure. Add unit tests for valid
error response and invalid JSON cases.

* fix(audit): add CodeUnset, use ErrorCodeFromBody in middleware

Add errors.CodeUnset for responses missing an error code. Update the
audit middleware to use render.ErrorCodeFromBody instead of the removed
render.ErrorFromBody.

* test(audit): add unit tests for responseCapture

Test the four meaningful behaviors: success responses don't capture
body, error responses capture body, large error bodies truncate at
4096 bytes, and 204 No Content suppresses writes entirely.

* fix(audit): check rw.Write return values in response_test.go

* style(audit): rename want prefix to expected in test fields

* refactor(audit): replace Sprintf with strings.Builder in newBody

Handle edge cases where principal email, ID, or resource ID may be
empty. The builder conditionally includes each segment, avoiding
empty parentheses or leading spaces in the audit body.

Add test cases covering all meaningful combinations: success/failure
with full/partial/empty principal, resource ID, and error details.

* chore: fix formatting

* chore: remove json tags

* fix: rebase with main
2026-04-01 10:10:52 +00:00
Vikrant Gupta
bad80399a6 feat(serviceaccount): integrate service account (#10681)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat(serviceaccount): integrate service account

* feat(serviceaccount): integrate service account with better types

* feat(serviceaccount): fix lint and testing changes

* feat(serviceaccount): update integration tests

* feat(serviceaccount): fix formatting

* feat(serviceaccount): fix openapi spec

* feat(serviceaccount): update txlock to immediate to avoid busy snapshot errors

* feat(serviceaccount): add restrictions for factor_api_key

* feat(serviceaccount): add restrictions for factor_api_key

* feat: enabled service account and deprecated API Keys (#10715)

* feat: enabled service account and deprecated API Keys

* feat: deprecated API Keys

* feat: service account spec updates and role management changes

* feat: updated the error component for roles management

* feat: updated test case

* feat: updated the error component and added retries

* feat: refactored code and added retry to happend 3 times total

* feat: fixed feedbacks and added test case

* feat: refactored code and removed retry

* feat: updated the test cases

---------

Co-authored-by: SagarRajput-7 <162284829+SagarRajput-7@users.noreply.github.com>
2026-04-01 07:20:59 +00:00
Pandey
71a13b4818 feat(audit): enterprise auditor with licensing gate and OTLP HTTP export (#10776)
* feat(audit): add enterprise auditor with licensing gate and OTLP HTTP export

Implements the enterprise auditor at ee/auditor/otlphttpauditor/.
Composes auditorserver.Server for batching with licensing gate,
OTel SDK LoggerProvider for InstrumentationScope, and otlploghttp
exporter for OTLP HTTP transport.

* fix(audit): address PR review — inline factory, move body+logrecord to audittypes

- Inline NewFactory closure, remove newProviderFunc
- Move buildBody to audittypes.BuildBody
- Move eventToLogRecord to audittypes.ToLogRecord
- go mod tidy for newly direct otel/log deps

* fix(audit): address PR review round 2

- Make ToLogRecord a method on AuditEvent returning sdklog.Record
- Fold buildBody into ToLogRecord as unexported helper
- Remove accumulatingProcessor and LoggerProvider — export directly
  via otlploghttp.Exporter
- Delete body.go and processor.go

* fix(audit): address PR review round 3

- Merge export.go into provider.go
- Add severity/severityText fields to Outcome struct
- Rename buildBody to setBody method on AuditEvent
- Add appendStringIfNotEmpty helper to reduce duplication

* feat(audit): switch to plog + direct HTTP POST for OTLP export

Replace otlploghttp.Exporter + sdklog.Record with plog data model,
ProtoMarshaler, and direct HTTP POST. This properly sets
InstrumentationScope.Name = "signoz.audit" and Resource attributes
on the OTLP payload.

* fix(audit): adopt collector otlphttpexporter pattern for HTTP export

Model the send function after the OTel Collector's otlphttpexporter:
- Bounded response body reads (64KB max)
- Protobuf-encoded Status parsing from error responses
- Proper response body draining on defer
- Detailed error messages with endpoint URL and status details

* refactor(audit): split export logic into export.go, add throttle retry

- Move export, send, and HTTP response helpers to export.go
- Add exporterhelper.NewThrottleRetry for 429/503 with Retry-After
- Parse Retry-After as delay-seconds or HTTP-date per RFC 7231
- Keep provider.go focused on Auditor interface and lifecycle

* feat(audit): add partial success handler and internal retry with backoff

- Parse ExportLogsServiceResponse on 2xx for partial success, log
  warning if log records were rejected
- Internal retry loop with exponential backoff for retryable status
  codes (429, 502, 503, 504) using RetryConfig from auditor config
- Honour Retry-After header (delay-seconds and HTTP-date)
- Store full auditor.Config on provider struct
- Replace exporterhelper.NewThrottleRetry with local retryableError
  type consumed by the internal retry loop

* fix(audit): fix lint — use pkg/errors, remove stdlib errors and fmt.Errorf

* refactor(audit): use provider as receiver name instead of p

* refactor(audit): clean up enterprise auditor implementation

- Extract retry logic into retry.go
- Move NewPLogsFromAuditEvents and ToLogRecord into event.go
- Add ErrCodeAuditExportFailed to auditor package
- Add version.Build to provider for service.version attribute
- Simplify sendOnce, split response handling into onSuccess/onErr
- Use PrincipalOrgID as valuer.UUID directly
- Use OTLPHTTP.Endpoint as URL type
- Remove gzip compression handling
- Delete logrecord.go

* refactor(audit): use pkg/http/client instead of bare http.Client

Use the standard SigNoz HTTP client with OTel instrumentation.
Disable heimdall retries (count=0) since we have our own
OTLP-aware retry loop that understands Retry-After headers.

* refactor(audit): use heimdall Retriable for retry instead of manual loop

- Implement retrier with exponential backoff from auditor RetryConfig
- Compute retry count from MaxElapsedTime and backoff intervals
- Pass retrier and retry count to pkg/http/client via WithRetriable
- Remove manual retry loop, retryableError type, and Retry-After parsing
- Heimdall handles retries on >= 500 status codes automatically

* refactor(audit): rename retry.go to retrier.go
2026-03-31 13:37:20 +00:00
Vikrant Gupta
2163e1ce41 chore(lint): enable godot and staticcheck (#10775)
* chore(lint): enable godot and staticcheck

* chore(lint): merge main and fix new lint issues in main
2026-03-31 09:11:49 +00:00
Srikanth Chekuri
b9eecacab7 chore: remove v1 metrics explorer code (#10764)
* chore: remove v1 metrics explorer code

* chore: fix ci

* chore: fix tests

* chore: address review comments
2026-03-31 08:46:27 +00:00
Pandey
e41d400aa0 feat(audit): add Auditor interface and rename auditortypes to audittypes (#10766)
Some checks failed
build-staging / prepare (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat(audit): add Auditor interface and rename auditortypes to audittypes

Add the Auditor interface in pkg/auditor/ as the contract between the
HTTP handler layer and audit implementations (noop for community, OTLP
for enterprise). Includes Config with buffer, batch, and flush settings
following the provider pattern.

Rename pkg/types/auditortypes/ to pkg/types/audittypes/ for consistency.

closes SigNoz/platform-pod#1930

* refactor(audit): move endpoint config to OTLPHTTPConfig provider struct

Move Endpoint out of top-level Config into a provider-specific
OTLPHTTPConfig struct with full OTLP HTTP options (url_path, insecure,
compression, timeout, headers, retry). Keep BufferSize, BatchSize,
FlushInterval at top level as common settings across providers.

closes SigNoz/platform-pod#1930

* feat(audit): add auditorbatcher for buffered event batching

Add pkg/auditor/auditorbatcher/ with channel-based batching for audit
events. Flushes when either batch size is reached or flush interval
elapses (whichever comes first). Events are dropped when the buffer is
full (fail-open). Follows the alertmanagerbatcher pattern.

* refactor(audit): replace public channel with Receive method on batcher

Make batchC private and expose Receive() <-chan []AuditEvent as the
read-side API. Clearer contract: Add() to write, Receive() to read.

* refactor(audit): rename batcher config fields to BufferSize and BatchSize

Capacity → BufferSize, Size → BatchSize for clarity.

* fix(audit): single-line slog call and fix log key to buffer_size

* feat(audit): add OTel metrics to auditorbatcher

Add telemetry via OTel MeterProvider with 4 instruments:
- signoz.audit.events.emitted (counter)
- signoz.audit.store.write_errors (counter, via RecordWriteError)
- signoz.audit.events.dropped (counter)
- signoz.audit.events.buffer_size (observable gauge)

Batcher.New() now accepts metric.Meter and returns error.

* refactor(audit): inject factory.ScopedProviderSettings into batcher

Replace separate logger and meter params with ScopedProviderSettings,
giving the batcher access to logger, meter, and tracer from one source.

* feat(audit): add OTel tracing to batcher Add path

Span auditorbatcher.Add with event_name attribute set at creation
and audit.dropped set dynamically on buffer-full drop.

* feat(audit): add export span to batcher Receive path

Introduce Batch struct carrying context, events, and a trace span.
Each flush starts an auditorbatcher.Export span with batch_size
attribute. The consumer ends the span after export completes.

* refactor(audit): replace Batch/Receive with ExportFunc callback

Batcher now takes an ExportFunc at construction and manages spans
internally. Removes Batch struct, Receive(), and RecordWriteError()
from the public API. Span.End() is always called via defer, write
errors and span status are recorded automatically on export failure.
Uses errors.Attr for error logging, prefixes log keys with audit.

* refactor(audit): rename auditorbatcher to auditorserver

Rename package, file (batcher.go → server.go), type (Batcher → Server),
and receiver (b → server) to reflect the full service role: buffering,
batching, metrics, tracing, and export lifecycle management.

* refactor(audit): rename telemetry to serverMetrics and document struct fields

Rename type telemetry → serverMetrics and constructor
newTelemetry → newServerMetrics. Add comments to all Server
struct fields.

* feat(audit): implement ServiceWithHealthy, fix race, add unit tests

- Implement factory.ServiceWithHealthy on Server via healthyC channel
- Fix data race: Start closes healthyC after goroutinesWg.Add(1),
  Stop waits on healthyC before closing stopC
- Add 8 unit tests covering construction, start/stop lifecycle,
  batch-size flush, interval flush, buffer-full drop, drain on stop,
  export failure handling, and concurrent safety

* fix(audit): fix lint issues in auditorserver and tests

Use snake_case for slog keys, errors.New instead of fmt.Errorf,
and check all Start/Stop return values in tests.

* fix(audit): address PR review comments

Use auditor:: prefix in validation error messages. Move fire-and-forget
comment to the Audit method, remove interface-level comment.
2026-03-30 20:46:38 +00:00
Pandey
37d202de92 feat(audit): add auditortypes package (#10761)
* feat(audit): add auditortypes package with event struct, store interface, and config

Foundational type package for the audit log system with zero dependencies
on other audit packages. Includes AuditEvent struct, Store interface (single
Emit method), Config with defaults, constants for action/outcome/principal/category,
and EventName derivation helper.

* fix(audit): address PR review feedback on auditortypes

- Remove non-CUD actions (login/logout/lock/unlock/revoke), keep only create/update/delete
- Replace pastTenseMap with pastTense field on Action struct
- Make EventName a typed struct with NewEventName() constructor
- Rename enum vars to ActionCategory* prefix (ActionCategoryAccessControl, etc.)
- Add IEC 62443 reference link to ActionCategory comment
- Add TODO on PrincipalType to use coretypes once available
- Rename types.go to event.go, merge Store interface into it
- Reorder AuditEvent fields to follow OTel LogRecord structure
- Delete config.go (belongs with auditor service, not types)
- Delete store.go (merged into event.go)

* fix(audit): remove redundant test files per review feedback
2026-03-30 14:46:50 +00:00
Srikanth Chekuri
bb4e7df68b chore: add rule state history module (#10488)
* chore: add rule state history module

* chore: run generate

* chore: generate

* Fix timeline default limit and escape exists key (#10490)

Co-authored-by: Cursor Agent <cursoragent@cursor.com>

* chore: remove unused AddRuleStateHistory and add comments

* chore: regenerate

* chore: update names and move functions

* chore: remove return

* chore: update .github/CODEOWNERS for history

* chore: update condition builder

* chore: lint

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2026-03-30 10:38:52 +00:00
Nityananda Gohain
e4f0d026c1 feat: time aware dynamic field mapper (#9669)
Some checks failed
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat: time aware dynamic field mapper

* fix: tests

* fix: update fetch code

* fix: update comment

* fix: remove goroutine

* fix: update tests

* fix: minor changes

* fix: minor changes

* fix: use proper cache

* fix: use orgId properly

* fix: aggregation

* fix: comments

* fix: test

* fix: lint issues

* fix: minor changes

* fix: address changes and add tests

* fix: use name from evolutions

* fix: make the evolution code reusable and propagte context properly

* fix: tests

* fix: update the evolution metadata table

* fix: lint issues

* fix: address cursor comments

* fix: tests

* fix: tests

* fix: changes

* fix: refactor code

* fix: more changes

* fix: clean up evolution logic

* fix: fetch keys at the start

* fix: more changes

* fix: structural changes

* fix: add api

* fix: minor cleanup

* fix: update query in adjust keys

* fix: edgecases

* fix: update conditionfor

* fix: address comments

* fix: revert commented test

* fix: minor refactoring and addressing comments

* fix: evolution metadata

* fix: more fixes

* fix: add support for version in the evolutions

* fix: final cleanup

* fix: copy slice

* fix: address comments

* fix: address comments

* fix: field_mapper

* chore: add integration tests

* fix: lint

* fix: lint

* chore: integration test for materialized

* chore: add remaining changes

* chore: fix lint in integration tests

---------

Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-03-28 05:12:49 +00:00
Nityananda Gohain
23a4960e74 chore: don't run functions if the series is empty (#10725)
Some checks failed
build-staging / js-build (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: funcRunningDiff don't panic when series is empty

* chore: don't run functions if the series is empty
2026-03-27 04:41:00 +00:00
Srikanth Chekuri
028c134ea9 chore: reject empty aggregations in payload regardless of disabled st… (#10720)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: reject empty aggregations in payload regardless of disabled status

* chore: update tests

* chore: count -> count()
2026-03-26 05:14:21 +00:00
Karan Balani
8609f43fe0 feat(user): v2 apis for user and user_roles (#10688)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: user v2 apis

* fix: openapi specs

* chore: address review comments

* fix: proper handling if invalid roles are passed

* chore: address review comments

* refactor: frontend to use deprecated apis after id rename

* feat: separate apis for adding and deleting user role

* fix: invalidate token when roles are updated

* fix: openapi specs and frontend test

* fix: openapi schema

* fix: openapi spec and move to snakecasing for json
2026-03-25 10:53:21 +00:00
swapnil-signoz
4b09f057b9 feat: adding handlers with OpenAPI specs (#10643)
* feat: adding cloud integration type for refactor

* refactor: store interfaces to use local types and error

* feat: adding sql store implementation

* refactor: removing interface check

* feat: adding updated types for cloud integration

* refactor: using struct for map

* refactor: update cloud integration types and module interface

* fix: correct GetService signature and remove shadowed Data field

* feat: implement cloud integration store

* refactor: adding comments and removed wrong code

* refactor: streamlining types

* refactor: add comments for backward compatibility in PostableAgentCheckInRequest

* refactor: update Dashboard struct comments and remove unused fields

* refactor: split upsert store method

* feat: adding integration test

* refactor: clean up types

* refactor: renaming service type to service id

* refactor: using serviceID type

* feat: adding method for service id creation

* refactor: updating store methods

* refactor: clean up

* refactor: clean up

* refactor: review comments

* refactor: clean up

* feat: adding handlers

* fix: lint and ci issues

* fix: lint issues

* fix: update error code for service not found

* feat: adding handler skeleton

* chore: removing todo comment

* feat: adding frontend openapi schema

* refactor: making review changes

* feat: regenerating openapi specs
2026-03-24 10:24:38 +00:00