* feat(global): add mcp_url to global config
Adds an optional mcp_url field to the global config so the frontend can
gate the MCP settings page on its presence. When unset the API returns
"mcp_url": null (pointer + nullable:"true"); when set it emits the
parsed URL as a string.
* feat(global): surface mcp_url in frontend types
Adds mcp_url to the manual GlobalConfigData type and refreshes the
generated OpenAPI client so consumers can read the new field.
* docs(global): use <unset> placeholder for mcp_url example
Matches the style of external_url and ingestion_url above it.
* style(global): separate mcp_url prep from return in GetConfig
Adds a blank line between the nullable-conversion block and the return
statement so the two logical phases read as distinct blocks.
* feat(global): mark endpoint fields as required in the API schema
The backend always emits external_url, ingestion_url and mcp_url on
GET /api/v1/global/config (mcp_url as literal null when unset), so the
JSON keys are always present. Add required:"true" to all three and
regenerate the OpenAPI + frontend client so consumers get non-optional
types.
* revert(global): drop mcp_url from legacy GlobalConfigData type
The legacy hand-written type for the non-Orval getGlobalConfig client
should be left alone; consumers that need mcp_url go through the
generated Orval client.
* feat: setup types and interface for waterfall v3
v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service
* refactor: move type conversion logic to types pkg
* chore: add reason for using snake case in response
* fix: update span.attributes to map of string to any
To support otel format of diffrent types of attributes
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* refactor: convert waterfall api to modules format
* chore: add same test cases as for old waterfall api
* chore: avoid sorting on every traversal
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* fix: rename timestamp to milli for readability
* fix: add timeout to module context
* fix: use typed paramter field in logs
* chore: generate openapi spec for v3 waterfall
* fix: remove timeout since waterfall take longer
* fix: use int16 for status code as per db schema
* fix: update openapi specs
* refactor: break down GetWaterfall method for readability
* chore: avoid returning nil, nil
* refactor: move type creation and constants to types package
- Move DB/table/cache/windowing constants to tracedetailtypes package
- Add NewWaterfallTrace and NewWaterfallResponse constructors in types
- Use constructors in module.go instead of inline struct literals
- Reorder waterfall.go so public functions precede private ones
* refactor: extract ClickHouse queries into a store abstraction
Move GetTraceSummary and GetTraceSpans out of module.go into a
traceStore interface backed by clickhouseTraceStore in store.go.
The module struct now holds a traceStore instead of a raw
telemetrystore.TelemetryStore, keeping DB access separate from
business logic.
* refactor: move error to types as well
* refactor: separate out store calls and computations
* refactor: breakdown GetSelectedSpans for readability
* refactor: return 404 on missing trace and other cleanup
* refactor: use same method for cache key creation
* chore: remove unused duration nano field
* chore: use sqlbuilder in clickhouse store where possible
* refactor: move waterfall traverse logic to types
and extract out auto expanded span calculation
* chore: convert all timestamp to nano for consitancy
* chore: rename waterfall response to gettableX format
* chore: fix method calls in test after refactoring
* refactor: remove unused methods
* chore: fix openapi spec
* chore: better names for methods and vars
* chore: remove caching to match from v2
* chore: update openapi client
* refactor: move selection decision to types
* chore: move types to the top
* refactor: avoid passing the whole telementry store in a module
* refactor: move waterfall constants to module config
* chore: update openapi specs
* chore: update openapi clints
* feat(authz): add check API for community build
* feat(authz): move to types
* feat(authz): fix the role corelations
* feat(authz): fix the role corelations
* fix(authz): single line returns
* feat(authz): add support for delete role
* feat(authz): register config and return error on cleanup failure
* feat(authz): take user and serviceaccount DI for assignee checks
* feat(authz): add the example yaml
* feat(authz): move to callbacks instead of DI
* chore: baseline setup
* chore: endpoint detail update
* chore: added logic for hosts v3 api
* fix: bug fix
* chore: disk usage
* chore: added validate function
* chore: added some unit tests
* chore: return status as a string
* chore: yarn generate api
* chore: removed isSendingK8sAgentsMetricsCode
* chore: moved funcs
* chore: added validation on order by
* chore: updated spec
* chore: nil pointer dereference fix in req.Filter
* chore: added temporalities of metrics
* chore: unified composite key function
* chore: code improvements
* chore: hostStatusNone added for clarity that this field can be left empty as well in payload
* chore: yarn generate api
* chore: return errors from getMetadata and lint fix
* chore: return errors from getMetadata and lint fix
* chore: added hostName logic
* chore: modified getMetadata query
* chore: add type for response and files rearrange
* chore: warnings added passing from queryResponse warning to host lists response struct
* chore: added better metrics existence check
* chore: added a TODO remark
* chore: added required metrics check
* chore: distributed samples table to local table change for get metadata
* chore: frontend fix
* chore: endpoint correction
* chore: endpoint modification openapi
* chore: escape backtick to prevent sql injection
* chore: rearrage
* chore: improvements
* chore: validate order by to validate function
* chore: improved description
* chore: added TODOs and made filterByStatus a part of filter struct
* chore: ignore empty string hosts in get active hosts
* feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956)
* chore: add functionality for showing active and inactive counts in custom group by
* chore: bug fix
* chore: added subquery for active and total count
* chore: ignore empty string hosts in get active hosts
* fix: sinceUnixMilli for determining active hosts compute once per request
* chore: refactor code
* chore: rename HostsList -> ListHosts
* chore: rearrangement
* chore: inframonitoring types renaming
* chore: added types package
* chore: file structure further breakdown for clarity
* chore: comments correction
* chore: removed temporalities
* chore: comments resolve
* chore: added json tag required: true
* chore: added status unauthorized
* chore: remove a defensive nil map check, the function ensure non-nil map when err nil
* chore: make sort stable in case of tiebreaker by comparing composite group by keys
* chore: regen api client for inframonitoring
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: custom notifiers in alert manager
* chore: lint fixs
* chore: fix email linter
* chore: added tracing to msteamsv2 notifier
* feat: alert manager template to template title and notification body
* chore: updated test name + code for timeout errors
* chore: added utils for using variables with $ notation
* chore: exposed templates for alertmanager types
* feat: added preprocessor for alert templater
* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata
* chore: fix lint issues
* chore: added handling for missing variable used in template
* feat: converted alerttemplater to interface and updated tests
* refactor: added extractCommonKV instead of 2 different functions
* test: fix preprocessor test case
* feat: added support for and in templating
* chore: lint fix
* chore: renamed the interface
* chore: added test for missing function
* refactor: test case and sb related changed
* refactor: comments and test improvements
* chore: lint fix
* chore: updated comments
* feat: added basic html markdown templater
* chore: updated newline to markdown format
* feat: slack blockkit renderer using goldmark
* test: added test for html rendering
* feat: integrated slack blockit in markdownrenderer package and removed plaintext format
* chore: updated br with new line in test and logs added
* refactor: review comments
* refactor: lint fixes
* chore: updated licenses for notifiers
* chore: updated email notifier from upstream
* feat: return single templating result from with flag for template type
* fix: variables with symbols in template
* feat: slack mrkdwn renderer
* feat: custom raw html renderer to escape <no value>
* chore: integrated slack mrkdwn renderer and added NoOp formatter
* chore: removed notifier test files
* fix: concurrent rendering in markdown renderer
* refactor: changes as per internal review
* chore: lint issue
* chore: removed special handling for softline break
* refactor: removed logger as markdown renderer dependency
* refactor: changed markdown renderer from interface to package-level functions
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* feat: adding migration AWS cloud integration regions config
* refactor: removing raw queries
* refactor: using table expr for table name
* refactor: using updated AWS regions declaration
* refactor: cleanup
* refactor: update AWS region migration logic to use new configuration method
* refactor: adding aws regions in migration
---------
Co-authored-by: Vikrant Gupta <vikrant@signoz.io>
* fix: tons of changes
* chore: remove redundent comparison
* ci: tests fixed
* fix: upgraded collector version
* fix: qbtoexpr tests
* fix: go sum
* chore: upgrade collector version v0.144.3-rc.4
* fix: tests
* ci: test fix
* revert: remove db binaries
* test: selectField tests added
* fix: added safeguards in plan generation
* fix: name changed to field_map
* fix: json access plan remval of AvailableTypes
* fix: invalid index usage on terminal condition
* fix: branches should tell missing array types
* fix: comment removed
* fix: issue with FuzzyMatching and API failing
* fix: int64 mapping
* ci: test and lint fix
* fix: test VisitKey
* test: running test for sku
* fix: buildFieldForJSON works
* fix: few minor changes
* fix: refactor tag vs field_key table
* fix: minor changes based on review
* revert: minor variable change
* fix: added more membership testcases
* revert: minor var names reverted
* ci: tests aligned
* fix: indexed expressions
* chore: remove caching spans since v2 was not using it
So we can directly introduce redis instead of relying
on in-memory cache
* chore: remove unnecessary logs
* docs: perses schema for dashboards
* chore: no need for Signal type in commons, only used once
* chore: no need for PageSize type in commons, only used once
* chore: rm comment
* chore: remove stub for time series chart
* chore: remove manually written manifest and package
* chore: remove validate file
* chore: no config folder
* chore: no config folder
* chore: no commons (for now)
* feat: validation script
* fix: remove fields from variable specs that are there in ListVariable
* chore: test file with way more examples
* chore: test file with way more examples
* chore: checkpoint for half correct setup
* chore: rearrange specs in package.json
* chore: py script not needed
* chore: rename
* chore: folders in schemas for arranging
* chore: folders in schemas for arranging
* fix: proper composite query schema
* feat: custom time series schema
* chore: comment explaining when to use composite query and when not
* feat: promql example
* chore: remove upstream import
* fix: promql fix
* docs: time series panel schema without upstream ref
* chore: object for visualization section
* docs: bar chart panel schema without upstream ref
* docs: number panel schema without upstream ref
* docs: number panel schema without upstream ref
* docs: pie chart panel schema without upstream ref
* docs: table chart panel schema without upstream ref
* docs: histogram chart panel schema without upstream ref
* docs: list panel schema without upstream ref
* chore: a more complex example
* chore: examples for panel types
* chore: remaining fields file
* fix: no more online validation
* chore: replace yAxisUnit by unit
* chore: no need for threshold prefix inside threshold obj
* chore: remove unimplemented join query schema
* fix: no nesting in context links
* fix: less verbose field names in dynamic var
* chore: actually name every panel as a panel
* chore: common package for panels' repeated definitions
* chore: common package for queries' repeated definitions
* chore: common package for variables' repeated definitions
* fix: functions in formula
* fix: only allow one of metric or expr aggregation in builder query
* fix: datasource in perses.json
* fix: promql step duration schema
* fix: proper type for selectFields
* chore: single version for all schemas
* fix: normalise enum defs
* chore: change attr name to name
* chore: common threshold type
* chore: doc for how to add a panel spec
* feat: textbox variable
* feat: go struct based schema for dashboardv2 with validations and some tests
* fix: go mod fix
* chore: perses folder not needed anymore
* chore: use perses updated/createdat
* fix: builder query validation (might need to revisit, 3 types seems bad)
* chore: go lint fixes
* chore: define constants for enum values
* chore: nil factory case not needed
* chore: nil factory case not needed
* chore: slight rearrange for builder spec readability
* feat: add TimeSeriesChartAppearance
* chore: no omit empty
* chore: span gaps in schema
* chore: context link not needed in plugins
* chore: remove format from threshold with label, rearrange structs
* test: fix unit tests
* chore: refer to common struct
* feat: query type and panel type matching
* test: unit tests improvement first pass
* test: unit tests improvement second pass
* test: unit tests improvement third pass
* test: unit tests improvement fourth pass
* test: unit test for dashboard with sections
* test: unit test for dashboard with sections
* fix: add missing dashboard metadata fields
* chore: go lint fixes
* chore: go lint fixes
* chore: changes for create v2 api
* chore: more info in StorableDashboardDataV2
* chore: diff check in update method
* chore: add required true tag to required fields
* feat: update metadata methods
* chore: go mod tidy
* chore: put id in metadata.name, authtypes for v2
* revert: only the schema for now in this PR
* chore: comment for why v1.DashboardSpec is chosen
* chore: change source to signal in DynamicVariableSpec
* fix: string values for precision option
* feat: literal options for comparison operator
* fix: missing required tag in threshold fields
* chore: use valuer.string for plugin kind enums
* chore: use only TelemetryFieldKey in ListPanelSpec
* chore: simplify variable plugin validation
* fix: do not allow nil panels
* fix: do not allow nil plugin spec
* fix: signal should be an enum not a string
* chore: rearrange enums to separate those with default values
* test: unit tests for invalid enum values
* fix: all enums should have a default value
* refactor: extract UnmarshalBuilderQueryBySignal to deduplicate signal dispatch
* refactor: proper struct for span gaps
* chore: back to normal strings for kind enums
* chore: ticks in err messages
* chore: ticks in err messages
* chore: remove unused struct
* chore: snake case for non-kind enum values
* chore: proper error wrapping
* chore: accept int values in PrecisionOption as fallback
* fix: actually update the plugin from map to custom struct
* feat: disallow unknown fields in plugins
* chore: make enums valuer.string
* chore: proper enum types in constants
* chore: rename value to avoid overriding valuer.string method
* test: db cycle test
* fix: lint fix in some other file
* test: remove collapse info from sections
* test: use testify package
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* refactor(ruler): add Rule v2 read type and rename storage Rule to StorableRule
* refactor(ruler): map GettableRule to Rule before responding on v2 routes
* docs(openapi): regenerate spec with RuletypesRule on v2 rules routes
* docs(frontend): regenerate API clients with RuletypesRuleDTO
* refactor(ruler): validate uuid-v7 on delete rule handler
* refactor(ruler): add Enum() on AlertType
* refactor(ruler): convert RepeatType and RepeatOn to valuer.String with Enum()
* refactor(ruler): mark required fields on Recurrence
* refactor(ruler): mark required tag on CumulativeSchedule.Type
* refactor(ruler): rename GettablePlannedMaintenance to PlannedMaintenance
* docs: regenerate OpenAPI spec and frontend clients with tightened schema
* refactor(ruler): add PostablePlannedMaintenance input type with Validate
* refactor(ruler): rename EditPlannedMaintenance to Update and GetAll to List
* refactor(ruler): switch Create/Update to *PostablePlannedMaintenance
* refactor(ruler): convert PlannedMaintenance.Id string to ID valuer.UUID
* refactor(ruler): return *PlannedMaintenance from CreatePlannedMaintenance
* docs: regenerate OpenAPI spec and frontend clients for Postable/ID changes
* refactor(ruler): type PlannedMaintenance.Status as MaintenanceStatus enum
* refactor(ruler): type PlannedMaintenance.Kind as MaintenanceKind enum
* refactor(ruler): mark GettableRule.Id required
* refactor(ruler): mark GettableRule.State required
* refactor(ruler): make GettableRule timestamps non-pointer and users nullable
* refactor(ruler): return bare array from v2 ListRules instead of wrapped object
* docs: regenerate OpenAPI spec and frontend clients for schema pass
* refactor(ruler): define Ruler and Handler interfaces with signozruler implementation
Expand the Ruler interface with rule management and planned maintenance
methods matching rules.Manager signatures. Add Handler interface for
HTTP endpoints. Implement handler in signozruler wrapping ruler.Ruler,
and update provider to embed *rules.Manager for interface satisfaction.
* refactor(ruler): move eval_delay from query-service constants to ruler config
Replace constants.GetEvalDelay() with config.EvalDelay on ruler.Config,
defaulting to 2m. This removes the signozruler dependency on
pkg/query-service/constants.
* refactor(ruler): use time.Duration for eval_delay config
Match the convention used by all other configs in the codebase.
TextDuration is for preserving human-readable text through JSON
round-trips in user-facing rule definitions, not for internal config.
* refactor(ruler): add godoc comments and spacing to Ruler interface
* refactor(ruler): wire ruler handler through signoz.New and signozapiserver
- Add Start/Stop to Ruler interface for lifecycle management
- Add rulerCallback to signoz.New() for EE customization
- Wire ruler.Handler through Handlers, signozapiserver provider
- Register 12 routes in signozapiserver/ruler.go (7 rules, 5 downtime)
- Update cmd/community and cmd/enterprise to pass rulerCallback
- Move rules.Manager creation from server.go to signoz.New via callback
- Change APIHandler.ruleManager type from *rules.Manager to ruler.Ruler
- Remove makeRulesManager from both OSS and EE server.go
* refactor(ruler): remove old rules and downtime_schedules routes from http_handler
Remove 7 rules CRUD routes and 5 downtime_schedules routes plus their
handler methods from http_handler.go. These are now served by
signozapiserver/ruler.go via handler.New() with OpenAPIDef.
The 4 v1 history routes (stats, timeline, top_contributors,
overall_status) remain in http_handler.go as they depend on
interfaces.Reader and have v2 equivalents already in signozapiserver.
* refactor(ruler): use ProviderFactory pattern and register in factory.Registry
Replace the rulerCallback with rulerProviderFactories following the
standard ProviderFactory pattern (like auditorProviderFactories). The
ruler is now created via factory.NewProviderFromNamedMap and registered
in factory.Registry for lifecycle management. Start/Stop are no longer
called manually in server.go.
- Ruler interface embeds factory.Service (Start/Stop return error)
- signozruler.NewFactory accepts all deps including EE task funcs
- provider uses named field (not embedding) with explicit delegation
- cmd/community passes nil task funcs, cmd/enterprise passes EE funcs
- Remove NewRulerProviderFactories (replaced by callback from cmd/)
- Remove manual Start/Stop from both OSS and EE server.go
* fix(ruler): make Start block on stopC per factory.Service contract
rules.Manager.Start is non-blocking (run() just closes a channel).
Add stopC to provider so Start blocks until Stop closes it, matching
the factory.Service contract used by the Registry.
* refactor(ruler): remove unused RM() accessor from EE APIHandler
* refactor(ruler): remove RuleManager from APIHandlerOpts
Use Signoz.Ruler directly instead of passing it through opts.
* refactor(ruler): add /api/v1/rules/test and mark /api/v1/testRule as deprecated
* refactor(ruler): use binding.JSON.BindBody for downtime schedule decode
* refactor(ruler): add TODOs for raw string params on Ruler interface
Mark CreateRule, EditRule, PatchRule, TestNotification, and DeleteRule
with TODOs to accept typed params instead of raw JSON strings. Requires
changing the storage model since the manager stores raw JSON as Data.
* refactor(ruler): add TODO on MaintenanceStore to not expose store directly
* docs: regenerate OpenAPI spec and frontend API clients with ruler routes
* refactor(ruler): rename downtime_schedules tag to downtimeschedules
* refactor(ruler): add query params to ListDowntimeSchedules OpenAPIDef
Add ListPlannedMaintenanceParams struct with active/recurring fields.
Use binding.Query.BindQuery in the handler instead of raw URL parsing.
Add RequestQuery to the OpenAPIDef so params appear in the OpenAPI spec
and generated frontend client.
* refactor(ruler): add GettableTestRule response type to TestRule endpoint
Define GettableTestRule struct with AlertCount and Message fields.
Use it as the Response in TestRule OpenAPIDef so the generated frontend
client has a proper response type instead of string.
* refactor(ruler): tighten schema with oneOf unions and required fields
Surface the polymorphism in RuleThresholdData and EvaluationEnvelope via
JSONSchemaOneOf (the same pattern as QueryEnvelope), so the generated
TS types are discriminated unions with typed `spec` instead of unknown.
Also mark `alert`, `ruleType`, and `condition` required on PostableRule
so the generated TS types are non-optional for callers.
* refactor(ruler): add Enum() on EvaluationKind, ScheduleType, ThresholdKind
Surface the fixed set of accepted values for these valuer-wrapped kind
types so OpenAPI emits proper string-enum schemas and the generated TS
types become string-literal unions instead of plain string.
* refactor(ruler): mark required fields on nested rule and maintenance types
Surface fields already enforced by Validate()/UnmarshalJSON as required
in the OpenAPI schema so the generated TS types match runtime behavior.
Touches RuleCondition (compositeQuery, op, matchType), RuleThresholdData
(kind, spec), BasicRuleThreshold (name, target, op, matchType),
RollingWindow (evalWindow, frequency), CumulativeWindow (schedule,
frequency, timezone), EvaluationEnvelope (kind, spec), Schedule
(timezone), GettablePlannedMaintenance (name, schedule).
Does not mark server-populated fields (id, createdAt, updatedAt, status,
kind) on GettablePlannedMaintenance required, since the same struct is
reused for request bodies in MaintenanceStore.CreatePlannedMaintenance.
* refactor(ruler): tighten AlertCompositeQuery, QueryType, PanelType schema
Missed in the earlier tightening pass. AlertCompositeQuery.queries,
panelType, queryType are all required for a valid composite query;
QueryType and PanelType are valuer-wrapped with fixed value sets, so
expose them as enums in the OpenAPI schema.
* refactor(ruler): wrap sql.ErrNoRows as TypeNotFound in by-ID lookups
GetStoredRule and GetPlannedMaintenanceByID previously returned bun's
raw Scan error, so a missing ID leaked "sql: no rows in result set" to
the HTTP response with a 500 status. WrapNotFoundErrf converts
sql.ErrNoRows into TypeNotFound so render.Error emits 404 with a stable
`not_found` code, and passes other errors through unchanged.
* refactor(ruler): move migrated rules routes to /api/v2/rules
The 7 rules routes now live at /api/v2/rules, /api/v2/rules/{id}, and
/api/v2/rules/test — served via handler.New with render.Success and
render.Error. The legacy /api/v1/rules paths will be restored in the
query-service http handler in a follow-up so existing clients keep
receiving the SuccessResponse envelope unchanged.
Drop the /api/v1/testRule deprecated alias from signozapiserver; the
original lives on main's http_handler.go and is restored alongside the
other v1 paths.
Downtime schedule routes stay at /api/v1/downtime_schedules — single
track, no legacy restore planned.
* refactor(ruler): restore /api/v1/rules legacy handlers for back-compat
Bring the 7 rule CRUD/test handlers and their router.HandleFunc lines
back to http_handler.go so /api/v1/rules, /api/v1/rules/{id}, and
/api/v1/testRule continue to emit the legacy SuccessResponse envelope.
The v2 versions under signozapiserver are the new home for the render
envelope used by generated clients.
Delegation uses aH.ruleManager (populated from opts.Signoz.Ruler in
NewAPIHandler), so a single ruler.Ruler instance serves both paths — no
second rules.Manager is instantiated.
Downtime schedules stay single-track under signozapiserver; the 5
downtime handlers are not restored.
* docs: regenerate OpenAPI spec and frontend clients for /api/v2/rules
* refactor(ruler): return 201 Created on POST /api/v2/rules
A successful create now responds with 201 Created and the full
GettableRule body, matching REST convention for resource creation.
Regenerates the OpenAPI spec and frontend clients to reflect the new
status code.
* refactor(ruler): restore dropped sorter TODO in legacy listRules
The legacy listRules handler was copied verbatim from main during the
v1 back-compat restore, but an inner blank line and the load-bearing
`// todo(amol): need to add sorter` comment were stripped. Put them
back so the legacy block round-trips cleanly against main.
* refactor(ruler): return 201 Created on POST /api/v1/downtime_schedules
Match the REST convention already applied to POST /api/v2/rules:
successful creates respond with 201 Created. Response body remains
empty (nil); the generated frontend client surface is unchanged since
no response type was declared.
A richer "return the created resource" response body is a separate
follow-up — holding off until the ruletypes naming cleanup lands.
* fix(ruler): signal Healthy only after manager.Start closes m.block
The ruler provider didn't implement factory.Healthy, so the registry
fell back to factory.closedC and marked the service StateRunning the
instant its Start goroutine spawned — before rules.Manager.Start had
closed m.block. /api/v2/healthz therefore returned 200 while rule
evaluation was still gated, and integration tests that POSTed a rule
immediately after the readiness check saw their task goroutines stuck
on <-m.block until the next frequency tick.
Add a healthyC channel and close it inside Start only after
manager.Start returns; implement factory.Healthy so the registry and
/api/v2/healthz wait on the real readiness signal.
* fix: add the withhealthy interface
* fix(ruler): alias legacy RULES_EVAL_DELAY env var in backward-compat
The eval_delay config was moved from query-service constants (read from
RULES_EVAL_DELAY) onto ruler.Config (read via mapstructure from
SIGNOZ_RULER_EVAL__DELAY). That silently broke the legacy env var for
any existing deployment — notably the alerts integration-test fixture
which sets RULES_EVAL_DELAY=0s to let rules evaluate against just-
inserted data. The resulting default 2m delay pushed the query window
far enough back that the fixture's rate spike fell outside it, causing
8 of 24 parametrize cases in 02_basic_alert_conditions.py to fail with
"Expected N alerts to be fired but got 0 alerts".
Add RULES_EVAL_DELAY to mergeAndEnsureBackwardCompatibility alongside
the ~10 other aliased legacy env vars. Emits the standard deprecation
warning and overrides config.Ruler.EvalDelay.
* feat(apiserver): derive HTTP route prefix from global.external_url
The path component of global.external_url is now used as the base path
for all HTTP routes (API and web frontend), enabling SigNoz to be served
behind a reverse proxy at a sub-path (e.g. https://example.com/signoz/).
The prefix is applied via http.StripPrefix at the outermost handler
level, requiring zero changes to route registration code. Health
endpoints (/api/v1/health, /api/v2/healthz, /api/v2/readyz,
/api/v2/livez) remain accessible without the prefix for container
healthchecks.
Removes web.prefix config in favor of the unified global.external_url
approach, avoiding the desync bugs seen in projects with separate
API/UI prefix configs (ArgoCD, Prometheus).
closesSigNoz/platform-pod#1775
* feat(web): template index.html with dynamic base href from global.external_url
Read index.html at startup, parse as Go template with [[ ]] delimiters,
execute with BasePath derived from global.external_url, and cache the
rendered bytes in memory. This injects <base href="/signoz/" /> (or
whatever the route prefix is) so the browser resolves relative URLs
correctly when SigNoz is served at a sub-path.
Inject global.Config into the routerweb provider via the factory closure
pattern. Static files (JS, CSS, images) are still served from disk
unchanged.
* refactor(web): extract index.html templating into web.NewIndex
Move the template parsing and execution logic from routerweb provider
into pkg/web/template.go. NewIndex logs and returns raw bytes on
template failure; NewIndexE returns the error for callers that need it.
Rename BasePath to BaseHref to match the HTML attribute it populates.
Inject global.Config into routerweb via the factory closure pattern.
* refactor(global): rename RoutePrefix to ExternalPath, add ExternalPathTrailing
Rename RoutePrefix() to ExternalPath() to accurately reflect what it
returns: the path component of the external URL. Add
ExternalPathTrailing() which returns the path with a trailing slash,
used for HTML base href injection.
* refactor(web): make index filename configurable via web.index
Move the hardcoded indexFileName const from routerweb/provider.go to
web.Config.Index with default "index.html". This allows overriding the
SPA entrypoint file via configuration.
* refactor(web): collapse testdata_basepath into testdata
Use a single testdata directory with a templated index.html for all
routerweb tests. Remove the redundant testdata_basepath directory.
* test(web): add no-template and invalid-template index test cases
Add three distinct index fixtures in testdata:
- index.html: correct [[ ]] template with BaseHref
- index_no_template.html: plain HTML, no placeholders
- index_invalid_template.html: malformed template syntax
Tests verify: template substitution works, plain files pass through
unchanged, and invalid templates fall back to serving raw bytes.
Consolidate test helpers into startServer/get.
* refactor(web): rename test fixtures to no_template, valid_template, invalid_template
Drop the index_ prefix from test fixtures. Use web instead of w for
the variable name in test helpers.
* test(web): add SPA fallback paths to no_template and invalid_template tests
Test /, /does-not-exist, and /assets in all three template test cases
to verify SPA fallback behavior (non-existent paths and directories
serve the index) regardless of template type.
* test(web): use exact match instead of contains in template tests
Match the full expected response body in TestServeTemplatedIndex
instead of using assert.Contains.
* style(web): use raw string literals for expected test values
* refactor(web): rename get test helper to httpGet
* refactor(web): use table-driven tests with named path cases
Replace for-loop path iteration with explicit table-driven test cases
for each path. Each path (root, non-existent, directory) is a named
subtest case in all three template tests.
* chore: remove redundant comments from added code
* style: add blank lines between logical blocks
* fix(web): resolve lint errors in provider and template
Fix errcheck on rw.Write in serveIndex, use ErrorContext instead of
Error in NewIndex for sloglint compliance. Move serveIndex below
ServeHTTP to order public methods before private ones.
* style: formatting and test cleanup from review
Restructure Validate nil check, rename expectErr to fail with
early-return, trim trailing newlines in test assertions, remove
t.Parallel from subtests, inline short config literals, restore
struct field comments in web.Config.
* fix: remove unused files
* fix: remove unused files
* perf(web): cache http.FileServer on provider instead of creating per-request
* refactor(web): use html/template for context-aware escaping in index rendering
---------
Co-authored-by: SagarRajput-7 <162284829+SagarRajput-7@users.noreply.github.com>
* chore: custom notifiers in alert manager
* chore: lint fixs
* chore: fix email linter
* chore: added tracing to msteamsv2 notifier
* feat: alert manager template to template title and notification body
* chore: updated test name + code for timeout errors
* chore: added utils for using variables with $ notation
* chore: exposed templates for alertmanager types
* feat: added preprocessor for alert templater
* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata
* chore: fix lint issues
* chore: added handling for missing variable used in template
* feat: converted alerttemplater to interface and updated tests
* refactor: added extractCommonKV instead of 2 different functions
* test: fix preprocessor test case
* feat: added support for and in templating
* chore: lint fix
* chore: renamed the interface
* chore: added test for missing function
* refactor: test case and sb related changed
* refactor: comments and test improvements
* chore: lint fix
* chore: updated comments
* chore: updated newline to markdown format
* chore: updated br with new line in test and logs added
* refactor: review comments
* refactor: lint fixes
* chore: updated licenses for notifiers
* chore: updated email notifier from upstream
* feat: return single templating result from with flag for template type
* fix: variables with symbols in template
* chore: removed notifier test files
* refactor: changes as per internal review
* chore: lint issue
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* fix(member): better UX for pending invite users
* fix(member): add integration tests and reuse timezone util
* fix(member): rename deprecated and remove dead files
* fix(member): do not use hypened endpoints
* fix(member): user friendly button text
* fix(member): update the API endpoints and integration tests
* fix(member): simplify handler naming convention
* fix(member): added v2 API for update my password
* fix(member): remove more dead code
* fix(member): fix integration tests
* fix(member): fix integration tests
* refactor(alertmanager): move API handlers to signozapiserver
Extract Handler interface in pkg/alertmanager/handler.go and move
the implementation from api.go to signozalertmanager/handler.go.
Register all alertmanager routes (channels, route policies, alerts)
in signozapiserver via handler.New() with OpenAPIDef. Remove
AlertmanagerAPI injection from http_handler.go.
This enables future AuditDef instrumentation on these routes.
* fix(review): rename param, add /api/v1/channels/test endpoint
- Rename `am` to `alertmanagerService` in NewHandlers
- Add /api/v1/channels/test as the canonical test endpoint
- Mark /api/v1/testChannel as deprecated
- Regenerate OpenAPI spec
* fix(review): use camelCase for channel orgId json tag
* fix(review): remove section comments from alertmanager routes
* fix(review): use routepolicies tag without hyphen
* chore: regenerate frontend API clients for alertmanager routes
* fix: add required/nullable/enum tags to alertmanager OpenAPI types
- PostableRoutePolicy: mark expression, name, channels as required
- GettableRoutePolicy: change CreatedAt/UpdatedAt from pointer to value
- Channel: mark name, type, data, orgId as required
- ExpressionKind: add Enum() for rule/policy values
- Regenerate OpenAPI spec and frontend clients
* fix: use typed response for GetAlerts endpoint
* fix: add Receiver request type to channel mutation endpoints
CreateChannel, UpdateChannelByID, TestChannel, and TestChannelDeprecated
all read req.Body as a Receiver config. The OpenAPIDef must declare
the request type so the generated SDK includes the body parameter.
* fix: change CreateChannel access from EditAccess to AdminAccess
Aligns CreateChannel endpoint with the rest of the channel mutation
endpoints (update/delete) which all require admin access. This is
consistent with the frontend where notifications are not accessible
to editors.
* chore: initial commit
* chore: added metricNamespace as a new param
* chore: go generate openapi, update spec
* chore: frontend yarn generate:api
* chore: added metricnamespace support in /fields/values as well as added integration tests
* chore: corrected comment
* chore: added unit tests for getMetricsKeys and getMeterSourceMetricKeys
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* fix(authz): populate correct error for deleted service account
* chore(authz): reduce the regex restrictions on service accounts
* chore(authz): reduce the regex restrictions on service accounts
* fix(authz): populate correct error for deleted service account
* fix(authz): populate correct error for deleted service account
* chore(sqlstore): added connection max lifetime param support
* chore(sqlstore): fix test
* chore(sqlstore): default to 0 for conn_max_lifetime
* chore(sqlstore): rename the config
* chore(sqlstore): rename the config
* feat(audit): wire auditor into DI graph and service lifecycle
Register the auditor in the factory service registry so it participates
in application lifecycle (start/stop/health). Community uses noopauditor,
enterprise uses otlphttpauditor with licensing gate. Pass the auditor
instance to the audit middleware instead of nil.
* feat(audit): use NamedMap provider pattern with config-driven selection
Switch from single-factory callback to NamedMap + factory.NewProviderFromNamedMap
so the config's Provider field selects the auditor implementation. Add
NewAuditorProviderFactories() with noop as the community default. Enterprise
extends the map with otlphttpauditor. Add auditor section to conf/example.yaml
and set default provider to "noop" in config.
* chore: move auditor config to end of example.yaml
* feat(audit): add telemetry audit query infrastructure
Add pkg/telemetryaudit/ with tables, field mapper, condition builder,
and statement builder for querying audit logs from signoz_audit database.
Add SourceAudit to source enum and integrate audit key resolution
into the metadata store.
* chore: address review comments
Comment out SourceAudit from Enum() until frontend is ready.
Use actual audit table constants in metadata test helpers.
* fix(audit): align field mapper with actual audit DDL schema
Remove resources_string (not in audit table DDL).
Add event_name as intrinsic column.
Resource context resolves only through the resource JSON column.
* feat(audit): add audit field value autocomplete support
Wire distributed_tag_attributes_v2 for signoz_audit into the
metadata store. Add getAuditFieldValues() and route SignalLogs +
SourceAudit to it in GetFieldValues().
* test(audit): add statement builder tests
Cover all three request types (list, time series, scalar) with
audit-specific query patterns: materialized column filters, AND/OR
conditions, limit CTEs, and group-by expressions.
* refactor(audit): inline field key map into test file
Remove test_data.go and inline the audit field key map directly
into statement_builder_test.go with a compact helper function.
* style(audit): move column map to const.go, use sqlbuilder.As in metadata
Move logsV2Columns from field_mapper.go to const.go to colocate all
column definitions. Switch getAuditKeys() to use sb.As() instead of
raw string formatting. Fix FieldContext alignment.
* fix(audit): align table names with schema migration
Migration uses logs/distributed_logs (not logs_v2/distributed_logs_v2).
Rename LogsV2TableName to LogsTableName and LogsV2LocalTableName to
LogsLocalTableName to match the actual signoz_audit DDL.
* feat(audit): add integration test fixture for audit logs
AuditLog fixture inserts into all 5 signoz_audit tables matching
the schema migration DDL: distributed_logs (no resources_string,
has event_name), distributed_logs_resource, distributed_tag_attributes_v2,
distributed_logs_attribute_keys, distributed_logs_resource_keys.
* fix(audit): rename tag_attributes_v2 to tag_attributes
Migration uses tag_attributes/distributed_tag_attributes (no _v2
suffix). Rename constants and update all references including the
integration test fixture.
* feat(audit): wire audit statement builder into querier
Add auditStmtBuilder to querier struct and route LogAggregation
queries with source=audit to it in all three dispatch locations
(main query, live tail, shiftedQuery). Create and wire the full
audit query stack in signozquerier provider.
* test(audit): add integration tests for audit log querying
Cover the documented query patterns: list all events, filter by
principal ID, filter by outcome, filter by resource name+ID,
filter by principal type, scalar count for alerting, and
isolation test ensuring audit data doesn't leak into regular logs.
* fix(audit): revert sb.As in getAuditKeys, fix fixture column_names
Revert getAuditKeys to use raw SQL strings instead of sb.As() which
incorrectly treated string literals as column references. Add explicit
column_names to all ClickHouse insert calls in the audit fixture.
* fix(audit): remove debug assertion from integration test
* feat(audit): internalize resource filter in audit statement builder
Build the resource filter internally pointing at
signoz_audit.distributed_logs_resource. Add LogsResourceTableName
constant. Remove resourceFilterStmtBuilder from constructor params.
Update test expectations to use the audit resource table.
* fix(audit): rename resource.name to resource.kind, move to resource attributes
Align with schema change from SigNoz/signoz#10826:
- signoz.audit.resource.name renamed to signoz.audit.resource.kind
- resource.kind and resource.id moved from event attributes to OTel
Resource attributes (resource JSON column)
- Materialized columns reduced from 7 to 5 (resource.kind and
resource.id no longer materialized)
* refactor(audit): use pytest.mark.parametrize for filter integration tests
Consolidate filter test functions into a single parametrized test.
6/8 tests passing; resource kind+ID filter and scalar count need
further investigation (resource filter JSON key extraction with
dotted keys, scalar response format).
* fix(audit): add source to resource filter for correct metadata routing
Add source param to telemetryresourcefilter.New so the resource
filter's key selectors include Source when calling GetKeysMulti.
Without this, audit resource keys route to signoz_logs metadata
tables instead of signoz_audit. Fix scalar test to use table
response format (columns+data, not rows).
* refactor(audit): reuse querier fixtures in integration tests
Add source param to BuilderQuery and build_scalar_query in the
querier fixture. Replace custom _build_audit_query and
_build_audit_ts_query helpers with BuilderQuery and
build_scalar_query from the shared fixtures.
* refactor(audit): remove wrapper helpers, inline make_query_request calls
Remove _query_audit_raw and _query_audit_scalar helpers. Use
make_query_request, BuilderQuery, and build_scalar_query directly.
Compute time window at test execution time via _time_window() to
avoid stale module-level timestamps.
* refactor(audit): inline _time_window into test functions
* style(audit): use snake_case for pytest parametrize IDs
* refactor(audit): inline DEFAULT_ORDER using build_order_by
Use build_order_by from querier fixtures instead of OrderBy/
TelemetryFieldKey dataclasses. Allow BuilderQuery.order to accept
plain dicts alongside OrderBy objects.
* refactor(audit): inline all data setup, use distinct scenarios per test
Remove _insert_standard_audit_events helper. Each test now owns its
data: list_all uses alert-rule/saved-view/user resource types,
scalar_count uses multiple failures from different principals (count=2),
leak test uses a single organization event. Parametrized filter tests
keep the original 5-event dataset.
* fix(audit): remove silent empty-string guards in metadata store
Remove guards that silently returned nil/empty when audit DB params
were empty. All call sites now pass real constants, so misconfiguration
should fail loudly rather than produce silent empty results.
* style(audit): remove module docstring from integration test
* style: formatting fix in tables file
* style: formatting fix in tables file
* fix: add auditStmtBuilder nil param to querier_test.go
* fix: fix fmt
* feat: setup types and interface for waterfall v3
v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service
* refactor: move type conversion logic to types pkg
* chore: add reason for using snake case in response
* fix: update span.attributes to map of string to any
To support otel format of diffrent types of attributes
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* chore: rename resources field to follow otel
---------
Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
* fix: show warning for non-existent cost meter metrics
* chore: lint fix by removing unused list
* chore: py fmt add new line
* fix: missing metric check on type instead of temporality
* test: fix unit tests by mocking type data
* test: unit tests
* revert: revert changes from meter branch
* revert: revert changes from meter branch
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* chore(authz): add error logger for write requests
* chore(authz): send too many requests for authz write error
* chore(authz): send too many requests for authz write error
* feat(authz): accept singular roles for user and service accounts
* feat(authz): update integration tests
* feat(authz): update integration tests
* feat: move role management to a single select flow on members and service account pages(temporarily)
* feat(authz): enable stats reporter for service accounts
* feat(authz): identity call for activating/deleting user
---------
Co-authored-by: SagarRajput-7 <sagar@signoz.io>
* refactor: move resourcefilter to pkg/telemetryresourcefilter
Move pkg/querybuilder/resourcefilter to pkg/telemetryresourcefilter
to align with the existing telemetry package naming convention
(telemetrylogs, telemetrytraces, telemetrymetrics, telemetrymeter).
The resource filter is a statement builder, not a query builder utility.
* refactor: internalize resource filter construction in statement builders
Each telemetry statement builder (logs, traces) now creates its own
resource filter internally instead of receiving it as an injected
dependency. This makes it impossible to wire the wrong resource table
and simplifies the provider.
Delete telemetryresourcefilter/tables.go — each telemetry package now
owns its resource table constant (LogsResourceV2TableName in
telemetrylogs, TracesResourceV3TableName in telemetrytraces).
* refactor: create field mapper and condition builder inside resource filter New
Remove fieldMapper and conditionBuilder params from
telemetryresourcefilter.New — they are always the same
(NewFieldMapper + NewConditionBuilder) so create them internally.