* feat(global): add mcp_url to global config
Adds an optional mcp_url field to the global config so the frontend can
gate the MCP settings page on its presence. When unset the API returns
"mcp_url": null (pointer + nullable:"true"); when set it emits the
parsed URL as a string.
* feat(global): surface mcp_url in frontend types
Adds mcp_url to the manual GlobalConfigData type and refreshes the
generated OpenAPI client so consumers can read the new field.
* docs(global): use <unset> placeholder for mcp_url example
Matches the style of external_url and ingestion_url above it.
* style(global): separate mcp_url prep from return in GetConfig
Adds a blank line between the nullable-conversion block and the return
statement so the two logical phases read as distinct blocks.
* feat(global): mark endpoint fields as required in the API schema
The backend always emits external_url, ingestion_url and mcp_url on
GET /api/v1/global/config (mcp_url as literal null when unset), so the
JSON keys are always present. Add required:"true" to all three and
regenerate the OpenAPI + frontend client so consumers get non-optional
types.
* revert(global): drop mcp_url from legacy GlobalConfigData type
The legacy hand-written type for the non-Orval getGlobalConfig client
should be left alone; consumers that need mcp_url go through the
generated Orval client.
* feat: setup types and interface for waterfall v3
v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service
* refactor: move type conversion logic to types pkg
* chore: add reason for using snake case in response
* fix: update span.attributes to map of string to any
To support otel format of diffrent types of attributes
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* refactor: convert waterfall api to modules format
* chore: add same test cases as for old waterfall api
* chore: avoid sorting on every traversal
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* fix: rename timestamp to milli for readability
* fix: add timeout to module context
* fix: use typed paramter field in logs
* chore: generate openapi spec for v3 waterfall
* fix: remove timeout since waterfall take longer
* fix: use int16 for status code as per db schema
* fix: update openapi specs
* refactor: break down GetWaterfall method for readability
* chore: avoid returning nil, nil
* refactor: move type creation and constants to types package
- Move DB/table/cache/windowing constants to tracedetailtypes package
- Add NewWaterfallTrace and NewWaterfallResponse constructors in types
- Use constructors in module.go instead of inline struct literals
- Reorder waterfall.go so public functions precede private ones
* refactor: extract ClickHouse queries into a store abstraction
Move GetTraceSummary and GetTraceSpans out of module.go into a
traceStore interface backed by clickhouseTraceStore in store.go.
The module struct now holds a traceStore instead of a raw
telemetrystore.TelemetryStore, keeping DB access separate from
business logic.
* refactor: move error to types as well
* refactor: separate out store calls and computations
* refactor: breakdown GetSelectedSpans for readability
* refactor: return 404 on missing trace and other cleanup
* refactor: use same method for cache key creation
* chore: remove unused duration nano field
* chore: use sqlbuilder in clickhouse store where possible
* refactor: move waterfall traverse logic to types
and extract out auto expanded span calculation
* chore: convert all timestamp to nano for consitancy
* chore: rename waterfall response to gettableX format
* chore: fix method calls in test after refactoring
* refactor: remove unused methods
* chore: fix openapi spec
* chore: better names for methods and vars
* chore: remove caching to match from v2
* chore: update openapi client
* refactor: move selection decision to types
* chore: move types to the top
* refactor: avoid passing the whole telementry store in a module
* refactor: move waterfall constants to module config
* chore: update openapi specs
* chore: update openapi clints
* feat(authz): add check API for community build
* feat(authz): move to types
* feat(authz): fix the role corelations
* feat(authz): fix the role corelations
* fix(authz): single line returns
* feat(authz): add support for delete role
* feat(authz): register config and return error on cleanup failure
* feat(authz): take user and serviceaccount DI for assignee checks
* feat(authz): add the example yaml
* feat(authz): move to callbacks instead of DI
* chore: baseline setup
* chore: endpoint detail update
* chore: added logic for hosts v3 api
* fix: bug fix
* chore: disk usage
* chore: added validate function
* chore: added some unit tests
* chore: return status as a string
* chore: yarn generate api
* chore: removed isSendingK8sAgentsMetricsCode
* chore: moved funcs
* chore: added validation on order by
* chore: updated spec
* chore: nil pointer dereference fix in req.Filter
* chore: added temporalities of metrics
* chore: unified composite key function
* chore: code improvements
* chore: hostStatusNone added for clarity that this field can be left empty as well in payload
* chore: yarn generate api
* chore: return errors from getMetadata and lint fix
* chore: return errors from getMetadata and lint fix
* chore: added hostName logic
* chore: modified getMetadata query
* chore: add type for response and files rearrange
* chore: warnings added passing from queryResponse warning to host lists response struct
* chore: added better metrics existence check
* chore: added a TODO remark
* chore: added required metrics check
* chore: distributed samples table to local table change for get metadata
* chore: frontend fix
* chore: endpoint correction
* chore: endpoint modification openapi
* chore: escape backtick to prevent sql injection
* chore: rearrage
* chore: improvements
* chore: validate order by to validate function
* chore: improved description
* chore: added TODOs and made filterByStatus a part of filter struct
* chore: ignore empty string hosts in get active hosts
* feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956)
* chore: add functionality for showing active and inactive counts in custom group by
* chore: bug fix
* chore: added subquery for active and total count
* chore: ignore empty string hosts in get active hosts
* fix: sinceUnixMilli for determining active hosts compute once per request
* chore: refactor code
* chore: rename HostsList -> ListHosts
* chore: rearrangement
* chore: inframonitoring types renaming
* chore: added types package
* chore: file structure further breakdown for clarity
* chore: comments correction
* chore: removed temporalities
* chore: comments resolve
* chore: added json tag required: true
* chore: added status unauthorized
* chore: remove a defensive nil map check, the function ensure non-nil map when err nil
* chore: make sort stable in case of tiebreaker by comparing composite group by keys
* chore: regen api client for inframonitoring
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: custom notifiers in alert manager
* chore: lint fixs
* chore: fix email linter
* chore: added tracing to msteamsv2 notifier
* feat: alert manager template to template title and notification body
* chore: updated test name + code for timeout errors
* chore: added utils for using variables with $ notation
* chore: exposed templates for alertmanager types
* feat: added preprocessor for alert templater
* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata
* chore: fix lint issues
* chore: added handling for missing variable used in template
* feat: converted alerttemplater to interface and updated tests
* refactor: added extractCommonKV instead of 2 different functions
* test: fix preprocessor test case
* feat: added support for and in templating
* chore: lint fix
* chore: renamed the interface
* chore: added test for missing function
* refactor: test case and sb related changed
* refactor: comments and test improvements
* chore: lint fix
* chore: updated comments
* feat: added basic html markdown templater
* chore: updated newline to markdown format
* feat: slack blockkit renderer using goldmark
* test: added test for html rendering
* feat: integrated slack blockit in markdownrenderer package and removed plaintext format
* chore: updated br with new line in test and logs added
* refactor: review comments
* refactor: lint fixes
* chore: updated licenses for notifiers
* chore: updated email notifier from upstream
* feat: return single templating result from with flag for template type
* fix: variables with symbols in template
* feat: slack mrkdwn renderer
* feat: custom raw html renderer to escape <no value>
* chore: integrated slack mrkdwn renderer and added NoOp formatter
* chore: removed notifier test files
* fix: concurrent rendering in markdown renderer
* refactor: changes as per internal review
* chore: lint issue
* chore: removed special handling for softline break
* refactor: removed logger as markdown renderer dependency
* refactor: changed markdown renderer from interface to package-level functions
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* fix: tons of changes
* chore: remove redundent comparison
* ci: tests fixed
* fix: upgraded collector version
* fix: qbtoexpr tests
* fix: go sum
* chore: upgrade collector version v0.144.3-rc.4
* fix: tests
* ci: test fix
* revert: remove db binaries
* test: selectField tests added
* fix: added safeguards in plan generation
* fix: name changed to field_map
* fix: json access plan remval of AvailableTypes
* fix: invalid index usage on terminal condition
* fix: branches should tell missing array types
* fix: comment removed
* fix: issue with FuzzyMatching and API failing
* fix: int64 mapping
* ci: test and lint fix
* fix: test VisitKey
* test: running test for sku
* fix: buildFieldForJSON works
* fix: few minor changes
* fix: refactor tag vs field_key table
* fix: minor changes based on review
* revert: minor variable change
* fix: added more membership testcases
* revert: minor var names reverted
* ci: tests aligned
* fix: indexed expressions
* docs: perses schema for dashboards
* chore: no need for Signal type in commons, only used once
* chore: no need for PageSize type in commons, only used once
* chore: rm comment
* chore: remove stub for time series chart
* chore: remove manually written manifest and package
* chore: remove validate file
* chore: no config folder
* chore: no config folder
* chore: no commons (for now)
* feat: validation script
* fix: remove fields from variable specs that are there in ListVariable
* chore: test file with way more examples
* chore: test file with way more examples
* chore: checkpoint for half correct setup
* chore: rearrange specs in package.json
* chore: py script not needed
* chore: rename
* chore: folders in schemas for arranging
* chore: folders in schemas for arranging
* fix: proper composite query schema
* feat: custom time series schema
* chore: comment explaining when to use composite query and when not
* feat: promql example
* chore: remove upstream import
* fix: promql fix
* docs: time series panel schema without upstream ref
* chore: object for visualization section
* docs: bar chart panel schema without upstream ref
* docs: number panel schema without upstream ref
* docs: number panel schema without upstream ref
* docs: pie chart panel schema without upstream ref
* docs: table chart panel schema without upstream ref
* docs: histogram chart panel schema without upstream ref
* docs: list panel schema without upstream ref
* chore: a more complex example
* chore: examples for panel types
* chore: remaining fields file
* fix: no more online validation
* chore: replace yAxisUnit by unit
* chore: no need for threshold prefix inside threshold obj
* chore: remove unimplemented join query schema
* fix: no nesting in context links
* fix: less verbose field names in dynamic var
* chore: actually name every panel as a panel
* chore: common package for panels' repeated definitions
* chore: common package for queries' repeated definitions
* chore: common package for variables' repeated definitions
* fix: functions in formula
* fix: only allow one of metric or expr aggregation in builder query
* fix: datasource in perses.json
* fix: promql step duration schema
* fix: proper type for selectFields
* chore: single version for all schemas
* fix: normalise enum defs
* chore: change attr name to name
* chore: common threshold type
* chore: doc for how to add a panel spec
* feat: textbox variable
* feat: go struct based schema for dashboardv2 with validations and some tests
* fix: go mod fix
* chore: perses folder not needed anymore
* chore: use perses updated/createdat
* fix: builder query validation (might need to revisit, 3 types seems bad)
* chore: go lint fixes
* chore: define constants for enum values
* chore: nil factory case not needed
* chore: nil factory case not needed
* chore: slight rearrange for builder spec readability
* feat: add TimeSeriesChartAppearance
* chore: no omit empty
* chore: span gaps in schema
* chore: context link not needed in plugins
* chore: remove format from threshold with label, rearrange structs
* test: fix unit tests
* chore: refer to common struct
* feat: query type and panel type matching
* test: unit tests improvement first pass
* test: unit tests improvement second pass
* test: unit tests improvement third pass
* test: unit tests improvement fourth pass
* test: unit test for dashboard with sections
* test: unit test for dashboard with sections
* fix: add missing dashboard metadata fields
* chore: go lint fixes
* chore: go lint fixes
* chore: changes for create v2 api
* chore: more info in StorableDashboardDataV2
* chore: diff check in update method
* chore: add required true tag to required fields
* feat: update metadata methods
* chore: go mod tidy
* chore: put id in metadata.name, authtypes for v2
* revert: only the schema for now in this PR
* chore: comment for why v1.DashboardSpec is chosen
* chore: change source to signal in DynamicVariableSpec
* fix: string values for precision option
* feat: literal options for comparison operator
* fix: missing required tag in threshold fields
* chore: use valuer.string for plugin kind enums
* chore: use only TelemetryFieldKey in ListPanelSpec
* chore: simplify variable plugin validation
* fix: do not allow nil panels
* fix: do not allow nil plugin spec
* fix: signal should be an enum not a string
* chore: rearrange enums to separate those with default values
* test: unit tests for invalid enum values
* fix: all enums should have a default value
* refactor: extract UnmarshalBuilderQueryBySignal to deduplicate signal dispatch
* refactor: proper struct for span gaps
* chore: back to normal strings for kind enums
* chore: ticks in err messages
* chore: ticks in err messages
* chore: remove unused struct
* chore: snake case for non-kind enum values
* chore: proper error wrapping
* chore: accept int values in PrecisionOption as fallback
* fix: actually update the plugin from map to custom struct
* feat: disallow unknown fields in plugins
* chore: make enums valuer.string
* chore: proper enum types in constants
* chore: rename value to avoid overriding valuer.string method
* test: db cycle test
* fix: lint fix in some other file
* test: remove collapse info from sections
* test: use testify package
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* refactor(ruler): add Rule v2 read type and rename storage Rule to StorableRule
* refactor(ruler): map GettableRule to Rule before responding on v2 routes
* docs(openapi): regenerate spec with RuletypesRule on v2 rules routes
* docs(frontend): regenerate API clients with RuletypesRuleDTO
* refactor(ruler): validate uuid-v7 on delete rule handler
* refactor(ruler): add Enum() on AlertType
* refactor(ruler): convert RepeatType and RepeatOn to valuer.String with Enum()
* refactor(ruler): mark required fields on Recurrence
* refactor(ruler): mark required tag on CumulativeSchedule.Type
* refactor(ruler): rename GettablePlannedMaintenance to PlannedMaintenance
* docs: regenerate OpenAPI spec and frontend clients with tightened schema
* refactor(ruler): add PostablePlannedMaintenance input type with Validate
* refactor(ruler): rename EditPlannedMaintenance to Update and GetAll to List
* refactor(ruler): switch Create/Update to *PostablePlannedMaintenance
* refactor(ruler): convert PlannedMaintenance.Id string to ID valuer.UUID
* refactor(ruler): return *PlannedMaintenance from CreatePlannedMaintenance
* docs: regenerate OpenAPI spec and frontend clients for Postable/ID changes
* refactor(ruler): type PlannedMaintenance.Status as MaintenanceStatus enum
* refactor(ruler): type PlannedMaintenance.Kind as MaintenanceKind enum
* refactor(ruler): mark GettableRule.Id required
* refactor(ruler): mark GettableRule.State required
* refactor(ruler): make GettableRule timestamps non-pointer and users nullable
* refactor(ruler): return bare array from v2 ListRules instead of wrapped object
* docs: regenerate OpenAPI spec and frontend clients for schema pass
* refactor(ruler): define Ruler and Handler interfaces with signozruler implementation
Expand the Ruler interface with rule management and planned maintenance
methods matching rules.Manager signatures. Add Handler interface for
HTTP endpoints. Implement handler in signozruler wrapping ruler.Ruler,
and update provider to embed *rules.Manager for interface satisfaction.
* refactor(ruler): move eval_delay from query-service constants to ruler config
Replace constants.GetEvalDelay() with config.EvalDelay on ruler.Config,
defaulting to 2m. This removes the signozruler dependency on
pkg/query-service/constants.
* refactor(ruler): use time.Duration for eval_delay config
Match the convention used by all other configs in the codebase.
TextDuration is for preserving human-readable text through JSON
round-trips in user-facing rule definitions, not for internal config.
* refactor(ruler): add godoc comments and spacing to Ruler interface
* refactor(ruler): wire ruler handler through signoz.New and signozapiserver
- Add Start/Stop to Ruler interface for lifecycle management
- Add rulerCallback to signoz.New() for EE customization
- Wire ruler.Handler through Handlers, signozapiserver provider
- Register 12 routes in signozapiserver/ruler.go (7 rules, 5 downtime)
- Update cmd/community and cmd/enterprise to pass rulerCallback
- Move rules.Manager creation from server.go to signoz.New via callback
- Change APIHandler.ruleManager type from *rules.Manager to ruler.Ruler
- Remove makeRulesManager from both OSS and EE server.go
* refactor(ruler): remove old rules and downtime_schedules routes from http_handler
Remove 7 rules CRUD routes and 5 downtime_schedules routes plus their
handler methods from http_handler.go. These are now served by
signozapiserver/ruler.go via handler.New() with OpenAPIDef.
The 4 v1 history routes (stats, timeline, top_contributors,
overall_status) remain in http_handler.go as they depend on
interfaces.Reader and have v2 equivalents already in signozapiserver.
* refactor(ruler): use ProviderFactory pattern and register in factory.Registry
Replace the rulerCallback with rulerProviderFactories following the
standard ProviderFactory pattern (like auditorProviderFactories). The
ruler is now created via factory.NewProviderFromNamedMap and registered
in factory.Registry for lifecycle management. Start/Stop are no longer
called manually in server.go.
- Ruler interface embeds factory.Service (Start/Stop return error)
- signozruler.NewFactory accepts all deps including EE task funcs
- provider uses named field (not embedding) with explicit delegation
- cmd/community passes nil task funcs, cmd/enterprise passes EE funcs
- Remove NewRulerProviderFactories (replaced by callback from cmd/)
- Remove manual Start/Stop from both OSS and EE server.go
* fix(ruler): make Start block on stopC per factory.Service contract
rules.Manager.Start is non-blocking (run() just closes a channel).
Add stopC to provider so Start blocks until Stop closes it, matching
the factory.Service contract used by the Registry.
* refactor(ruler): remove unused RM() accessor from EE APIHandler
* refactor(ruler): remove RuleManager from APIHandlerOpts
Use Signoz.Ruler directly instead of passing it through opts.
* refactor(ruler): add /api/v1/rules/test and mark /api/v1/testRule as deprecated
* refactor(ruler): use binding.JSON.BindBody for downtime schedule decode
* refactor(ruler): add TODOs for raw string params on Ruler interface
Mark CreateRule, EditRule, PatchRule, TestNotification, and DeleteRule
with TODOs to accept typed params instead of raw JSON strings. Requires
changing the storage model since the manager stores raw JSON as Data.
* refactor(ruler): add TODO on MaintenanceStore to not expose store directly
* docs: regenerate OpenAPI spec and frontend API clients with ruler routes
* refactor(ruler): rename downtime_schedules tag to downtimeschedules
* refactor(ruler): add query params to ListDowntimeSchedules OpenAPIDef
Add ListPlannedMaintenanceParams struct with active/recurring fields.
Use binding.Query.BindQuery in the handler instead of raw URL parsing.
Add RequestQuery to the OpenAPIDef so params appear in the OpenAPI spec
and generated frontend client.
* refactor(ruler): add GettableTestRule response type to TestRule endpoint
Define GettableTestRule struct with AlertCount and Message fields.
Use it as the Response in TestRule OpenAPIDef so the generated frontend
client has a proper response type instead of string.
* refactor(ruler): tighten schema with oneOf unions and required fields
Surface the polymorphism in RuleThresholdData and EvaluationEnvelope via
JSONSchemaOneOf (the same pattern as QueryEnvelope), so the generated
TS types are discriminated unions with typed `spec` instead of unknown.
Also mark `alert`, `ruleType`, and `condition` required on PostableRule
so the generated TS types are non-optional for callers.
* refactor(ruler): add Enum() on EvaluationKind, ScheduleType, ThresholdKind
Surface the fixed set of accepted values for these valuer-wrapped kind
types so OpenAPI emits proper string-enum schemas and the generated TS
types become string-literal unions instead of plain string.
* refactor(ruler): mark required fields on nested rule and maintenance types
Surface fields already enforced by Validate()/UnmarshalJSON as required
in the OpenAPI schema so the generated TS types match runtime behavior.
Touches RuleCondition (compositeQuery, op, matchType), RuleThresholdData
(kind, spec), BasicRuleThreshold (name, target, op, matchType),
RollingWindow (evalWindow, frequency), CumulativeWindow (schedule,
frequency, timezone), EvaluationEnvelope (kind, spec), Schedule
(timezone), GettablePlannedMaintenance (name, schedule).
Does not mark server-populated fields (id, createdAt, updatedAt, status,
kind) on GettablePlannedMaintenance required, since the same struct is
reused for request bodies in MaintenanceStore.CreatePlannedMaintenance.
* refactor(ruler): tighten AlertCompositeQuery, QueryType, PanelType schema
Missed in the earlier tightening pass. AlertCompositeQuery.queries,
panelType, queryType are all required for a valid composite query;
QueryType and PanelType are valuer-wrapped with fixed value sets, so
expose them as enums in the OpenAPI schema.
* refactor(ruler): wrap sql.ErrNoRows as TypeNotFound in by-ID lookups
GetStoredRule and GetPlannedMaintenanceByID previously returned bun's
raw Scan error, so a missing ID leaked "sql: no rows in result set" to
the HTTP response with a 500 status. WrapNotFoundErrf converts
sql.ErrNoRows into TypeNotFound so render.Error emits 404 with a stable
`not_found` code, and passes other errors through unchanged.
* refactor(ruler): move migrated rules routes to /api/v2/rules
The 7 rules routes now live at /api/v2/rules, /api/v2/rules/{id}, and
/api/v2/rules/test — served via handler.New with render.Success and
render.Error. The legacy /api/v1/rules paths will be restored in the
query-service http handler in a follow-up so existing clients keep
receiving the SuccessResponse envelope unchanged.
Drop the /api/v1/testRule deprecated alias from signozapiserver; the
original lives on main's http_handler.go and is restored alongside the
other v1 paths.
Downtime schedule routes stay at /api/v1/downtime_schedules — single
track, no legacy restore planned.
* refactor(ruler): restore /api/v1/rules legacy handlers for back-compat
Bring the 7 rule CRUD/test handlers and their router.HandleFunc lines
back to http_handler.go so /api/v1/rules, /api/v1/rules/{id}, and
/api/v1/testRule continue to emit the legacy SuccessResponse envelope.
The v2 versions under signozapiserver are the new home for the render
envelope used by generated clients.
Delegation uses aH.ruleManager (populated from opts.Signoz.Ruler in
NewAPIHandler), so a single ruler.Ruler instance serves both paths — no
second rules.Manager is instantiated.
Downtime schedules stay single-track under signozapiserver; the 5
downtime handlers are not restored.
* docs: regenerate OpenAPI spec and frontend clients for /api/v2/rules
* refactor(ruler): return 201 Created on POST /api/v2/rules
A successful create now responds with 201 Created and the full
GettableRule body, matching REST convention for resource creation.
Regenerates the OpenAPI spec and frontend clients to reflect the new
status code.
* refactor(ruler): restore dropped sorter TODO in legacy listRules
The legacy listRules handler was copied verbatim from main during the
v1 back-compat restore, but an inner blank line and the load-bearing
`// todo(amol): need to add sorter` comment were stripped. Put them
back so the legacy block round-trips cleanly against main.
* refactor(ruler): return 201 Created on POST /api/v1/downtime_schedules
Match the REST convention already applied to POST /api/v2/rules:
successful creates respond with 201 Created. Response body remains
empty (nil); the generated frontend client surface is unchanged since
no response type was declared.
A richer "return the created resource" response body is a separate
follow-up — holding off until the ruletypes naming cleanup lands.
* fix(ruler): signal Healthy only after manager.Start closes m.block
The ruler provider didn't implement factory.Healthy, so the registry
fell back to factory.closedC and marked the service StateRunning the
instant its Start goroutine spawned — before rules.Manager.Start had
closed m.block. /api/v2/healthz therefore returned 200 while rule
evaluation was still gated, and integration tests that POSTed a rule
immediately after the readiness check saw their task goroutines stuck
on <-m.block until the next frequency tick.
Add a healthyC channel and close it inside Start only after
manager.Start returns; implement factory.Healthy so the registry and
/api/v2/healthz wait on the real readiness signal.
* fix: add the withhealthy interface
* fix(ruler): alias legacy RULES_EVAL_DELAY env var in backward-compat
The eval_delay config was moved from query-service constants (read from
RULES_EVAL_DELAY) onto ruler.Config (read via mapstructure from
SIGNOZ_RULER_EVAL__DELAY). That silently broke the legacy env var for
any existing deployment — notably the alerts integration-test fixture
which sets RULES_EVAL_DELAY=0s to let rules evaluate against just-
inserted data. The resulting default 2m delay pushed the query window
far enough back that the fixture's rate spike fell outside it, causing
8 of 24 parametrize cases in 02_basic_alert_conditions.py to fail with
"Expected N alerts to be fired but got 0 alerts".
Add RULES_EVAL_DELAY to mergeAndEnsureBackwardCompatibility alongside
the ~10 other aliased legacy env vars. Emits the standard deprecation
warning and overrides config.Ruler.EvalDelay.
* chore: custom notifiers in alert manager
* chore: lint fixs
* chore: fix email linter
* chore: added tracing to msteamsv2 notifier
* feat: alert manager template to template title and notification body
* chore: updated test name + code for timeout errors
* chore: added utils for using variables with $ notation
* chore: exposed templates for alertmanager types
* feat: added preprocessor for alert templater
* chore: hooked preProcess function in expandTitle and body, added labels and annotations in alertdata
* chore: fix lint issues
* chore: added handling for missing variable used in template
* feat: converted alerttemplater to interface and updated tests
* refactor: added extractCommonKV instead of 2 different functions
* test: fix preprocessor test case
* feat: added support for and in templating
* chore: lint fix
* chore: renamed the interface
* chore: added test for missing function
* refactor: test case and sb related changed
* refactor: comments and test improvements
* chore: lint fix
* chore: updated comments
* chore: updated newline to markdown format
* chore: updated br with new line in test and logs added
* refactor: review comments
* refactor: lint fixes
* chore: updated licenses for notifiers
* chore: updated email notifier from upstream
* feat: return single templating result from with flag for template type
* fix: variables with symbols in template
* chore: removed notifier test files
* refactor: changes as per internal review
* chore: lint issue
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* fix(member): better UX for pending invite users
* fix(member): add integration tests and reuse timezone util
* fix(member): rename deprecated and remove dead files
* fix(member): do not use hypened endpoints
* fix(member): user friendly button text
* fix(member): update the API endpoints and integration tests
* fix(member): simplify handler naming convention
* fix(member): added v2 API for update my password
* fix(member): remove more dead code
* fix(member): fix integration tests
* fix(member): fix integration tests
* refactor(alertmanager): move API handlers to signozapiserver
Extract Handler interface in pkg/alertmanager/handler.go and move
the implementation from api.go to signozalertmanager/handler.go.
Register all alertmanager routes (channels, route policies, alerts)
in signozapiserver via handler.New() with OpenAPIDef. Remove
AlertmanagerAPI injection from http_handler.go.
This enables future AuditDef instrumentation on these routes.
* fix(review): rename param, add /api/v1/channels/test endpoint
- Rename `am` to `alertmanagerService` in NewHandlers
- Add /api/v1/channels/test as the canonical test endpoint
- Mark /api/v1/testChannel as deprecated
- Regenerate OpenAPI spec
* fix(review): use camelCase for channel orgId json tag
* fix(review): remove section comments from alertmanager routes
* fix(review): use routepolicies tag without hyphen
* chore: regenerate frontend API clients for alertmanager routes
* fix: add required/nullable/enum tags to alertmanager OpenAPI types
- PostableRoutePolicy: mark expression, name, channels as required
- GettableRoutePolicy: change CreatedAt/UpdatedAt from pointer to value
- Channel: mark name, type, data, orgId as required
- ExpressionKind: add Enum() for rule/policy values
- Regenerate OpenAPI spec and frontend clients
* fix: use typed response for GetAlerts endpoint
* fix: add Receiver request type to channel mutation endpoints
CreateChannel, UpdateChannelByID, TestChannel, and TestChannelDeprecated
all read req.Body as a Receiver config. The OpenAPIDef must declare
the request type so the generated SDK includes the body parameter.
* fix: change CreateChannel access from EditAccess to AdminAccess
Aligns CreateChannel endpoint with the rest of the channel mutation
endpoints (update/delete) which all require admin access. This is
consistent with the frontend where notifications are not accessible
to editors.
* chore: initial commit
* chore: added metricNamespace as a new param
* chore: go generate openapi, update spec
* chore: frontend yarn generate:api
* chore: added metricnamespace support in /fields/values as well as added integration tests
* chore: corrected comment
* chore: added unit tests for getMetricsKeys and getMeterSourceMetricKeys
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* fix(authz): populate correct error for deleted service account
* chore(authz): reduce the regex restrictions on service accounts
* chore(authz): reduce the regex restrictions on service accounts
* fix(authz): populate correct error for deleted service account
* fix(authz): populate correct error for deleted service account
* feat(audit): add telemetry audit query infrastructure
Add pkg/telemetryaudit/ with tables, field mapper, condition builder,
and statement builder for querying audit logs from signoz_audit database.
Add SourceAudit to source enum and integrate audit key resolution
into the metadata store.
* chore: address review comments
Comment out SourceAudit from Enum() until frontend is ready.
Use actual audit table constants in metadata test helpers.
* fix(audit): align field mapper with actual audit DDL schema
Remove resources_string (not in audit table DDL).
Add event_name as intrinsic column.
Resource context resolves only through the resource JSON column.
* feat(audit): add audit field value autocomplete support
Wire distributed_tag_attributes_v2 for signoz_audit into the
metadata store. Add getAuditFieldValues() and route SignalLogs +
SourceAudit to it in GetFieldValues().
* test(audit): add statement builder tests
Cover all three request types (list, time series, scalar) with
audit-specific query patterns: materialized column filters, AND/OR
conditions, limit CTEs, and group-by expressions.
* refactor(audit): inline field key map into test file
Remove test_data.go and inline the audit field key map directly
into statement_builder_test.go with a compact helper function.
* style(audit): move column map to const.go, use sqlbuilder.As in metadata
Move logsV2Columns from field_mapper.go to const.go to colocate all
column definitions. Switch getAuditKeys() to use sb.As() instead of
raw string formatting. Fix FieldContext alignment.
* fix(audit): align table names with schema migration
Migration uses logs/distributed_logs (not logs_v2/distributed_logs_v2).
Rename LogsV2TableName to LogsTableName and LogsV2LocalTableName to
LogsLocalTableName to match the actual signoz_audit DDL.
* feat(audit): add integration test fixture for audit logs
AuditLog fixture inserts into all 5 signoz_audit tables matching
the schema migration DDL: distributed_logs (no resources_string,
has event_name), distributed_logs_resource, distributed_tag_attributes_v2,
distributed_logs_attribute_keys, distributed_logs_resource_keys.
* fix(audit): rename tag_attributes_v2 to tag_attributes
Migration uses tag_attributes/distributed_tag_attributes (no _v2
suffix). Rename constants and update all references including the
integration test fixture.
* feat(audit): wire audit statement builder into querier
Add auditStmtBuilder to querier struct and route LogAggregation
queries with source=audit to it in all three dispatch locations
(main query, live tail, shiftedQuery). Create and wire the full
audit query stack in signozquerier provider.
* test(audit): add integration tests for audit log querying
Cover the documented query patterns: list all events, filter by
principal ID, filter by outcome, filter by resource name+ID,
filter by principal type, scalar count for alerting, and
isolation test ensuring audit data doesn't leak into regular logs.
* fix(audit): revert sb.As in getAuditKeys, fix fixture column_names
Revert getAuditKeys to use raw SQL strings instead of sb.As() which
incorrectly treated string literals as column references. Add explicit
column_names to all ClickHouse insert calls in the audit fixture.
* fix(audit): remove debug assertion from integration test
* feat(audit): internalize resource filter in audit statement builder
Build the resource filter internally pointing at
signoz_audit.distributed_logs_resource. Add LogsResourceTableName
constant. Remove resourceFilterStmtBuilder from constructor params.
Update test expectations to use the audit resource table.
* fix(audit): rename resource.name to resource.kind, move to resource attributes
Align with schema change from SigNoz/signoz#10826:
- signoz.audit.resource.name renamed to signoz.audit.resource.kind
- resource.kind and resource.id moved from event attributes to OTel
Resource attributes (resource JSON column)
- Materialized columns reduced from 7 to 5 (resource.kind and
resource.id no longer materialized)
* refactor(audit): use pytest.mark.parametrize for filter integration tests
Consolidate filter test functions into a single parametrized test.
6/8 tests passing; resource kind+ID filter and scalar count need
further investigation (resource filter JSON key extraction with
dotted keys, scalar response format).
* fix(audit): add source to resource filter for correct metadata routing
Add source param to telemetryresourcefilter.New so the resource
filter's key selectors include Source when calling GetKeysMulti.
Without this, audit resource keys route to signoz_logs metadata
tables instead of signoz_audit. Fix scalar test to use table
response format (columns+data, not rows).
* refactor(audit): reuse querier fixtures in integration tests
Add source param to BuilderQuery and build_scalar_query in the
querier fixture. Replace custom _build_audit_query and
_build_audit_ts_query helpers with BuilderQuery and
build_scalar_query from the shared fixtures.
* refactor(audit): remove wrapper helpers, inline make_query_request calls
Remove _query_audit_raw and _query_audit_scalar helpers. Use
make_query_request, BuilderQuery, and build_scalar_query directly.
Compute time window at test execution time via _time_window() to
avoid stale module-level timestamps.
* refactor(audit): inline _time_window into test functions
* style(audit): use snake_case for pytest parametrize IDs
* refactor(audit): inline DEFAULT_ORDER using build_order_by
Use build_order_by from querier fixtures instead of OrderBy/
TelemetryFieldKey dataclasses. Allow BuilderQuery.order to accept
plain dicts alongside OrderBy objects.
* refactor(audit): inline all data setup, use distinct scenarios per test
Remove _insert_standard_audit_events helper. Each test now owns its
data: list_all uses alert-rule/saved-view/user resource types,
scalar_count uses multiple failures from different principals (count=2),
leak test uses a single organization event. Parametrized filter tests
keep the original 5-event dataset.
* fix(audit): remove silent empty-string guards in metadata store
Remove guards that silently returned nil/empty when audit DB params
were empty. All call sites now pass real constants, so misconfiguration
should fail loudly rather than produce silent empty results.
* style(audit): remove module docstring from integration test
* style: formatting fix in tables file
* style: formatting fix in tables file
* fix: add auditStmtBuilder nil param to querier_test.go
* fix: fix fmt
* feat: setup types and interface for waterfall v3
v3 is required for udpating the response json of
the waterfall api. There wont' be any logical change.
Using this requirement as an opportunity to move
waterfall api to provider codebase architecture from
older query-service
* refactor: move type conversion logic to types pkg
* chore: add reason for using snake case in response
* fix: update span.attributes to map of string to any
To support otel format of diffrent types of attributes
* fix: remove unused fields and rename span type
To avoid confusing with otel span
* chore: rename resources field to follow otel
---------
Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
* feat(authz): accept singular roles for user and service accounts
* feat(authz): update integration tests
* feat(authz): update integration tests
* feat: move role management to a single select flow on members and service account pages(temporarily)
* feat(authz): enable stats reporter for service accounts
* feat(authz): identity call for activating/deleting user
---------
Co-authored-by: SagarRajput-7 <sagar@signoz.io>
* feat: updated user api to v2 and accordingly update members page and role management
* feat: updated members page to use new role management and v2 user api
* feat: updated test cases
* feat: code refactor
* feat: refactored code and addressed feedbacks
* feat: refactored code and addressed feedbacks
* feat: refactored code and addressed feedbacks
* fix(user): fix openapi spec
* feat: handle isRoot user and self user cases and added test cases
---------
Co-authored-by: vikrantgupta25 <vikrant@signoz.io>
* feat(audittypes): align types with revised schema doc
Rename resource.name → resource.kind to match Typeable.Kind() rename.
Move resource attributes (kind, id) from event attributes to OTel
Resource, grouping events by target resource in NewPLogsFromAuditEvents.
Add network.protocol.name, network.protocol.version, url.scheme to
transport attributes for complete OTel semconv coverage.
* refactor(audittypes): inline resourceKey struct into function scope
* test(audittypes): add tests for NewPLogsFromAuditEvents
Cover resource grouping: empty input, single event, same resource
batched into one ResourceLogs, different resources split, same kind
with different IDs split, and interleaved events grouped correctly.
Verify resource attrs live on Resource (not event attributes).
* fix: added validations for having expression
* fix: added extra validation and unit tests
* fix: added antlr based parsing for validation
* fix: added more unit tests
* fix: removed validation on having in range request validations
* fix: generated lexer files and added more unit tests
* fix: edge cases
* fix: added cmnd to scripts for generating lexer
* fix: use std libg sorting instead of selection sort
* fix: support implicit and
* fix: allow bare not in expression
* fix: added suggestion for having expression
* fix: typo
* fix: added more unit tests, handle white space difference in aggregation exp and having exp
* fix: added support for in and not, updated errors
* fix: added support for brackets list
* fix: lint error
* fix: handle non spaced expression
---------
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
* feat(audit): handler-level AuditDef and response-capturing wrapper
Add declarative audit instrumentation to the handler package. Routes
declare an AuditDef alongside OpenAPIDef; the handler automatically
captures the response status/body and emits an audit event via
auditor.Audit() after every request.
* refactor(audit): move audit logic to middleware, merge with logging
Move audit event emission from handler to middleware layer. The handler
package keeps only the AuditDef struct and AuditDefProvider interface.
The logging middleware now handles both request logging and audit event
emission using a single response capture, avoiding double-wrapping.
Rename badResponseLoggingWriter to responseCapture with body capture
on all 4xx/5xx responses (previously only 400 and 5xx).
* refactor(audit): rename Logging middleware to Audit, merge into single file
Delete logging.go and merge its contents into audit.go. Rename
Logging/NewLogging to Audit/NewAudit. The response.go file with
responseCapture is unchanged.
* refactor(audit): extract NewAuditEventFromHTTPRequest factory into audittypes
Move event construction to audittypes.NewAuditEventFromHTTPRequest with
an AuditEventContext struct for caller-provided fields. The audittypes
layer reads only transport fields from *http.Request and has no mux,
authtypes, or context dependencies. The middleware pre-extracts
principal, trace, error, and route fields before calling the factory.
* refactor(audit): move error parsing to render.ErrorFromBody and render.ErrorTypeFromStatusCode
Add render.ErrorFromBody to extract errors.JSON from a JSON-encoded
ErrorResponse body, and render.ErrorTypeFromStatusCode to reverse-map
HTTP status codes to error type strings. The middleware now uses these
instead of local duplicates.
* refactor(audit): move AuditDef onto Handler interface, consolidate files
Move AuditDef() onto the Handler interface directly. All Handler
implementations now carry it: handler returns the configured def,
healthOpenAPIHandler returns nil. Delete the separate AuditDefProvider
interface and audit.go handler file. Move excludedRoutes check before
audit emission so excluded routes skip both logging and audit.
* feat(audit): add option.go with AuditDef, Option, and WithAuditDef
* refactor(audit): decompose AuditEvent into attribute sub-structs, add tests
Decompose flat AuditEvent fields into typed sub-structs
(AuditEventAuditAttributes, PrincipalAttributes, ResourceAttributes,
ErrorAttributes, TransportAttributes) each with a constructor and
Put(pcommon.Map) method. Simplify NewAuditEventFromHTTPRequest to
accept authtypes.Claims and oteltrace IDs directly. Simplify the
middleware caller accordingly.
Add unit tests for the factory, outcome boundary, and principal type
derivation.
* refactor(audit): shorten attribute struct names, drop error message
Rename AuditEventAuditAttributes to AuditAttributes,
AuditEventPrincipalAttributes to PrincipalAttributes, and likewise
for Resource, Error, and Transport. The package prefix already
disambiguates.
Remove ErrorMessage from ErrorAttributes to avoid leaking sensitive
or PII data into audit logs. Error type and code are sufficient for
filtering; investigators can correlate via trace ID.
* fix(audit): update auditorserver test and otlphttp provider for new struct layout
Update newTestEvent in server_test.go to use nested AuditAttributes
and ResourceAttributes. Update otlphttpauditor provider to access
PrincipalOrgID via PrincipalAttributes. Fix godot lint on attribute
section comments.
* fix(audit): fix gjson path in ErrorCodeFromBody, add tests
Fix ErrorCodeFromBody gjson path from "errors.code" to "error.code"
to match the ErrorResponse JSON structure. Add unit tests for valid
error response and invalid JSON cases.
* fix(audit): add CodeUnset, use ErrorCodeFromBody in middleware
Add errors.CodeUnset for responses missing an error code. Update the
audit middleware to use render.ErrorCodeFromBody instead of the removed
render.ErrorFromBody.
* test(audit): add unit tests for responseCapture
Test the four meaningful behaviors: success responses don't capture
body, error responses capture body, large error bodies truncate at
4096 bytes, and 204 No Content suppresses writes entirely.
* fix(audit): check rw.Write return values in response_test.go
* style(audit): rename want prefix to expected in test fields
* refactor(audit): replace Sprintf with strings.Builder in newBody
Handle edge cases where principal email, ID, or resource ID may be
empty. The builder conditionally includes each segment, avoiding
empty parentheses or leading spaces in the audit body.
Add test cases covering all meaningful combinations: success/failure
with full/partial/empty principal, resource ID, and error details.
* chore: fix formatting
* chore: remove json tags
* fix: rebase with main
* feat(serviceaccount): integrate service account
* feat(serviceaccount): integrate service account with better types
* feat(serviceaccount): fix lint and testing changes
* feat(serviceaccount): update integration tests
* feat(serviceaccount): fix formatting
* feat(serviceaccount): fix openapi spec
* feat(serviceaccount): update txlock to immediate to avoid busy snapshot errors
* feat(serviceaccount): add restrictions for factor_api_key
* feat(serviceaccount): add restrictions for factor_api_key
* feat: enabled service account and deprecated API Keys (#10715)
* feat: enabled service account and deprecated API Keys
* feat: deprecated API Keys
* feat: service account spec updates and role management changes
* feat: updated the error component for roles management
* feat: updated test case
* feat: updated the error component and added retries
* feat: refactored code and added retry to happend 3 times total
* feat: fixed feedbacks and added test case
* feat: refactored code and removed retry
* feat: updated the test cases
---------
Co-authored-by: SagarRajput-7 <162284829+SagarRajput-7@users.noreply.github.com>
* feat(audit): add enterprise auditor with licensing gate and OTLP HTTP export
Implements the enterprise auditor at ee/auditor/otlphttpauditor/.
Composes auditorserver.Server for batching with licensing gate,
OTel SDK LoggerProvider for InstrumentationScope, and otlploghttp
exporter for OTLP HTTP transport.
* fix(audit): address PR review — inline factory, move body+logrecord to audittypes
- Inline NewFactory closure, remove newProviderFunc
- Move buildBody to audittypes.BuildBody
- Move eventToLogRecord to audittypes.ToLogRecord
- go mod tidy for newly direct otel/log deps
* fix(audit): address PR review round 2
- Make ToLogRecord a method on AuditEvent returning sdklog.Record
- Fold buildBody into ToLogRecord as unexported helper
- Remove accumulatingProcessor and LoggerProvider — export directly
via otlploghttp.Exporter
- Delete body.go and processor.go
* fix(audit): address PR review round 3
- Merge export.go into provider.go
- Add severity/severityText fields to Outcome struct
- Rename buildBody to setBody method on AuditEvent
- Add appendStringIfNotEmpty helper to reduce duplication
* feat(audit): switch to plog + direct HTTP POST for OTLP export
Replace otlploghttp.Exporter + sdklog.Record with plog data model,
ProtoMarshaler, and direct HTTP POST. This properly sets
InstrumentationScope.Name = "signoz.audit" and Resource attributes
on the OTLP payload.
* fix(audit): adopt collector otlphttpexporter pattern for HTTP export
Model the send function after the OTel Collector's otlphttpexporter:
- Bounded response body reads (64KB max)
- Protobuf-encoded Status parsing from error responses
- Proper response body draining on defer
- Detailed error messages with endpoint URL and status details
* refactor(audit): split export logic into export.go, add throttle retry
- Move export, send, and HTTP response helpers to export.go
- Add exporterhelper.NewThrottleRetry for 429/503 with Retry-After
- Parse Retry-After as delay-seconds or HTTP-date per RFC 7231
- Keep provider.go focused on Auditor interface and lifecycle
* feat(audit): add partial success handler and internal retry with backoff
- Parse ExportLogsServiceResponse on 2xx for partial success, log
warning if log records were rejected
- Internal retry loop with exponential backoff for retryable status
codes (429, 502, 503, 504) using RetryConfig from auditor config
- Honour Retry-After header (delay-seconds and HTTP-date)
- Store full auditor.Config on provider struct
- Replace exporterhelper.NewThrottleRetry with local retryableError
type consumed by the internal retry loop
* fix(audit): fix lint — use pkg/errors, remove stdlib errors and fmt.Errorf
* refactor(audit): use provider as receiver name instead of p
* refactor(audit): clean up enterprise auditor implementation
- Extract retry logic into retry.go
- Move NewPLogsFromAuditEvents and ToLogRecord into event.go
- Add ErrCodeAuditExportFailed to auditor package
- Add version.Build to provider for service.version attribute
- Simplify sendOnce, split response handling into onSuccess/onErr
- Use PrincipalOrgID as valuer.UUID directly
- Use OTLPHTTP.Endpoint as URL type
- Remove gzip compression handling
- Delete logrecord.go
* refactor(audit): use pkg/http/client instead of bare http.Client
Use the standard SigNoz HTTP client with OTel instrumentation.
Disable heimdall retries (count=0) since we have our own
OTLP-aware retry loop that understands Retry-After headers.
* refactor(audit): use heimdall Retriable for retry instead of manual loop
- Implement retrier with exponential backoff from auditor RetryConfig
- Compute retry count from MaxElapsedTime and backoff intervals
- Pass retrier and retry count to pkg/http/client via WithRetriable
- Remove manual retry loop, retryableError type, and Retry-After parsing
- Heimdall handles retries on >= 500 status codes automatically
* refactor(audit): rename retry.go to retrier.go
* feat(audit): add Auditor interface and rename auditortypes to audittypes
Add the Auditor interface in pkg/auditor/ as the contract between the
HTTP handler layer and audit implementations (noop for community, OTLP
for enterprise). Includes Config with buffer, batch, and flush settings
following the provider pattern.
Rename pkg/types/auditortypes/ to pkg/types/audittypes/ for consistency.
closesSigNoz/platform-pod#1930
* refactor(audit): move endpoint config to OTLPHTTPConfig provider struct
Move Endpoint out of top-level Config into a provider-specific
OTLPHTTPConfig struct with full OTLP HTTP options (url_path, insecure,
compression, timeout, headers, retry). Keep BufferSize, BatchSize,
FlushInterval at top level as common settings across providers.
closesSigNoz/platform-pod#1930
* feat(audit): add auditorbatcher for buffered event batching
Add pkg/auditor/auditorbatcher/ with channel-based batching for audit
events. Flushes when either batch size is reached or flush interval
elapses (whichever comes first). Events are dropped when the buffer is
full (fail-open). Follows the alertmanagerbatcher pattern.
* refactor(audit): replace public channel with Receive method on batcher
Make batchC private and expose Receive() <-chan []AuditEvent as the
read-side API. Clearer contract: Add() to write, Receive() to read.
* refactor(audit): rename batcher config fields to BufferSize and BatchSize
Capacity → BufferSize, Size → BatchSize for clarity.
* fix(audit): single-line slog call and fix log key to buffer_size
* feat(audit): add OTel metrics to auditorbatcher
Add telemetry via OTel MeterProvider with 4 instruments:
- signoz.audit.events.emitted (counter)
- signoz.audit.store.write_errors (counter, via RecordWriteError)
- signoz.audit.events.dropped (counter)
- signoz.audit.events.buffer_size (observable gauge)
Batcher.New() now accepts metric.Meter and returns error.
* refactor(audit): inject factory.ScopedProviderSettings into batcher
Replace separate logger and meter params with ScopedProviderSettings,
giving the batcher access to logger, meter, and tracer from one source.
* feat(audit): add OTel tracing to batcher Add path
Span auditorbatcher.Add with event_name attribute set at creation
and audit.dropped set dynamically on buffer-full drop.
* feat(audit): add export span to batcher Receive path
Introduce Batch struct carrying context, events, and a trace span.
Each flush starts an auditorbatcher.Export span with batch_size
attribute. The consumer ends the span after export completes.
* refactor(audit): replace Batch/Receive with ExportFunc callback
Batcher now takes an ExportFunc at construction and manages spans
internally. Removes Batch struct, Receive(), and RecordWriteError()
from the public API. Span.End() is always called via defer, write
errors and span status are recorded automatically on export failure.
Uses errors.Attr for error logging, prefixes log keys with audit.
* refactor(audit): rename auditorbatcher to auditorserver
Rename package, file (batcher.go → server.go), type (Batcher → Server),
and receiver (b → server) to reflect the full service role: buffering,
batching, metrics, tracing, and export lifecycle management.
* refactor(audit): rename telemetry to serverMetrics and document struct fields
Rename type telemetry → serverMetrics and constructor
newTelemetry → newServerMetrics. Add comments to all Server
struct fields.
* feat(audit): implement ServiceWithHealthy, fix race, add unit tests
- Implement factory.ServiceWithHealthy on Server via healthyC channel
- Fix data race: Start closes healthyC after goroutinesWg.Add(1),
Stop waits on healthyC before closing stopC
- Add 8 unit tests covering construction, start/stop lifecycle,
batch-size flush, interval flush, buffer-full drop, drain on stop,
export failure handling, and concurrent safety
* fix(audit): fix lint issues in auditorserver and tests
Use snake_case for slog keys, errors.New instead of fmt.Errorf,
and check all Start/Stop return values in tests.
* fix(audit): address PR review comments
Use auditor:: prefix in validation error messages. Move fire-and-forget
comment to the Audit method, remove interface-level comment.
* feat(audit): add auditortypes package with event struct, store interface, and config
Foundational type package for the audit log system with zero dependencies
on other audit packages. Includes AuditEvent struct, Store interface (single
Emit method), Config with defaults, constants for action/outcome/principal/category,
and EventName derivation helper.
* fix(audit): address PR review feedback on auditortypes
- Remove non-CUD actions (login/logout/lock/unlock/revoke), keep only create/update/delete
- Replace pastTenseMap with pastTense field on Action struct
- Make EventName a typed struct with NewEventName() constructor
- Rename enum vars to ActionCategory* prefix (ActionCategoryAccessControl, etc.)
- Add IEC 62443 reference link to ActionCategory comment
- Add TODO on PrincipalType to use coretypes once available
- Rename types.go to event.go, merge Store interface into it
- Reorder AuditEvent fields to follow OTel LogRecord structure
- Delete config.go (belongs with auditor service, not types)
- Delete store.go (merged into event.go)
* fix(audit): remove redundant test files per review feedback
* feat: user v2 apis
* fix: openapi specs
* chore: address review comments
* fix: proper handling if invalid roles are passed
* chore: address review comments
* refactor: frontend to use deprecated apis after id rename
* feat: separate apis for adding and deleting user role
* fix: invalidate token when roles are updated
* fix: openapi specs and frontend test
* fix: openapi schema
* fix: openapi spec and move to snakecasing for json
* feat: adding cloud integration type for refactor
* refactor: store interfaces to use local types and error
* feat: adding sql store implementation
* refactor: removing interface check
* feat: adding updated types for cloud integration
* refactor: using struct for map
* refactor: update cloud integration types and module interface
* fix: correct GetService signature and remove shadowed Data field
* feat: implement cloud integration store
* refactor: adding comments and removed wrong code
* refactor: streamlining types
* refactor: add comments for backward compatibility in PostableAgentCheckInRequest
* refactor: update Dashboard struct comments and remove unused fields
* refactor: split upsert store method
* feat: adding integration test
* refactor: clean up types
* refactor: renaming service type to service id
* refactor: using serviceID type
* feat: adding method for service id creation
* refactor: updating store methods
* refactor: clean up
* refactor: clean up
* refactor: review comments
* refactor: clean up
* feat: adding handlers
* fix: lint and ci issues
* fix: lint issues
* fix: update error code for service not found
* feat: adding handler skeleton
* chore: removing todo comment
* feat: adding frontend openapi schema
* refactor: making review changes
* feat: regenerating openapi specs