mirror of
https://github.com/SigNoz/signoz.git
synced 2026-05-30 22:00:33 +01:00
406800a2e5c45d4bf00a63ab97ecc339b94d030b
133 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6cf22e98dd |
feat(planned-downtime): scope maintenance windows to label expressions (#11186)
* add maintenanceMuteStage to move planned maintenance to alertmanager Rules previously skipped rule.Eval() entirely during maintenance windows. This change moves suppression to MaintenanceMuter, injected as a Stage in the alertmanager notification pipeline. Now rules always evaluate and everys suppression is handled by alertmanager. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: wrap routing pipeline once instead of per-route injection Replace the per-route-entry loop with a single MultiStage wrap so maintenance suppression runs once per dispatch group before routing. * refactor: move maintenance mute stage into custom pipelineBuilder Copy notify.PipelineBuilder locally so we can inject mms between the silence stage and the receiver stage (GossipSettle → Inhibit → TimeActive → TimeMute → Silence → mms → Receiver), matching the correct suppression order the team requires. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add license header to pipeline_builder.go Copied code originates from Apache-2.0 licensed Prometheus Alertmanager; add dual copyright + SPDX identifier following the repo's convention. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: replace SPDX tag with full Apache 2.0 license boilerplate The full license text is unambiguously compliant with Apache 2.0 Section 4(a), which requires giving recipients "a copy of this License". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: pass MaintenanceMuter directly to pipelineBuilder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove dead orgID param from task constructors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * rename buildReceiverStage -> createReceiverStage * refactor: replace maintenanceMuteStage with notify.NewMuteStage MaintenanceMuter already satisfies types.Muter, and pipelineBuilder has its own pb.metrics, so the hand-rolled maintenanceMuteStage wrapper is redundant. Use notify.NewMuteStage(pb.muter, pb.metrics) directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: hoist MuteStage construction out of the receiver loop MuteStage holds no per-receiver state, so one instance shared across all receivers is sufficient — matching how is/ss are handled upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: always initialize maintenanceStore; remove nil guards Tests now use a real sqlrulestore-backed MaintenanceMuter instead of passing nil. With nil no longer a valid input, remove the nil guards in server.go and pipeline_builder.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move MaintenanceMuter to Server and pass it to pipelineBuilder.New - Remove muter from pipelineBuilder struct and newPipelineBuilder(); pass it as a parameter to New() instead, consistent with inhibitor/silencer - Store muter on Server so GetAlerts can call Mutes() alongside the inhibitor and silencer, ensuring maintenance-suppressed alerts show the correct muted status in API responses Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * remove redundant MemMarker wrapper * feat: surface maintenance-suppressed alerts via mutedBy in GetAlerts Alerts suppressed by an active maintenance window were being correctly muted in the notification pipeline but appeared as state=active in the v2 GetAlerts response, since MaintenanceMuter.Mutes had no marker side-effect (unlike inhibitor/silencer). Add MaintenanceMuter.MutedBy returning the matching window IDs, and plumb a mutedByFunc callback through NewGettableAlertsFromAlertProvider into AlertToOpenAPIAlert. The upstream v2 API forces state=suppressed when mutedBy is non-empty, so the frontend's existing state-based rendering picks it up without further changes. Use the dedicated mutedBy field rather than SilencedBy to avoid violating the "complete set of silence IDs" contract that anything querying silences by ID would rely on. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * code cleanup * refactor: move maintenance (planned downtime) to alertmanager packages Types move from pkg/types/ruletypes/ to pkg/types/alertmanagertypes/: - maintenance.go, recurrence.go, schedule.go (+ tests) Store impl moves from pkg/ruler/rulestore/sqlrulestore/ to pkg/alertmanager/alertmanagerstore/sqlalertmanagerstore/. Maintenance windows mute alerts, so they belong with alertmanager rather than the rule types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: add unit tests for MaintenanceMuter Covers Mutes/MutedBy semantics (empty label, rule match, empty-RuleIDs matches-all, future windows, multi-window) and the result cache (single-fetch within TTL, stale-cache fallback on store error, re-fetch after expiry, concurrency safety). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update schema changes * Re-add marker * fix NewMaintenanceStore in tests * Go lint fixes * test: use mockery-generated mock for MaintenanceStore in muter tests Replace hand-written fakeMaintenanceStore with a mockery-generated MockMaintenanceStore, consistent with the alertmanagertest pattern. Also adds MaintenanceStore to .mockery.yml so the mock stays in sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: regenerate mocks via make gen-mocks Picks up new MockHandler for the Handler interface in pkg/alertmanager and regenerates MockMaintenanceStore with canonical mockery formatting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * cleanup test * test: add e2e muting tests for maintenance window behaviour * Add label expression support to planned downtime Alert instances can now be scoped by label expression (e.g. env == "prod"), scoping suppression below the rule level. A window with no rule IDs and a label expression silences any alert whose labels match, regardless of which rule fired it; when rule IDs are also present, the expression is evaluated only within the matched rules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Remove redundant || undefined from labelExpression assignment * Move label expression evaluation into ShouldSkip ShouldSkip now owns all three suppression checks in sequence: rule ID match → schedule active → label expression. IsActive passes nil labels so the expression check is skipped (no instance labels available for UI status). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * remove redundant LabelSet->map conversion * implement Down migration to drop label_expression column * fix lset type and update openapi spec * fix(tests): resolve envprovider env isolation, factory name length, and ShouldSkip signature Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * remove unused function `evaluateLabelExpression` * chore: rename label expression to scope * test(maintenance): add tests for scope label expression filtering Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(e2e): verify scope-based maintenance muting in alertmanager flow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Use `AND` instead of `&&` Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com> * fix(maintenance): consolidate label-set-to-env conversion to avoid expr panic Move ConvertLabelSetToEnv to alertmanagertypes so both the maintenance scope evaluator and the route-policy evaluator share one implementation. Dotted label keys (e.g. kubernetes.node) are expanded into nested maps, preventing the expr-lang panic that occurs when one key is a prefix of another. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(planned-downtime): use clickable learn more link in scope tooltip * chore(sqlmigration): renumber add_scope_to_planned_maintenance to 086 078 collided with add_sa_managed_role_txn; bumped to the next free number and reordered registration after add_source_to_dashboard (085). * refactor(maintenance): extract scope expression eval and surface errors - Move ConvertLabelSetToEnv and EvalScopeExpression into expression.go with companion tests in expression_test.go. - EvalScopeExpression now returns (bool, error) instead of swallowing compile/run failures and non-bool outputs; ShouldSkip logs the error via slog.Default() and falls back to not suppressing (safety-first). - Update test fixtures to the SQL-style operator form (`=`, AND, OR) matching the placeholder and reviewer suggestions. * chore: use `=` instead of `==` in expressions * fix(maintenance): satisfy forbidigo/sloglint in scope eval - Replace fmt.Errorf with pkg/errors Wrapf/Newf using a new ErrCodeInvalidScopeExpression code. - Use slog ErrorContext (with context.Background()) instead of Error to satisfy sloglint. * perf(maintenance): fold prefix-conflict detection into ConvertLabelSetToEnv ConvertLabelSetToEnv now returns (env, conflict). The rulebased provider drops its O(n^2) pre-scan and logs based on the flag, restoring the previous O(n*d) cost while keeping the shared helper. * chore: add docs URL for invalid scope * refactor: don't log in types package * remove down migration --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com> |
||
|
|
d1f143f675 |
feat(web): add support for generating web settings types (#11445)
* feat(web): add support for generating settings type * feat(web): add support for generating settings type * feat(web): add support for generating settings type * refactor: rename generate settings to generate config web-settings - Rename cmd/settings.go to cmd/genconfig.go - Restructure command as `generate config web-settings` - Move schema output to docs/config/web-settings.json - Update frontend script to generate:config:web-settings - Update CI checks to match new command names - Strip Web prefix from generated JSON Schema definitions |
||
|
|
f47f1ad92b |
Remove unused field from waterfall response (part 1 of memory opt) (#11337)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: remove unused field from waterfall v3 * chore: update openapi specs * chore: remove debug statements |
||
|
|
9ba57d323d |
refactor: merge tracedetail typse with spantypes (#11417)
* refactor: merge tracedetail typse with spantypes * chore: update openapi specs |
||
|
|
5c54a2537c | chore: arrays non-nullable (#11388) | ||
|
|
9f60bdf54a |
chore: create source field in dashboards (#11367)
* chore: create source field in dashboards * chore: consolidate checks to module * chore: run generate * chore: address review comments * chore: separate test file * chore: address review comments |
||
|
|
a27b7d3d8e |
move planned maintenance to alertmanager pipeline (#11130)
* add maintenanceMuteStage to move planned maintenance to alertmanager Rules previously skipped rule.Eval() entirely during maintenance windows. This change moves suppression to MaintenanceMuter, injected as a Stage in the alertmanager notification pipeline. Now rules always evaluate and everys suppression is handled by alertmanager. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: wrap routing pipeline once instead of per-route injection Replace the per-route-entry loop with a single MultiStage wrap so maintenance suppression runs once per dispatch group before routing. * refactor: move maintenance mute stage into custom pipelineBuilder Copy notify.PipelineBuilder locally so we can inject mms between the silence stage and the receiver stage (GossipSettle → Inhibit → TimeActive → TimeMute → Silence → mms → Receiver), matching the correct suppression order the team requires. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: add license header to pipeline_builder.go Copied code originates from Apache-2.0 licensed Prometheus Alertmanager; add dual copyright + SPDX identifier following the repo's convention. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: replace SPDX tag with full Apache 2.0 license boilerplate The full license text is unambiguously compliant with Apache 2.0 Section 4(a), which requires giving recipients "a copy of this License". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: pass MaintenanceMuter directly to pipelineBuilder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove dead orgID param from task constructors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * rename buildReceiverStage -> createReceiverStage * refactor: replace maintenanceMuteStage with notify.NewMuteStage MaintenanceMuter already satisfies types.Muter, and pipelineBuilder has its own pb.metrics, so the hand-rolled maintenanceMuteStage wrapper is redundant. Use notify.NewMuteStage(pb.muter, pb.metrics) directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: hoist MuteStage construction out of the receiver loop MuteStage holds no per-receiver state, so one instance shared across all receivers is sufficient — matching how is/ss are handled upstream. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: always initialize maintenanceStore; remove nil guards Tests now use a real sqlrulestore-backed MaintenanceMuter instead of passing nil. With nil no longer a valid input, remove the nil guards in server.go and pipeline_builder.go. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: move MaintenanceMuter to Server and pass it to pipelineBuilder.New - Remove muter from pipelineBuilder struct and newPipelineBuilder(); pass it as a parameter to New() instead, consistent with inhibitor/silencer - Store muter on Server so GetAlerts can call Mutes() alongside the inhibitor and silencer, ensuring maintenance-suppressed alerts show the correct muted status in API responses Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * remove redundant MemMarker wrapper * feat: surface maintenance-suppressed alerts via mutedBy in GetAlerts Alerts suppressed by an active maintenance window were being correctly muted in the notification pipeline but appeared as state=active in the v2 GetAlerts response, since MaintenanceMuter.Mutes had no marker side-effect (unlike inhibitor/silencer). Add MaintenanceMuter.MutedBy returning the matching window IDs, and plumb a mutedByFunc callback through NewGettableAlertsFromAlertProvider into AlertToOpenAPIAlert. The upstream v2 API forces state=suppressed when mutedBy is non-empty, so the frontend's existing state-based rendering picks it up without further changes. Use the dedicated mutedBy field rather than SilencedBy to avoid violating the "complete set of silence IDs" contract that anything querying silences by ID would rely on. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * code cleanup * refactor: move maintenance (planned downtime) to alertmanager packages Types move from pkg/types/ruletypes/ to pkg/types/alertmanagertypes/: - maintenance.go, recurrence.go, schedule.go (+ tests) Store impl moves from pkg/ruler/rulestore/sqlrulestore/ to pkg/alertmanager/alertmanagerstore/sqlalertmanagerstore/. Maintenance windows mute alerts, so they belong with alertmanager rather than the rule types. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test: add unit tests for MaintenanceMuter Covers Mutes/MutedBy semantics (empty label, rule match, empty-RuleIDs matches-all, future windows, multi-window) and the result cache (single-fetch within TTL, stale-cache fallback on store error, re-fetch after expiry, concurrency safety). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update schema changes * Re-add marker * fix NewMaintenanceStore in tests * Go lint fixes * test: use mockery-generated mock for MaintenanceStore in muter tests Replace hand-written fakeMaintenanceStore with a mockery-generated MockMaintenanceStore, consistent with the alertmanagertest pattern. Also adds MaintenanceStore to .mockery.yml so the mock stays in sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: regenerate mocks via make gen-mocks Picks up new MockHandler for the Handler interface in pkg/alertmanager and regenerates MockMaintenanceStore with canonical mockery formatting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * cleanup test * test: add e2e muting tests for maintenance window behaviour * fix updates: omit empty endTime from serialization --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
72a58c634b |
feat(infra-monitoring): v2 daemonsets list api (#11149)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: pvcs code added * chore: updated endpoint and spec * chore: pvcs todo * chore: added condition * chore: added filter * chore: added code for deployments * chore: query nit * chore: statefulsets code added * chore: base filter added * chore: added base deployments change * chore: added base condition * chore: v2 jobs list api added * chore: added daemonsets api * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness * chore: deployment record uses PodCountsByPhase * chore: statefulset record uses PodCountsByPhase * chore: job record uses PodCountsByPhase * chore: daemonset record uses PodCountsByPhase * chore: added remaining metrics to check * chore: metrics existence check * chore: statefulset metrics added * chore: added jobs metrics * chore: added metrics * chore: updated PR things * chore: changes to generated files --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
edb30f29c1 |
feat(authz): introduce detach relationship (#11298)
* feat(authz): introduce detach relationship * feat(authz): attach and detach for parent child heirarchy * feat(authz): fix the openapi spec generated schemas * feat(authz): add integration tests * feat(authz): add telemetry metaresource * feat(authz): fix the http response and integration tests * feat(authz): generate frontend openapi schema * feat(authz): remove unwanted tuples |
||
|
|
3b9ee4901e |
feat(infra-monitoring): v2 jobs list api (#11148)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: pvcs code added * chore: updated endpoint and spec * chore: pvcs todo * chore: added condition * chore: added filter * chore: added code for deployments * chore: query nit * chore: statefulsets code added * chore: base filter added * chore: added base deployments change * chore: added base condition * chore: v2 jobs list api added * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness * chore: deployment record uses PodCountsByPhase * chore: statefulset record uses PodCountsByPhase * chore: job record uses PodCountsByPhase * chore: metrics existence check * chore: statefulset metrics added * chore: added jobs metrics * chore: added metrics --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
83fa73c3e8 |
feat(infra-monitoring): v2 statefulsets list api (#11146)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: pvcs code added * chore: updated endpoint and spec * chore: pvcs todo * chore: added condition * chore: added filter * chore: added code for deployments * chore: query nit * chore: statefulsets code added * chore: base filter added * chore: added base deployments change * chore: added base condition * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness * chore: deployment record uses PodCountsByPhase * chore: statefulset record uses PodCountsByPhase * chore: metrics existence check * chore: statefulset metrics added * chore: availablePods -> renamed to currentPods * chore: restored to main --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
951f55b062 |
feat(infra-monitoring): v2 deployments list api (#11140)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: pvcs code added * chore: updated endpoint and spec * chore: pvcs todo * chore: added condition * chore: added filter * chore: added code for deployments * chore: query nit * chore: added base deployments change * chore: added base condition * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness * chore: deployment record uses PodCountsByPhase * chore: metrics existence check --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
a913e67acf |
feat(infra-monitoring): v2 volumes list api (#11137)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: pvcs code added * chore: updated endpoint and spec * chore: pvcs todo * chore: added condition * chore: added filter * chore: added base condition * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
c86df3adcb |
feat(pnpm): migrate away from yarn (#11158)
* feat(pnpm): migrate away from yarn * fix(lodash): using uninstall dependency * fix(workflows): use pnpm as package manager * fix(pnpm-lock): keep it updated * fix(test): issue with lodash-es and our pnpm store * fix(jest): more esm conflicts * fix(pipeline-page): update snapshot test * fix(pnpm-lock): out of sync * fix(json-view): issue with typing * chore(pnpm): upgrade pnpm * chore(yarn): remove yarn |
||
|
|
1f5134b3ba |
feat: module and store for span mapper (#11127)
* feat: module and store for span mapper * fix: remove nullable from fieldcontext * fix: address comments * fix: cleanup migration file as well * fix: rename field_context to fieldContext * fix: decompose update function in module |
||
|
|
ca8cc30c34 |
feat(infra-monitoring): v2 clusters list api (#11132)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: namespaces code * chore: v2 nodes api * chore: rename * chore: v2 clusters list api * chore: namespaces code * chore: rename * chore: review clusters PR * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: cluster record uses PodCountsByPhase, NodeCountsByReadiness --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
f0b533b198 |
feat(infra-monitoring): v2 namespaces list api (#11131)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: v2 nodes api * chore: namespaces code * chore: rename * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: namespace record uses PodCountsByPhase * chore: merge error resolved --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
912c6073c5 |
feat(authz): add resource-level FGA for service accounts (#11065)
Some checks failed
build-staging / prepare (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat(authz): add resource-level FGA and attach permissions for service accounts - Add CheckAll middleware (AND of OR groups) for multi-resource authz checks - Switch SA role routes (SetRole, DeleteRole) to VerbAttach on ResourceServiceAccount - Add RoleAttachSelectors on SA module for role-level VerbAttach resolution - DeleteRole uses CheckAll (both checks at middleware from URL params) - SetRole uses Check (entity) at middleware + module-level role attach check - Add migration 078 to backfill FGA tuples for existing organizations - Add authz contributing guide (docs/contributing/go/authz.md) - Regenerate OpenAPI spec with scoped security schemes * feat(authz): fix openapi spec * feat(authz): add attach permissions to migration * feat(authz): role details page fixes * fix(openapi): openapi changes for attach * fix(openapi): openapi changes for attach * fix(types): move types to middleware to remove http import from types * test(integration): add integration tests * test(integration): fix test lint and remove contributing guide * feat(authz): revert role details changes * feat(authz): move selectors to handler * feat(authz): better naming for authz service and authz middleware * feat(authz): better naming for authz service and authz middleware |
||
|
|
9e118ded1f |
feat(ruletypes): publish OpenAPI 3 discriminator on RuleThresholdData and EvaluationEnvelope (#11180)
* feat(ruletypes): publish OpenAPI 3 discriminator on RuleThresholdData and EvaluationEnvelope
Both types model `{kind, spec}` discriminated unions on the wire but
the generated OpenAPI lacked the `discriminator:` keyword, so
codegen tools (oapi-codegen, terraform-plugin-codegen-openapi) fell
back to opaque `Spec: any` and consumers had to hand-write
JSON-bridges instead of typed Expand/Flatten.
Add the discriminator declaration via a marker convention:
- Each parent type implements `jsonschema.Preparer` to set an
`x-signoz-discriminator` extra property carrying `propertyName` and
the per-kind `mapping`.
- A small `attachDiscriminators` pass in pkg/signoz/openapi.go runs
after spec reflection, walks every component schema, promotes the
marker into a real openapi3.Discriminator, and removes the marker
so it doesn't leak into the rendered YAML.
The two-step is required because jsonschema-go.Schema has no
Discriminator field of its own and openapi-go only carries through
`x-`-prefixed extras unchanged. The wire shape is unchanged —
`{kind: "<value>", spec: <variant>}` is still what's sent and
received.
Adding a new variant: append to JSONSchemaOneOf and add a
mapping entry on PrepareJSONSchema.
* refactor(openapi): inline discriminator constant and tighten attachDiscriminators
* chore(ruletypes): trim discriminator comments
* fix(ruletypes): mark evaluation variant kind/spec required
* fix(ruletypes): keep envelope kind/spec out of the parent schema via json:"-"
* refactor(ruletypes): strip envelope parent properties in attachDiscriminators
Revert the json:"-" + custom MarshalJSON dance on the envelope
structs. Restore the original tags (json:"kind" / json:"spec"),
keep the discriminator marker, and clear the parent's redundant
properties / required block in attachDiscriminators after the
discriminator is promoted.
* style(openapi): add blank lines between logical blocks in attachDiscriminators
* style(ruletypes): add blank lines in PrepareJSONSchema
* fix(ruletypes): mark threshold variant kind/spec required
|
||
|
|
ae2127afe8 |
test: dashboards list spec with new e2e framework (#11190)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* test: dashboards list spec with new e2e framework * chore: update docker ignore * test: dashboards list spec with new e2e framework * test: fix skipped ones * test: fix scroll * test: fix flaky clicks * test: fix formatting * chore: doc update + ignore file changes * chore: update fixtures vs helpers * chore: resolve comments * chore: resolve comments |
||
|
|
9e94ee30b9 |
feat(infra-monitoring): v2 nodes list api (#11128)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: nodes list v2 full blown * chore: metadata fix * chore: updated comment * chore: v2 nodes api * chore: added pod phase counts * chore: for pods and nodes, replace none with no_data * chore: node and pod counts structs added * chore: strongly type meta --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
3d8cddf84e |
refactor: split typeable infrastructure into pkg/types/coretypes (#11105)
* refactor: move authtypes to coretypes
* refactor: migrate downstream consumers to coretypes Kind/Type/Relation
Wire all consumers of the typeable infrastructure through coretypes:
- Replace authtypes.Name/Type/Relation references with coretypes equivalents
- Switch Typeable singletons to constructor calls (authtypes.NewTypeableUser
etc.), with the embedded coretypes.Typeable populated so Kind/Type/Prefix/
Scope dispatch correctly through the embed
- Update dashboardtypes meta-resource declarations to use authtypes
constructors so they expose Tuples (authz callers need it)
- Rename Resource.Name field accesses to Resource.Kind to match the field
rename in authtypes.Resource
- Fix typeable_metaresource.go calling the plural NewTypeableMetaResources
helper — should be the singular NewTypeableMetaResource
go build ./... and go vet ./... clean (parser-generated unreachable-code
warnings are pre-existing). Authz unit tests pass.
* refactor(audittypes): unify Action with coretypes.Relation
Drop the duplicate Action enum from audittypes — the verbs (create/update/
delete) match coretypes.Relation exactly. Move PastTense onto Relation so
audit EventName derivation continues to work without a parallel hierarchy.
Also retypes AuditDef.ResourceKind from string to coretypes.Kind so audit
declarations get the same regex validation that authz already enforces.
* refactor(retentiontypes): extract TTLSetting into its own package
TTLSetting is the bun model for ClickHouse TTL settings — has nothing to do
with the Organization domain it was previously co-located with in
pkg/types/organization.go. Moved to pkg/types/retentiontypes/ alongside the
ClickHouse reader that's its sole consumer.
No schema change; the bun table tag (table:ttl_setting) is unchanged.
* chore(openapi): regenerate spec for coretypes.Relation and Resource.Kind
* chore(frontend): regenerate API client and migrate Resource.name → Resource.kind
Regenerated TypeScript API types after the AuthtypesResource field rename
and the new CoretypesRelation enum. Updated:
- frontend/scripts/generate-permissions-type.cjs to read `r.kind` from the
/api/v1/authz/resources response and emit `kind:` in the static
permissions.type.ts file.
- frontend/src/hooks/useAuthZ/{permissions.type,types,utils,useAuthZ}.tsx:
Resource.name → Resource.kind throughout.
- frontend/src/container/RolesSettings/{utils.tsx,__tests__/utils.test.ts}:
same field migration.
- frontend/src/components/createGuardedRoute/createGuardedRoute.test.tsx:
same.
- useAuthZ/utils.ts: cast string relations to CoretypesRelationDTO at the
AuthtypesTransactionDTO boundary now that relation is an enum, not a raw
string.
yarn generate:api passes (orval generation + lint + typecheck).
* refactor: migrate downstream consumers to Resource/Verb rename
* chore(openapi): regenerate spec for Resource/Verb rename
* feat(coretypes): add ListResources accessor with stable sort
* feat(cmd): add 'generate authz' subcommand for permissions type
* refactor(authz): drop runtime authz/resources endpoint
* refactor(frontend): consume static permissions.type.ts directly
* chore(frontend): regenerate Orval client without authz/resources
* ci: move authz schema check from jsci to goci
* refactor(coretypes): move Selector/Object/Transaction from authtypes
* feat(coretypes): add managed role names and permission policy
* feat(coretypes): add Registry assembling resources, types, and managed-role transactions
* refactor(authz): wire *coretypes.Registry; drop RegisterTypeable
* refactor(cmd): wire coretypes.NewRegistry into server bootstraps
* chore: regenerate openapi spec for authtypes -> coretypes type moves
* chore(frontend): regenerate API client for Authtypes -> Coretypes type moves
* refactor(coretypes): rename GettableResource to ResourceRef
* refactor(authz): collapse Registry around static data; bridge once at construction
* refactor(coretypes): tighten Registry, restore anonymous public-dashboard grant
Drops passthrough fields from coretypes.Registry; adds an O(1) lookup map
for NewResourceFromTypeAndKind; replaces stringly-typed Type compares with
Type.Equals; removes the now-redundant getUniqueTypes helper. Restores the
signoz-anonymous read grant on metaresource/public-dashboard that was
silently dropped, and removes the invalid signoz-admin/VerbCreate/TypeUser
entry that panicked at startup.
* chore: regenerate openapi spec for coretypes -> authtypes type moves
* chore(frontend): regenerate API client for Coretypes -> Authtypes type moves
* fix(authz): disambiguate kind→type by relation, preserve multi-part selectors
permissions.type.ts now lists the same kind (dashboard, role,
public-dashboard) under both metaresource and metaresources, so the prior
kind→type map silently overwrote one with the other. Resolve the type
using the requesting relation's allowed types, and slice the selector at
the first colon so multi-part selectors (e.g. id:version) round-trip
correctly. Updates useAuthZ.test.tsx to use the regenerated kind field.
* refactor(authtypes): introduce Relation wrapper over coretypes.Verb
The authz layer modeled relations as raw coretypes.Verb everywhere, which
forced authz-level concepts (action, role-binding) to share a type with
schema-level enumerations. Introduce authtypes.Relation as a thin wrapper
over coretypes.Verb so the authz APIs (CheckWithTupleCreation, ListObjects,
GetObjects, PatchObjects, NewTuples, Transaction.Relation, etc.) can grow
authz-specific affordances without leaking back into coretypes.
Also reshuffles the static coretypes data into dedicated registry_*.go files
(types, kinds, verbs, resources, managed roles) to keep the schema declarations
isolated from the value types they configure.
* refactor(authtypes): expose Relation.Enum() and regenerate openapi spec
Without an Enum() method on Relation the openapi generator emitted an
empty AuthtypesRelation schema (no allowed values). Forward the enum
from the embedded coretypes.Verb so the wire contract is faithful.
* refactor(ee/authz): drop always-nil error returns from managed-role tuple helpers
getManagedRoleGrantTuples and getManagedRoleTransactionTuples never
returned a non-nil error, which the linter (unparam) had flagged. Drop
the unused error return; callers no longer need the err check either.
* chore(frontend): regenerate API client for authtypes.Relation
* fix(authz): satisfy go-lint — keyed Relation literal, drop redundant Verb selector
* refactor(coretypes): sync Kinds slice with full registry_kind declarations
* feat(coretypes): register metaresource and metaresources for all new kinds
Adds 21 metaresource and 21 metaresources entries (covering notification-channel,
route-policy, apdex-setting, auth-domain, session, cloud-integration,
cloud-integration-service, ingestion-key, ingestion-limit, pipeline,
user-preference, org-preference, quick-filter, ttl-setting, rule,
planned-maintenance, saved-view, trace-funnel, factor-password, factor-api-key,
license) so the authz schema covers every resource Kind declared in
registry_kind. Regenerates the static frontend permissions.type.ts to match.
* feat(coretypes): populate ManagedRoleToTransactions from signozapiserver routes
Enumerates every (verb, resource) tuple each managed role holds, derived
from the AdminAccess/EditAccess/ViewAccess middleware on routes in
pkg/apiserver/signozapiserver and the legacy http_handler in
pkg/query-service/app. Admin gets 123 transactions, editor 53, viewer 25,
anonymous keeps the single public-dashboard read.
* feat(coretypes): add integration kind with full CRUD for viewer/editor/admin
Install/uninstall/list integration routes (legacy /api/v1/integrations) all
sit behind ViewAccess, so every authenticated role gets the full CRUD
surface on (metaresource, integration) and (metaresources, integration).
Regenerates the static frontend permissions.type.ts to match.
* feat(coretypes): add subscription kind alongside license, document LCRUD shape
License covers the in-product license resource (Activate/Refresh/GetActive).
Subscription is the billing lifecycle (checkout/portal/billing) served by
ee/query-service routes. Both are admin-only and modeled with a uniform
LCRUD shape; comments call out which verbs actually map to routes versus
which are placeholders for shape parity (e.g. cancellation flows through
Stripe's portal, not an in-process delete).
* feat(coretypes): model telemetryresource for logs, traces, metrics
Mirrors the telemetryresource type from ee/authz/openfgaschema/base.fga
into coretypes: a read-only Type with three Kinds (logs, traces, metrics)
matching telemetrytypes.Signal. Selector is wildcard-only for v1; future
work can narrow per-service or per-environment when the use case lands.
Every managed role (admin/editor/viewer) gets read on each signal,
matching the schema's role#assignee grant. Anonymous stays unchanged.
Regenerates the static frontend permissions.type.ts.
* feat(coretypes): add audit-logs and meter-metrics kinds under telemetryresource
Audit logs (signal=logs, source=audit) and meter metrics (signal=metrics,
source=meter) are sensitive source-qualified telemetry streams that don't
belong under the broad read-grant every role gets on regular logs/traces/
metrics. Modeled as distinct Kinds so they can be permissioned
independently. Admin-only read for now; widen on explicit ask (e.g. an
auditor flow that needs viewer access to audit-logs). Regenerates the
static frontend permissions.type.ts.
* feat(coretypes): add logs-field and traces-field kinds for stored field config
GET/POST /logs/fields and /api/v2/traces/fields manage stored, mutable
field metadata (indexed/promoted columns) over each signal. They're
configuration, not telemetry data, so they sit under metaresource rather
than telemetryresource. Viewer reads, editor/admin update; no
create/delete since POST overwrites. Plural prefix (logs-field /
traces-field) matches the signal naming.
* chore(frontend): regenerate permissions.type.ts to match generate authz output
* feat(authz): add attach permissions to fga model
* fix(tests): use role permissions instead of dashboards
* fix(authz): couple of issues with register flow
* fix(authz): public dashboard read should be anomymous
* fix(tests): integration test for public dashboard access
---------
Co-authored-by: vikrantgupta25 <vikrant@signoz.io>
|
||
|
|
ac46cd8e80 |
fix: return span start time similar to waterfall v2 (#11183)
* fix: return span start time similar to waterfall v2 * chore: update openapi specs * chore: rename timestamp field to match style of other fields * chore: rename the struct field to keep json and field same |
||
|
|
8409a9798d |
fix(authdomain): nest config response, rename Updateable→Updatable, return Identifiable on create (#11176)
Some checks failed
build-staging / staging (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(authdomain): nest config response, rename Updateable→Updatable, return Identifiable on create
Three small API-shape corrections on auth_domain:
- GettableAuthDomain previously embedded AuthDomainConfig, which
flattened sso_enabled / saml_config / oidc_config / google_auth_config /
role_mapping at the response root and made the response shape
diverge from the request shape (PostableAuthDomain has them under
`config`). Move it under a named `Config` field with a `config`
json tag so request and response carry the same nested object.
- UpdateableAuthDomain → UpdatableAuthDomain (typo fix; aligns with
UpdatableUser already in the codebase).
- CreateAuthDomain previously returned the full GettableAuthDomain;
the only field clients actually need from the create response is
the new ID. Switch to Identifiable so the contract states what the
endpoint guarantees and clients re-Read for the full domain when
needed.
Frontend schema and OpenAPI spec regenerated.
* fix(authdomain-frontend): adapt to nested config + Identifiable create response
Regenerate the orval client (`yarn generate:api`) and update the
auth-domain UI for the API shape changes from the previous commit:
- `record.ssoType`, `.ssoEnabled`, `.googleAuthConfig`, `.oidcConfig`,
`.samlConfig`, `.roleMapping` are now nested under `record.config.*`
in `AuthtypesGettableAuthDomainDTO` — update SSOEnforcementToggle,
CreateEdit form initial-values, the list page's Configure button,
and the auth-domain test mocks.
- `mockCreateSuccessResponse` now returns `{ id }` (Identifiable)
instead of the full domain.
`yarn generate:api` ran clean: lint OK, tsgo OK.
* fix(authdomain): align CreateAuthDomain success code with handler + adjust integration test
The Create handler returns http.StatusCreated but the OpenAPI
annotation said StatusOK. Sync the annotation to 201, regenerate the
spec + frontend client.
The callbackauthn integration test (01_domain.py) still read
`domain["ssoType"]` off the GET response — now nested under
`domain["config"]["ssoType"]` after the previous shape change. Update
the assertion.
|
||
|
|
8a7793794d | feat(global): add ai assistant url to global config (#11171) | ||
|
|
680bcd08c3 |
fix(types): correct OpenAPI schema for AuthDomainConfig and PostableChannel (#11164)
* fix(authtypes): embed values and expose AuthDomainConfig oneOf
GettableAuthDomain now embeds StorableAuthDomain and AuthDomainConfig
by value so the response flattens correctly. AuthDomainConfig also
implements jsonschema.OneOfExposer over the SAML/Google/OIDC variants.
* fix(alertmanagertypes): expose PostableChannel JSONSchema
PostableChannel now implements jsonschema.Exposer, requiring name
and a oneOf branch per *_configs field so the OpenAPI request body
for POST /channels matches the runtime contract enforced in
NewChannelFromReceiver. Switched the route's Request type from
Receiver to PostableChannel and regenerated the OpenAPI spec.
* fix(alertmanagertypes): use components/schemas prefix in PostableChannel refs
The standalone reflector inside JSONSchema defaulted to #/definitions/
prefix, producing dangling refs to ConfigDiscordConfig etc. that broke
the generated frontend client. Pass DefinitionsPrefix("#/components/schemas/")
so refs point to existing OpenAPI components, and regenerate the frontend
Orval client.
* feat(authdomain): add GET /api/v1/domains/{id} endpoint
Returns a single GettableAuthDomain scoped to the caller's organization,
backed by the existing module.GetByOrgIDAndID. Adds Get to the Handler
interface, wires the route under AdminAccess, and regenerates the
OpenAPI spec and frontend Orval client.
* feat(authtypes): expose AuthNProvider enum in OpenAPI schema
AuthNProvider now implements jsonschema.Enum, narrowing the generated
TypeScript type from string to a typed enum. Updated callers in the
auth-domain settings UI and mocks to use AuthtypesAuthNProviderDTO,
and added an early-return guard in the create/edit submit handler so
TS can narrow the union before passing it as ssoType.
* chore(types): document oneOf/discriminator mismatch on PostableChannel and AuthDomainConfig
Both types emit a oneOf in the OpenAPI spec but neither shape supports an
OpenAPI discriminator: PostableChannel implies the variant by which *_configs
field is present, and AuthDomainConfig keeps the variant payload in a
sibling field instead of being the payload itself. Leave a TODO pointing at
ruletypes.RuleThresholdData as the envelope pattern to migrate to.
* fix(ruletypes): handle string driver values in Schedule.Scan and Recurrence.Scan
The Scan methods only handled []byte and silently no-op'd on anything
else. SQLite's TEXT columns come back as string from the driver, so
every GET of a planned_maintenance returned a zero-valued Schedule
(empty timezone, 0001-01-01 startTime/endTime, no recurrence) — even
though Create + Update wrote the values correctly.
Switch on src type, accept []byte, string, and nil; error on anything
else. Aligns Schedule with the existing pattern; in Recurrence fixes
the receiver — Unmarshal was being passed src (the interface{} arg)
rather than r.
|
||
|
|
ad8f3328e0 |
fix(mcp-page): added acitve host url instead of current url on mcp page (#11141)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(mcp-page): added acitve host url instead of current url on mcp page * fix(mcp-page): configure access and role control * chore: move get hosts api access to viewers (#11145) * chore: move get hosts api access to viewers * chore: update openapi spec --------- Co-authored-by: SagarRajput-7 <162284829+SagarRajput-7@users.noreply.github.com> * fix: allowed hosts api to run on all the cloud users * fix: updated test cases --------- Co-authored-by: Karan Balani <29383381+balanikaran@users.noreply.github.com> |
||
|
|
755390c4b5 |
feat: types and handler for llm pricing rules (#10908)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: 1.Types for ai-o11y ricing rules * fix: changes * fix: minor changes * fix: more changes * fix: new updates * fix: address comments * fix: remove nullable * fix: types * fix: address comments * fix: use mustnewuuid * fix: correct table name * fix: address comments and move pricing to a single struct * fix: linting issues |
||
|
|
9a3e79fb54 |
Add support for custom field selection for color by feature. (#11092)
* feat: add customer aggregation support in waterfall * chore: add tests for aggregation logic * chore: rename analytics to aggregations * chore: update openapi specs * feat: add support to request telemetry fields in flamegraph * chore: simplify getting attribute value for span * fix: remove overlapping time from duration aggregation * fix: use valuer.String for enums * feat: add preferences for preview and color by |
||
|
|
738ae6f21b |
docs: add servers block to the openapi spec (#11129)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* docs: add servers block to the openapi spec * docs: add info block |
||
|
|
ffd493617f |
feat(infra-monitoring): v2 pods list API (#10833)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: added pods list logic * chore: updated openapi yml * chore: updated spec * chore: pods api meta start time * chore: nil pointer check * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: added pods metrics temporality * chore: unified composite key function * chore: code improvements * chore: added pods list api updates * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: pods code restructuring * chore: comments resolve * chore: added json tag required: true * chore: removed pod metric temporalities * chore: removed internal server error * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: cleanup and rename * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: added required tags * chore: added support for pod phase unknown * chore: removed pods - order by phase * chore: improved api description to document -1 as no data in numeric fields * fix: rebase fixes * feat(infra-monitoring): v2 pods list apis - phase counts when custom grouping (#11088) * chore: added phase counts feature * chore: added queries for pod phase counts in custom group by * chore: added unknown phase count * fix: isPodUIDInGroupBy in buildPodRecords * chore: 3 cte --> 2 cte * chore: pod phase with local table of time series as counts * chore: comment correction * chore: corrected comment * chore: value column for samples table added * chore: removed query G for phase counts * chore: rename variable * chore: added PodPhaseNum constants to types * chore: updated comment * chore: formatted file --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ashwin Bhatkal <ashwin96@gmail.com> |
||
|
|
64c356b484 |
feat: types and handler for span attribute mapping (ai-o11y) (#10909)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: 1.types and handler for ai-o11y attribute mapping * feat: 1.types and handler for ai-o11y attribute mapping * fix: cleanup * fix: minor changes * fix: remove stutters * fix: remove lint issues * fix: lint issues * fix: address comments * fix: address comments * fix: address more comments * fix: more changes * fix: address comments * fix: address comments * fix: address comments * fix: use mustnewuuid |
||
|
|
78e3916aea |
chore: replace JSONDataTypeIndex and bump otel-collector to v0.144.3 (#11110)
* refactor(telemetrytypes): replace JSONDataTypeIndex with TelemetryFieldKeySkipIndex * revert: mcp related changes * fix: openapi fails --------- Co-authored-by: Ankit Nayan <ankit@signoz.io> |
||
|
|
7e7d7ab570 |
feat(global): add mcp_url to global config (#11085)
* feat(global): add mcp_url to global config Adds an optional mcp_url field to the global config so the frontend can gate the MCP settings page on its presence. When unset the API returns "mcp_url": null (pointer + nullable:"true"); when set it emits the parsed URL as a string. * feat(global): surface mcp_url in frontend types Adds mcp_url to the manual GlobalConfigData type and refreshes the generated OpenAPI client so consumers can read the new field. * docs(global): use <unset> placeholder for mcp_url example Matches the style of external_url and ingestion_url above it. * style(global): separate mcp_url prep from return in GetConfig Adds a blank line between the nullable-conversion block and the return statement so the two logical phases read as distinct blocks. * feat(global): mark endpoint fields as required in the API schema The backend always emits external_url, ingestion_url and mcp_url on GET /api/v1/global/config (mcp_url as literal null when unset), so the JSON keys are always present. Add required:"true" to all three and regenerate the OpenAPI + frontend client so consumers get non-optional types. * revert(global): drop mcp_url from legacy GlobalConfigData type The legacy hand-written type for the non-Orval getGlobalConfig client should be left alone; consumers that need mcp_url go through the generated Orval client. |
||
|
|
156459e414 |
refactor: move waterfall api to module structure with updated response (#10797)
* feat: setup types and interface for waterfall v3 v3 is required for udpating the response json of the waterfall api. There wont' be any logical change. Using this requirement as an opportunity to move waterfall api to provider codebase architecture from older query-service * refactor: move type conversion logic to types pkg * chore: add reason for using snake case in response * fix: update span.attributes to map of string to any To support otel format of diffrent types of attributes * fix: remove unused fields and rename span type To avoid confusing with otel span * refactor: convert waterfall api to modules format * chore: add same test cases as for old waterfall api * chore: avoid sorting on every traversal * fix: remove unused fields and rename span type To avoid confusing with otel span * fix: rename timestamp to milli for readability * fix: add timeout to module context * fix: use typed paramter field in logs * chore: generate openapi spec for v3 waterfall * fix: remove timeout since waterfall take longer * fix: use int16 for status code as per db schema * fix: update openapi specs * refactor: break down GetWaterfall method for readability * chore: avoid returning nil, nil * refactor: move type creation and constants to types package - Move DB/table/cache/windowing constants to tracedetailtypes package - Add NewWaterfallTrace and NewWaterfallResponse constructors in types - Use constructors in module.go instead of inline struct literals - Reorder waterfall.go so public functions precede private ones * refactor: extract ClickHouse queries into a store abstraction Move GetTraceSummary and GetTraceSpans out of module.go into a traceStore interface backed by clickhouseTraceStore in store.go. The module struct now holds a traceStore instead of a raw telemetrystore.TelemetryStore, keeping DB access separate from business logic. * refactor: move error to types as well * refactor: separate out store calls and computations * refactor: breakdown GetSelectedSpans for readability * refactor: return 404 on missing trace and other cleanup * refactor: use same method for cache key creation * chore: remove unused duration nano field * chore: use sqlbuilder in clickhouse store where possible * refactor: move waterfall traverse logic to types and extract out auto expanded span calculation * chore: convert all timestamp to nano for consitancy * chore: rename waterfall response to gettableX format * chore: fix method calls in test after refactoring * refactor: remove unused methods * chore: fix openapi spec * chore: better names for methods and vars * chore: remove caching to match from v2 * chore: update openapi client * refactor: move selection decision to types * chore: move types to the top * refactor: avoid passing the whole telementry store in a module * refactor: move waterfall constants to module config * chore: update openapi specs * chore: update openapi clints |
||
|
|
21ce7a663a |
feat: adding cloud integration azure types and openapi spec (#11005)
* refactor: moving types to cloud provider specific namespace/pkg * refactor: separating cloud provider types * refactor: using upper case key for AWS * feat: adding cloud integration azure types * feat: adding azure services * refactor: updating omitempty tags * refactor: updating azure integration config * feat: completing azure types * refactor: lint issues * refactor: updating command key * fix: handle optional connection URL in AWS integration * refactor: updating strategy struct * refactor: updating azure blob storage service name * refactor: update Azure service identifiers * refactor: updating types |
||
|
|
6996d41b01 |
fix(serviceaccount): status code for deleted service accounts (#11075)
* fix(serviceaccount): status code for deleted service accounts * fix(authz): plural relation endpoints |
||
|
|
93f5df9185 |
tests: unify integration + e2e under shared pytest project (#11019)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(tests): hoist pytest project to tests/ root for shared fixtures
Lift pyproject.toml, uv.lock, conftest.py, and fixtures/ up from
tests/integration/ so the pytest project becomes shared infrastructure
rather than integration's private property. A sibling tests/e2e/ can
reuse the same fixture graph (containers, auth, seeding) without
duplicating plugins.
Also:
- Merge tests/integration/src/querier/util.py into tests/fixtures/querier.py
(response assertions and corrupt-metadata generators belong with the
other querier helpers).
- Use --import-mode=importlib + pythonpath=["."] in pyproject so
same-basename tests across src/*/ do not collide at the now-wider
rootdir.
- Broaden python_files to "*/src/**/**.py" so future test trees under
tests/e2e/src/ get discovered.
- Update Makefile py-* targets and integrationci.yaml to cd into tests/
and reference integration/src/... paths.
* feat(tests/e2e): import Playwright suite from signoz-e2e
Relocate the standalone signoz-e2e repository into tests/e2e/ as a
sibling of tests/integration/. The suite still points at remote
staging by default; subsequent commits wire it to the shared pytest
fixture graph so the backend can be provisioned locally.
Excluded from the import: .git, .github (CI migration deferred),
.auth, node_modules, test-results, playwright-report.
* feat(tests/e2e): pytest-driven backend bring-up, seeding, and playwright runner
Wire the Playwright suite into the shared pytest fixture graph so the
backend + its seeded state are provisioned locally instead of pointing
at remote staging.
Python side (owns lifecycle):
- tests/fixtures/dashboards.py — generic create/list/upsert_dashboard
helpers (shared infra; testdata stays per-tree).
- tests/e2e/conftest.py — e2e-scoped pytest fixtures: seed_dashboards
(idempotent upsert from tests/e2e/testdata/dashboards/*.json),
seed_alert_rules (from tests/e2e/testdata/alerts/*.json, via existing
create_alert_rule), seed_e2e_telemetry (fresh traces/logs across a
few synthetic services so /home and Services pages have data).
- tests/e2e/src/bootstrap/setup.py — test_setup depends on the fixture
graph and persists backend coordinates to tests/e2e/.signoz-backend.json;
test_teardown is the --teardown target.
- tests/e2e/src/bootstrap/run.py — test_e2e: one-command entrypoint that
brings up the backend + seeds, then subprocesses yarn test and asserts
Playwright exits 0.
- tests/conftest.py — register fixtures.dashboards plugin.
Playwright side (just reads):
- tests/e2e/global.setup.ts — loads .signoz-backend.json and injects
SIGNOZ_E2E_BASE_URL/USERNAME/PASSWORD. No-op when env is already
populated (staging mode, or pytest-driven runs where env is pre-set).
- playwright.config.ts registers globalSetup.
- package.json gains test:staging; existing scripts unchanged.
Testdata layout: tests/e2e/testdata/{dashboards,alerts,channels}/*.json
— per-tree (integration has its own tests/integration/testdata/).
* docs(tests): describe pytest-master workflow and shared fixture layout
- tests/README.md (new): top-level map of the shared pytest project,
fixture-ownership rule (shared vs per-tree), and common commands.
- tests/e2e/README.md: lead with the one-command pytest run and the
warm-backend dev loop; keep the staging fallback as option 2.
- tests/e2e/CLAUDE.md: updated commands so agent contexts reflect the
pytest-driven lifecycle.
- tests/e2e/.env.example: drop unused SIGNOZ_E2E_ENV_TYPE; note the file
is only needed for staging mode.
* fix(tests/fixtures/signoz.py): anchor Docker build context to repo root
Previously used path="../../" which resolved to the repo root only when
pytest's cwd was tests/integration/. After hoisting the pytest project
to tests/, that same relative path pointed one level above the repo
root and the build failed with:
Cannot locate specified Dockerfile: cmd/enterprise/Dockerfile.with-web.integration
Anchor the build context to an absolute path computed from __file__ so
the fixture works regardless of pytest cwd.
* feat(tests/e2e): alerts-downtime regression suite (platform-pod/issues/2095)
Import the 34-step regression suite originally developed on
platform-pod/issues/2095-frontend. Targets the alerts and planned-downtime
frontend flows after their migration to generated OpenAPI clients and
generated react-query hooks.
- specs/alerts-downtime/: SUITE.md (the stable spec), README.md (scope +
open observations from the original runs), results-schema.md (legacy
per-run artifact shape, retained for context).
- tests/alerts-downtime/alerts-downtime.spec.ts: 881-line Playwright spec
covering 6 flows — alert CRUD/toggle, alert detail 404, planned
downtime CRUD, notification channel routing, anomaly alerts.
Integration with the shared suite:
- Uses baseURL + storageState from tests/e2e/playwright.config.ts (no
separate config). page.goto calls use relative paths; SIGNOZ_E2E_*
env vars from the pytest bootstrap drive auth.
- test.describe.configure({ mode: 'serial' }) at the top of the describe:
the flows mutate shared tenant state, so parallel runs cause cross-
flow interference (documented in the original 2095 config).
- Per-run artifacts (network captures + screenshots) land in
tests/e2e/tests/alerts-downtime/run-spec-<ts>/ by default — gitignored.
Historical per-run artifacts (~7.5MB of screenshots across run-1 through
run-7) are not imported; they lived at e2e/2095/run-*/ on the original
branch and remain there if needed.
* refactor(fixtures/traces): extract insert + truncate helpers
Pull the ClickHouse insert path out of the insert_traces pytest fixture
into a plain module-level function insert_traces_to_clickhouse(conn,
traces), and move the per-table TRUNCATE loop into truncate_traces_tables
(conn, cluster). The fixture becomes a thin wrapper over both — zero
behavioural change.
Lets the HTTP seeder container (tests/fixtures/seeder/) reuse the exact
same insert + truncate code the pytest fixture uses, so the two stay in
sync as the trace schema evolves.
* feat(fixtures/seeder): HTTP seeder container for fine-grained telemetry seeding
Adds a sibling container alongside signoz/clickhouse/postgres that exposes
HTTP endpoints for direct-ClickHouse telemetry seeding, so Playwright
tests can shape per-test data without going through OTel or the SigNoz
ingestion path.
tests/fixtures/seeder/:
- Dockerfile: python:3.13-slim + the shared fixtures/ tree so the
container can import fixtures.traces and reuse the exact insert path
used by pytest.
- server.py: FastAPI app with GET /healthz, POST /telemetry/traces
(accepts a JSON list matching Traces.from_dict input; auto-tags each
inserted row with resource seeder=true), DELETE /telemetry/traces
(truncates all traces tables).
- requirements.txt: fastapi, uvicorn, clickhouse-connect, numpy plus
sqlalchemy/pytest/testcontainers because fixtures/{__init__,types,
traces}.py import them at module load.
tests/fixtures/seeder/__init__.py: pytest fixture (`seeder`, package-
scoped) that builds the image via docker-py (testcontainers DockerImage
had multi-segment dockerfile issues), starts the container on the
shared network wired to ClickHouse via env vars, and waits for
/healthz. Cache key + restore follow the dev.wrap pattern other
fixtures use for --reuse.
tests/.dockerignore: exclude .venv, caches, e2e node_modules, and test
outputs so the build context is small and deterministic.
tests/conftest.py: register fixtures.seeder as a pytest plugin.
Currently traces-only — logs + metrics follow the same pattern.
* feat(tests/e2e): surface seeder_url to Playwright via globalSetup
- bootstrap/setup.py: test_setup now depends on the seeder fixture and
writes seeder_url into .signoz-backend.json alongside base_url.
- bootstrap/run.py: test_e2e exports SIGNOZ_E2E_SEEDER_URL to the
subprocessed yarn test so Playwright specs can reach the seeder
directly in the one-command path.
- global.setup.ts: if .signoz-backend.json carries seeder_url, populate
process.env.SIGNOZ_E2E_SEEDER_URL. Remains optional — staging mode
leaves it unset.
Playwright specs that want per-test telemetry can:
await fetch(process.env.SIGNOZ_E2E_SEEDER_URL + '/telemetry/traces', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify([...])
});
and await a truncate via DELETE on teardown.
* fix(alerts-downtime): capture load-time GETs before navigation
Flow 1 registered cap.mark() AFTER page.goto() and then called
page.waitForResponse(/api/v2/rules) — but against a fast local backend
the GET /api/v2/rules response arrived during page.goto, before the
waiter could register, and the test timed out at 30s.
installCapture's page.on('response') listener runs from before the
navigation, so moving mark() above page.goto() and relying on
dumpSince's 500ms drain is enough. No lost precision.
One site only; the same pattern exists in later flows (via per-action
waitForResponse) and may surface similar races — those are left for a
follow-up once the backend-side 2095 migration lands on main (current
frontend still calls PATCH /api/v1/rules/:id which the spec's assertion
doesn't match anyway).
* refactor(fixtures/logs,metrics): extract insert + truncate helpers
Mirror the traces refactor: pull the ClickHouse insert path out of the
insert_logs / insert_metrics pytest fixtures into plain module-level
functions (insert_logs_to_clickhouse, insert_metrics_to_clickhouse) and
move the per-table TRUNCATE loops into truncate_logs_tables /
truncate_metrics_tables. The fixtures become thin wrappers — zero
behavioural change.
Sets up the seeder container to expose POST/DELETE endpoints for logs
and metrics using the exact same code paths as the pytest fixtures.
* feat(fixtures/seeder): add logs and metrics endpoints
Extend the seeder with POST/DELETE endpoints for logs and metrics,
following the same shape as the existing traces endpoints:
- POST /telemetry/logs accepts a JSON list matching Logs.from_dict;
tags each row's resources with seeder=true.
- POST /telemetry/metrics accepts a JSON list matching Metrics.from_dict;
tags resource_attrs with seeder=true (Metrics.from_dict unpacks
resource_attrs rather than a resources dict).
- DELETE /telemetry/logs, DELETE /telemetry/metrics truncate via the
shared truncate_*_tables helpers.
Requirements gain svix-ksuid because fixtures/logs.py imports KsuidMs
for log id generation.
Verified end-to-end against the warm backend: POST inserted=1 on each
signal, DELETE truncated=true on each.
* refactor(fixtures/seeder): align status codes with HTTP semantics
- POST /telemetry/{traces,logs,metrics}: return 201 Created (kept the
{inserted: N} body so callers can verify the count landed).
- DELETE /telemetry/{traces,logs,metrics}: return 204 No Content with
an empty body.
* refactor(tests/seeder): extract from fixtures/ into top-level package
Move the HTTP seeder (Dockerfile, requirements.txt, server.py) out of
tests/fixtures/seeder/ and into its own tests/seeder/ top-level package.
The pytest fixture that builds and runs the image moves to
tests/fixtures/seeder.py so it sits next to the other container fixtures.
Rationale: the seeder is a standalone containerized Python service, not a
pytest fixture. It ships a Dockerfile, its own requirements.txt, and a
server.py entrypoint — none of which belong under a package whose purpose
is shared pytest code.
Image-side changes:
- Dockerfile now copies seeder/ alongside fixtures/ and launches
seeder.server:app instead of fixtures.seeder.server:app.
- Build context stays tests/ (unchanged), so fixtures.* imports inside
server.py continue to resolve.
Fixture-side changes:
- _TESTS_ROOT computation drops one parent (parents[1] now that the file
is at tests/fixtures/seeder.py, not tests/fixtures/seeder/__init__.py).
- The dockerfile= path passed to docker-py becomes seeder/Dockerfile.
No behavior change; every consumer still imports the seeder fixture as
before and gets the same container.
* refactor(fixtures/keycloak): rename from idp.py to name the concrete tech
The container provider at fixtures/idp.py brought up a Keycloak image. Name
it for what it is so we can use fixtures/idp.py later for API-side IdP
helpers (OIDC/SAML admin flows) without an idp-vs-idputils naming collision.
- fixtures/idp.py → fixtures/keycloak.py (git rename).
- fixtures.idputils updates its one internal import to fixtures.keycloak.
- conftest.py pytest_plugins entry points at the new module.
No caller outside fixtures/ imports fixtures.idp directly, so no shim is
needed. The "idp" fixture name (how tests reference it) is unchanged.
* refactor(fixtures/gateway): drop -utils suffix
The module only held helper functions (no fixtures). Rename to match the
domain and leave a shim at the old path so integration/ import sites keep
working until they are swept in a follow-up.
* fix(fixtures/gatewayutils): silence wildcard-import in deprecation shim
The shim intentionally re-exports via `from fixtures.gateway import *`;
pylint flags the wildcard and every unused-wildcard symbol. Suppress both
in the shim only — the live module has no wildcard.
* refactor(fixtures/auth): merge authutils helpers into auth
Pull the pure-helper functions from authutils.py (create_active_user,
find_user_by_email, find_user_with_roles_by_email, assert_user_has_role,
change_user_role) into auth.py next to the fixtures they complement.
Fixtures remain on top; helpers go below. Drop the module docstring.
Replace authutils.py with a deprecation shim that re-exports from
fixtures.auth so integration/ import sites (9 files) keep working until
they are swept in a follow-up. Suppress the wildcard-import warnings in
the shim only.
* refactor(fixtures/alerts): merge alertutils helpers into alerts
Pull the pure-helper functions from alertutils.py
(collect_webhook_firing_alerts, _verify_alerts_labels,
verify_webhook_alert_expectation, update_rule_channel_name) into alerts.py
next to the fixtures they complement. Fixtures stay on top; helpers go
below.
Replace alertutils.py with a deprecation shim that re-exports from
fixtures.alerts so integration/ import sites keep working until they are
swept in a follow-up.
* refactor(fixtures/cloudintegrations): merge cloudintegrationsutils helpers
Pull the pure-helper functions from cloudintegrationsutils.py
(deprecated_simulate_agent_checkin, setup_create_account_mocks,
simulate_agent_checkin) into cloudintegrations.py next to the fixtures
they complement. Fixtures stay on top; helpers go below.
Replace cloudintegrationsutils.py with a deprecation shim that re-exports
from fixtures.cloudintegrations so integration/ import sites keep working
until they are swept in a follow-up.
* refactor(fixtures/idp): rename idputils to idp now that keycloak owns the container
With the Keycloak container provider at fixtures.keycloak, the fixtures.idp
name is free for what idputils always was — API/browser helpers for OIDC
and SAML admin flows against the IdP container.
- fixtures.idputils → fixtures.idp (git rename).
- conftest.py pytest_plugins swaps fixtures.idputils for fixtures.idp so
the create_saml_client / create_oidc_client fixtures register under the
canonical path.
Replace fixtures.idputils with a deprecation shim re-exporting from
fixtures.idp so integration/ import sites (callbackauthn) keep working
until they are swept in a follow-up.
* refactor(fixtures/reuse): rename from dev to describe what the module is
The module wraps pytest-cache resource reuse/teardown for container
fixtures; "dev" conveyed nothing about its role. Rename to fixtures.reuse
and update the 12 internal callers that imported `from fixtures import
dev, types` to use `reuse` instead.
Replace fixtures.dev with a deprecation shim so any external caller keeps
working until the follow-up sweep.
* refactor(fixtures/time,fs): split utils by responsibility
fixtures.utils only held two time parsers (parse_timestamp, parse_duration)
and one path helper (get_testdata_file_path) — a "utils" grab bag.
- Time parsers move to fixtures.time (utils.py → time.py via git rename).
- get_testdata_file_path moves into fixtures.fs where other filesystem
helpers live.
- Internal callers (alerts, logs, metrics, traces) update to the new paths.
Replace fixtures.utils with a deprecation shim that re-exports all three
functions so integration/ import sites keep working until the follow-up
sweep.
* refactor(fixtures/browser): rename from driver to match peer primitives
fixtures.driver was the Selenium WebDriver fixture — rename the module to
fixtures.browser so it sits next to fixtures.http as a named primitive.
The fixture name inside (driver) stays — that's the Selenium-canonical
term and tests reference it directly.
conftest.py pytest_plugins entry points at the new module. A deprecation
shim at fixtures.driver keeps any external caller working until the
follow-up sweep.
* refactor(tests/seeder): install deps via uv from pyproject, drop requirements.txt
The seeder's requirements.txt duplicated 7 of 10 deps from pyproject.toml
with overlapping version pins — a standing drift risk. The comment on top
of the file admitted the real problem: the seeder image already ships
pytest + testcontainers + sqlalchemy because importing fixtures.traces
walks fixtures/__init__.py and fixtures/types.py. "Don't ship test infra"
was already violated.
- Add fastapi, uvicorn[standard], and py to pyproject.toml dependencies
(the three seeder-only deps that were not yet in pyproject; `py` was a
latent gap since fixtures/types.py uses py.path.local but pytest only
pulls it in transitively).
- Switch the Dockerfile to `uv sync --frozen --no-install-project --no-dev`
so the container env matches local dev exactly (uv.lock is the single
source of truth for versions).
- Move tests/seeder/Dockerfile → tests/Dockerfile.seeder so it lives
alongside the pyproject at the root of the build context.
- Delete tests/seeder/requirements.txt.
The seeder image grows by ~40-50MB (selenium, psycopg2, wiremock now come
along from main deps); accepted as a cost of single source of truth since
the seeder is dev-only infra, not a shipped artifact.
* refactor(tests/integration): flatten src/ into bootstrap/ + tests/
Drop the redundant src/ layer in the integration tree. 'src' carries no
information — the directory IS integration test source. After flatten:
tests/integration/
bootstrap/setup.py was src/bootstrap/setup.py
tests/<suite>/*.py was src/<suite>/*.py (16 suites)
testdata/
Updates:
- Makefile: py-test-setup/py-test-teardown/py-test target paths.
- tests/README.md: layout diagram + command examples.
- tests/pyproject.toml: python_files glob now matches basenames
explicitly — "[0-9][0-9]_*.py" for NN-prefixed suite files plus
"setup.py" and "run.py" for bootstrap entrypoints. The old "*/src/.."
glob stopped matching anything here and would have caused pytest to
try collecting seeder/server.py as a test.
* refactor(tests/e2e): flatten src/ into bootstrap/
Drop the e2e/src/ wrapper — the only Python content under it was
bootstrap/, which is now a direct child of e2e/. Keeps integration and
e2e symmetric (both have bootstrap/, tests/, testdata/ as peers).
Also delete bootstrap/__init__.py on both integration and e2e sides.
With --import-mode=importlib, pytest walks up from each .py file to find
the highest __init__.py-containing dir and uses that as the package root.
Without integration/__init__.py or e2e/__init__.py above bootstrap/, both
setup.py files resolved to the same dotted name `bootstrap.setup`, causing
a sys.modules collision that silently dropped test_telemetry_databases_exist
from integration's bootstrap. With no __init__.py anywhere, pytest treats
each setup.py as a standalone module via spec_from_file_location and both
are collected cleanly.
Updates tests/README.md, tests/e2e/README.md, and tests/e2e/CLAUDE.md path
references from e2e/src/bootstrap/ to e2e/bootstrap/.
* refactor(tests/e2e): drop specs/ + strip // spec: back-pointers
specs/ held markdown test plans that mirrored tests/ 1:1 as pre-code
scratch. Once a test exists, the plan is stale the moment the test
diverges — they're AI-planner output, not source of truth. Keep the
workflow alive by .gitignore-ing specs/ (the planner agent can still
write locally) but stop shipping stale plans in the repo.
Strip the `// spec: specs/...` and `// seed: tests/seed.spec.ts` header
comments from 5 .spec.ts files. The spec pointer is dead; the seed
pointer was convention-only — Playwright collects regardless.
* docs(contributing/tests): move e2e/integration guides out of test dirs
Pull the e2e contributor guide out of tests/e2e/CLAUDE.md (which read
like a full agent-workflow reference doc) and into
docs/contributing/tests/e2e.md alongside the existing development / go
guides.
- Delete tests/e2e/CLAUDE.md; its content (layout, commands, role tags,
locator priority, Playwright agent workflow) lives in the new e2e.md
with references to the now-.gitignore'd specs/ dir removed.
- Add docs/contributing/tests/integration.md — short guide covering
layout, runner commands, filename conventions, and the flow for
adding a new suite (there was no contributor doc for this before).
- Trim tests/e2e/README.md to quick-start + commands; link out to the
full guide. Readers who just want to run tests get the 5 commands
they need; anything deeper is one hop away.
* chore(tests/e2e): drop examples/example-test-plan.md
Init-agents boilerplate. Fresh planner agents don't need a checked-in
template; they can write to the .gitignore'd specs/ scratch dir.
tests/integration/.qodo/ was also removed (untracked, empty; .qodo is
already in the root .gitignore).
* refactor(tests/seeder): use fixtures.logger.setup_logger
Drop the one-off logging.basicConfig + logging.getLogger("seeder") in
favor of the shared setup_logger helper that every fixtures/*.py already
uses. Keeps log format consistent across pytest runs and the seeder
container.
fixtures.logger ships into the image via the existing COPY fixtures step
in Dockerfile.seeder — no build change needed.
* fix(tests/e2e): correct e2e_dir path after src/ flatten
After phase 2 (flatten tests/e2e/src/ into tests/e2e/), the run.py file
sits one level closer to the e2e root. parents[2] now resolves to tests/
instead of tests/e2e/, so yarn test would subprocess from the wrong cwd.
parents[1] is the correct index now.
* fix(tests/e2e): correct endpoint-file path in setup.py after src/ flatten
Same class of stale-path bug as the run.py fix: after the e2e/src/
flatten, setup.py sits one level closer to the e2e root. parents[2] now
lands at tests/ instead of tests/e2e/, so .signoz-backend.json would be
written to tests/.signoz-backend.json and the Playwright global.setup.ts
(which expects tests/e2e/.signoz-backend.json) wouldn't find it.
parents[1] is correct.
* refactor(tests/e2e): drop pre-seed fixtures; each spec owns its data
The seeder (tests/seeder/) was built so specs can POST telemetry
per-test. Global pre-seeding via tests/e2e/conftest.py (seed_dashboards,
seed_alert_rules, seed_e2e_telemetry) is the exact anti-pattern that
setup obsoletes — shared state across specs, order-dependent runs, no
reset between tests.
- Delete tests/e2e/conftest.py (3 fixtures, all pre-seed).
- Delete tests/e2e/testdata/dashboards/apm-metrics.json — its only
consumer was seed_dashboards. tests/e2e/testdata/ now empty and gone.
- Drop seed_dashboards, seed_alert_rules, seed_e2e_telemetry params
from bootstrap/setup.py::test_setup and bootstrap/run.py::test_e2e.
test_teardown never depended on them.
- Refresh the module docstrings on both bootstrap tests to reflect the
new model (backend + seeder up; specs seed themselves).
- Update tests/README.md and docs/contributing/tests/e2e.md: remove the
testdata/ + conftest.py references, document the per-spec seeding
rule (telemetry via seeder endpoints, dashboards/alerts via SigNoz
REST API from the spec).
Known breakage: tests/e2e/tests/dashboards/dashboards-list.spec.ts
expects at least one dashboard to exist. With seed_dashboards gone, it
will fail until that spec is updated to create its own dashboard via
the SigNoz API in test.beforeAll. Followup.
* refactor(tests/e2e): relocate auth helper into fixtures/; expose authedPage
Rename tests/e2e/utils/login.util.ts → tests/e2e/fixtures/auth.ts and
drop the (now-empty) utils/ dir. "Fixtures" is the unit of per-test
shared setup on both the Python and TS sides of this project — naming
them consistently across trees makes the parallel obvious.
fixtures/auth.ts now exports three things:
- `test` — Playwright test extended with an authedPage fixture. New
specs can request `authedPage` as a param and skip the
`beforeEach(() => ensureLoggedIn(page))` boilerplate entirely.
- `expect` — re-exported from @playwright/test so callers have one
import.
- `ensureLoggedIn(page)` — the underlying helper, still exported for
specs that want per-call control.
Update the 4 specs that imported from utils/login.util to point at the
new path; no behavior change in those specs (they keep calling
ensureLoggedIn in beforeEach). Refactoring them to use authedPage can
happen spec-by-spec later.
Also update the path example in .cursorrules so AI-generated snippets
reach for the new import path.
* refactor(tests/e2e): emit .env.local instead of .signoz-backend.json
The old flow (pytest writes JSON → global.setup.ts loads it → exports
env vars) was doing what dotenv already does. Collapse to the native
pattern:
- bootstrap/setup.py writes tests/e2e/.env.local with the four coords
(BASE_URL, USERNAME, PASSWORD, SEEDER_URL). File header marks it as
generated.
- playwright.config.ts loads .env first, then .env.local with
override=true. User-provided defaults stay in .env; generated values
win when present.
- Delete tests/e2e/global.setup.ts (36 lines gone) and its globalSetup
reference in playwright.config.ts.
Subprocess-injected env (run.py shelling out to yarn test) still wins
because dotenv doesn't overwrite already-set process.env keys.
Rename the test-only override env var SIGNOZ_E2E_ENDPOINT_FILE →
SIGNOZ_E2E_ENV_FILE for accuracy. Update .env.example, .gitignore (drop
.signoz-backend.json, keep .env.local with its explanatory comment),
tests/README.md, docs/contributing/tests/e2e.md.
* refactor(tests/e2e/alerts-downtime): drop custom network + screenshot capture
The spec wrapped every /api/ response in a bespoke installCapture(), wrote
hand-named JSON files per call (01_step1.1_GET_rules.json, ...), and took
step-by-step screenshots — all going into run-spec-<ts>/ next to the spec
(gitignored).
Playwright already records equivalent data via `trace` (network bodies,
screenshots per step, DOM snapshots, console — viewable via
`playwright show-trace`). The capture infra was duplicating that for the
one-shot 2095 regression audit; no downstream consumer reads the JSON or
PNG artifacts now.
- Remove installCapture, shot, RUN_DIR/NET_DIR/SHOT_DIR, fs/path imports.
- Strip cap.mark()/cap.dumpSince()/shot() calls throughout the 7 flows.
- Collapse the block-scopes that only existed to bound mark variables.
- Drop the "Artifacts" paragraph from the file's top-of-file comment.
- Remove the `tests/alerts-downtime/run-spec-*/` entry from .gitignore.
Spec drops from 885 lines to 736 (≈17% smaller). All 7 flows + their
assertions are unchanged. For debug access, rely on
`trace: 'on-first-retry'` (already set in playwright.config.ts) + `yarn
show-trace`.
* refactor(tests/e2e): move alerts-downtime.spec.ts into alerts/
The spec lives mostly in the alerts domain (6 of 7 flows), with the
planned-downtime CRUD (Flow 4) and cascade-delete (Flow 5) as
cross-feature collateral. The standalone alerts-downtime/ dir was
compound-named, breaking the one-feature-per-dir pattern every other
dir under tests/ follows, and duplicating the spec's own filename.
Move to tests/alerts/alerts-downtime.spec.ts. Empty alerts-downtime/
dir removed.
* refactor(tests/e2e): consolidate Playwright output under artifacts/
All Playwright outputs now land under a single tests/e2e/artifacts/ dir
so CI can archive it in one command (tar / zip / upload-artifact). Each
piece was writing to its own sibling of tests/e2e/ before.
playwright.config.ts:
- outputDir: 'artifacts/test-results' — per-test traces, screenshots,
videos (was default test-results/).
- HTML reporter → 'artifacts/html-report' (was default
playwright-report/); open: 'never' so CI doesn't spawn a browser on
report generation.
- JSON reporter → 'artifacts/results.json' (was
'test-results/results.json').
package.json: `yarn report` now points playwright show-report at the new
HTML folder.
Ignore updates — replace the two old paths with /artifacts/ in
tests/e2e/.gitignore, tests/e2e/.prettierignore, and tests/.dockerignore
(seeder image build context).
.cursorrules: update the `cat test-results/results.json` example to the
new path so AI-generated snippets reach for the right file.
Delete the empty test-results/ and playwright-report/ dirs that prior
runs left behind.
* refactor(tests/e2e): one artifacts/ subdir per reporter
Within artifacts/, give each reporter its own named subdir so the layout
tells you what wrote what:
artifacts/
html/ # HTML reporter (was artifacts/html-report)
json/results.json # JSON reporter (was artifacts/results.json)
test-results/ # outputDir — per-test traces/screenshots/videos
`yarn report` and the .cursorrules cat example point at the new paths.
* refactor(tests/e2e): drop SIGNOZ_USER_ROLE env filter and @admin/@editor/@viewer tags
The filter claimed to be role-based but only grep'd by tag — the actual
browser session is always admin (bootstrap creates one admin, auth.setup.ts
saves one storageState, every project uses it). Tagging tests `@viewer`
didn't mean they ran as a viewer; it just meant they'd be in the subset
selected when SIGNOZ_USER_ROLE=Viewer. Superset semantics (admin sees
everything) meant the filter was at best a narrower test selection and
at worst a misleading assertion of role coverage.
Gone:
- getRoleGrepPattern() + grep: line in playwright.config.ts.
- The dedicated setup project's grep override (no filter to override).
- SIGNOZ_USER_ROLE entries in .env.example, README, docs/contributing.
- The "Role-Based Testing" section + all role-tagging guidance and
example snippets in .cursorrules.
- All `{ tag: '@viewer' | '@editor' | '@admin' }` annotations on the 90
affected test sites across 5 spec files (single-line and multi-line
forms). ~90 annotations gone.
For ad-hoc selection, `yarn test --grep <pattern>` still works on
Playwright's normal grep (test titles/paths).
Real role-based coverage (separate users + storageStates per role) is a
different problem — not pretending this was it.
* chore(tests/e2e): drop .cursorrules
* refactor(tests/e2e): move auth from project-level storageState to per-suite fixture
Replaced auth.setup.ts + globally-mounted storageState with a test-scoped
authedPage fixture in tests/e2e/fixtures/auth.ts. Each suite controls its
own identity via `test.use({ user: ... })`; specs that need to run
unauthenticated just request the stock `page` fixture instead.
fixtures/auth.ts:
- Declares `user` as a test option, defaulting to ADMIN (creds from
.env.local / .env).
- authedPage resolves to a Page whose context has storageState mounted
for that user. First request per (user, worker) triggers one login
and writes a per-user storageState file under .auth/; subsequent
requests reuse it via a Promise-valued cache.
- Exposes `User` type and `ADMIN` constant so future suites can declare
additional users (EDITOR, VIEWER) as credentials become available.
playwright.config.ts:
- Drop authFile constant, `setup` project, storageState + dependencies
on each browser project.
tests/auth.setup.ts:
- Deleted. Login logic now lives inside fixtures/auth.ts's login() helper,
called on demand by the fixture rather than upfront for the whole run.
Spec migration (6 files):
- Import `test, expect` from ../fixtures/auth (or ../../fixtures/auth)
instead of @playwright/test.
- Drop `ensureLoggedIn` imports and `await ensureLoggedIn(page)` calls.
- Swap `{ page }` → `{ authedPage: page }` in test and beforeEach
destructures (local var stays `page` via aliasing so test bodies need
no further changes).
Cost: N logins per run, where N = unique users × workers (= 1 × 2–4
today, vs the old 1 globally). Tradeoff for explicit per-suite control.
Specs that need unauth later just use `async ({ page }) => ...` — the
fixture isn't invoked, so no login fires.
291 tests still list (previously 292: the old auth.setup.ts counted as
one fake "test"; it's gone now).
* refactor(tests/e2e): cache auth storageState in memory, drop .auth/ dir
The fixture was writing each user's storageState to .auth/<user>.json and
then handing Playwright the file path. But Playwright's
browser.newContext({ storageState }) accepts the object form too —
ctx.storageState() without a path arg returns the cookies+origins
inline.
Keeping the cache in memory means no filesystem roundtrip per login, no
.auth/ dir to maintain, no stale JSON persisting across runs, and no
gitignore entry for it. Each worker's Map holds one Promise<StorageState>
per unique user, resolved on first login and reused thereafter.
Drop the .auth/ entry from tests/e2e/.gitignore; delete the (now unused)
on-disk .auth/ dir.
* chore(tests/e2e): drop seed.spec.ts
* chore(tests/e2e): drop unused README.md and .mcp.json
* refactor(tests/e2e): move existing specs to legacy/ pending fresh rewrite
Park the 5 current spec files under tests/e2e/legacy/ while fresh specs
get written in tests/e2e/tests/ against the new conventions (TC-NN
titles, authedPage fixture, minimal direct-fetch). Playwright's testDir
stays pointed at ./tests — `yarn test` now finds 0 tests until the
first fresh spec lands. legacy/ is preserved for reference but not
collected by default.
Add a .gitkeep under tests/ so the empty dir survives in git between
the move and the first new spec.
Running legacy on demand:
npx playwright test --config tests/e2e/playwright.config.ts \
--project chromium legacy/<spec>.ts
(or temporarily point testDir at ./legacy in the config). No yarn
script wired — legacy is expected to rot as fresh specs replace it.
* refactor(tests): drop -utils deprecation shims; import from canonical modules
The shims we introduced during the phase-3 merges (authutils, alertutils,
cloudintegrationsutils, idputils, gatewayutils) and the phase-4 primitive
renames (dev, utils, driver) have done their job — integration/ tests can
now import directly from the real modules.
Rewrite every shim-import in tests/integration/tests/:
fixtures.authutils → fixtures.auth
fixtures.alertutils → fixtures.alerts
fixtures.cloudintegrationsutils → fixtures.cloudintegrations
fixtures.idputils → fixtures.idp
fixtures.gatewayutils → fixtures.gateway
fixtures.utils (get_testdata_file_path) → fixtures.fs
Delete all 8 shim files:
fixtures/{authutils,alertutils,cloudintegrationsutils,idputils,
gatewayutils,dev,utils,driver}.py
Nothing in active code (integration tests, e2e fixtures, bootstrap, seeder)
imported fixtures.dev or fixtures.driver, so those had no callers to
sweep — just delete.
500 tests still collect.
* fix(tests/seeder): add python3-dev so psycopg2 can compile in the image
Consolidating seeder deps into pyproject.toml pulled in psycopg2, which
needs Python dev headers (Python.h) to build from source. The apt layer
had gcc + libpq-dev but was missing python3-dev, so \`uv sync --frozen
--no-install-project --no-dev\` failed with "gcc failed with exit code 1"
during the seeder image build.
Add python3-dev to the apt install line; image size bump ~50MB for dev
headers. Alternative would have been swapping psycopg2 for
psycopg2-binary in pyproject.toml, but that'd affect the whole test
project for one Dockerfile concern — wrong scope.
* feat(tests/e2e): re-author 2095 alerts + downtime regression
Three fresh specs split by resource replace the 736-line
legacy monolith at tests/e2e/legacy/alerts/alerts-downtime.spec.ts:
- alerts.spec.ts: rule list CRUD, labels round-trip, test-notification
pre-state, details/history/AlertNotFound, anomaly (EE-gated, skip on
community)
- downtime.spec.ts: planned-downtime CRUD round-trip
- cascade-delete.spec.ts: 409 paths on rule/downtime delete when linked
UI-first: Playwright traces capture the BE conversations, so direct
page.request calls are reserved for seeding where the query-builder
setup is incidental to the test, API-contract probes, and cleanup.
* refactor(tests/e2e): group alerts specs under tests/alerts/
* feat(tests/fixtures/auth): apply_license fixture + wire into e2e bootstrap
- Adds a package-scoped apply_license fixture that stubs the Zeus
/v2/licenses/me mock and POSTs /api/v3/licenses so the BE flips to
ENTERPRISE. The fixture also PUTs org_onboarding=true because the
license enables the onboarding flag which would otherwise hijack
every post-login navigation to a questionnaire.
- Wires apply_license into e2e/bootstrap/setup.py::test_setup and
::test_teardown alongside create_user_admin.
- Existing add_license helper stays as-is for integration tests.
- Login fixture now waits for the URL to leave /login instead of a
pre-license "Hello there" welcome string (the post-login landing
page varies with license state).
- TC-07 anomaly test no longer skips (license enables the flag) and
drops the legacy test-notification API contract probe that needs
seeded metric data (covered by the integration suite).
* chore: cleanup
* chore: remove claude files
* chore(tests/fixtures): drop unused dashboards.py
* chore(tests/e2e): rename playwright outputDir to artifacts/results
* chore(tests/e2e): drop legacy specs, trim alerts.spec.ts to one smoke test
Deletes tests/e2e/legacy/ (five old 2095-replay specs) and the two
sibling alerts suite files (downtime, cascade-delete). alerts.spec.ts
is reduced to a single TC-01 smoke test that loads /alerts and asserts
the tabs render — a fresh minimum to build on.
* docs(contributing): new integration.md + e2e.md at top level
Promotes the two test-contributor docs from docs/contributing/tests/
to docs/contributing/ and rewrites them in the long-form Q&A format
of docs/contributing/go/integration.md (prerequisites → setup →
framework → writing → running → configuring → remember).
Reflects the current state: shared fixtures package at tests/fixtures/,
flat integration suites under tests/integration/tests/, e2e specs
grouped by resource under tests/e2e/tests/<feature>/, apply_license
fixture in the bootstrap, authedPage Playwright fixture, and the
artifacts/{html,json,results} output layout.
* docs(contributing): relocate integration.md + e2e.md to tests/
Moves docs/contributing/go/integration.md -> docs/contributing/tests/integration.md
and docs/contributing/e2e.md -> docs/contributing/tests/e2e.md so the test-
contributor docs live under contributing/tests/. The previous top-level
promotion at docs/contributing/integration.md is removed; go/readme.md
drops the dangling integration link.
* docs(contributing/tests): update integration.md to current repo layout
* ci(tests): fix integrationci paths + add e2eci workflow
integrationci:
- Matrix path was integration/src/<suite> (old layout); current layout
is integration/tests/<suite>. Renames the matrix key src -> suite and
fixes the pytest path accordingly.
- Adds auditquerier and rawexportdata to the matrix (new suites).
- Drops bootstrap from the matrix — it's no longer a test suite, just
the pytest lifecycle entry.
e2eci (new, replaces the broken frontend/-based run-e2e.yaml):
- Label-gated trigger mirroring integrationci: requires safe-to-test +
safe-to-e2e. Runs on pull_request / pull_request_target.
- Installs Python (uv) and Node (yarn), syncs tests/ deps, installs
Playwright browsers for the matrix project.
- Brings the stack up via e2e/bootstrap/setup.py::test_setup --with-web
(build signoz-with-web container once), runs playwright against it,
tears down in an always-run step.
- Uploads the HTML report + per-test traces as artifacts.
- Matrix starts with chromium only (firefox / webkit can follow).
* ci(tests/e2e): upload entire artifacts/ dir, 5-day retention
* fix(tests/fixtures): apply black formatting to truncate helpers
* fix(tests/pyproject): ignore node_modules and py module in pylint
* ci(tests): drop auditquerier from integrationci matrix for now
* refactor(tests): drop __file__.parents[N] path tricks; use pytestconfig.rootpath
The pytest rootdir is already tests/, so anywhere we were computing
_REPO_ROOT / _TESTS_ROOT / e2e-dir from Path(__file__).resolve().parents[N]
can just use pytestconfig.rootpath (or .parent for the repo root).
- fixtures/signoz.py: DockerImage path → pytestconfig.rootpath.parent
- fixtures/seeder.py: docker-py build path → pytestconfig.rootpath
- e2e/bootstrap/setup.py: .env.local path → pytestconfig.rootpath / e2e
- e2e/bootstrap/run.py: yarn-test cwd → pytestconfig.rootpath / e2e
* chore(tests/e2e): drop bootstrap/run.py
The run.py entrypoint was just setup.py + subprocess('yarn test'); CI
splits those steps anyway (separate provision / test / teardown for
clean artifact capture) and locally the two-step flow is equivalent.
Removing the duplicate entrypoint; docs updated accordingly.
* cleanup(tests): simplify review pass
- fixtures/fs.py: testdata path resolved to tests/testdata after the
fixture move; integration tests with data-driven parametrize (e.g.
alerts/02_basic_alert_conditions.py) were all failing with
FileNotFoundError. Walk to tests/integration/testdata now.
- fixtures/auth.py: extract _login helper so apply_license stops
duplicating the GET /sessions/context + POST /sessions/email_password
pair. Add a retry loop on POST /api/v3/licenses so a BE that isn't
quite ready at bring-up time doesn't fail the fixture.
- seeder/server.py: use FastAPI lifespan to open+close the ClickHouse
client instead of a lazy module-level global; collapse the verbose
module docstring.
- fixtures/seeder.py + e2e/bootstrap/setup.py: trim docstrings/comments
that narrated WHAT the code does — per-repo convention keeps only
non-obvious WHY.
- .github/workflows/integrationci.yaml: gate the Chrome + chromedriver
install on matrix.suite == 'callbackauthn' (the only suite that uses
Selenium). Saves ~30s × 50 jobs on every PR run.
|
||
|
|
89b755a6b0 |
feat(infra-monitoring): v2 hosts list api (#10805)
* chore: baseline setup * chore: endpoint detail update * chore: added logic for hosts v3 api * fix: bug fix * chore: disk usage * chore: added validate function * chore: added some unit tests * chore: return status as a string * chore: yarn generate api * chore: removed isSendingK8sAgentsMetricsCode * chore: moved funcs * chore: added validation on order by * chore: updated spec * chore: nil pointer dereference fix in req.Filter * chore: added temporalities of metrics * chore: unified composite key function * chore: code improvements * chore: hostStatusNone added for clarity that this field can be left empty as well in payload * chore: yarn generate api * chore: return errors from getMetadata and lint fix * chore: return errors from getMetadata and lint fix * chore: added hostName logic * chore: modified getMetadata query * chore: add type for response and files rearrange * chore: warnings added passing from queryResponse warning to host lists response struct * chore: added better metrics existence check * chore: added a TODO remark * chore: added required metrics check * chore: distributed samples table to local table change for get metadata * chore: frontend fix * chore: endpoint correction * chore: endpoint modification openapi * chore: escape backtick to prevent sql injection * chore: rearrage * chore: improvements * chore: validate order by to validate function * chore: improved description * chore: added TODOs and made filterByStatus a part of filter struct * chore: ignore empty string hosts in get active hosts * feat(infra-monitoring): v2 hosts list - return counts of active & inactive hosts for custom group by attributes (#10956) * chore: add functionality for showing active and inactive counts in custom group by * chore: bug fix * chore: added subquery for active and total count * chore: ignore empty string hosts in get active hosts * fix: sinceUnixMilli for determining active hosts compute once per request * chore: refactor code * chore: rename HostsList -> ListHosts * chore: rearrangement * chore: inframonitoring types renaming * chore: added types package * chore: file structure further breakdown for clarity * chore: comments correction * chore: removed temporalities * chore: comments resolve * chore: added json tag required: true * chore: added status unauthorized * chore: remove a defensive nil map check, the function ensure non-nil map when err nil * chore: make sort stable in case of tiebreaker by comparing composite group by keys * chore: regen api client for inframonitoring Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e015310b5b |
chore: add request examples for alert rules (#11023)
* chore: add request examples for alert rules * chore: address ci |
||
|
|
f52d89c338 |
docs(go): add types.md covering core-type and Postable/Gettable/Storable conventions (#10998)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* docs(go): add types.md covering core-type and Postable/Gettable/Storable conventions Document the core-type-first pattern used across pkg/types/: always define the domain type X, and introduce Postable/Gettable/Updatable/Storable flavors only when their shape actually differs from X. Walk through Channel (core only), AuthDomain (all four flavors), and Rule (the in-progress v2 split) as worked examples. * docs(go): drop Rule migration example from types.md * docs(go): allow both NewFromX and ToX conversion forms in types.md * docs(go): move validation guidance to core type X in types.md * docs(go): drop exact file:line refs in types.md, use inline examples * docs(go): split spelling guidance into Updatable and Storable bullets * docs(go): trim conversion examples to Channel and AuthDomain * docs(go): swap remaining examples to AuthDomain/Channel where possible |
||
|
|
61ae49d4ab |
refactor(ruler): introduce v2 Rule read type and validate uuid on delete (#10997)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(ruler): add Rule v2 read type and rename storage Rule to StorableRule * refactor(ruler): map GettableRule to Rule before responding on v2 routes * docs(openapi): regenerate spec with RuletypesRule on v2 rules routes * docs(frontend): regenerate API clients with RuletypesRuleDTO * refactor(ruler): validate uuid-v7 on delete rule handler |
||
|
|
a5e1f71cf6 |
refactor(ruler): tighten rule and downtime OpenAPI schema (#10995)
* refactor(ruler): add Enum() on AlertType * refactor(ruler): convert RepeatType and RepeatOn to valuer.String with Enum() * refactor(ruler): mark required fields on Recurrence * refactor(ruler): mark required tag on CumulativeSchedule.Type * refactor(ruler): rename GettablePlannedMaintenance to PlannedMaintenance * docs: regenerate OpenAPI spec and frontend clients with tightened schema * refactor(ruler): add PostablePlannedMaintenance input type with Validate * refactor(ruler): rename EditPlannedMaintenance to Update and GetAll to List * refactor(ruler): switch Create/Update to *PostablePlannedMaintenance * refactor(ruler): convert PlannedMaintenance.Id string to ID valuer.UUID * refactor(ruler): return *PlannedMaintenance from CreatePlannedMaintenance * docs: regenerate OpenAPI spec and frontend clients for Postable/ID changes * refactor(ruler): type PlannedMaintenance.Status as MaintenanceStatus enum * refactor(ruler): type PlannedMaintenance.Kind as MaintenanceKind enum * refactor(ruler): mark GettableRule.Id required * refactor(ruler): mark GettableRule.State required * refactor(ruler): make GettableRule timestamps non-pointer and users nullable * refactor(ruler): return bare array from v2 ListRules instead of wrapped object * docs: regenerate OpenAPI spec and frontend clients for schema pass |
||
|
|
c9610df66d |
refactor(ruler): move rules and planned maintenance handlers to signozapiserver (#10957)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(ruler): define Ruler and Handler interfaces with signozruler implementation
Expand the Ruler interface with rule management and planned maintenance
methods matching rules.Manager signatures. Add Handler interface for
HTTP endpoints. Implement handler in signozruler wrapping ruler.Ruler,
and update provider to embed *rules.Manager for interface satisfaction.
* refactor(ruler): move eval_delay from query-service constants to ruler config
Replace constants.GetEvalDelay() with config.EvalDelay on ruler.Config,
defaulting to 2m. This removes the signozruler dependency on
pkg/query-service/constants.
* refactor(ruler): use time.Duration for eval_delay config
Match the convention used by all other configs in the codebase.
TextDuration is for preserving human-readable text through JSON
round-trips in user-facing rule definitions, not for internal config.
* refactor(ruler): add godoc comments and spacing to Ruler interface
* refactor(ruler): wire ruler handler through signoz.New and signozapiserver
- Add Start/Stop to Ruler interface for lifecycle management
- Add rulerCallback to signoz.New() for EE customization
- Wire ruler.Handler through Handlers, signozapiserver provider
- Register 12 routes in signozapiserver/ruler.go (7 rules, 5 downtime)
- Update cmd/community and cmd/enterprise to pass rulerCallback
- Move rules.Manager creation from server.go to signoz.New via callback
- Change APIHandler.ruleManager type from *rules.Manager to ruler.Ruler
- Remove makeRulesManager from both OSS and EE server.go
* refactor(ruler): remove old rules and downtime_schedules routes from http_handler
Remove 7 rules CRUD routes and 5 downtime_schedules routes plus their
handler methods from http_handler.go. These are now served by
signozapiserver/ruler.go via handler.New() with OpenAPIDef.
The 4 v1 history routes (stats, timeline, top_contributors,
overall_status) remain in http_handler.go as they depend on
interfaces.Reader and have v2 equivalents already in signozapiserver.
* refactor(ruler): use ProviderFactory pattern and register in factory.Registry
Replace the rulerCallback with rulerProviderFactories following the
standard ProviderFactory pattern (like auditorProviderFactories). The
ruler is now created via factory.NewProviderFromNamedMap and registered
in factory.Registry for lifecycle management. Start/Stop are no longer
called manually in server.go.
- Ruler interface embeds factory.Service (Start/Stop return error)
- signozruler.NewFactory accepts all deps including EE task funcs
- provider uses named field (not embedding) with explicit delegation
- cmd/community passes nil task funcs, cmd/enterprise passes EE funcs
- Remove NewRulerProviderFactories (replaced by callback from cmd/)
- Remove manual Start/Stop from both OSS and EE server.go
* fix(ruler): make Start block on stopC per factory.Service contract
rules.Manager.Start is non-blocking (run() just closes a channel).
Add stopC to provider so Start blocks until Stop closes it, matching
the factory.Service contract used by the Registry.
* refactor(ruler): remove unused RM() accessor from EE APIHandler
* refactor(ruler): remove RuleManager from APIHandlerOpts
Use Signoz.Ruler directly instead of passing it through opts.
* refactor(ruler): add /api/v1/rules/test and mark /api/v1/testRule as deprecated
* refactor(ruler): use binding.JSON.BindBody for downtime schedule decode
* refactor(ruler): add TODOs for raw string params on Ruler interface
Mark CreateRule, EditRule, PatchRule, TestNotification, and DeleteRule
with TODOs to accept typed params instead of raw JSON strings. Requires
changing the storage model since the manager stores raw JSON as Data.
* refactor(ruler): add TODO on MaintenanceStore to not expose store directly
* docs: regenerate OpenAPI spec and frontend API clients with ruler routes
* refactor(ruler): rename downtime_schedules tag to downtimeschedules
* refactor(ruler): add query params to ListDowntimeSchedules OpenAPIDef
Add ListPlannedMaintenanceParams struct with active/recurring fields.
Use binding.Query.BindQuery in the handler instead of raw URL parsing.
Add RequestQuery to the OpenAPIDef so params appear in the OpenAPI spec
and generated frontend client.
* refactor(ruler): add GettableTestRule response type to TestRule endpoint
Define GettableTestRule struct with AlertCount and Message fields.
Use it as the Response in TestRule OpenAPIDef so the generated frontend
client has a proper response type instead of string.
* refactor(ruler): tighten schema with oneOf unions and required fields
Surface the polymorphism in RuleThresholdData and EvaluationEnvelope via
JSONSchemaOneOf (the same pattern as QueryEnvelope), so the generated
TS types are discriminated unions with typed `spec` instead of unknown.
Also mark `alert`, `ruleType`, and `condition` required on PostableRule
so the generated TS types are non-optional for callers.
* refactor(ruler): add Enum() on EvaluationKind, ScheduleType, ThresholdKind
Surface the fixed set of accepted values for these valuer-wrapped kind
types so OpenAPI emits proper string-enum schemas and the generated TS
types become string-literal unions instead of plain string.
* refactor(ruler): mark required fields on nested rule and maintenance types
Surface fields already enforced by Validate()/UnmarshalJSON as required
in the OpenAPI schema so the generated TS types match runtime behavior.
Touches RuleCondition (compositeQuery, op, matchType), RuleThresholdData
(kind, spec), BasicRuleThreshold (name, target, op, matchType),
RollingWindow (evalWindow, frequency), CumulativeWindow (schedule,
frequency, timezone), EvaluationEnvelope (kind, spec), Schedule
(timezone), GettablePlannedMaintenance (name, schedule).
Does not mark server-populated fields (id, createdAt, updatedAt, status,
kind) on GettablePlannedMaintenance required, since the same struct is
reused for request bodies in MaintenanceStore.CreatePlannedMaintenance.
* refactor(ruler): tighten AlertCompositeQuery, QueryType, PanelType schema
Missed in the earlier tightening pass. AlertCompositeQuery.queries,
panelType, queryType are all required for a valid composite query;
QueryType and PanelType are valuer-wrapped with fixed value sets, so
expose them as enums in the OpenAPI schema.
* refactor(ruler): wrap sql.ErrNoRows as TypeNotFound in by-ID lookups
GetStoredRule and GetPlannedMaintenanceByID previously returned bun's
raw Scan error, so a missing ID leaked "sql: no rows in result set" to
the HTTP response with a 500 status. WrapNotFoundErrf converts
sql.ErrNoRows into TypeNotFound so render.Error emits 404 with a stable
`not_found` code, and passes other errors through unchanged.
* refactor(ruler): move migrated rules routes to /api/v2/rules
The 7 rules routes now live at /api/v2/rules, /api/v2/rules/{id}, and
/api/v2/rules/test — served via handler.New with render.Success and
render.Error. The legacy /api/v1/rules paths will be restored in the
query-service http handler in a follow-up so existing clients keep
receiving the SuccessResponse envelope unchanged.
Drop the /api/v1/testRule deprecated alias from signozapiserver; the
original lives on main's http_handler.go and is restored alongside the
other v1 paths.
Downtime schedule routes stay at /api/v1/downtime_schedules — single
track, no legacy restore planned.
* refactor(ruler): restore /api/v1/rules legacy handlers for back-compat
Bring the 7 rule CRUD/test handlers and their router.HandleFunc lines
back to http_handler.go so /api/v1/rules, /api/v1/rules/{id}, and
/api/v1/testRule continue to emit the legacy SuccessResponse envelope.
The v2 versions under signozapiserver are the new home for the render
envelope used by generated clients.
Delegation uses aH.ruleManager (populated from opts.Signoz.Ruler in
NewAPIHandler), so a single ruler.Ruler instance serves both paths — no
second rules.Manager is instantiated.
Downtime schedules stay single-track under signozapiserver; the 5
downtime handlers are not restored.
* docs: regenerate OpenAPI spec and frontend clients for /api/v2/rules
* refactor(ruler): return 201 Created on POST /api/v2/rules
A successful create now responds with 201 Created and the full
GettableRule body, matching REST convention for resource creation.
Regenerates the OpenAPI spec and frontend clients to reflect the new
status code.
* refactor(ruler): restore dropped sorter TODO in legacy listRules
The legacy listRules handler was copied verbatim from main during the
v1 back-compat restore, but an inner blank line and the load-bearing
`// todo(amol): need to add sorter` comment were stripped. Put them
back so the legacy block round-trips cleanly against main.
* refactor(ruler): return 201 Created on POST /api/v1/downtime_schedules
Match the REST convention already applied to POST /api/v2/rules:
successful creates respond with 201 Created. Response body remains
empty (nil); the generated frontend client surface is unchanged since
no response type was declared.
A richer "return the created resource" response body is a separate
follow-up — holding off until the ruletypes naming cleanup lands.
* fix(ruler): signal Healthy only after manager.Start closes m.block
The ruler provider didn't implement factory.Healthy, so the registry
fell back to factory.closedC and marked the service StateRunning the
instant its Start goroutine spawned — before rules.Manager.Start had
closed m.block. /api/v2/healthz therefore returned 200 while rule
evaluation was still gated, and integration tests that POSTed a rule
immediately after the readiness check saw their task goroutines stuck
on <-m.block until the next frequency tick.
Add a healthyC channel and close it inside Start only after
manager.Start returns; implement factory.Healthy so the registry and
/api/v2/healthz wait on the real readiness signal.
* fix: add the withhealthy interface
* fix(ruler): alias legacy RULES_EVAL_DELAY env var in backward-compat
The eval_delay config was moved from query-service constants (read from
RULES_EVAL_DELAY) onto ruler.Config (read via mapstructure from
SIGNOZ_RULER_EVAL__DELAY). That silently broke the legacy env var for
any existing deployment — notably the alerts integration-test fixture
which sets RULES_EVAL_DELAY=0s to let rules evaluate against just-
inserted data. The resulting default 2m delay pushed the query window
far enough back that the fixture's rate spike fell outside it, causing
8 of 24 parametrize cases in 02_basic_alert_conditions.py to fail with
"Expected N alerts to be fired but got 0 alerts".
Add RULES_EVAL_DELAY to mergeAndEnsureBackwardCompatibility alongside
the ~10 other aliased legacy env vars. Emits the standard deprecation
warning and overrides config.Ruler.EvalDelay.
|
||
|
|
c8099a88c3 |
feat(member): add v2 reset password token endpoints and show invite expiry status (#10954)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* fix(member): better UX for pending invite users * fix(member): add integration tests and reuse timezone util * fix(member): rename deprecated and remove dead files * fix(member): do not use hypened endpoints * fix(member): user friendly button text * fix(member): update the API endpoints and integration tests * fix(member): simplify handler naming convention * fix(member): added v2 API for update my password * fix(member): remove more dead code * fix(member): fix integration tests * fix(member): fix integration tests |
||
|
|
c24579be12 | chore: updated the onboarding contributor doc as per the new enhancement in the asset management (#10955) | ||
|
|
b3da6fb251 |
refactor(alertmanager): move API handlers to signozapiserver (#10941)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* refactor(alertmanager): move API handlers to signozapiserver Extract Handler interface in pkg/alertmanager/handler.go and move the implementation from api.go to signozalertmanager/handler.go. Register all alertmanager routes (channels, route policies, alerts) in signozapiserver via handler.New() with OpenAPIDef. Remove AlertmanagerAPI injection from http_handler.go. This enables future AuditDef instrumentation on these routes. * fix(review): rename param, add /api/v1/channels/test endpoint - Rename `am` to `alertmanagerService` in NewHandlers - Add /api/v1/channels/test as the canonical test endpoint - Mark /api/v1/testChannel as deprecated - Regenerate OpenAPI spec * fix(review): use camelCase for channel orgId json tag * fix(review): remove section comments from alertmanager routes * fix(review): use routepolicies tag without hyphen * chore: regenerate frontend API clients for alertmanager routes * fix: add required/nullable/enum tags to alertmanager OpenAPI types - PostableRoutePolicy: mark expression, name, channels as required - GettableRoutePolicy: change CreatedAt/UpdatedAt from pointer to value - Channel: mark name, type, data, orgId as required - ExpressionKind: add Enum() for rule/policy values - Regenerate OpenAPI spec and frontend clients * fix: use typed response for GetAlerts endpoint * fix: add Receiver request type to channel mutation endpoints CreateChannel, UpdateChannelByID, TestChannel, and TestChannelDeprecated all read req.Body as a Receiver config. The OpenAPIDef must declare the request type so the generated SDK includes the body parameter. * fix: change CreateChannel access from EditAccess to AdminAccess Aligns CreateChannel endpoint with the rest of the channel mutation endpoints (update/delete) which all require admin access. This is consistent with the frontend where notifications are not accessible to editors. |
||
|
|
6ad2711c7a |
feat(/fields/*) : introduce a new param metricNamespace in the APIs for prefix match (#10779)
* chore: initial commit * chore: added metricNamespace as a new param * chore: go generate openapi, update spec * chore: frontend yarn generate:api * chore: added metricnamespace support in /fields/values as well as added integration tests * chore: corrected comment * chore: added unit tests for getMetricsKeys and getMeterSourceMetricKeys --------- Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com> |
||
|
|
7279c5f770 |
feat: adding query params in cloud integration APIs (#10900)
Some checks failed
Release Drafter / update_release_draft (push) Has been cancelled
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
* feat: adding query params in cloud integration APIs * refactor: create account HTTP status change from OK to CREATED |
||
|
|
92660b457d |
feat: adding types changes and openapi spec (#10866)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
* feat: adding types changes and openapi spec * refactor: review changes * feat: generating OpenAPI spec * refactor: updating create account types * refactor: removing email domain function |