Compare commits

...

40 Commits

Author SHA1 Message Date
Jatinderjit Singh
c958b132bd chore(planned-downtime): use clickable learn more link in scope tooltip 2026-05-20 17:23:35 +05:30
Jatinderjit Singh
82c0517ef1 fix(maintenance): consolidate label-set-to-env conversion to avoid expr panic
Move ConvertLabelSetToEnv to alertmanagertypes so both the maintenance
scope evaluator and the route-policy evaluator share one implementation.
Dotted label keys (e.g. kubernetes.node) are expanded into nested maps,
preventing the expr-lang panic that occurs when one key is a prefix of
another.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 15:39:07 +05:30
Jatinderjit Singh
44ec2bc044 Use AND instead of &&
Co-authored-by: Srikanth Chekuri <srikanth.chekuri92@gmail.com>
2026-05-20 14:14:09 +05:30
Jatinderjit Singh
928c5ea1e9 test(e2e): verify scope-based maintenance muting in alertmanager flow
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 14:10:25 +05:30
Jatinderjit Singh
8aa19a104c test(maintenance): add tests for scope label expression filtering
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 13:47:41 +05:30
Jatinderjit Singh
f1f1a6a670 chore: rename label expression to scope 2026-05-20 12:09:36 +05:30
Jatinderjit Singh
76fce84e97 Merge branch 'main' into feat/maintenance-label-expression 2026-05-20 11:45:30 +05:30
Jatinderjit Singh
494552c530 remove unused function evaluateLabelExpression 2026-05-19 17:41:36 +05:30
Jatinderjit Singh
f7e7485f97 fix(tests): resolve envprovider env isolation, factory name length, and ShouldSkip signature
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 17:41:36 +05:30
Jatinderjit Singh
493b337494 fix lset type and update openapi spec 2026-05-19 17:41:36 +05:30
Jatinderjit Singh
8dc458e6ba implement Down migration to drop label_expression column 2026-05-19 17:41:36 +05:30
Jatinderjit Singh
e7176de589 remove redundant LabelSet->map conversion 2026-05-19 17:41:36 +05:30
Jatinderjit Singh
0500388d4c Move label expression evaluation into ShouldSkip
ShouldSkip now owns all three suppression checks in sequence:
rule ID match → schedule active → label expression.
IsActive passes nil labels so the expression check is skipped
(no instance labels available for UI status).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 17:41:36 +05:30
Jatinderjit Singh
d2010b01ab Remove redundant || undefined from labelExpression assignment 2026-05-19 17:41:36 +05:30
Jatinderjit Singh
3929138c87 Add label expression support to planned downtime
Alert instances can now be scoped by label expression
(e.g. env == "prod"), scoping suppression below the rule level.
A window with no rule IDs and a label expression silences any
alert whose labels match, regardless of which rule fired it;
when rule IDs are also present, the expression is evaluated only
within the matched rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 17:41:36 +05:30
Jatinderjit Singh
f08782adeb test: add e2e muting tests for maintenance window behaviour 2026-05-19 17:37:06 +05:30
Jatinderjit Singh
b81f0dc8e5 cleanup test 2026-05-19 16:38:21 +05:30
Jatinderjit Singh
a97feaceb0 chore: regenerate mocks via make gen-mocks
Picks up new MockHandler for the Handler interface in pkg/alertmanager
and regenerates MockMaintenanceStore with canonical mockery formatting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 16:25:17 +05:30
Jatinderjit Singh
165218dedb test: use mockery-generated mock for MaintenanceStore in muter tests
Replace hand-written fakeMaintenanceStore with a mockery-generated
MockMaintenanceStore, consistent with the alertmanagertest pattern.
Also adds MaintenanceStore to .mockery.yml so the mock stays in sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:11:48 +05:30
Jatinderjit Singh
6ee3f75bf6 Go lint fixes 2026-05-19 15:02:47 +05:30
Jatinderjit Singh
32e5bf2f17 fix NewMaintenanceStore in tests 2026-05-19 14:51:18 +05:30
Jatinderjit Singh
dbebb76bda Re-add marker 2026-05-19 14:29:26 +05:30
Jatinderjit Singh
13547f29e4 Update schema changes 2026-05-19 14:10:03 +05:30
Jatinderjit Singh
cd3e4bcb87 test: add unit tests for MaintenanceMuter
Covers Mutes/MutedBy semantics (empty label, rule match, empty-RuleIDs
matches-all, future windows, multi-window) and the result cache
(single-fetch within TTL, stale-cache fallback on store error,
re-fetch after expiry, concurrency safety).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 13:57:04 +05:30
Jatinderjit Singh
b5ae86c3f7 refactor: move maintenance (planned downtime) to alertmanager packages
Types move from pkg/types/ruletypes/ to pkg/types/alertmanagertypes/:
- maintenance.go, recurrence.go, schedule.go (+ tests)

Store impl moves from pkg/ruler/rulestore/sqlrulestore/ to
pkg/alertmanager/alertmanagerstore/sqlalertmanagerstore/.

Maintenance windows mute alerts, so they belong with alertmanager
rather than the rule types.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 13:57:04 +05:30
Jatinderjit Singh
6143b9bac2 code cleanup 2026-05-19 13:53:29 +05:30
Jatinderjit Singh
09ac78b42e feat: surface maintenance-suppressed alerts via mutedBy in GetAlerts
Alerts suppressed by an active maintenance window were being correctly
muted in the notification pipeline but appeared as state=active in the
v2 GetAlerts response, since MaintenanceMuter.Mutes had no marker
side-effect (unlike inhibitor/silencer).

Add MaintenanceMuter.MutedBy returning the matching window IDs, and
plumb a mutedByFunc callback through NewGettableAlertsFromAlertProvider
into AlertToOpenAPIAlert. The upstream v2 API forces state=suppressed
when mutedBy is non-empty, so the frontend's existing state-based
rendering picks it up without further changes.

Use the dedicated mutedBy field rather than SilencedBy to avoid
violating the "complete set of silence IDs" contract that anything
querying silences by ID would rely on.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
84ca7c0bd9 remove redundant MemMarker wrapper 2026-05-19 13:53:29 +05:30
Jatinderjit Singh
08c763ba0a refactor: move MaintenanceMuter to Server and pass it to pipelineBuilder.New
- Remove muter from pipelineBuilder struct and newPipelineBuilder();
  pass it as a parameter to New() instead, consistent with inhibitor/silencer
- Store muter on Server so GetAlerts can call Mutes() alongside the
  inhibitor and silencer, ensuring maintenance-suppressed alerts show
  the correct muted status in API responses

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
27182a0275 refactor: always initialize maintenanceStore; remove nil guards
Tests now use a real sqlrulestore-backed MaintenanceMuter instead of
passing nil. With nil no longer a valid input, remove the nil guards
in server.go and pipeline_builder.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
2a59ec62ca refactor: hoist MuteStage construction out of the receiver loop
MuteStage holds no per-receiver state, so one instance shared across
all receivers is sufficient — matching how is/ss are handled upstream.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
ec07c80e70 refactor: replace maintenanceMuteStage with notify.NewMuteStage
MaintenanceMuter already satisfies types.Muter, and pipelineBuilder has
its own pb.metrics, so the hand-rolled maintenanceMuteStage wrapper is
redundant. Use notify.NewMuteStage(pb.muter, pb.metrics) directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
32747dcb52 rename buildReceiverStage -> createReceiverStage 2026-05-19 13:53:29 +05:30
Jatinderjit Singh
b517e97612 refactor: remove dead orgID param from task constructors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
7b99f4475c refactor: pass MaintenanceMuter directly to pipelineBuilder
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
7a8826531e chore: replace SPDX tag with full Apache 2.0 license boilerplate
The full license text is unambiguously compliant with Apache 2.0 Section 4(a),
which requires giving recipients "a copy of this License".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
6ca316df01 chore: add license header to pipeline_builder.go
Copied code originates from Apache-2.0 licensed Prometheus Alertmanager;
add dual copyright + SPDX identifier following the repo's convention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
e5ba44f257 refactor: move maintenance mute stage into custom pipelineBuilder
Copy notify.PipelineBuilder locally so we can inject mms between the
silence stage and the receiver stage (GossipSettle → Inhibit →
TimeActive → TimeMute → Silence → mms → Receiver), matching the
correct suppression order the team requires.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
49a6f879a2 refactor: wrap routing pipeline once instead of per-route injection
Replace the per-route-entry loop with a single MultiStage wrap so
maintenance suppression runs once per dispatch group before routing.
2026-05-19 13:53:29 +05:30
Jatinderjit Singh
4127705a0c add maintenanceMuteStage to move planned maintenance to alertmanager
Rules previously skipped rule.Eval() entirely during maintenance windows.
This change moves suppression to MaintenanceMuter, injected as a Stage
in the alertmanager notification pipeline. Now rules always evaluate and
everys suppression is handled by alertmanager.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:53:29 +05:30
14 changed files with 643 additions and 156 deletions

View File

@@ -129,6 +129,8 @@ components:
type: string
schedule:
$ref: '#/components/schemas/AlertmanagertypesSchedule'
scope:
type: string
status:
$ref: '#/components/schemas/AlertmanagertypesMaintenanceStatus'
updatedAt:
@@ -272,6 +274,8 @@ components:
type: string
schedule:
$ref: '#/components/schemas/AlertmanagertypesSchedule'
scope:
type: string
required:
- name
- schedule

View File

@@ -225,6 +225,10 @@ export interface AlertmanagertypesPlannedMaintenanceDTO {
*/
name: string;
schedule: AlertmanagertypesScheduleDTO;
/**
* @type string
*/
scope?: string;
status: AlertmanagertypesMaintenanceStatusDTO;
/**
* @type string
@@ -1714,6 +1718,10 @@ export interface AlertmanagertypesPostablePlannedMaintenanceDTO {
*/
name: string;
schedule: AlertmanagertypesScheduleDTO;
/**
* @type string
*/
scope?: string;
}
export interface AlertmanagertypesPostableRoutePolicyDTO {

View File

@@ -1,5 +1,5 @@
import React, { useCallback, useEffect, useMemo, useState } from 'react';
import { Check } from '@signozhq/icons';
import { Check, Info } from '@signozhq/icons';
import {
Button,
DatePicker,
@@ -11,6 +11,7 @@ import {
Select,
SelectProps,
Spin,
Tooltip,
} from 'antd';
import { Typography } from '@signozhq/ui/typography';
import type { DefaultOptionType } from 'antd/es/select';
@@ -78,6 +79,7 @@ interface PlannedDowntimeFormData {
alertRules: DefaultOptionType[];
recurrenceSelect?: AlertmanagertypesRecurrenceDTO;
timezone?: string;
scope?: string;
}
const customFormat = DATE_TIME_FORMATS.ORDINAL_DATETIME;
@@ -144,6 +146,7 @@ export function PlannedDowntimeForm(
.map((alert) => alert.value)
.filter((alert) => alert !== undefined) as string[],
name: values.name,
scope: values.scope,
schedule: {
startTime: values.startTime?.format(),
endTime: values.endTime?.format(),
@@ -278,6 +281,7 @@ export function PlannedDowntimeForm(
duration: getDurationInfo(schedule?.recurrence?.duration)?.value ?? '',
} as AlertmanagertypesRecurrenceDTO,
timezone: schedule?.timezone as string,
scope: initialValues.scope || '',
};
}, [initialValues, alertOptions]);
@@ -311,7 +315,7 @@ export function PlannedDowntimeForm(
default:
return `Scheduled for ${formattedStartDate} starting at ${formattedStartTime}.`;
}
}, [formData, recurrenceType, timezone]);
}, [formData, recurrenceType]);
const endTimeText = useMemo((): string => {
const endTime = formData.endTime;
@@ -322,7 +326,7 @@ export function PlannedDowntimeForm(
const formattedEndTime = endTime.format(TIME_FORMAT);
const formattedEndDate = endTime.format(DATE_FORMAT);
return `Scheduled to end maintenance on ${formattedEndDate} at ${formattedEndTime}.`;
}, [formData, recurrenceType, timezone]);
}, [formData, recurrenceType]);
return (
<Modal
@@ -488,6 +492,36 @@ export function PlannedDowntimeForm(
</Select>
</Form.Item>
</div>
<Form.Item
label={
<span>
Scope&nbsp;
<Tooltip
mouseLeaveDelay={0.3}
title={
<span>
Scope the planned downtime by alert labels.{' '}
<a
href="https://signoz.io/docs/alerts-management/planned-maintenance/#scoping-with-label-expressions"
target="_blank"
rel="noopener noreferrer"
>
Learn more
</a>
</span>
}
>
<Info size={13} />
</Tooltip>
</span>
}
name="scope"
>
<Input.TextArea
placeholder='e.g. env = "prod" AND region = "us-east-1"'
autoSize={{ minRows: 2, maxRows: 4 }}
/>
</Form.Item>
<Form.Item style={{ marginBottom: 0 }}>
<ModalButtonWrapper>
<Button

View File

@@ -42,7 +42,7 @@ func (m *MaintenanceMuter) Mutes(ctx context.Context, lset model.LabelSet) bool
}
now := time.Now()
for _, mw := range m.getMaintenances(ctx) {
if mw.ShouldSkip(ruleID, now) {
if mw.ShouldSkip(ruleID, now, lset) {
return true
}
}
@@ -61,7 +61,7 @@ func (m *MaintenanceMuter) MutedBy(ctx context.Context, lset model.LabelSet) []s
var ids []string
now := time.Now()
for _, mw := range m.getMaintenances(ctx) {
if mw.ShouldSkip(ruleID, now) {
if mw.ShouldSkip(ruleID, now, lset) {
ids = append(ids, mw.ID.String())
}
}

View File

@@ -87,18 +87,25 @@ func TestEndToEndAlertManagerFlow(t *testing.T) {
err = notificationManager.SetNotificationConfig(orgID, "high-cpu-usage", &notifConfig)
require.NoError(t, err)
mwID := valuer.GenerateUUID()
activeSchedule := &alertmanagertypes.Schedule{
Timezone: "UTC",
StartTime: time.Now().Add(-time.Hour),
EndTime: time.Now().Add(time.Hour),
}
// mwRuleIDAndScope: only critical high-cpu-usage alerts.
mwRuleIDAndScope := valuer.GenerateUUID()
// mwRuleIDOnly: all high-cpu-usage alerts regardless of severity.
mwRuleIDOnly := valuer.GenerateUUID()
// mwScopeOnly: all critical alerts regardless of rule ID.
mwScopeOnly := valuer.GenerateUUID()
maintenanceStore := alertmanagertypestest.NewMockMaintenanceStore(t)
maintenanceStore.On("ListPlannedMaintenance", mock.Anything, orgID).Return(
[]*alertmanagertypes.PlannedMaintenance{{
ID: mwID,
Schedule: &alertmanagertypes.Schedule{
Timezone: "UTC",
StartTime: time.Now().Add(-time.Hour),
EndTime: time.Now().Add(time.Hour),
},
RuleIDs: []string{"high-cpu-usage"},
}}, nil,
[]*alertmanagertypes.PlannedMaintenance{
{ID: mwRuleIDAndScope, Schedule: activeSchedule, RuleIDs: []string{"high-cpu-usage"}, Scope: `severity == "critical"`},
{ID: mwRuleIDOnly, Schedule: activeSchedule, RuleIDs: []string{"high-cpu-usage"}},
{ID: mwScopeOnly, Schedule: activeSchedule, Scope: `severity == "critical"`},
}, nil,
)
srvCfg := NewConfig()
@@ -249,18 +256,42 @@ func TestEndToEndAlertManagerFlow(t *testing.T) {
require.Equal(t, "{__receiver__=\"webhook\"}:{cluster=\"prod-cluster\", instance=\"server-03\", ruleId=\"high-cpu-usage\"}", alertGroups[2].GroupKey)
})
t.Run("verify_muting", func(t *testing.T) {
req, err := http.NewRequest(http.MethodGet, "/alerts", nil)
require.NoError(t, err)
params, err := alertmanagertypes.NewGettableAlertsParams(req)
require.NoError(t, err)
alerts, err := server.GetAlerts(ctx, params)
require.NoError(t, err)
req, err := http.NewRequest(http.MethodGet, "/alerts", nil)
require.NoError(t, err)
params, err := alertmanagertypes.NewGettableAlertsParams(req)
require.NoError(t, err)
alerts, err := server.GetAlerts(ctx, params)
require.NoError(t, err)
t.Run("verify_muting_ruleid_and_scope", func(t *testing.T) {
// Window with ruleID + scope mutes only alerts matching both.
for _, alert := range alerts {
if alert.Labels["ruleId"] == "high-cpu-usage" && alert.Labels["severity"] == "critical" {
require.Contains(t, alert.Status.MutedBy, mwRuleIDAndScope.String())
} else {
require.NotContains(t, alert.Status.MutedBy, mwRuleIDAndScope.String())
}
}
})
t.Run("verify_muting_ruleid_only", func(t *testing.T) {
// Window with ruleID but no scope mutes all severities for that rule.
for _, alert := range alerts {
if alert.Labels["ruleId"] == "high-cpu-usage" {
require.Equal(t, []string{mwID.String()}, alert.Status.MutedBy)
require.Contains(t, alert.Status.MutedBy, mwRuleIDOnly.String())
} else {
require.Empty(t, alert.Status.MutedBy)
require.NotContains(t, alert.Status.MutedBy, mwRuleIDOnly.String())
}
}
})
t.Run("verify_muting_scope_only", func(t *testing.T) {
// Window with scope but no ruleIDs mutes all critical alerts regardless of rule.
for _, alert := range alerts {
if alert.Labels["severity"] == "critical" {
require.Contains(t, alert.Status.MutedBy, mwScopeOnly.String())
} else {
require.NotContains(t, alert.Status.MutedBy, mwScopeOnly.String())
}
}
})

View File

@@ -89,6 +89,7 @@ func (r *maintenance) CreatePlannedMaintenance(ctx context.Context, maintenance
Description: maintenance.Description,
Schedule: maintenance.Schedule,
OrgID: claims.OrgID,
Scope: maintenance.Scope,
}
maintenanceRules := make([]*alertmanagertypes.StorablePlannedMaintenanceRule, 0)
@@ -123,7 +124,6 @@ func (r *maintenance) CreatePlannedMaintenance(ctx context.Context, maintenance
NewInsert().
Model(&maintenanceRules).
Exec(ctx)
if err != nil {
return err
}
@@ -141,6 +141,7 @@ func (r *maintenance) CreatePlannedMaintenance(ctx context.Context, maintenance
Description: storablePlannedMaintenance.Description,
Schedule: storablePlannedMaintenance.Schedule,
RuleIDs: maintenance.AlertIds,
Scope: maintenance.Scope,
CreatedAt: storablePlannedMaintenance.CreatedAt,
CreatedBy: storablePlannedMaintenance.CreatedBy,
UpdatedAt: storablePlannedMaintenance.UpdatedAt,
@@ -189,6 +190,7 @@ func (r *maintenance) UpdatePlannedMaintenance(ctx context.Context, maintenance
Description: maintenance.Description,
Schedule: maintenance.Schedule,
OrgID: claims.OrgID,
Scope: maintenance.Scope,
}
storablePlannedMaintenanceRules := make([]*alertmanagertypes.StorablePlannedMaintenanceRule, 0)
@@ -224,7 +226,6 @@ func (r *maintenance) UpdatePlannedMaintenance(ctx context.Context, maintenance
Model(new(alertmanagertypes.StorablePlannedMaintenanceRule)).
Where("planned_maintenance_id = ?", storablePlannedMaintenance.ID.StringValue()).
Exec(ctx)
if err != nil {
return err
}
@@ -241,7 +242,6 @@ func (r *maintenance) UpdatePlannedMaintenance(ctx context.Context, maintenance
}
return nil
})
if err != nil {
return err

View File

@@ -235,66 +235,20 @@ func (r *provider) Match(ctx context.Context, orgID string, ruleID string, set m
return matchedChannels, nil
}
// convertLabelSetToEnv converts a flat label set with dotted keys into a nested map structure for expr env.
// when both a leaf and a deeper nested path exist (e.g. "foo" and "foo.bar"),
// the nested structure takes precedence. That means we will replace an existing leaf at any
// intermediate path with a map so we can materialize the deeper structure.
// TODO(srikanthccv): we need a better solution to handle this, remove the following
// when we update the expr to support dotted keys.
// convertLabelSetToEnv delegates to alertmanagertypes.ConvertLabelSetToEnv and
// logs when a key is a prefix of another (e.g. "foo" alongside "foo.bar").
func (r *provider) convertLabelSetToEnv(ctx context.Context, labelSet model.LabelSet) map[string]interface{} {
env := make(map[string]interface{})
logForReview := false
for lk, lv := range labelSet {
key := strings.TrimSpace(string(lk))
value := string(lv)
if strings.Contains(key, ".") {
parts := strings.Split(key, ".")
current := env
for i, raw := range parts {
part := strings.TrimSpace(raw)
last := i == len(parts)-1
if last {
if _, isMap := current[part].(map[string]interface{}); isMap {
logForReview = true
// deeper structure already exists; do not overwrite.
break
}
current[part] = value
break
}
// ensure a map so we can keep descending.
if nextMap, ok := current[part].(map[string]interface{}); ok {
current = nextMap
continue
}
// if absent or a leaf, replace it with a map.
newMap := make(map[string]interface{})
current[part] = newMap
current = newMap
outer:
for lk := range labelSet {
prefix := string(lk) + "."
for lk2 := range labelSet {
if strings.HasPrefix(string(lk2), prefix) {
r.settings.Logger().InfoContext(ctx, "found label set with conflicting prefix dotted keys", slog.Any("labels", labelSet))
break outer
}
continue
}
// if a map already sits here (due to nested keys), keep the map (nested wins).
if _, isMap := env[key].(map[string]interface{}); isMap {
logForReview = true
continue
}
env[key] = value
}
if logForReview {
r.settings.Logger().InfoContext(ctx, "found label set with conflicting prefix dotted keys", slog.Any("labels", labelSet))
}
return env
return alertmanagertypes.ConvertLabelSetToEnv(labelSet)
}
func (r *provider) evaluateExpr(ctx context.Context, expression string, labelSet model.LabelSet) (bool, error) {

View File

@@ -925,72 +925,3 @@ func TestProvider_CreateRoutes(t *testing.T) {
})
}
}
func TestConvertLabelSetToEnv(t *testing.T) {
tests := []struct {
name string
labelSet model.LabelSet
expected map[string]interface{}
}{
{
name: "simple keys",
labelSet: model.LabelSet{
"key1": "value1",
"key2": "value2",
},
expected: map[string]interface{}{
"key1": "value1",
"key2": "value2",
},
},
{
name: "nested keys",
labelSet: model.LabelSet{
"foo.bar": "value1",
"foo.baz": "value2",
},
expected: map[string]interface{}{
"foo": map[string]interface{}{
"bar": "value1",
"baz": "value2",
},
},
},
{
name: "conflict - nested structure wins",
labelSet: model.LabelSet{
"foo.bar.baz": "deep",
"foo.bar": "shallow",
},
expected: map[string]interface{}{
"foo": map[string]interface{}{
"bar": map[string]interface{}{
"baz": "deep",
},
},
},
},
{
name: "conflict - leaf value vs nested",
labelSet: model.LabelSet{
"foo.bar": "value",
"foo": "should_be_ignored",
},
expected: map[string]interface{}{
"foo": map[string]interface{}{
"bar": "value",
},
},
},
}
provider := &provider{
settings: factory.NewScopedProviderSettings(createTestProviderSettings(), "provider_test"),
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := provider.convertLabelSetToEnv(context.Background(), tt.labelSet)
assert.Equal(t, tt.expected, result)
})
}
}

View File

@@ -2,6 +2,8 @@ package envprovider
import (
"context"
"os"
"strings"
"testing"
"github.com/SigNoz/signoz/pkg/config"
@@ -9,7 +11,21 @@ import (
"github.com/stretchr/testify/require"
)
// clearSignozEnv unsets all existing SIGNOZ_* env vars for the duration of the test.
func clearSignozEnv(t *testing.T) {
t.Helper()
for _, kv := range os.Environ() {
if strings.HasPrefix(kv, prefix) {
key := strings.SplitN(kv, "=", 2)[0]
orig, _ := os.LookupEnv(key)
os.Unsetenv(key)
t.Cleanup(func() { os.Setenv(key, orig) })
}
}
}
func TestGetWithStrings(t *testing.T) {
clearSignozEnv(t)
t.Setenv("SIGNOZ_K1_K2", "string")
t.Setenv("SIGNOZ_K3__K4", "string")
t.Setenv("SIGNOZ_K5__K6_K7__K8", "string")
@@ -31,6 +47,7 @@ func TestGetWithStrings(t *testing.T) {
}
func TestGetWithNoPrefix(t *testing.T) {
clearSignozEnv(t)
t.Setenv("K1_K2", "string")
t.Setenv("K3_K4", "string")
expected := map[string]any{}
@@ -43,6 +60,7 @@ func TestGetWithNoPrefix(t *testing.T) {
}
func TestGetWithGoTypes(t *testing.T) {
clearSignozEnv(t)
t.Setenv("SIGNOZ_BOOL", "true")
t.Setenv("SIGNOZ_STRING", "string")
t.Setenv("SIGNOZ_INT", "1")

View File

@@ -204,6 +204,7 @@ func NewSQLMigrationProviderFactories(
sqlmigration.NewAddTagsFactory(sqlstore, sqlschema),
sqlmigration.NewAddRoleCRUDTuplesFactory(sqlstore),
sqlmigration.NewAddIntegrationDashboardFactory(sqlstore, sqlschema),
sqlmigration.NewAddScopeToPlannedMaintenanceFactory(sqlstore, sqlschema),
)
}

View File

@@ -0,0 +1,97 @@
package sqlmigration
import (
"context"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/sqlschema"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/uptrace/bun"
"github.com/uptrace/bun/migrate"
)
type addScopeToPlannedMaintenance struct {
sqlstore sqlstore.SQLStore
sqlschema sqlschema.SQLSchema
}
func NewAddScopeToPlannedMaintenanceFactory(sqlstore sqlstore.SQLStore, sqlschema sqlschema.SQLSchema) factory.ProviderFactory[SQLMigration, Config] {
return factory.NewProviderFactory(
factory.MustNewName("add_scope_to_planned"),
func(ctx context.Context, ps factory.ProviderSettings, c Config) (SQLMigration, error) {
return &addScopeToPlannedMaintenance{
sqlstore: sqlstore,
sqlschema: sqlschema,
}, nil
},
)
}
func (migration *addScopeToPlannedMaintenance) Register(migrations *migrate.Migrations) error {
if err := migrations.Register(migration.Up, migration.Down); err != nil {
return err
}
return nil
}
func (migration *addScopeToPlannedMaintenance) Up(ctx context.Context, db *bun.DB) error {
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return err
}
defer func() {
_ = tx.Rollback()
}()
table, _, err := migration.sqlschema.GetTable(ctx, "planned_maintenance")
if err != nil {
return err
}
column := &sqlschema.Column{
Name: sqlschema.ColumnName("scope"),
DataType: sqlschema.DataTypeText,
Nullable: true,
}
sqls := migration.sqlschema.Operator().AddColumn(table, nil, column, nil)
for _, sql := range sqls {
if _, err := tx.ExecContext(ctx, string(sql)); err != nil {
return err
}
}
return tx.Commit()
}
func (migration *addScopeToPlannedMaintenance) Down(ctx context.Context, db *bun.DB) error {
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return err
}
defer func() {
_ = tx.Rollback()
}()
table, _, err := migration.sqlschema.GetTable(ctx, "planned_maintenance")
if err != nil {
return err
}
column := &sqlschema.Column{
Name: sqlschema.ColumnName("scope"),
DataType: sqlschema.DataTypeText,
Nullable: true,
}
sqls := migration.sqlschema.Operator().DropColumn(table, column)
for _, sql := range sqls {
if _, err := tx.ExecContext(ctx, string(sql)); err != nil {
return err
}
}
return tx.Commit()
}

View File

@@ -3,12 +3,16 @@ package alertmanagertypes
import (
"context"
"encoding/json"
"strings"
"time"
"github.com/expr-lang/expr"
"github.com/prometheus/common/model"
"github.com/uptrace/bun"
"github.com/SigNoz/signoz/pkg/errors"
"github.com/SigNoz/signoz/pkg/types"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/uptrace/bun"
)
var ErrCodeInvalidPlannedMaintenancePayload = errors.MustNewCode("invalid_planned_maintenance_payload")
@@ -58,6 +62,7 @@ type StorablePlannedMaintenance struct {
Description string `bun:"description,type:text"`
Schedule *Schedule `bun:"schedule,type:text,notnull"`
OrgID string `bun:"org_id,type:text"`
Scope string `bun:"scope,type:text"`
}
type PlannedMaintenance struct {
@@ -66,6 +71,7 @@ type PlannedMaintenance struct {
Description string `json:"description"`
Schedule *Schedule `json:"schedule" required:"true"`
RuleIDs []string `json:"alertIds"`
Scope string `json:"scope,omitempty"`
CreatedAt time.Time `json:"createdAt"`
CreatedBy string `json:"createdBy"`
UpdatedAt time.Time `json:"updatedAt"`
@@ -82,6 +88,7 @@ type PostablePlannedMaintenance struct {
Description string `json:"description"`
Schedule *Schedule `json:"schedule" required:"true"`
AlertIds []string `json:"alertIds"`
Scope string `json:"scope"`
}
func (p *PostablePlannedMaintenance) Validate() error {
@@ -116,6 +123,11 @@ func (p *PostablePlannedMaintenance) Validate() error {
return errors.Newf(errors.TypeInvalidInput, ErrCodeInvalidPlannedMaintenancePayload, "end time cannot be before start time")
}
}
if p.Scope != "" {
if _, err := expr.Compile(p.Scope, expr.AllowUndefinedVariables(), expr.AsBool()); err != nil {
return errors.Newf(errors.TypeInvalidInput, ErrCodeInvalidPlannedMaintenancePayload, "invalid scope: %v", err)
}
}
return nil
}
@@ -151,7 +163,7 @@ func (m *PlannedMaintenance) HasScheduleRecurrenceBoundsMismatch() bool {
(recurrence.EndTime != nil && !recurrence.EndTime.Equal(m.Schedule.EndTime))
}
func (m *PlannedMaintenance) ShouldSkip(ruleID string, now time.Time) bool {
func (m *PlannedMaintenance) ShouldSkip(ruleID string, now time.Time, lset model.LabelSet) bool {
// Check if the alert ID is in the maintenance window
found := false
if len(m.RuleIDs) > 0 {
@@ -171,6 +183,23 @@ func (m *PlannedMaintenance) ShouldSkip(ruleID string, now time.Time) bool {
return false
}
if !m.isScheduleActive(now) {
return false
}
// lset is empty when called from IsActive (no instance labels available);
// skip expression filtering in that case.
if m.Scope != "" && len(lset) != 0 {
if !evalScopeExpression(m.Scope, lset) {
return false
}
}
return true
}
// isScheduleActive reports whether now falls inside the maintenance window's schedule.
func (m *PlannedMaintenance) isScheduleActive(now time.Time) bool {
// If alert is found, we check if it should be skipped based on the schedule
loc, err := time.LoadLocation(m.Schedule.Timezone)
if err != nil {
@@ -220,6 +249,59 @@ func (m *PlannedMaintenance) ShouldSkip(ruleID string, now time.Time) bool {
return false
}
// ConvertLabelSetToEnv converts a label set into a map suitable for use as an
// expr environment. Dotted keys (e.g. "kubernetes.node") are expanded into
// nested maps so that expr can resolve them without panicking. When a dotted
// path conflicts with a plain key, the nested structure takes precedence.
func ConvertLabelSetToEnv(lset model.LabelSet) map[string]interface{} {
env := make(map[string]interface{})
for lk, lv := range lset {
key := strings.TrimSpace(string(lk))
value := string(lv)
if strings.Contains(key, ".") {
parts := strings.Split(key, ".")
current := env
for i, raw := range parts {
part := strings.TrimSpace(raw)
if i == len(parts)-1 {
if _, isMap := current[part].(map[string]interface{}); !isMap {
current[part] = value
}
break
}
if nextMap, ok := current[part].(map[string]interface{}); ok {
current = nextMap
} else {
newMap := make(map[string]interface{})
current[part] = newMap
current = newMap
}
}
continue
}
if _, isMap := env[key].(map[string]interface{}); !isMap {
env[key] = value
}
}
return env
}
// evalScopeExpression compiles and runs the expression against the provided labels.
// Returns false on any error (safety-first: don't suppress on a bad expression).
func evalScopeExpression(expression string, lset model.LabelSet) bool {
env := ConvertLabelSetToEnv(lset)
program, err := expr.Compile(expression, expr.Env(env), expr.AllowUndefinedVariables())
if err != nil {
return false
}
output, err := expr.Run(program, env)
if err != nil {
return false
}
result, ok := output.(bool)
return ok && result
}
// checkDaily rebases the recurrence start to today (or yesterday if needed)
// and returns true if currentTime is within [candidate, candidate+Duration].
func (m *PlannedMaintenance) checkDaily(currentTime time.Time, rec *Recurrence, loc *time.Location) bool {
@@ -306,7 +388,7 @@ func (m *PlannedMaintenance) IsActive(now time.Time) bool {
if len(m.RuleIDs) > 0 {
ruleID = (m.RuleIDs)[0]
}
return m.ShouldSkip(ruleID, now)
return m.ShouldSkip(ruleID, now, nil)
}
func (m *PlannedMaintenance) IsUpcoming() bool {
@@ -389,6 +471,7 @@ func (m PlannedMaintenance) MarshalJSON() ([]byte, error) {
Description string `json:"description" db:"description"`
Schedule *Schedule `json:"schedule" db:"schedule"`
AlertIds []string `json:"alertIds" db:"alert_ids"`
Scope string `json:"scope,omitempty" db:"scope"`
CreatedAt time.Time `json:"createdAt" db:"created_at"`
CreatedBy string `json:"createdBy" db:"created_by"`
UpdatedAt time.Time `json:"updatedAt" db:"updated_at"`
@@ -401,6 +484,7 @@ func (m PlannedMaintenance) MarshalJSON() ([]byte, error) {
Description: m.Description,
Schedule: m.Schedule,
AlertIds: m.RuleIDs,
Scope: m.Scope,
CreatedAt: m.CreatedAt,
CreatedBy: m.CreatedBy,
UpdatedAt: m.UpdatedAt,
@@ -424,6 +508,7 @@ func (m *PlannedMaintenanceWithRules) ToPlannedMaintenance() *PlannedMaintenance
Description: m.Description,
Schedule: m.Schedule,
RuleIDs: ruleIDs,
Scope: m.Scope,
CreatedAt: m.CreatedAt,
UpdatedAt: m.UpdatedAt,
CreatedBy: m.CreatedBy,

View File

@@ -1,10 +1,12 @@
package alertmanagertypes
import (
"reflect"
"testing"
"time"
"github.com/SigNoz/signoz/pkg/valuer"
"github.com/prometheus/common/model"
)
// Helper function to create a time pointer.
@@ -668,9 +670,330 @@ func TestShouldSkipMaintenance(t *testing.T) {
}
for idx, c := range cases {
result := c.maintenance.ShouldSkip(c.name, c.ts)
result := c.maintenance.ShouldSkip(c.name, c.ts, model.LabelSet{})
if result != c.skip {
t.Errorf("skip %v, got %v, case:%d - %s", c.skip, result, idx, c.name)
}
}
}
func TestShouldSkip_Scope(t *testing.T) {
activeSchedule := func() *Schedule {
return &Schedule{
Timezone: "UTC",
StartTime: time.Now().UTC().Add(-time.Hour),
EndTime: time.Now().UTC().Add(time.Hour),
}
}
now := time.Now().UTC()
cases := []struct {
name string
maintenance *PlannedMaintenance
ruleID string
ts time.Time
lset model.LabelSet
skip bool
}{
{
name: "empty scope - no label filtering applied",
maintenance: &PlannedMaintenance{Schedule: activeSchedule()},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production"},
skip: true,
},
{
name: "scope matches labels",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production"},
skip: true,
},
{
name: "scope does not match labels",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "staging"},
skip: false,
},
{
name: "AND expression - both conditions match",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production" && service == "api"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production", "service": "api"},
skip: true,
},
{
name: "AND expression - one condition does not match",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production" && service == "api"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production", "service": "worker"},
skip: false,
},
{
name: "OR expression - first alternative matches",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production" || env == "staging"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production"},
skip: true,
},
{
name: "OR expression - second alternative matches",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production" || env == "staging"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "staging"},
skip: true,
},
{
name: "OR expression - neither alternative matches",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production" || env == "staging"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "development"},
skip: false,
},
{
name: "scope references label absent from lset",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"service": "api"},
skip: false,
},
{
name: "in expression - value is in list",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env in ["production", "staging"]`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "staging"},
skip: true,
},
{
name: "in expression - value not in list",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), Scope: `env in ["production", "staging"]`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "development"},
skip: false,
},
{
name: "ruleID in list and scope matches - should skip",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), RuleIDs: []string{"rule-1", "rule-2"}, Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production"},
skip: true,
},
{
name: "ruleID not in list and scope matches - ruleID gate prevents skip",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), RuleIDs: []string{"rule-2"}, Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "production"},
skip: false,
},
{
name: "ruleID in list but scope does not match - should not skip",
maintenance: &PlannedMaintenance{Schedule: activeSchedule(), RuleIDs: []string{"rule-1"}, Scope: `env == "production"`},
ruleID: "rule-1",
ts: now,
lset: model.LabelSet{"env": "staging"},
skip: false,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
got := c.maintenance.ShouldSkip(c.ruleID, c.ts, c.lset)
if got != c.skip {
t.Errorf("ShouldSkip() = %v, want %v", got, c.skip)
}
})
}
}
func TestEvalScopeExpression(t *testing.T) {
cases := []struct {
name string
expression string
lset model.LabelSet
want bool
}{
{
name: "equality match",
expression: `env == "production"`,
lset: model.LabelSet{"env": "production"},
want: true,
},
{
name: "equality no match",
expression: `env == "production"`,
lset: model.LabelSet{"env": "staging"},
want: false,
},
{
name: "inequality match",
expression: `env != "production"`,
lset: model.LabelSet{"env": "staging"},
want: true,
},
{
name: "AND - both match",
expression: `env == "production" && service == "api"`,
lset: model.LabelSet{"env": "production", "service": "api"},
want: true,
},
{
name: "AND - partial match",
expression: `env == "production" && service == "api"`,
lset: model.LabelSet{"env": "production", "service": "worker"},
want: false,
},
{
name: "OR - first matches",
expression: `env == "production" || env == "staging"`,
lset: model.LabelSet{"env": "production"},
want: true,
},
{
name: "OR - second matches",
expression: `env == "production" || env == "staging"`,
lset: model.LabelSet{"env": "staging"},
want: true,
},
{
name: "OR - none match",
expression: `env == "production" || env == "staging"`,
lset: model.LabelSet{"env": "development"},
want: false,
},
{
name: "undefined label returns false",
expression: `env == "production"`,
lset: model.LabelSet{"service": "api"},
want: false,
},
{
name: "in list - present",
expression: `env in ["production", "staging"]`,
lset: model.LabelSet{"env": "production"},
want: true,
},
{
name: "in list - absent",
expression: `env in ["production", "staging"]`,
lset: model.LabelSet{"env": "development"},
want: false,
},
{
name: "invalid expression returns false",
expression: `env ==`,
lset: model.LabelSet{"env": "production"},
want: false,
},
{
name: "non-bool expression returns false",
expression: `env`,
lset: model.LabelSet{"env": "production"},
want: false,
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
got := evalScopeExpression(c.expression, c.lset)
if got != c.want {
t.Errorf("evalScopeExpression(%q, %v) = %v, want %v", c.expression, c.lset, got, c.want)
}
})
}
}
func TestPostablePlannedMaintenance_ValidateScope(t *testing.T) {
validSchedule := &Schedule{
Timezone: "UTC",
StartTime: time.Now().UTC(),
EndTime: time.Now().UTC().Add(time.Hour),
}
cases := []struct {
name string
scope string
wantErr bool
}{
{name: "empty scope", scope: "", wantErr: false},
{name: "simple equality", scope: `env == "production"`, wantErr: false},
{name: "AND expression", scope: `env == "production" && service == "api"`, wantErr: false},
{name: "OR expression", scope: `env == "production" || env == "staging"`, wantErr: false},
{name: "in expression", scope: `env in ["production", "staging"]`, wantErr: false},
{name: "incomplete expression", scope: `env ==`, wantErr: true},
{name: "non-bool expression", scope: `"just a string"`, wantErr: true},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
p := &PostablePlannedMaintenance{
Name: "test",
Schedule: validSchedule,
Scope: c.scope,
}
err := p.Validate()
if (err != nil) != c.wantErr {
t.Errorf("Validate() error = %v, wantErr %v", err, c.wantErr)
}
})
}
}
func TestConvertLabelSetToEnv(t *testing.T) {
cases := []struct {
name string
lset model.LabelSet
expected map[string]interface{}
}{
{
name: "simple keys",
lset: model.LabelSet{"key1": "value1", "key2": "value2"},
expected: map[string]interface{}{"key1": "value1", "key2": "value2"},
},
{
name: "dotted keys become nested maps",
lset: model.LabelSet{"foo.bar": "value1", "foo.baz": "value2"},
expected: map[string]interface{}{
"foo": map[string]interface{}{"bar": "value1", "baz": "value2"},
},
},
{
name: "deeper dotted key wins over shallow dotted key",
lset: model.LabelSet{"foo.bar.baz": "deep", "foo.bar": "shallow"},
expected: map[string]interface{}{
"foo": map[string]interface{}{
"bar": map[string]interface{}{"baz": "deep"},
},
},
},
{
name: "nested structure wins over plain key",
lset: model.LabelSet{"foo.bar": "value", "foo": "ignored"},
expected: map[string]interface{}{
"foo": map[string]interface{}{"bar": "value"},
},
},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
got := ConvertLabelSetToEnv(c.lset)
if !reflect.DeepEqual(got, c.expected) {
t.Errorf("ConvertLabelSetToEnv() = %v, want %v", got, c.expected)
}
})
}
}

View File

@@ -108,6 +108,7 @@ func (s *Schedule) UnmarshalJSON(data []byte) error {
if err != nil {
return err
}
// TODO(jatinderjit): if endTime.IsZero() then we should not set the endTime
s.EndTime = time.Date(endTime.Year(), endTime.Month(), endTime.Day(), endTime.Hour(), endTime.Minute(), endTime.Second(), endTime.Nanosecond(), loc)
}