Compare commits

..

5 Commits

Author SHA1 Message Date
srikanthccv
f05c574606 chore: add initial version of query range design principles doc 2026-02-25 11:23:20 +05:30
Nageshbansal
4e4c9ce5af chore: enable metadataexporter in docker (#10409)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
2026-02-25 03:13:27 +05:30
Srikanth Chekuri
7605775a38 chore: remove support for non v5 version in rules (#10406) 2026-02-24 23:16:21 +05:30
Vinicius Lourenço
cb1a2a8a13 perf(bundle-size): lazy load pages to reduce main bundle size (#10230)
Some checks failed
build-staging / prepare (push) Has been cancelled
build-staging / js-build (push) Has been cancelled
build-staging / go-build (push) Has been cancelled
build-staging / staging (push) Has been cancelled
Release Drafter / update_release_draft (push) Has been cancelled
2026-02-24 10:41:40 +00:00
Nikhil Soni
1a5d37b25a fix: add missing filtering for ip address for scalar data (#10264)
* fix: add missing filtering for ip address for scalar data

In domain listing api for external api monitoring,
we have option to filter out the IP address but
it only handles timeseries and raw type data while
domain list handler returns scalar data.

* fix: switch to new derived attributes for ip filtering

---------

Co-authored-by: Nityananda Gohain <nityanandagohain@gmail.com>
2026-02-24 10:26:10 +00:00
28 changed files with 528 additions and 6419 deletions

View File

@@ -82,6 +82,12 @@ exporters:
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
@@ -93,19 +99,19 @@ service:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, signozmeter]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, signozmeter]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, signozmeter]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, signozmeter]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]

View File

@@ -82,6 +82,12 @@ exporters:
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
@@ -93,19 +99,19 @@ service:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, signozmeter]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, signozmeter]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, signozmeter]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, signozmeter]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]

View File

@@ -0,0 +1,216 @@
# Query Range v5 — Design Principles & Architectural Contracts
## Purpose of This Document
This document defines the design principles, invariants, and architectural contracts of the Query Range v5 system. It is intended for the authors working on the querier and querier related parts codebase. Any change to the system must align with the principles described here. If a change would violate a principle, it must be flagged and discussed.
---
## Core Architectural Principle
**The user speaks OpenTelemetry. The storage speaks ClickHouse. The system translates between them. These two worlds must never leak into each other.**
Every design choice in Query Range flows from this separation. The user-facing API surface deals exclusively in `TelemetryFieldKey`: a representation of fields as they exist in the OpenTelemetry data model. The storage layer deals in ClickHouse column expressions, table names, and SQL fragments. The translation between them is mediated by a small set of composable abstractions with strict boundaries.
---
## The Central Type: `TelemetryFieldKey`
`TelemetryFieldKey` is the atomic unit of the entire query system. Every filter, aggregation, group-by, order-by, and select operation is expressed in terms of field keys. Understanding its design contracts is non-negotiable.
### Identity
A field key is identified by three dimensions:
- **Name** — the field name as the user knows it (`service.name`, `http.method`, `trace_id`)
- **FieldContext** — where the field lives in the OTel model (`resource`, `attribute`, `span`, `log`, `body`, `scope`, `event`, `metric`)
- **FieldDataType** — the data type (`string`, `bool`, `number`/`float64`/`int64`, array variants)
### Invariant: Same name does not mean same field
`status` as an attribute, `status` as a body JSON key, and `status` as a span-level field are **three different fields**. The context disambiguates. Code that resolves or compares field keys must always consider all three dimensions, never just the name.
### Invariant: Normalization happens once, at the boundary
`TelemetryFieldKey.Normalize()` is called during JSON unmarshaling. After normalization, the text representation `resource.service.name:string` and the programmatic construction `{Name: "service.name", FieldContext: Resource, FieldDataType: String}` are identical.
**Consequence:** Downstream code must never re-parse or re-normalize field keys. If you find yourself splitting on `.` or `:` deep in the pipeline, something is wrong — the normalization should have already happened.
### Invariant: The text format is `context.name:datatype`
Parsing rules (implemented in `GetFieldKeyFromKeyText` and `Normalize`):
1. Data type is extracted from the right, after the last `:`.
2. Field context is extracted from the left, before the first `.`, if it matches a known context prefix.
3. Everything remaining is the name.
Special case: `log.body.X` normalizes to `{FieldContext: body, Name: X}` — the `log.body.` prefix collapses because body fields under log are a nested context.
### Invariant: Historical aliases must be preserved
The `fieldContexts` map includes aliases (`tag` -> `attribute`, `spanfield` -> `span`, `logfield` -> `log`). These exist because older database entries use these names. Removing or changing these aliases will break existing saved queries and dashboard configurations.
---
## The Abstraction Stack
The query pipeline is built from four interfaces that compose vertically. Each layer has a single responsibility. Each layer depends only on the layers below it. This layering is intentional and must be preserved.
```
StatementBuilder <- Orchestrates everything into executable SQL
├── AggExprRewriter <- Rewrites aggregation expressions (maps field refs to columns)
├── ConditionBuilder <- Builds WHERE predicates (field + operator + value -> SQL)
└── FieldMapper <- Maps TelemetryFieldKey -> ClickHouse column expression
```
### FieldMapper
**Contract:** Given a `TelemetryFieldKey`, return a ClickHouse column expression that yields the value for that field when used in a SELECT.
**Principle:** This is the *only* place where field-to-column translation happens. No other layer should contain knowledge of how fields map to storage. If you need a column expression, go through the FieldMapper.
**Why:** The user says `http.request.method`. ClickHouse might store it as `attributes_string['http.request.method']`, or as a materialized column `` `attribute_string_http$$request$$method` ``, or via a JSON access path in a body column. This variation is entirely contained within the FieldMapper. Everything above it is storage-agnostic.
### ConditionBuilder
**Contract:** Given a field key, an operator, and a value, produce a valid SQL predicate for a WHERE clause.
**Dependency:** Uses FieldMapper for the left-hand side of the condition.
**Principle:** The ConditionBuilder owns all the complexity of operator semantics, i.e type casting, array operators (`hasAny`/`hasAll` vs `=`), existence checks, and negative operator behavior. This complexity must not leak upward into the StatementBuilder.
### AggExprRewriter
**Contract:** Given a user-facing aggregation expression like `sum(duration_nano)`, resolve field references within it and produce valid ClickHouse SQL.
**Dependency:** Uses FieldMapper to resolve field names within expressions.
**Principle:** Aggregation expressions are user-authored strings that contain field references. The rewriter parses them, identifies field references, resolves each through the FieldMapper, and reassembles the expression.
### StatementBuilder
**Contract:** Given a complete `QueryBuilderQuery`, a time range, and a request type, produces an executable SQL statement.
**Dependency:** Uses all three abstractions above.
**Principle:** This is the composition layer. It does not contain field mapping logic, condition building logic, or expression rewriting logic. It orchestrates the other abstractions. If you find storage-specific logic creeping into the StatementBuilder, push it down into the appropriate abstraction.
### Invariant: No layer skipping
The StatementBuilder must not call FieldMapper directly to build conditions, it goes through the ConditionBuilder. The AggExprRewriter must not hardcode column names, it goes through the FieldMapper. Skipping layers creates hidden coupling and makes the system fragile to storage changes.
---
## Design Decisions as Constraints
### Constraint: Formula evaluation happens in Go, not in ClickHouse
Formulas (`A + B`, `A / B`, `sqrt(A*A + B*B)`) are evaluated application-side by `FormulaEvaluator`, not via ClickHouse JOINs.
**Why this is a constraint, not just an implementation choice:** The original JOIN-based approach was abandoned because ClickHouse evaluates joins right-to-left, serializing execution unnecessarily. Running queries independently allows parallelism and caching of intermediate results. Any future optimization must not reintroduce the JOIN pattern without solving the serialization problem.
**Consequence:** Individual query results must be independently cacheable. Formula evaluation must handle label matching, timestamp alignment, and missing values without requiring the queries to coordinate at the SQL level.
### Constraint: Zero-defaulting is aggregation-dependent
Only additive/counting aggregations (`count`, `count_distinct`, `sum`, `rate`) default missing values to zero. Statistical aggregations (`avg`, `min`, `max`, percentiles) must show gaps.
**Why:** Absence of data has different meanings. No error requests in a time bucket means error count = 0. No requests at all means average latency is *unknown*, not 0. Conflating these is a correctness bug, not a display preference.
**Enforcement:** `GetQueriesSupportingZeroDefault` determines which queries can default to zero. The `FormulaEvaluator` consumes this via `canDefaultZero`. Changes to aggregation handling must preserve this distinction.
### Constraint: Existence semantics differ for positive vs negative operators
- **Positive operators** (`=`, `>`, `LIKE`, `IN`, etc.) implicitly assert field existence. `http.method = GET` means "the field exists AND equals GET".
- **Negative operators** (`!=`, `NOT IN`, `NOT LIKE`, etc.) do **not** add an existence check. `http.method != GET` includes records where the field doesn't exist at all.
**Why:** The user's intent with negative operators is ambiguous. Rather than guess, we take the broader interpretation. Users can add an explicit `EXISTS` filter if they want the narrower one. This is documented in `AddDefaultExistsFilter`.
**Consequence:** Any new operator must declare its existence behavior in `AddDefaultExistsFilter`. Do not add operators without considering this.
### Constraint: Post-processing functions operate on result sets, not in SQL
Functions like `cutOffMin`, `ewma`, `median`, `timeShift`, `fillZero`, `runningDiff`, and `cumulativeSum` are applied in Go on the returned time series, not pushed into ClickHouse SQL.
**Why:** These are sequential time-series transformations that require complete, ordered result sets. Pushing them into SQL would complicate query generation, prevent caching of raw results, and make the functions harder to test. They are applied via `ApplyFunctions` after query execution.
**Consequence:** New time-series transformation functions should follow this pattern i.e implement them as Go functions on `*TimeSeries`, not as SQL modifications.
### Constraint: The API surface rejects unknown fields with suggestions
All request types use custom `UnmarshalJSON` that calls `DisallowUnknownFields`. Unknown fields trigger error messages with Levenshtein-based suggestions ("did you mean: 'groupBy'?").
**Why:** Silent acceptance of unknown fields causes subtle bugs. A misspelled `groupBy` results in ungrouped data with no indication of what went wrong. Failing fast with suggestions turns errors into actionable feedback.
**Consequence:** Any new request type or query spec struct must implement custom unmarshaling with `UnmarshalJSONWithContext`. Do not use default `json.Unmarshal` for user-facing types.
### Constraint: Validation is context-sensitive to request type
What's valid depends on the `RequestType`. For aggregation requests (`time_series`, `scalar`, `distribution`), fields like `groupBy`, `aggregations`, `having`, and aggregation-referenced `orderBy` are validated. For non-aggregation requests (`raw`, `raw_stream`, `trace`), these fields are ignored.
**Why:** A raw log query doesn't have aggregations, so requiring `aggregations` would be wrong. But a time-series query without aggregations is meaningless. The validation rules are request-type-aware to avoid both false positives and false negatives.
**Consequence:** When adding new fields to query specs, consider which request types they apply to and gate validation accordingly.
---
## The Composite Query Model
### Structure
A `QueryRangeRequest` contains a `CompositeQuery` which holds `[]QueryEnvelope`. Each envelope is a discriminated union: a `Type` field determines how `Spec` is decoded.
### Invariant: Query names are unique within a composite query
Builder queries must have unique names. Formulas reference queries by name (`A`, `B`, `A.0`, `A.my_alias`). Duplicate names would make formula evaluation ambiguous.
### Invariant: Multi-aggregation uses indexed or aliased references
A single builder query can have multiple aggregations. They are accessed in formulas via:
- Index: `A.0`, `A.1` (zero-based)
- Alias: `A.total`, `A.error_count`
The default (just `A`) resolves to index 0. This is the formula evaluation contract and must be preserved.
### Invariant: Type-specific decoding through signal detection
Builder queries are decoded by first peeking at the `signal` field in the raw JSON, then unmarshaling into the appropriate generic type (`QueryBuilderQuery[TraceAggregation]`, `QueryBuilderQuery[LogAggregation]`, `QueryBuilderQuery[MetricAggregation]`). This two-pass decoding is intentional — it allows each signal to have its own aggregation schema while sharing the query structure.
---
## The Metadata Layer
### MetadataStore
The `MetadataStore` interface provides runtime field discovery and type resolution. It answers questions like "what fields exist for this signal?" and "what are the data types of field X?".
### Principle: Fields can be ambiguous until resolved
The same name can map to multiple `TelemetryFieldKey` variants (different contexts, different types). The metadata store returns *all* variants. Resolution to a single field happens during query building, using the query's signal and any explicit context/type hints from the user.
**Consequence:** Code that calls `GetKey` or `GetKeys` must handle multiple results. Do not assume a name maps to a single field.
### Principle: Materialized fields are a performance optimization, not a semantic distinction
A materialized field and its non-materialized equivalent represent the same logical field. The `Materialized` flag tells the FieldMapper to generate a simpler column expression. The user should never need to know whether a field is materialized.
### Principle: JSON body fields require access plans
Fields inside JSON body columns (`body.response.errors[].code`) need pre-computed `JSONAccessPlan` trees that encode the traversal path, including branching at array boundaries between `Array(JSON)` and `Array(Dynamic)` representations. These plans are computed during metadata resolution, not during query execution.
---
## Summary of Inviolable Rules
1. **User-facing types never contain ClickHouse column names or SQL fragments.**
2. **Field-to-column translation only happens in FieldMapper.**
3. **Normalization happens once at the API boundary, never deeper.**
4. **Historical aliases in fieldContexts and fieldDataTypes must not be removed.**
5. **Formula evaluation stays in Go — do not push it into ClickHouse JOINs.**
6. **Zero-defaulting is aggregation-type-dependent — do not universally default to zero.**
7. **Positive operators imply existence, negative operators do not.**
8. **Post-processing functions operate on Go result sets, not in SQL.**
9. **All user-facing types reject unknown JSON fields with suggestions.**
10. **Validation rules are gated by request type.**
11. **Query names must be unique within a composite query.**
12. **The four-layer abstraction stack (FieldMapper -> ConditionBuilder -> AggExprRewriter -> StatementBuilder) must not be bypassed or flattened.**

View File

@@ -308,3 +308,15 @@ export const PublicDashboardPage = Loadable(
/* webpackChunkName: "Public Dashboard Page" */ 'pages/PublicDashboard'
),
);
export const AlertTypeSelectionPage = Loadable(
() =>
import(
/* webpackChunkName: "Alert Type Selection Page" */ 'pages/AlertTypeSelection'
),
);
export const MeterExplorerPage = Loadable(
() =>
import(/* webpackChunkName: "Meter Explorer Page" */ 'pages/MeterExplorer'),
);

View File

@@ -1,12 +1,10 @@
import { RouteProps } from 'react-router-dom';
import ROUTES from 'constants/routes';
import AlertTypeSelectionPage from 'pages/AlertTypeSelection';
import MessagingQueues from 'pages/MessagingQueues';
import MeterExplorer from 'pages/MeterExplorer';
import {
AlertHistory,
AlertOverview,
AlertTypeSelectionPage,
AllAlertChannels,
AllErrors,
ApiMonitoring,
@@ -29,6 +27,8 @@ import {
LogsExplorer,
LogsIndexToFields,
LogsSaveViews,
MessagingQueuesMainPage,
MeterExplorerPage,
MetricsExplorer,
OldLogsExplorer,
Onboarding,
@@ -399,28 +399,28 @@ const routes: AppRoutes[] = [
{
path: ROUTES.MESSAGING_QUEUES_KAFKA,
exact: true,
component: MessagingQueues,
component: MessagingQueuesMainPage,
key: 'MESSAGING_QUEUES_KAFKA',
isPrivate: true,
},
{
path: ROUTES.MESSAGING_QUEUES_CELERY_TASK,
exact: true,
component: MessagingQueues,
component: MessagingQueuesMainPage,
key: 'MESSAGING_QUEUES_CELERY_TASK',
isPrivate: true,
},
{
path: ROUTES.MESSAGING_QUEUES_OVERVIEW,
exact: true,
component: MessagingQueues,
component: MessagingQueuesMainPage,
key: 'MESSAGING_QUEUES_OVERVIEW',
isPrivate: true,
},
{
path: ROUTES.MESSAGING_QUEUES_KAFKA_DETAIL,
exact: true,
component: MessagingQueues,
component: MessagingQueuesMainPage,
key: 'MESSAGING_QUEUES_KAFKA_DETAIL',
isPrivate: true,
},
@@ -463,21 +463,21 @@ const routes: AppRoutes[] = [
{
path: ROUTES.METER,
exact: true,
component: MeterExplorer,
component: MeterExplorerPage,
key: 'METER',
isPrivate: true,
},
{
path: ROUTES.METER_EXPLORER,
exact: true,
component: MeterExplorer,
component: MeterExplorerPage,
key: 'METER_EXPLORER',
isPrivate: true,
},
{
path: ROUTES.METER_EXPLORER_VIEWS,
exact: true,
component: MeterExplorer,
component: MeterExplorerPage,
key: 'METER_EXPLORER_VIEWS',
isPrivate: true,
},

View File

@@ -14,16 +14,10 @@ interface ITimelineV2Props {
startTimestamp: number;
endTimestamp: number;
timelineHeight: number;
offsetTimestamp: number;
}
function TimelineV2(props: ITimelineV2Props): JSX.Element {
const {
startTimestamp,
endTimestamp,
timelineHeight,
offsetTimestamp,
} = props;
const { startTimestamp, endTimestamp, timelineHeight } = props;
const [intervals, setIntervals] = useState<Interval[]>([]);
const [ref, { width }] = useMeasure<HTMLDivElement>();
const isDarkMode = useIsDarkMode();
@@ -36,10 +30,8 @@ function TimelineV2(props: ITimelineV2Props): JSX.Element {
const minIntervals = getMinimumIntervalsBasedOnWidth(width);
const intervalisedSpread = (spread / minIntervals) * 1.0;
const intervals = getIntervals(intervalisedSpread, spread, offsetTimestamp);
setIntervals(intervals);
}, [startTimestamp, endTimestamp, width, offsetTimestamp]);
setIntervals(getIntervals(intervalisedSpread, spread));
}, [startTimestamp, endTimestamp, width]);
if (endTimestamp < startTimestamp) {
console.error(

View File

@@ -64,71 +64,6 @@ export const resolveTimeFromInterval = (
export function getIntervals(
intervalSpread: number,
baseSpread: number,
offsetTimestamp: number, // ms offset from trace start (e.g. viewStart - traceStart)
): Interval[] {
const integerPartString = intervalSpread.toString().split('.')[0];
const integerPartLength = integerPartString.length;
const intervalSpreadNormalized =
intervalSpread < 1.0
? intervalSpread
: Math.floor(Number(integerPartString) / 10 ** (integerPartLength - 1)) *
10 ** (integerPartLength - 1);
let intervalUnit = INTERVAL_UNITS[0];
for (let idx = INTERVAL_UNITS.length - 1; idx >= 0; idx -= 1) {
const standardInterval = INTERVAL_UNITS[idx];
if (intervalSpread * standardInterval.multiplier >= 1) {
intervalUnit = INTERVAL_UNITS[idx];
break;
}
}
const intervals: Interval[] = [
{
// ✅ start label should reflect window start offset (relative to trace start)
label: `${toFixed(
resolveTimeFromInterval(offsetTimestamp, intervalUnit),
2,
)}${intervalUnit.name}`,
percentage: 0,
},
];
let tempBaseSpread = baseSpread;
let elapsedIntervals = 0;
while (tempBaseSpread && intervals.length < 20) {
let intervalTime: number;
if (tempBaseSpread <= 1.5 * intervalSpreadNormalized) {
intervalTime = elapsedIntervals + tempBaseSpread;
tempBaseSpread = 0;
} else {
intervalTime = elapsedIntervals + intervalSpreadNormalized;
tempBaseSpread -= intervalSpreadNormalized;
}
elapsedIntervals = intervalTime;
// ✅ label time = window offset + elapsed time inside window
const labelTime = offsetTimestamp + intervalTime;
intervals.push({
label: `${toFixed(resolveTimeFromInterval(labelTime, intervalUnit), 2)}${
intervalUnit.name
}`,
percentage: (intervalTime / baseSpread) * 100,
});
}
return intervals;
}
export function getIntervalsOld(
intervalSpread: number,
baseSpread: number,
offsetTimestamp: number,
): Interval[] {
const integerPartString = intervalSpread.toString().split('.')[0];
const integerPartLength = integerPartString.length;
@@ -171,10 +106,9 @@ export function getIntervalsOld(
}
elapsedIntervals = intervalTime;
const interval: Interval = {
label: `${toFixed(
resolveTimeFromInterval(intervalTime + offsetTimestamp, intervalUnit),
2,
)}${intervalUnit.name}`,
label: `${toFixed(resolveTimeFromInterval(intervalTime, intervalUnit), 2)}${
intervalUnit.name
}`,
percentage: (intervalTime / baseSpread) * 100,
};
intervals.push(interval);

View File

@@ -1,88 +0,0 @@
# Flamegraph Canvas POC Notes
## Overview
This document tracks the proof-of-concept (POC) implementation of a canvas-based flamegraph rendering system, replacing the previous DOM-based approach.
## Implementation Status
### ✅ Completed Features
1. **Canvas-based Rendering**
- Replaced DOM elements (`react-virtuoso` and `div.span-item`) with a single HTML5 Canvas
- Implemented `drawFlamegraph` function to render all spans as rectangles on canvas
- Added device pixel ratio (DPR) support for crisp rendering
2. **Time-window-based Zoom**
- Replaced pixel-based zoom/pan with time-window-based approach (`viewStartTs`, `viewEndTs`)
- Prevents pixelation by redrawing from data with new time bounds
- Zoom anchors to cursor position
- Horizontal zoom works correctly (min: 1/100th of trace, max: full trace)
3. **Drag to Pan**
- Implemented drag-to-pan functionality for navigating the canvas
- Differentiates between click (span selection) and drag (panning) based on distance moved
- Prevents unwanted window zoom
4. **Minimap with 2D Navigation**
- Canvas-based minimap showing density histogram (time × levels)
- 2D brush overlay for both horizontal (time) and vertical (levels) navigation
- Draggable brush to pan both dimensions
- Bidirectional synchronization between main canvas and minimap
5. **Timeline Synchronization**
- `TimelineV2` component synchronized with visible time window
- Updates correctly during zoom and pan operations
6. **Hit Testing**
- Implemented span rectangle tracking for click detection
- Tooltip on hover
- Span selection via click
### ❌ Known Issues / Not Working
1. **Vertical Zoom - NOT WORKING**
- **Status**: Attempted implementation but not functioning correctly
- **Issue**: When horizontal zoom reaches maximum (full trace width), vertical zoom cannot continue to zoom out further
- **Attempted Solution**: Added `rowHeightScale` state to control vertical row spacing, but the implementation does not work as expected
- **Impact**: Users cannot fully zoom out vertically to see all levels when horizontal zoom is at maximum
- **Next Steps**: Needs further investigation and alternative approach
2. **Timeline Scale Alignment - NOT WORKING PROPERLY**
- **Status**: Issue identified but not fully resolved
- **Issue**: The timeline scale does not align properly when dragging/panning the canvas. The timeline aligns correctly during zoom operations, but not during drag/pan operations.
- **Impact**: Timeline may show incorrect time values while dragging the canvas
- **Attempted Solution**: Used refs (`viewStartTsRef`, `viewEndTsRef`) to track current time window and incremental delta calculation, but issue persists
- **Next Steps**: Needs further investigation to ensure timeline stays synchronized during all interaction types
### 🔄 Pending / Future Work
1. **Performance Optimization**
- Consider adding an interaction layer (separate canvas on top) for better performance
- Optimize rendering for large traces
2. **Code Quality**
- Reduce cognitive complexity of `drawFlamegraph` function (currently 26, target: 15)
- Reduce cognitive complexity of `drawMinimap` function (currently 30, target: 15)
3. **Additional Features**
- Keyboard shortcuts for navigation
- Better zoom controls
- Export functionality
## Technical Details
### Key Files Modified
- `frontend/src/container/PaginatedTraceFlamegraph/TraceFlamegraphStates/Success/Success.tsx` - Main rendering component
- `frontend/src/container/PaginatedTraceFlamegraph/TraceFlamegraphStates/Success/Success.styles.scss` - Styles for canvas and minimap
- `frontend/src/components/TimelineV2/TimelineV2.tsx` - Timeline synchronization
### Key Concepts
- **Time-window-based zoom**: Instead of scaling canvas bitmap, redraw from data with new time bounds
- **Device Pixel Ratio**: Render at DPR resolution for crisp display on high-DPI screens
- **2D Minimap**: Shows density heatmap across both time (horizontal) and levels (vertical) dimensions
- **Brush Navigation**: Draggable rectangle overlay for panning both dimensions
## Notes
- This is a POC implementation - code quality and optimization can be improved after validation
- Some linting warnings (cognitive complexity) are acceptable for POC phase
- All changes should be validated before production use

View File

@@ -1,5 +1,3 @@
// @ts-nocheck
/* eslint-disable */
import { useEffect, useMemo, useState } from 'react';
import { useParams } from 'react-router-dom';
import { Progress, Skeleton, Tooltip, Typography } from 'antd';
@@ -15,13 +13,8 @@ import { Span } from 'types/api/trace/getTraceV2';
import { TraceFlamegraphStates } from './constants';
import Error from './TraceFlamegraphStates/Error/Error';
import NoData from './TraceFlamegraphStates/NoData/NoData';
// import Success from './TraceFlamegraphStates/Success/SuccessV2';
// import Success from './TraceFlamegraphStates/Success/SuccessV3_without_minimap_best';
import Success from './TraceFlamegraphStates/Success/Success_zoom';
import Success from './TraceFlamegraphStates/Success/Success';
// import Success from './TraceFlamegraphStates/Success/Success_zoom_api';
// import Success from './TraceFlamegraphStates/Success/SuccessCursor';
// import Success from './TraceFlamegraphStates/Success/Success';
import './PaginatedTraceFlamegraph.styles.scss';
interface ITraceFlamegraphProps {
@@ -45,6 +38,7 @@ function TraceFlamegraph(props: ITraceFlamegraphProps): JSX.Element {
const [firstSpanAtFetchLevel, setFirstSpanAtFetchLevel] = useState<string>(
urlQuery.get('spanId') || '',
);
useEffect(() => {
setFirstSpanAtFetchLevel(urlQuery.get('spanId') || '');
}, [urlQuery]);
@@ -52,9 +46,6 @@ function TraceFlamegraph(props: ITraceFlamegraphProps): JSX.Element {
const { data, isFetching, error } = useGetTraceFlamegraph({
traceId,
selectedSpanId: firstSpanAtFetchLevel,
limit: 100001,
// boundaryStartTsMilli: 0,
// boundarEndTsMilli: 10000,
});
// get the current state of trace flamegraph based on the API lifecycle

View File

@@ -3,11 +3,6 @@
overflow-x: hidden;
overflow-y: auto;
&.trace-flamegraph-canvas {
overflow: hidden;
position: relative;
}
.trace-flamegraph-virtuoso {
overflow-x: hidden;

View File

@@ -1,6 +1,3 @@
// @ts-nocheck
/* eslint-disable */
/* eslint-disable jsx-a11y/no-static-element-interactions */
/* eslint-disable jsx-a11y/click-events-have-key-events */
import {
@@ -26,12 +23,6 @@ import { toFixed } from 'utils/toFixed';
import './Success.styles.scss';
// Constants for rendering
const ROW_HEIGHT = 24; // 18px height + 6px padding
const SPAN_BAR_HEIGHT = 12;
const EVENT_DOT_SIZE = 6;
const SPAN_BAR_Y_OFFSET = 3; //
interface ITraceMetadata {
startTime: number;
endTime: number;
@@ -44,14 +35,6 @@ interface ISuccessProps {
traceMetadata: ITraceMetadata;
selectedSpan: Span | undefined;
}
interface SpanRect {
span: FlamegraphSpan;
x: number;
y: number;
width: number;
height: number;
level: number;
}
function Success(props: ISuccessProps): JSX.Element {
const {
@@ -65,28 +48,6 @@ function Success(props: ISuccessProps): JSX.Element {
const history = useHistory();
const isDarkMode = useIsDarkMode();
const virtuosoRef = useRef<VirtuosoHandle>(null);
const baseCanvasRef = useRef<HTMLCanvasElement>(null);
const interactionCanvasRef = useRef<HTMLCanvasElement>(null);
const minimapRef = useRef<HTMLCanvasElement>(null);
const containerRef = useRef<HTMLDivElement>(null);
console.log('spans.length', spans.length);
// Calculate total canvas height. this is coming less
const totalHeight = spans.length * ROW_HEIGHT;
// Build a flat array of span rectangles for hit testing.
// consider per level buckets to improve hit testing
const spanRects = useRef<SpanRect[]>([]);
// Time window state (instead of zoom/pan in pixel space)
const [viewStartTs, setViewStartTs] = useState<number>(
traceMetadata.startTime,
);
const [viewEndTs, setViewEndTs] = useState<number>(traceMetadata.endTime);
const [scrollTop, setScrollTop] = useState<number>(0);
const [hoveredSpanId, setHoveredSpanId] = useState<string>('');
const renderSpanLevel = useCallback(
(_: number, spans: FlamegraphSpan[]): JSX.Element => (
@@ -196,305 +157,16 @@ function Success(props: ISuccessProps): JSX.Element {
});
}, [firstSpanAtFetchLevel, spans]);
// Draw a single event dot
const drawEventDot = useCallback(
(
ctx: CanvasRenderingContext2D,
x: number,
y: number,
isError: boolean,
): void => {
// could be optimized:
// ctx.beginPath();
// ctx.moveTo(x, y - size/2);
// ctx.lineTo(x + size/2, y);
// ctx.lineTo(x, y + size/2);
// ctx.lineTo(x - size/2, y);
// ctx.closePath();
// ctx.fill();
// ctx.stroke();
ctx.save();
ctx.translate(x, y);
ctx.rotate(Math.PI / 4); // 45 degrees
if (isError) {
ctx.fillStyle = isDarkMode ? 'rgb(239, 68, 68)' : 'rgb(220, 38, 38)';
ctx.strokeStyle = isDarkMode ? 'rgb(185, 28, 28)' : 'rgb(153, 27, 27)';
} else {
ctx.fillStyle = isDarkMode ? 'rgb(14, 165, 233)' : 'rgb(6, 182, 212)';
ctx.strokeStyle = isDarkMode ? 'rgb(2, 132, 199)' : 'rgb(8, 145, 178)';
}
ctx.lineWidth = 1;
ctx.fillRect(
-EVENT_DOT_SIZE / 2,
-EVENT_DOT_SIZE / 2,
EVENT_DOT_SIZE,
EVENT_DOT_SIZE,
);
ctx.strokeRect(
-EVENT_DOT_SIZE / 2,
-EVENT_DOT_SIZE / 2,
EVENT_DOT_SIZE,
EVENT_DOT_SIZE,
);
ctx.restore();
},
[isDarkMode],
);
// Get CSS color value from color string or CSS variable
// const getColorValue = useCallback((color: string): string => {
// // if (color.startsWith('var(')) {
// // // For CSS variables, we need to get computed value
// // const tempDiv = document.createElement('div');
// // tempDiv.style.color = color;
// // document.body.appendChild(tempDiv);
// // const computedColor = window.getComputedStyle(tempDiv).color;
// // document.body.removeChild(tempDiv);
// // return computedColor;
// // }
// return color;
// }, []);
// Get span color based on service, error state, and selection
// separate this when introducing interaction canvas
const getSpanColor = useCallback(
(span: FlamegraphSpan): string => {
let color = generateColor(span.serviceName, themeColors.traceDetailColors);
if (span.hasError) {
color = isDarkMode ? 'rgb(239, 68, 68)' : 'rgb(220, 38, 38)';
}
// else {
// color = getColorValue(color);
// }
// Apply selection/hover highlight
//hover/selection highlight in getSpanColor forces base redraw. clipping necessary.
if (selectedSpan?.spanId === span.spanId || hoveredSpanId === span.spanId) {
const colorObj = Color(color);
color = isDarkMode
? colorObj.lighten(0.7).hex()
: colorObj.darken(0.7).hex();
}
return color;
},
[isDarkMode, selectedSpan, hoveredSpanId],
);
// Draw a single span and its events
const drawSpan = useCallback(
(
ctx: CanvasRenderingContext2D,
span: FlamegraphSpan,
x: number,
y: number,
width: number,
levelIndex: number,
spanRectsArray: SpanRect[],
): void => {
const color = getSpanColor(span); // do not depend on hover/clicks
const spanY = y + SPAN_BAR_Y_OFFSET;
// Draw span rectangle
ctx.fillStyle = color;
ctx.beginPath();
// see if we can avoid roundRect as it is performance intensive
ctx.roundRect(x, spanY, width, SPAN_BAR_HEIGHT, 6);
ctx.fill();
// Store rect for hit testing
// consider per level buckets to improve hit testing
// So hover can:
// compute level from y
// search only within that row
spanRectsArray.push({
span,
x,
y: spanY,
width,
height: SPAN_BAR_HEIGHT,
level: levelIndex,
});
// Draw events
// think about optimizing this.
// if span is too small to draw events, skip drawing events???
span.event?.forEach((event) => {
const eventTimeMs = event.timeUnixNano / 1e6;
const eventOffsetPercent =
((eventTimeMs - span.timestamp) / (span.durationNano / 1e6)) * 100;
const clampedOffset = Math.max(1, Math.min(eventOffsetPercent, 99));
const eventX = x + (clampedOffset / 100) * width;
const eventY = spanY + SPAN_BAR_HEIGHT / 2;
// LOD guard: skip events if span too narrow
// if (width < EVENT_DOT_SIZE) {
// return;
// }
drawEventDot(ctx, eventX, eventY, event.isError);
});
},
[getSpanColor, drawEventDot],
);
const drawFlamegraph = useCallback(() => {
const canvas = baseCanvasRef.current;
const container = containerRef.current;
if (!canvas || !container) {
return;
}
const ctx = canvas.getContext('2d');
if (!ctx) {
return;
}
const dpr = window.devicePixelRatio || 1;
ctx.setTransform(dpr, 0, 0, dpr, 0, 0);
const timeSpan = viewEndTs - viewStartTs;
if (timeSpan <= 0) {
return;
}
const cssWidth = canvas.width / dpr;
// ---- Vertical clipping window ----
const viewportHeight = container.clientHeight;
const overscan = 4;
const firstLevel = Math.max(0, Math.floor(scrollTop / ROW_HEIGHT) - overscan);
const visibleLevelCount =
Math.ceil(viewportHeight / ROW_HEIGHT) + 2 * overscan;
const lastLevel = Math.min(spans.length - 1, firstLevel + visibleLevelCount);
// ---- Clear only visible region (recommended) ----
const clearTop = firstLevel * ROW_HEIGHT;
const clearHeight = (lastLevel - firstLevel + 1) * ROW_HEIGHT;
ctx.clearRect(0, clearTop, cssWidth, clearHeight);
const spanRectsArray: SpanRect[] = [];
// ---- Draw only visible levels ----
for (let levelIndex = firstLevel; levelIndex <= lastLevel; levelIndex++) {
const levelSpans = spans[levelIndex];
if (!levelSpans) {
continue;
}
const y = levelIndex * ROW_HEIGHT;
for (let i = 0; i < levelSpans.length; i++) {
const span = levelSpans[i];
const spanStartMs = span.timestamp;
const spanEndMs = span.timestamp + span.durationNano / 1e6;
// Time culling (already correct)
if (spanEndMs < viewStartTs || spanStartMs > viewEndTs) {
continue;
}
const leftOffset = ((spanStartMs - viewStartTs) / timeSpan) * cssWidth;
const rightEdge = ((spanEndMs - viewStartTs) / timeSpan) * cssWidth;
let width = rightEdge - leftOffset;
// Clamp to visible x-range
if (leftOffset < 0) {
width += leftOffset;
if (width <= 0) {
continue;
}
}
if (rightEdge > cssWidth) {
width = cssWidth - Math.max(0, leftOffset);
if (width <= 0) {
continue;
}
}
// Optional: minimum 1px width
if (width > 0 && width < 1) {
width = 1;
}
drawSpan(
ctx,
span,
Math.max(0, leftOffset),
y,
width,
levelIndex,
spanRectsArray,
);
}
}
spanRects.current = spanRectsArray;
}, [spans, viewStartTs, viewEndTs, scrollTop, drawSpan]);
// Handle canvas resize with device pixel ratio
useEffect(() => {
const canvas = baseCanvasRef.current;
const container = containerRef.current;
if (!canvas || !container) {
return;
}
const updateCanvasSize = (): void => {
const rect = container.getBoundingClientRect();
const dpr = window.devicePixelRatio || 1;
// Set CSS size
canvas.style.width = `${rect.width}px`;
canvas.style.height = `${totalHeight}px`;
// Set actual pixel size (accounting for DPR)
// Only update if size actually changed to prevent unnecessary redraws
const newWidth = Math.floor(rect.width * dpr);
const newHeight = Math.floor(totalHeight * dpr);
if (canvas.width !== newWidth || canvas.height !== newHeight) {
canvas.width = newWidth;
canvas.height = newHeight;
// Redraw with current time window (preserves zoom/pan)
drawFlamegraph();
}
};
const resizeObserver = new ResizeObserver(updateCanvasSize);
resizeObserver.observe(container);
// Initial size
updateCanvasSize();
// Handle DPR changes (e.g., moving window between screens)
const handleDPRChange = (): void => {
updateCanvasSize();
};
window
.matchMedia('(resolution: 1dppx)')
.addEventListener('change', handleDPRChange);
return (): void => {
resizeObserver.disconnect();
window
.matchMedia('(resolution: 1dppx)')
.removeEventListener('change', handleDPRChange);
};
}, [drawFlamegraph, totalHeight]);
// Re-draw when data changes
useEffect(() => {
drawFlamegraph();
}, [drawFlamegraph]);
return (
<>
<div ref={containerRef} className="trace-flamegraph trace-flamegraph-canvas">
<canvas ref={baseCanvasRef}></canvas>
<div className="trace-flamegraph">
<Virtuoso
ref={virtuosoRef}
className="trace-flamegraph-virtuoso"
data={spans}
itemContent={renderSpanLevel}
rangeChanged={handleRangeChanged}
/>
</div>
<TimelineV2
startTimestamp={traceMetadata.startTime}

View File

@@ -1,890 +0,0 @@
// @ts-nocheck
/* eslint-disable */
/* eslint-disable jsx-a11y/no-static-element-interactions */
/* eslint-disable jsx-a11y/click-events-have-key-events */
import {
Dispatch,
SetStateAction,
useCallback,
useEffect,
useRef,
useState,
} from 'react';
import { useHistory, useLocation } from 'react-router-dom';
import { Button } from 'antd';
import Color from 'color';
import TimelineV2 from 'components/TimelineV2/TimelineV2';
import { themeColors } from 'constants/theme';
import { useIsDarkMode } from 'hooks/useDarkMode';
import { generateColor } from 'lib/uPlotLib/utils/generateColor';
import { FlamegraphSpan } from 'types/api/trace/getTraceFlamegraph';
import { Span } from 'types/api/trace/getTraceV2';
import './Success.styles.scss';
interface ITraceMetadata {
startTime: number;
endTime: number;
}
interface ISuccessProps {
spans: FlamegraphSpan[][];
firstSpanAtFetchLevel: string;
setFirstSpanAtFetchLevel: Dispatch<SetStateAction<string>>;
traceMetadata: ITraceMetadata;
selectedSpan: Span | undefined;
}
// Constants for rendering
const ROW_HEIGHT = 24; // 18px height + 6px padding
const SPAN_BAR_HEIGHT = 12;
const EVENT_DOT_SIZE = 6;
const SPAN_BAR_Y_OFFSET = 3; // Center the 12px bar in 18px row
interface SpanRect {
span: FlamegraphSpan;
x: number;
y: number;
width: number;
height: number;
level: number;
}
function Success(props: ISuccessProps): JSX.Element {
const {
spans,
setFirstSpanAtFetchLevel,
traceMetadata,
firstSpanAtFetchLevel,
selectedSpan,
} = props;
const { search } = useLocation();
const history = useHistory();
const isDarkMode = useIsDarkMode();
const canvasRef = useRef<HTMLCanvasElement>(null);
const containerRef = useRef<HTMLDivElement>(null);
const [hoveredSpanId, setHoveredSpanId] = useState<string>('');
const [tooltipContent, setTooltipContent] = useState<{
content: string;
x: number;
y: number;
} | null>(null);
const [scrollTop, setScrollTop] = useState<number>(0);
// Time window state (instead of zoom/pan in pixel space)
const [viewStartTs, setViewStartTs] = useState<number>(
traceMetadata.startTime,
);
const [viewEndTs, setViewEndTs] = useState<number>(traceMetadata.endTime);
const [isSpacePressed, setIsSpacePressed] = useState<boolean>(false);
// Refs to avoid stale state during rapid wheel events and dragging
const viewStartRef = useRef(viewStartTs);
const viewEndRef = useRef(viewEndTs);
useEffect(() => {
viewStartRef.current = viewStartTs;
viewEndRef.current = viewEndTs;
}, [viewStartTs, viewEndTs]);
// Drag state in refs to avoid re-renders during drag
const isDraggingRef = useRef(false);
const dragStartRef = useRef<{ x: number; y: number } | null>(null);
const dragDistanceRef = useRef(0);
const suppressClickRef = useRef(false);
// Scroll ref to avoid recreating getCanvasPointer on every scroll
const scrollTopRef = useRef(0);
useEffect(() => {
scrollTopRef.current = scrollTop;
}, [scrollTop]);
// Build a flat array of span rectangles for hit testing
const spanRects = useRef<SpanRect[]>([]);
// Get span color based on service, error state, and selection
const getSpanColor = useCallback(
(span: FlamegraphSpan): string => {
let color = generateColor(span.serviceName, themeColors.traceDetailColors);
if (span.hasError) {
color = isDarkMode ? 'rgb(239, 68, 68)' : 'rgb(220, 38, 38)';
}
// Apply selection/hover highlight
if (selectedSpan?.spanId === span.spanId || hoveredSpanId === span.spanId) {
const colorObj = Color(color);
color = isDarkMode
? colorObj.lighten(0.7).hex()
: colorObj.darken(0.7).hex();
}
return color;
},
[isDarkMode, selectedSpan, hoveredSpanId],
);
// Draw a single event dot
const drawEventDot = useCallback(
(
ctx: CanvasRenderingContext2D,
x: number,
y: number,
isError: boolean,
): void => {
// could be optimized:
// ctx.beginPath();
// ctx.moveTo(x, y - size/2);
// ctx.lineTo(x + size/2, y);
// ctx.lineTo(x, y + size/2);
// ctx.lineTo(x - size/2, y);
// ctx.closePath();
// ctx.fill();
// ctx.stroke();
ctx.save();
ctx.translate(x, y);
ctx.rotate(Math.PI / 4); // 45 degrees
if (isError) {
ctx.fillStyle = isDarkMode ? 'rgb(239, 68, 68)' : 'rgb(220, 38, 38)';
ctx.strokeStyle = isDarkMode ? 'rgb(185, 28, 28)' : 'rgb(153, 27, 27)';
} else {
ctx.fillStyle = isDarkMode ? 'rgb(14, 165, 233)' : 'rgb(6, 182, 212)';
ctx.strokeStyle = isDarkMode ? 'rgb(2, 132, 199)' : 'rgb(8, 145, 178)';
}
ctx.lineWidth = 1;
ctx.fillRect(
-EVENT_DOT_SIZE / 2,
-EVENT_DOT_SIZE / 2,
EVENT_DOT_SIZE,
EVENT_DOT_SIZE,
);
ctx.strokeRect(
-EVENT_DOT_SIZE / 2,
-EVENT_DOT_SIZE / 2,
EVENT_DOT_SIZE,
EVENT_DOT_SIZE,
);
ctx.restore();
},
[isDarkMode],
);
// Draw a single span and its events
const drawSpan = useCallback(
(
ctx: CanvasRenderingContext2D,
span: FlamegraphSpan,
x: number,
y: number,
width: number,
levelIndex: number,
spanRectsArray: SpanRect[],
): void => {
const color = getSpanColor(span); // do not depend on hover/clicks
const spanY = y + SPAN_BAR_Y_OFFSET;
// Draw span rectangle
ctx.fillStyle = color;
ctx.beginPath();
ctx.roundRect(x, spanY, width, SPAN_BAR_HEIGHT, 6);
ctx.fill();
// Store rect for hit testing
// consider per level buckets to improve hit testing
// So hover can:
// compute level from y
// search only within that row
spanRectsArray.push({
span,
x,
y: spanY,
width,
height: SPAN_BAR_HEIGHT,
level: levelIndex,
});
// Draw events
// think about optimizing this.
// if span is too small to draw events, skip drawing events???
span.event?.forEach((event) => {
const eventTimeMs = event.timeUnixNano / 1e6;
const eventOffsetPercent =
((eventTimeMs - span.timestamp) / (span.durationNano / 1e6)) * 100;
const clampedOffset = Math.max(1, Math.min(eventOffsetPercent, 99));
const eventX = x + (clampedOffset / 100) * width;
const eventY = spanY + SPAN_BAR_HEIGHT / 2;
drawEventDot(ctx, eventX, eventY, event.isError);
});
},
[getSpanColor, drawEventDot],
);
// Draw the flamegraph on canvas
const drawFlamegraph = useCallback(() => {
const canvas = canvasRef.current;
const container = containerRef.current;
if (!canvas || !container) {
return;
}
const ctx = canvas.getContext('2d');
if (!ctx) {
return;
}
const dpr = window.devicePixelRatio || 1;
ctx.setTransform(dpr, 0, 0, dpr, 0, 0);
const timeSpan = viewEndTs - viewStartTs;
if (timeSpan <= 0) {
return;
}
const cssWidth = canvas.width / dpr;
// ---- Vertical clipping window ----
const viewportHeight = container.clientHeight;
const overscan = 4;
const firstLevel = Math.max(0, Math.floor(scrollTop / ROW_HEIGHT) - overscan);
const visibleLevelCount =
Math.ceil(viewportHeight / ROW_HEIGHT) + 2 * overscan;
const lastLevel = Math.min(spans.length - 1, firstLevel + visibleLevelCount);
// ---- Clear only visible region (recommended) ----
const clearTop = firstLevel * ROW_HEIGHT;
const clearHeight = (lastLevel - firstLevel + 1) * ROW_HEIGHT;
ctx.clearRect(0, clearTop, cssWidth, clearHeight);
const spanRectsArray: SpanRect[] = [];
// ---- Draw only visible levels ----
for (let levelIndex = firstLevel; levelIndex <= lastLevel; levelIndex++) {
const levelSpans = spans[levelIndex];
if (!levelSpans) {
continue;
}
const y = levelIndex * ROW_HEIGHT;
for (let i = 0; i < levelSpans.length; i++) {
const span = levelSpans[i];
const spanStartMs = span.timestamp;
const spanEndMs = span.timestamp + span.durationNano / 1e6;
// Time culling (already correct)
if (spanEndMs < viewStartTs || spanStartMs > viewEndTs) {
continue;
}
const leftOffset = ((spanStartMs - viewStartTs) / timeSpan) * cssWidth;
const rightEdge = ((spanEndMs - viewStartTs) / timeSpan) * cssWidth;
let width = rightEdge - leftOffset;
// Clamp to visible x-range
if (leftOffset < 0) {
width += leftOffset;
if (width <= 0) {
continue;
}
}
if (rightEdge > cssWidth) {
width = cssWidth - Math.max(0, leftOffset);
if (width <= 0) {
continue;
}
}
// Optional: minimum 1px width
if (width > 0 && width < 1) {
width = 1;
}
drawSpan(
ctx,
span,
Math.max(0, leftOffset),
y,
width,
levelIndex,
spanRectsArray,
);
}
}
spanRects.current = spanRectsArray;
}, [spans, viewStartTs, viewEndTs, scrollTop, drawSpan]);
// Calculate total canvas height
const totalHeight = spans.length * ROW_HEIGHT;
console.log('time: ', {
start: traceMetadata.startTime,
end: traceMetadata.endTime,
});
// Initialize time window when trace metadata changes (only if not already set)
useEffect(() => {
// Only reset if we're at the default view (full trace)
// This prevents resetting zoom/pan when metadata updates
if (
viewStartTs === traceMetadata.startTime &&
viewEndTs === traceMetadata.endTime
) {
// Already at default, no need to update
return;
}
// Only reset if the trace bounds have actually changed significantly
const currentSpan = viewEndTs - viewStartTs;
const fullSpan = traceMetadata.endTime - traceMetadata.startTime;
// If we're zoomed in, preserve the zoom level relative to new bounds
if (currentSpan < fullSpan * 0.99) {
// We're zoomed in, adjust the window proportionally
const ratio = currentSpan / fullSpan;
const newSpan = (traceMetadata.endTime - traceMetadata.startTime) * ratio;
const center = (viewStartTs + viewEndTs) / 2;
const newStart = Math.max(
traceMetadata.startTime,
Math.min(center - newSpan / 2, traceMetadata.endTime - newSpan),
);
setViewStartTs(newStart);
setViewEndTs(newStart + newSpan);
} else {
// We're at full view, reset to new full view
setViewStartTs(traceMetadata.startTime);
setViewEndTs(traceMetadata.endTime);
}
}, [traceMetadata.startTime, traceMetadata.endTime, viewStartTs, viewEndTs]);
// Handle canvas resize with device pixel ratio
useEffect(() => {
const canvas = canvasRef.current;
const container = containerRef.current;
if (!canvas || !container) {
return;
}
const updateCanvasSize = (): void => {
const rect = container.getBoundingClientRect();
const dpr = window.devicePixelRatio || 1;
// Set CSS size
canvas.style.width = `${rect.width}px`;
canvas.style.height = `${totalHeight}px`;
// Set actual pixel size (accounting for DPR)
// Only update if size actually changed to prevent unnecessary redraws
const newWidth = Math.floor(rect.width * dpr);
const newHeight = Math.floor(totalHeight * dpr);
if (canvas.width !== newWidth || canvas.height !== newHeight) {
canvas.width = newWidth;
canvas.height = newHeight;
// Redraw with current time window (preserves zoom/pan)
drawFlamegraph();
}
};
const resizeObserver = new ResizeObserver(updateCanvasSize);
resizeObserver.observe(container);
// Initial size
updateCanvasSize();
// Handle DPR changes (e.g., moving window between screens)
const handleDPRChange = (): void => {
updateCanvasSize();
};
window
.matchMedia('(resolution: 1dppx)')
.addEventListener('change', handleDPRChange);
return (): void => {
resizeObserver.disconnect();
window
.matchMedia('(resolution: 1dppx)')
.removeEventListener('change', handleDPRChange);
};
}, [drawFlamegraph, totalHeight]);
// Re-draw when data changes
useEffect(() => {
drawFlamegraph();
}, [drawFlamegraph]);
// Find span at given canvas coordinates
const findSpanAtPosition = useCallback((x: number, y: number):
| SpanRect
| undefined => {
return spanRects.current.find(
(spanRect) =>
x >= spanRect.x &&
x <= spanRect.x + spanRect.width &&
y >= spanRect.y &&
y <= spanRect.y + spanRect.height,
);
}, []);
// Utility to convert client coordinates to CSS canvas coordinates
const getCanvasPointer = useCallback((clientX: number, clientY: number): {
cssX: number;
cssY: number;
cssWidth: number;
} | null => {
const canvas = canvasRef.current;
if (!canvas) {
return null;
}
const rect = canvas.getBoundingClientRect();
const dpr = window.devicePixelRatio || 1;
const cssWidth = canvas.width / dpr;
const cssX = (clientX - rect.left) * (cssWidth / rect.width);
const cssY = clientY - rect.top + scrollTopRef.current;
return { cssX, cssY, cssWidth };
}, []);
// Handle mouse move for hover and dragging
const handleMouseMove = useCallback(
(event: React.MouseEvent<HTMLCanvasElement>) => {
const canvas = canvasRef.current;
if (!canvas) {
return;
}
const rect = canvas.getBoundingClientRect();
console.log('event', { clientX: event.clientX, clientY: event.clientY });
// ---- Dragging (pan in time space) ----
if (isDraggingRef.current && dragStartRef.current) {
const deltaX = event.clientX - dragStartRef.current.x;
const deltaY = event.clientY - dragStartRef.current.y;
console.log('delta', { deltaY, deltaX });
const totalDistance = Math.sqrt(deltaX * deltaX + deltaY * deltaY);
dragDistanceRef.current = totalDistance;
const timeSpan = viewEndRef.current - viewStartRef.current;
const deltaTime = (deltaX / rect.width) * timeSpan;
const newStart = viewStartRef.current - deltaTime;
const clampedStart = Math.max(
traceMetadata.startTime,
Math.min(newStart, traceMetadata.endTime - timeSpan),
);
const clampedEnd = clampedStart + timeSpan;
setViewStartTs(clampedStart);
setViewEndTs(clampedEnd);
dragStartRef.current = {
x: event.clientX,
y: event.clientY,
};
return;
}
// ---- Hover ----
const pointer = getCanvasPointer(event.clientX, event.clientY);
if (!pointer) {
return;
}
const { cssX, cssY } = pointer;
const hoveredSpan = findSpanAtPosition(cssX, cssY);
if (hoveredSpan) {
setHoveredSpanId(hoveredSpan.span.spanId);
setTooltipContent({
content: hoveredSpan.span.name,
x: event.clientX,
y: event.clientY,
});
canvas.style.cursor = 'pointer';
} else {
setHoveredSpanId('');
setTooltipContent(null);
// Set cursor based on space key state when not hovering
canvas.style.cursor = isSpacePressed ? 'grab' : 'default';
}
},
[findSpanAtPosition, traceMetadata, getCanvasPointer, isSpacePressed],
);
// Handle key down for space key
useEffect(() => {
const handleKeyDown = (event: KeyboardEvent): void => {
if (event.code === 'Space') {
event.preventDefault();
setIsSpacePressed(true);
}
};
const handleKeyUp = (event: KeyboardEvent): void => {
if (event.code === 'Space') {
event.preventDefault();
setIsSpacePressed(false);
}
};
window.addEventListener('keydown', handleKeyDown);
window.addEventListener('keyup', handleKeyUp);
return (): void => {
window.removeEventListener('keydown', handleKeyDown);
window.removeEventListener('keyup', handleKeyUp);
};
}, []);
const handleMouseDown = useCallback(
(event: React.MouseEvent<HTMLCanvasElement>) => {
if (event.button !== 0) {
return;
} // left click only
event.preventDefault();
isDraggingRef.current = true;
dragStartRef.current = {
x: event.clientX,
y: event.clientY,
};
dragDistanceRef.current = 0;
const canvas = canvasRef.current;
if (canvas) {
canvas.style.cursor = 'grabbing';
}
},
[],
);
const handleMouseUp = useCallback(() => {
const wasDrag = dragDistanceRef.current > 5;
suppressClickRef.current = wasDrag; // 👈 key fix: suppress click after drag
isDraggingRef.current = false;
dragStartRef.current = null;
dragDistanceRef.current = 0;
const canvas = canvasRef.current;
if (canvas) {
canvas.style.cursor = 'grab';
}
return wasDrag;
}, []);
const handleMouseLeave = useCallback(() => {
isOverFlamegraphRef.current = false;
setHoveredSpanId('');
setTooltipContent(null);
isDraggingRef.current = false;
dragStartRef.current = null;
dragDistanceRef.current = 0;
const canvas = canvasRef.current;
if (canvas) {
canvas.style.cursor = 'grab';
}
}, []);
const handleClick = useCallback(
(event: React.MouseEvent<HTMLCanvasElement>) => {
// Prevent click after drag
if (suppressClickRef.current) {
suppressClickRef.current = false; // reset after suppressing once
return;
}
const pointer = getCanvasPointer(event.clientX, event.clientY);
if (!pointer) {
return;
}
const { cssX, cssY } = pointer;
const clickedSpan = findSpanAtPosition(cssX, cssY);
if (!clickedSpan) {
return;
}
const searchParams = new URLSearchParams(search);
const currentSpanId = searchParams.get('spanId');
if (currentSpanId !== clickedSpan.span.spanId) {
searchParams.set('spanId', clickedSpan.span.spanId);
history.replace({ search: searchParams.toString() });
}
},
[search, history, findSpanAtPosition, getCanvasPointer],
);
const isOverFlamegraphRef = useRef(false);
useEffect(() => {
const onWheel = (e: WheelEvent) => {
// Pinch zoom on trackpads often comes as ctrl+wheel
if (isOverFlamegraphRef.current && e.ctrlKey) {
e.preventDefault(); // stops browser zoom
}
};
// capture:true ensures we intercept early
window.addEventListener('wheel', onWheel, { passive: false, capture: true });
return () => {
window.removeEventListener(
'wheel',
onWheel as any,
{ capture: true } as any,
);
};
}, []);
const wheelDeltaRef = useRef(0);
const rafRef = useRef<number | null>(null);
const lastCursorXRef = useRef(0);
const lastCssWidthRef = useRef(1);
const lastIsPinchRef = useRef(false);
const applyWheelZoom = useCallback(() => {
rafRef.current = null;
const cssWidth = lastCssWidthRef.current || 1;
const cursorX = lastCursorXRef.current;
const fullSpan = traceMetadata.endTime - traceMetadata.startTime;
const oldSpan = viewEndRef.current - viewStartRef.current;
// ✅ Different intensity for pinch vs scroll
const zoomIntensityScroll = 0.0015;
const zoomIntensityPinch = 0.01; // pinch needs stronger response
const zoomIntensity = lastIsPinchRef.current
? zoomIntensityPinch
: zoomIntensityScroll;
const deltaY = wheelDeltaRef.current;
wheelDeltaRef.current = 0;
// ✅ Smooth zoom using delta magnitude
const zoomFactor = Math.exp(deltaY * zoomIntensity);
const newSpan = oldSpan * zoomFactor;
console.log('newSpan', { cssWidth, newSpan, zoomFactor, oldSpan });
// ✅ Better minSpan clamp (absolute + pixel-based)
const absoluteMinSpan = 5; // ms
const pixelMinSpan = fullSpan / cssWidth; // ~1px of time
const minSpan = Math.max(absoluteMinSpan, pixelMinSpan);
const maxSpan = fullSpan;
const clampedSpan = Math.max(minSpan, Math.min(maxSpan, newSpan));
// ✅ Anchor preserving zoom (same as your original logic)
const cursorRatio = Math.max(0, Math.min(cursorX / cssWidth, 1));
const anchorTs = viewStartRef.current + cursorRatio * oldSpan;
const newViewStart = anchorTs - cursorRatio * clampedSpan;
const finalStart = Math.max(
traceMetadata.startTime,
Math.min(newViewStart, traceMetadata.endTime - clampedSpan),
);
const finalEnd = finalStart + clampedSpan;
console.log('finalStart', finalStart);
console.log('finalEnd', finalEnd);
setViewStartTs(finalStart);
setViewEndTs(finalEnd);
}, [traceMetadata]);
const handleWheel = useCallback(
(event: React.WheelEvent<HTMLCanvasElement>) => {
event.preventDefault();
const pointer = getCanvasPointer(event.clientX, event.clientY);
if (!pointer) {
return;
}
console.log('pointer', pointer);
const { cssX: cursorX, cssWidth } = pointer;
// ✅ Detect pinch on Chrome/Edge: ctrlKey true for trackpad pinch
lastIsPinchRef.current = event.ctrlKey;
lastCssWidthRef.current = cssWidth;
lastCursorXRef.current = cursorX;
// ✅ Accumulate deltas; apply once per frame
wheelDeltaRef.current += event.deltaY;
if (rafRef.current == null) {
rafRef.current = requestAnimationFrame(applyWheelZoom);
}
},
[applyWheelZoom, getCanvasPointer],
);
// Reset zoom and pan
const handleResetZoom = useCallback(() => {
setViewStartTs(traceMetadata.startTime);
setViewEndTs(traceMetadata.endTime);
}, [traceMetadata]);
// Handle scroll for pagination
const handleScroll = useCallback(
(event: React.UIEvent<HTMLDivElement>): void => {
const target = event.currentTarget;
setScrollTop(target.scrollTop);
// Pagination logic
if (spans.length < 50) {
return;
}
const scrollPercentage = target.scrollTop / target.scrollHeight;
const totalLevels = spans.length;
if (scrollPercentage === 0 && spans[0]?.[0]?.level !== 0) {
setFirstSpanAtFetchLevel(spans[0][0].spanId);
}
if (scrollPercentage >= 0.95 && spans[totalLevels - 1]?.[0]?.spanId) {
setFirstSpanAtFetchLevel(spans[totalLevels - 1][0].spanId);
}
},
[spans, setFirstSpanAtFetchLevel],
);
// Auto-scroll to selected span
useEffect(() => {
if (!firstSpanAtFetchLevel || !containerRef.current) {
return;
}
const levelIndex = spans.findIndex(
(level) => level[0]?.spanId === firstSpanAtFetchLevel,
);
if (levelIndex !== -1) {
const targetScroll = levelIndex * ROW_HEIGHT;
containerRef.current.scrollTop = targetScroll;
setScrollTop(targetScroll);
}
}, [firstSpanAtFetchLevel, spans]);
return (
<>
<div
ref={containerRef}
className="trace-flamegraph trace-flamegraph-canvas"
onScroll={handleScroll}
>
{(viewStartTs !== traceMetadata.startTime ||
viewEndTs !== traceMetadata.endTime) && (
<Button
className="flamegraph-reset-zoom"
size="small"
onClick={handleResetZoom}
title="Reset zoom and pan"
>
Reset View
</Button>
)}
<canvas
ref={canvasRef}
style={{
display: 'block',
width: '100%',
height: `${totalHeight}px`,
}}
onMouseMove={handleMouseMove}
onMouseDown={handleMouseDown}
onMouseUp={handleMouseUp}
onMouseEnter={() => (isOverFlamegraphRef.current = true)}
onMouseLeave={handleMouseLeave}
onClick={handleClick}
onWheel={handleWheel}
onContextMenu={(e): void => e.preventDefault()}
/>
{tooltipContent && (
<div
style={{
position: 'fixed',
left: `${tooltipContent.x + 10}px`,
top: `${tooltipContent.y - 10}px`,
pointerEvents: 'none',
zIndex: 1000,
backgroundColor: isDarkMode ? '#1f2937' : '#ffffff',
color: isDarkMode ? '#ffffff' : '#000000',
padding: '4px 8px',
borderRadius: '4px',
fontSize: '12px',
boxShadow: '0 2px 8px rgba(0,0,0,0.15)',
}}
>
{tooltipContent.content}
</div>
)}
</div>
<TimelineV2
startTimestamp={viewStartTs}
endTimestamp={viewEndTs}
offsetTimestamp={viewStartTs - traceMetadata.startTime}
timelineHeight={22}
/>
</>
);
}
export default Success;
// on drag on click is getting registered as a click
// zoom and scale not matching
// check minimap logic
// use
// const scrollTopRef = useRef(scrollTop);
// useEffect(() => {
// scrollTopRef.current = scrollTop;
// }, [scrollTop]);
// fix clicks in interaction canvas
// Auto-scroll to selected span else on top(based on default span)
// time bar line vertical
// zoom handle vertical and horizontal scroll with proper defined thresholds
// timeline should be in sync with the flamegraph. test with vertical line of time on event etc.
// proper working interaction layer for clicks and hovers
// hit testing should be efficient and accurate without flat spanRect
// Final Priority Order (Clean Summary)
// ✅ Zoom (Horizontal + Vertical thresholds)
// ✅ Timeline sync + vertical time dashed line
// ✅ Minimap brush correctness
// ✅ Auto-scroll behavior
// ✅ Interaction layer separation
// ✅ Efficient hit testing

View File

@@ -17,9 +17,6 @@ const useGetTraceFlamegraph = (
REACT_QUERY_KEY.GET_TRACE_V2_FLAMEGRAPH,
props.traceId,
props.selectedSpanId,
props.limit,
props.boundaryStartTsMilli,
props.boundarEndTsMilli,
],
enabled: !!props.traceId,
keepPreviousData: true,

View File

@@ -5,9 +5,6 @@ export interface TraceDetailFlamegraphURLProps {
export interface GetTraceFlamegraphPayloadProps {
traceId: string;
selectedSpanId: string;
limit: number;
boundaryStartTsMilli: number;
boundarEndTsMilli: number;
}
export interface Event {
@@ -34,6 +31,4 @@ export interface GetTraceFlamegraphSuccessResponse {
spans: FlamegraphSpan[][];
startTimestampMillis: number;
endTimestampMillis: number;
durationNano: number;
hasMore: boolean;
}

View File

@@ -1046,19 +1046,7 @@ func (r *ClickHouseReader) GetWaterfallSpansForTraceWithMetadata(ctx context.Con
}
processingPostCache := time.Now()
limit := min(req.Limit, tracedetail.MaxLimitToSelectAllSpans)
selectAllSpans := totalSpans <= uint64(limit)
var (
selectedSpans []*model.Span
uncollapsedSpans []string
rootServiceName, rootServiceEntryPoint string
)
if selectAllSpans {
selectedSpans, rootServiceName, rootServiceEntryPoint = tracedetail.GetAllSpans(traceRoots)
} else {
selectedSpans, uncollapsedSpans, rootServiceName, rootServiceEntryPoint = tracedetail.GetSelectedSpans(req.UncollapsedSpans, req.SelectedSpanID, traceRoots, spanIdToSpanNodeMap, req.IsSelectedSpanIDUnCollapsed)
}
selectedSpans, uncollapsedSpans, rootServiceName, rootServiceEntryPoint := tracedetail.GetSelectedSpans(req.UncollapsedSpans, req.SelectedSpanID, traceRoots, spanIdToSpanNodeMap, req.IsSelectedSpanIDUnCollapsed)
zap.L().Info("getWaterfallSpansForTraceWithMetadata: processing post cache", zap.Duration("duration", time.Since(processingPostCache)), zap.String("traceID", traceID))
// convert start timestamp to millis because right now frontend is expecting it in millis
@@ -1071,7 +1059,7 @@ func (r *ClickHouseReader) GetWaterfallSpansForTraceWithMetadata(ctx context.Con
}
response.Spans = selectedSpans
response.UncollapsedSpans = uncollapsedSpans // ignoring if all spans are returning
response.UncollapsedSpans = uncollapsedSpans
response.StartTimestampMillis = startTime / 1000000
response.EndTimestampMillis = endTime / 1000000
response.TotalSpansCount = totalSpans
@@ -1080,7 +1068,6 @@ func (r *ClickHouseReader) GetWaterfallSpansForTraceWithMetadata(ctx context.Con
response.RootServiceEntryPoint = rootServiceEntryPoint
response.ServiceNameToTotalDurationMap = serviceNameToTotalDurationMap
response.HasMissingSpans = hasMissingSpans
response.HasMore = !selectAllSpans
return response, nil
}
@@ -1212,7 +1199,7 @@ func (r *ClickHouseReader) GetFlamegraphSpansForTrace(ctx context.Context, orgID
}
}
selectedSpans = tracedetail.GetAllSpansForFlamegraph(traceRoots, spanIdToSpanNodeMap)
selectedSpans = tracedetail.GetSelectedSpansForFlamegraph(traceRoots, spanIdToSpanNodeMap)
traceCache := model.GetFlamegraphSpansForTraceCache{
StartTime: startTime,
EndTime: endTime,
@@ -1229,20 +1216,12 @@ func (r *ClickHouseReader) GetFlamegraphSpansForTrace(ctx context.Context, orgID
}
processingPostCache := time.Now()
selectedSpansForRequest := selectedSpans
limit := min(req.Limit, tracedetail.MaxLimitWithoutSampling)
totalSpanCount := tracedetail.GetTotalSpanCount(selectedSpans)
if totalSpanCount > uint64(limit) {
boundaryStart, boundaryEnd := utils.MilliToNano(req.BoundaryStartTS), utils.MilliToNano(req.BoundaryEndTS)
selectedSpansForRequest = tracedetail.GetSelectedSpansForFlamegraphForRequest(req.SelectedSpanID, selectedSpans, boundaryStart, boundaryEnd)
}
zap.L().Info("getFlamegraphSpansForTrace: processing post cache", zap.Duration("duration", time.Since(processingPostCache)), zap.String("traceID", traceID),
zap.Uint64("totalSpanCount", totalSpanCount))
selectedSpansForRequest := tracedetail.GetSelectedSpansForFlamegraphForRequest(req.SelectedSpanID, selectedSpans, startTime, endTime)
zap.L().Info("getFlamegraphSpansForTrace: processing post cache", zap.Duration("duration", time.Since(processingPostCache)), zap.String("traceID", traceID))
trace.Spans = selectedSpansForRequest
trace.StartTimestampMillis = startTime / 1000000
trace.EndTimestampMillis = endTime / 1000000
trace.HasMore = totalSpanCount > uint64(limit)
return trace, nil
}

View File

@@ -7,11 +7,9 @@ import (
)
var (
flamegraphSpanLevelLimit float64 = 50
flamegraphSpanLimitPerLevel int = 1000
flamegraphSamplingBucketCount int = 500
MaxLimitWithoutSampling uint = 120_000
SPAN_LIMIT_PER_REQUEST_FOR_FLAMEGRAPH float64 = 50
SPAN_LIMIT_PER_LEVEL int = 100
TIMESTAMP_SAMPLING_BUCKET_COUNT int = 50
)
func ContainsFlamegraphSpan(slice []*model.FlamegraphSpan, item *model.FlamegraphSpan) bool {
@@ -54,8 +52,7 @@ func FindIndexForSelectedSpan(spans [][]*model.FlamegraphSpan, selectedSpanId st
return selectedSpanLevel
}
// GetAllSpansForFlamegraph groups all spans as per their level
func GetAllSpansForFlamegraph(traceRoots []*model.FlamegraphSpan, spanIdToSpanNodeMap map[string]*model.FlamegraphSpan) [][]*model.FlamegraphSpan {
func GetSelectedSpansForFlamegraph(traceRoots []*model.FlamegraphSpan, spanIdToSpanNodeMap map[string]*model.FlamegraphSpan) [][]*model.FlamegraphSpan {
var traceIdLevelledFlamegraph = map[string]map[int64][]*model.FlamegraphSpan{}
selectedSpans := [][]*model.FlamegraphSpan{}
@@ -103,7 +100,7 @@ func getLatencyAndTimestampBucketedSpans(spans []*model.FlamegraphSpan, selected
})
// pick the top 5 latency spans
for idx := range 100 {
for idx := range 5 {
sampledSpans = append(sampledSpans, spans[idx])
}
@@ -113,7 +110,6 @@ func getLatencyAndTimestampBucketedSpans(spans []*model.FlamegraphSpan, selected
for _idx, span := range spans {
if span.SpanID == selectedSpanID {
idx = _idx
break
}
}
if idx != -1 {
@@ -121,17 +117,17 @@ func getLatencyAndTimestampBucketedSpans(spans []*model.FlamegraphSpan, selected
}
}
bucketSize := (endTime - startTime) / uint64(flamegraphSamplingBucketCount)
bucketSize := (endTime - startTime) / uint64(TIMESTAMP_SAMPLING_BUCKET_COUNT)
if bucketSize == 0 {
bucketSize = 1
}
bucketedSpans := make([][]*model.FlamegraphSpan, flamegraphSamplingBucketCount)
bucketedSpans := make([][]*model.FlamegraphSpan, 50)
for _, span := range spans {
if span.TimeUnixNano >= startTime && span.TimeUnixNano <= endTime {
bucketIndex := int((span.TimeUnixNano - startTime) / bucketSize)
if bucketIndex >= 0 && bucketIndex < flamegraphSamplingBucketCount {
if bucketIndex >= 0 && bucketIndex < 50 {
bucketedSpans[bucketIndex] = append(bucketedSpans[bucketIndex], span)
}
}
@@ -160,8 +156,8 @@ func GetSelectedSpansForFlamegraphForRequest(selectedSpanID string, selectedSpan
selectedIndex = FindIndexForSelectedSpan(selectedSpans, selectedSpanID)
}
lowerLimit := selectedIndex - int(flamegraphSpanLevelLimit*0.4)
upperLimit := selectedIndex + int(flamegraphSpanLevelLimit*0.6)
lowerLimit := selectedIndex - int(SPAN_LIMIT_PER_REQUEST_FOR_FLAMEGRAPH*0.4)
upperLimit := selectedIndex + int(SPAN_LIMIT_PER_REQUEST_FOR_FLAMEGRAPH*0.6)
if lowerLimit < 0 {
upperLimit = upperLimit - lowerLimit
@@ -178,7 +174,7 @@ func GetSelectedSpansForFlamegraphForRequest(selectedSpanID string, selectedSpan
}
for i := lowerLimit; i < upperLimit; i++ {
if len(selectedSpans[i]) > flamegraphSpanLimitPerLevel {
if len(selectedSpans[i]) > SPAN_LIMIT_PER_LEVEL {
_spans := getLatencyAndTimestampBucketedSpans(selectedSpans[i], selectedSpanID, i == selectedIndex, startTime, endTime)
selectedSpansForRequest = append(selectedSpansForRequest, _spans)
} else {
@@ -188,12 +184,3 @@ func GetSelectedSpansForFlamegraphForRequest(selectedSpanID string, selectedSpan
return selectedSpansForRequest
}
func GetTotalSpanCount(spans [][]*model.FlamegraphSpan) uint64 {
levelCount := len(spans)
spanCount := uint64(0)
for i := range levelCount {
spanCount += uint64(len(spans[i]))
}
return spanCount
}

View File

@@ -9,9 +9,6 @@ import (
var (
SPAN_LIMIT_PER_REQUEST_FOR_WATERFALL float64 = 500
maxDepthForSelectedSpanChildren int = 5
MaxLimitToSelectAllSpans uint = 10_000
)
type Interval struct {
@@ -91,11 +88,8 @@ func getPathFromRootToSelectedSpanId(node *model.Span, selectedSpanId string, un
return isPresentInSubtreeForTheNode, spansFromRootToNode
}
func traverseTrace(span *model.Span, uncollapsedSpans []string, level uint64, isPartOfPreOrder bool, hasSibling bool, selectedSpanId string,
depthFromSelectedSpan int, isSelectedSpanIDUnCollapsed bool, selectAllSpan bool) ([]*model.Span, []string) {
func traverseTrace(span *model.Span, uncollapsedSpans []string, level uint64, isPartOfPreOrder bool, hasSibling bool, selectedSpanId string) []*model.Span {
preOrderTraversal := []*model.Span{}
autoExpandedSpans := []string{}
// sort the children to maintain the order across requests
sort.Slice(span.Children, func(i, j int) bool {
@@ -132,40 +126,15 @@ func traverseTrace(span *model.Span, uncollapsedSpans []string, level uint64, is
preOrderTraversal = append(preOrderTraversal, &nodeWithoutChildren)
}
nextDepthFromSelectedSpan := -1
if span.SpanID == selectedSpanId && isSelectedSpanIDUnCollapsed {
nextDepthFromSelectedSpan = 1
} else if depthFromSelectedSpan >= 1 && depthFromSelectedSpan < maxDepthForSelectedSpanChildren {
nextDepthFromSelectedSpan = depthFromSelectedSpan + 1
}
for index, child := range span.Children {
// A child is included in the pre-order output if its parent is uncollapsed
// OR if the child falls within MAX_DEPTH_FOR_SELECTED_SPAN_CHILDREN levels
// below the selected span.
isChildWithinMaxDepth := nextDepthFromSelectedSpan >= 1
isAlreadyUncollapsed := slices.Contains(uncollapsedSpans, span.SpanID)
childIsPartOfPreOrder := isPartOfPreOrder && (isAlreadyUncollapsed || isChildWithinMaxDepth)
if selectAllSpan {
childIsPartOfPreOrder = true
}
if isPartOfPreOrder && isChildWithinMaxDepth && !isAlreadyUncollapsed {
if !slices.Contains(autoExpandedSpans, span.SpanID) {
autoExpandedSpans = append(autoExpandedSpans, span.SpanID)
}
}
_childTraversal, _autoExpanded := traverseTrace(child, uncollapsedSpans, level+1, childIsPartOfPreOrder, index != (len(span.Children)-1), selectedSpanId,
nextDepthFromSelectedSpan, isSelectedSpanIDUnCollapsed, selectAllSpan)
_childTraversal := traverseTrace(child, uncollapsedSpans, level+1, isPartOfPreOrder && slices.Contains(uncollapsedSpans, span.SpanID), index != (len(span.Children)-1), selectedSpanId)
preOrderTraversal = append(preOrderTraversal, _childTraversal...)
autoExpandedSpans = append(autoExpandedSpans, _autoExpanded...)
nodeWithoutChildren.SubTreeNodeCount += child.SubTreeNodeCount + 1
span.SubTreeNodeCount += child.SubTreeNodeCount + 1
}
nodeWithoutChildren.SubTreeNodeCount += 1
return preOrderTraversal, autoExpandedSpans
return preOrderTraversal
}
@@ -199,13 +168,7 @@ func GetSelectedSpans(uncollapsedSpans []string, selectedSpanID string, traceRoo
_, spansFromRootToNode := getPathFromRootToSelectedSpanId(rootNode, selectedSpanID, updatedUncollapsedSpans, isSelectedSpanIDUnCollapsed)
updatedUncollapsedSpans = append(updatedUncollapsedSpans, spansFromRootToNode...)
_preOrderTraversal, _autoExpanded := traverseTrace(rootNode, updatedUncollapsedSpans, 0, true, false, selectedSpanID, -1, isSelectedSpanIDUnCollapsed, false)
// Merge auto-expanded spans into updatedUncollapsedSpans for returning in response
for _, spanID := range _autoExpanded {
if !slices.Contains(updatedUncollapsedSpans, spanID) {
updatedUncollapsedSpans = append(updatedUncollapsedSpans, spanID)
}
}
_preOrderTraversal := traverseTrace(rootNode, updatedUncollapsedSpans, 0, true, false, selectedSpanID)
_selectedSpanIndex := findIndexForSelectedSpanFromPreOrder(_preOrderTraversal, selectedSpanID)
if _selectedSpanIndex != -1 {
@@ -249,17 +212,3 @@ func GetSelectedSpans(uncollapsedSpans []string, selectedSpanID string, traceRoo
return preOrderTraversal[startIndex:endIndex], updatedUncollapsedSpans, rootServiceName, rootServiceEntryPoint
}
func GetAllSpans(traceRoots []*model.Span) (spans []*model.Span, rootServiceName, rootEntryPoint string) {
for _, root := range traceRoots {
childSpans, _ := traverseTrace(root, nil, 0, true, false, "", -1, false, true)
spans = append(spans, childSpans...)
if rootServiceName == "" {
rootServiceName = root.ServiceName
}
if rootEntryPoint == "" {
rootEntryPoint = root.Name
}
}
return
}

View File

@@ -333,14 +333,10 @@ type GetWaterfallSpansForTraceWithMetadataParams struct {
SelectedSpanID string `json:"selectedSpanId"`
IsSelectedSpanIDUnCollapsed bool `json:"isSelectedSpanIDUnCollapsed"`
UncollapsedSpans []string `json:"uncollapsedSpans"`
Limit uint `json:"limit"`
}
type GetFlamegraphSpansForTraceParams struct {
SelectedSpanID string `json:"selectedSpanId"`
Limit uint `json:"limit"`
BoundaryStartTS uint64 `json:"boundaryStartTsMilli"`
BoundaryEndTS uint64 `json:"boundarEndTsMilli"`
SelectedSpanID string `json:"selectedSpanId"`
}
type SpanFilterParams struct {

View File

@@ -329,7 +329,6 @@ type GetWaterfallSpansForTraceWithMetadataResponse struct {
HasMissingSpans bool `json:"hasMissingSpans"`
// this is needed for frontend and query service sync
UncollapsedSpans []string `json:"uncollapsedSpans"`
HasMore bool `json:"hasMore"`
}
type GetFlamegraphSpansForTraceResponse struct {
@@ -337,7 +336,6 @@ type GetFlamegraphSpansForTraceResponse struct {
EndTimestampMillis uint64 `json:"endTimestampMillis"`
DurationNano uint64 `json:"durationNano"`
Spans [][]*FlamegraphSpan `json:"spans"`
HasMore bool `json:"hasMore"`
}
type OtelSpanRef struct {

View File

@@ -17,7 +17,3 @@ func Elapsed(funcName string, args map[string]interface{}) func() {
zap.L().Info("Elapsed time", zapFields...)
}
}
func MilliToNano(milliTS uint64) uint64 {
return milliTS * 1000_000
}

View File

@@ -169,6 +169,7 @@ func NewSQLMigrationProviderFactories(
sqlmigration.NewAddAnonymousPublicDashboardTransactionFactory(sqlstore),
sqlmigration.NewAddRootUserFactory(sqlstore, sqlschema),
sqlmigration.NewAddUserEmailOrgIDIndexFactory(sqlstore, sqlschema),
sqlmigration.NewMigrateRulesV4ToV5Factory(sqlstore, telemetryStore),
)
}

View File

@@ -0,0 +1,209 @@
package sqlmigration
import (
"context"
"database/sql"
"encoding/json"
"log/slog"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/sqlstore"
"github.com/SigNoz/signoz/pkg/telemetrystore"
"github.com/SigNoz/signoz/pkg/transition"
"github.com/uptrace/bun"
"github.com/uptrace/bun/migrate"
)
type migrateRulesV4ToV5 struct {
store sqlstore.SQLStore
telemetryStore telemetrystore.TelemetryStore
logger *slog.Logger
}
func NewMigrateRulesV4ToV5Factory(
store sqlstore.SQLStore,
telemetryStore telemetrystore.TelemetryStore,
) factory.ProviderFactory[SQLMigration, Config] {
return factory.NewProviderFactory(
factory.MustNewName("migrate_rules_post_deprecation"),
func(ctx context.Context, ps factory.ProviderSettings, c Config) (SQLMigration, error) {
return &migrateRulesV4ToV5{
store: store,
telemetryStore: telemetryStore,
logger: ps.Logger,
}, nil
})
}
func (migration *migrateRulesV4ToV5) Register(migrations *migrate.Migrations) error {
if err := migrations.Register(migration.Up, migration.Down); err != nil {
return err
}
return nil
}
func (migration *migrateRulesV4ToV5) getLogDuplicateKeys(ctx context.Context) ([]string, error) {
query := `
SELECT name
FROM (
SELECT DISTINCT name FROM signoz_logs.distributed_logs_attribute_keys
INTERSECT
SELECT DISTINCT name FROM signoz_logs.distributed_logs_resource_keys
)
ORDER BY name
`
rows, err := migration.telemetryStore.ClickhouseDB().Query(ctx, query)
if err != nil {
migration.logger.WarnContext(ctx, "failed to query log duplicate keys", "error", err)
return nil, nil
}
defer rows.Close()
var keys []string
for rows.Next() {
var key string
if err := rows.Scan(&key); err != nil {
migration.logger.WarnContext(ctx, "failed to scan log duplicate key", "error", err)
continue
}
keys = append(keys, key)
}
return keys, nil
}
func (migration *migrateRulesV4ToV5) getTraceDuplicateKeys(ctx context.Context) ([]string, error) {
query := `
SELECT tagKey
FROM signoz_traces.distributed_span_attributes_keys
WHERE tagType IN ('tag', 'resource')
GROUP BY tagKey
HAVING COUNT(DISTINCT tagType) > 1
ORDER BY tagKey
`
rows, err := migration.telemetryStore.ClickhouseDB().Query(ctx, query)
if err != nil {
migration.logger.WarnContext(ctx, "failed to query trace duplicate keys", "error", err)
return nil, nil
}
defer rows.Close()
var keys []string
for rows.Next() {
var key string
if err := rows.Scan(&key); err != nil {
migration.logger.WarnContext(ctx, "failed to scan trace duplicate key", "error", err)
continue
}
keys = append(keys, key)
}
return keys, nil
}
func (migration *migrateRulesV4ToV5) Up(ctx context.Context, db *bun.DB) error {
logsKeys, err := migration.getLogDuplicateKeys(ctx)
if err != nil {
return err
}
tracesKeys, err := migration.getTraceDuplicateKeys(ctx)
if err != nil {
return err
}
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return err
}
defer func() {
_ = tx.Rollback()
}()
var rules []struct {
ID string `bun:"id"`
Data map[string]any `bun:"data"`
}
err = tx.NewSelect().
Table("rule").
Column("id", "data").
Scan(ctx, &rules)
if err != nil {
if err == sql.ErrNoRows {
return nil
}
return err
}
alertsMigrator := transition.NewAlertMigrateV5(migration.logger, logsKeys, tracesKeys)
count := 0
for _, rule := range rules {
version, _ := rule.Data["version"].(string)
if version == "v5" {
continue
}
if version == "" {
migration.logger.WarnContext(ctx, "unexpected empty version for rule", "rule_id", rule.ID)
}
migration.logger.InfoContext(ctx, "migrating rule v4 to v5", "rule_id", rule.ID, "current_version", version)
// Check if the queries envelope already exists and is non-empty
hasQueriesEnvelope := false
if condition, ok := rule.Data["condition"].(map[string]any); ok {
if compositeQuery, ok := condition["compositeQuery"].(map[string]any); ok {
if queries, ok := compositeQuery["queries"].([]any); ok && len(queries) > 0 {
hasQueriesEnvelope = true
}
}
}
if hasQueriesEnvelope {
// already has queries envelope, just bump version
// this is because user made a mistake of choosing version
migration.logger.InfoContext(ctx, "rule already has queries envelope, bumping version", "rule_id", rule.ID)
rule.Data["version"] = "v5"
} else {
// old format, run full migration
migration.logger.InfoContext(ctx, "rule has old format, running full migration", "rule_id", rule.ID)
updated := alertsMigrator.Migrate(ctx, rule.Data)
if !updated {
migration.logger.WarnContext(ctx, "expected updated to be true but got false", "rule_id", rule.ID)
continue
}
rule.Data["version"] = "v5"
}
dataJSON, err := json.Marshal(rule.Data)
if err != nil {
return err
}
_, err = tx.NewUpdate().
Table("rule").
Set("data = ?", string(dataJSON)).
Where("id = ?", rule.ID).
Exec(ctx)
if err != nil {
return err
}
count++
}
if count != 0 {
migration.logger.InfoContext(ctx, "migrate v4 alerts", "count", count)
}
return tx.Commit()
}
func (migration *migrateRulesV4ToV5) Down(ctx context.Context, db *bun.DB) error {
return nil
}

View File

@@ -355,6 +355,10 @@ func (r *PostableRule) validate() error {
errs = append(errs, signozError.NewInvalidInputf(signozError.CodeInvalidInput, "composite query is required"))
}
if r.Version != "v5" {
errs = append(errs, signozError.NewInvalidInputf(signozError.CodeInvalidInput, "only version v5 is supported, got %q", r.Version))
}
if isAllQueriesDisabled(r.RuleCondition.CompositeQuery) {
errs = append(errs, signozError.NewInvalidInputf(signozError.CodeInvalidInput, "all queries are disabled in rule condition"))
}

View File

@@ -108,6 +108,7 @@ func TestParseIntoRule(t *testing.T) {
"ruleType": "threshold_rule",
"evalWindow": "5m",
"frequency": "1m",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -150,6 +151,7 @@ func TestParseIntoRule(t *testing.T) {
content: []byte(`{
"alert": "DefaultsRule",
"ruleType": "threshold_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -187,6 +189,7 @@ func TestParseIntoRule(t *testing.T) {
initRule: PostableRule{},
content: []byte(`{
"alert": "PromQLRule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "promql",
@@ -256,6 +259,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
content: []byte(`{
"alert": "SeverityLabelTest",
"schemaVersion": "v1",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -344,6 +348,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
content: []byte(`{
"alert": "NoLabelsTest",
"schemaVersion": "v1",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -384,6 +389,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
content: []byte(`{
"alert": "OverwriteTest",
"schemaVersion": "v1",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -474,6 +480,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
content: []byte(`{
"alert": "V2Test",
"schemaVersion": "v2",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -517,6 +524,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
initRule: PostableRule{},
content: []byte(`{
"alert": "DefaultSchemaTest",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -569,6 +577,7 @@ func TestParseIntoRuleSchemaVersioning(t *testing.T) {
func TestParseIntoRuleThresholdGeneration(t *testing.T) {
content := []byte(`{
"alert": "TestThresholds",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -639,6 +648,7 @@ func TestParseIntoRuleMultipleThresholds(t *testing.T) {
"schemaVersion": "v2",
"alert": "MultiThresholdAlert",
"ruleType": "threshold_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -732,6 +742,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyBelowTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -766,6 +777,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyBelowTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -799,6 +811,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyAboveTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -833,6 +846,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyAboveTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -866,6 +880,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyBelowAllTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -901,6 +916,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyBelowAllTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -935,6 +951,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "AnomalyOutOfBoundsTest",
"ruleType": "anomaly_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -969,6 +986,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "ThresholdTest",
"ruleType": "threshold_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",
@@ -1003,6 +1021,7 @@ func TestAnomalyNegationEval(t *testing.T) {
ruleJSON: []byte(`{
"alert": "ThresholdTest",
"ruleType": "threshold_rule",
"version": "v5",
"condition": {
"compositeQuery": {
"queryType": "builder",