fix: handle url.full and http.url in third-party API endpoint count

Use two queries combined via formula to count distinct endpoints across both semconv versions (url.full and http.url). This avoids missing count for endpoint if only url.full is used. Alternative a simple coalesce in count_distinct would have fixed this but it would require substantial query builder refactoring since it's not supported currently.
2026-05-06 18:40:32 +01:00 · 2026-01-27 15:29:44 +05:30
5 changed files with 50 additions and 466 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -1,136 +0,0 @@
-# CLAUDE.md
-
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
-## Project Overview
-
-SigNoz is an open-source observability platform (APM, logs, metrics, traces) built on OpenTelemetry and ClickHouse. It provides a unified solution for monitoring applications with features including distributed tracing, log management, metrics dashboards, and alerting.
-
-## Build and Development Commands
-
-### Development Environment Setup
-```bash
-make devenv-up              # Start ClickHouse and OTel Collector for local dev
-make devenv-clickhouse      # Start only ClickHouse
-make devenv-signoz-otel-collector  # Start only OTel Collector
-make devenv-clickhouse-clean       # Clean ClickHouse data
-```
-
-### Backend (Go)
-```bash
-make go-run-community       # Run community backend server
-make go-run-enterprise      # Run enterprise backend server
-make go-test                # Run all Go unit tests
-go test -race ./pkg/...     # Run tests for specific package
-go test -race ./pkg/querier/...  # Example: run querier tests
-```
-
-### Integration Tests (Python)
-```bash
-cd tests/integration
-uv sync                     # Install dependencies
-make py-test-setup          # Start test environment (keep running with --reuse)
-make py-test                # Run all integration tests
-make py-test-teardown       # Stop test environment
-
-# Run specific test
-uv run pytest --basetemp=./tmp/ -vv --reuse src/<suite>/<file>.py::test_name
-```
-
-### Code Quality
-```bash
-# Go linting (golangci-lint)
-golangci-lint run
-
-# Python formatting/linting
-make py-fmt                 # Format with black
-make py-lint                # Run isort, autoflake, pylint
-```
-
-### OpenAPI Generation
-```bash
-go run cmd/enterprise/*.go generate openapi
-```
-
-## Architecture Overview
-
-### Backend Structure
-
-The Go backend follows a **provider pattern** for dependency injection:
-
- **`pkg/signoz/`** - IoC container that wires all providers together
- **`pkg/modules/`** - Business logic modules (user, organization, dashboard, etc.)
- **`pkg/<provider>/`** - Provider implementations following consistent structure:
-  - `<name>.go` - Interface definition
-  - `config.go` - Configuration (implements `factory.Config`)
-  - `<implname><name>/provider.go` - Implementation
-  - `<name>test/` - Mock implementations for testing
-
-### Key Packages
- **`pkg/querier/`** - Query engine for telemetry data (logs, traces, metrics)
- **`pkg/telemetrystore/`** - ClickHouse telemetry storage interface
- **`pkg/sqlstore/`** - Relational database (SQLite/PostgreSQL) for metadata
- **`pkg/apiserver/`** - HTTP API server with OpenAPI integration
- **`pkg/alertmanager/`** - Alert management
- **`pkg/authn/`, `pkg/authz/`** - Authentication and authorization
- **`pkg/flagger/`** - Feature flags (OpenFeature-based)
- **`pkg/errors/`** - Structured error handling
-
-### Enterprise vs Community
- **`cmd/community/`** - Community edition entry point
- **`cmd/enterprise/`** - Enterprise edition entry point
- **`ee/`** - Enterprise-only features
-
-## Code Conventions
-
-### Error Handling
-Use the custom `pkg/errors` package instead of standard library:
-```go
-errors.New(typ, code, message)           // Instead of errors.New()
-errors.Newf(typ, code, message, args...) // Instead of fmt.Errorf()
-errors.Wrapf(err, typ, code, msg)        // Wrap with context
-```
-
-Define domain-specific error codes:
-```go
-var CodeThingNotFound = errors.MustNewCode("thing_not_found")
-```
-
-### HTTP Handlers
-Handlers are thin adapters in modules that:
-1. Extract auth context from request
-2. Decode request body using `binding` package
-3. Call module functions
-4. Return responses using `render` package
-
-Register routes in `pkg/apiserver/signozapiserver/` with `handler.New()` and `OpenAPIDef`.
-
-### SQL/Database
- Use Bun ORM via `sqlstore.BunDBCtx(ctx)`
- Star schema with `organizations` as central entity
- All tables have `id`, `created_at`, `updated_at`, `org_id` columns
- Write idempotent migrations in `pkg/sqlmigration/`
- No `ON CASCADE` deletes - handle in application logic
-
-### REST Endpoints
- Use plural resource names: `/v1/organizations`, `/v1/users`
- Use `me` for current user/org: `/v1/organizations/me/users`
- Follow RESTful conventions for CRUD operations
-
-### Linting Rules (from .golangci.yml)
- Don't use `errors` package - use `pkg/errors`
- Don't use `zap` logger - use `slog`
- Don't use `fmt.Errorf` or `fmt.Print*`
-
-## Testing
-
-### Unit Tests
- Run with race detector: `go test -race ./...`
- Provider mocks are in `<provider>test/` packages
-
-### Integration Tests
- Located in `tests/integration/`
- Use pytest with testcontainers
- Files prefixed with numbers for execution order (e.g., `01_database.py`)
- Always use `--reuse` flag during development
- Fixtures in `tests/integration/fixtures/`
--- a/.claude/skills/commit/SKILL.md
+++ b/.claude/skills/commit/SKILL.md
@@ -1,37 +0,0 @@
---
-name: commit
-description: Create a conventional commit with staged changes
-allowed-tools: Bash(git commit:*)
---
-
-# Create Conventional Commit
-
-Commit staged changes using conventional commit format: `type(scope): description`
-
-## Types
-
- `feat:` - New feature
- `fix:` - Bug fix
- `chore:` - Maintenance/refactor/tooling
- `test:` - Tests only
- `docs:` - Documentation
-
-## Process
-
-1. Review staged changes: `git diff --cached`
-2. Determine type, optional scope, and description (imperative, <70 chars)
-3. Commit using HEREDOC:
-   ```bash
-   git commit -m "$(cat <<'EOF'
-   type(scope): description
-   EOF
-   )"
-   ```
-4. Verify: `git log -1`
-
-## Notes
-
- Description: imperative mood, lowercase, no period
- Body: explain WHY, not WHAT (code shows what). Keep it concise and brief.
- Do not include co-authored by claude in commit message, we want ownership and accountability to remain with the human contributor.
- Do not automatically add files to stage unless asked to.
--- a/.claude/skills/raise-pr/SKILL.md
+++ b/.claude/skills/raise-pr/SKILL.md
@@ -1,55 +0,0 @@
---
-name: raise-pr
-description: Create a pull request with auto-filled template. Pass 'commit' to commit staged changes first.
-allowed-tools: Bash(gh:*, git:*), Read
-argument-hint: [commit?]
---
-
-# Raise Pull Request
-
-Create a PR with auto-filled template from commits after origin/main.
-
-## Arguments
-
- No argument: Create PR with existing commits
- `commit`: Commit staged changes first, then create PR
-
-## Process
-
-1. **If `$ARGUMENTS` is "commit"**: Review staged changes and commit with descriptive message
-    - Check for staged changes: `git diff --cached --stat`
-    - If changes exist:
-        - Review the changes: `git diff --cached`
-        - Use commit skill for making the commit, i.e. follow conventional commit practices
-        - Commit command: `git commit -m "message"`
-
-2. **Analyze commits since origin/main**:
-   - `git log origin/main..HEAD --pretty=format:"%s%n%b"` - get commit messages
-   - `git diff origin/main...HEAD --stat` - see changes
-
-3. **Read template**: `.github/pull_request_template.md`
-
-4. **Generate PR**:
-   - **Title**: Short (<70 chars), from commit messages or main change
-   - **Body**: Fill template sections based on commits/changes, keep these minimal and to the point:
-     - Summary (why/what/approach) - end with "Closes #<issue_number>" if issue number is available from branch name (git branch --show-current)
-     - Change Type checkboxes
-     - Bug Context (if applicable)
-     - Testing Strategy
-     - Risk Assessment
-     - Changelog (if user-facing)
-     - Checklist
-
-5. **Create PR**:
-   ```bash
-   git push -u origin $(git branch --show-current)
-   gh pr create --base main --title "..." --body "..."
-   gh pr view
-   ```
-
-## Notes
-
- Analyze ALL commits messages from origin/main to HEAD
- Fill template sections based on commit messages, look into code changes if messages doesn't have all the context.
- Leave template sections as they are if you can't determine the content
- Don't add the changes to git stage, only commit or push whatever user has already staged
--- a/.claude/skills/review/SKILL.md
+++ b/.claude/skills/review/SKILL.md
@@ -1,228 +0,0 @@
---
-name: review
-description: Review code changes for bugs, performance issues, and SigNoz convention compliance
-allowed-tools: Bash(git:*, gh:*), Read, Glob, Grep
---
-
-# Review Command
-
-Perform a thorough code review following SigNoz's coding conventions and contributing guidelines and for feature intent completion.
-
-## Usage
-
-Invoke this command to review code changes with actionable and concise feedback.
-
-## Process
-
-1. **Determine scope**:
-   - Ask user what to review if not specified:
-     - Current git diff (staged or unstaged)
-     - All changes since origin/main or a commit range
-
-2. **Gather context**:
-   ```bash
-   # For current changes
-   git diff --cached           # Staged changes
-   git diff                    # Unstaged changes
-
-   # For commit range
-   git diff origin/main...HEAD # All changes since main
-   
-   # for last commit only
-   git diff HEAD~1..HEAD
-   ```
-
-3. **Read all relevant files thoroughly**:
-   - Understand the context and purpose of changes
-
-4. **Review against SigNoz guidelines**:
-   - **Frontend**: Check [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md)
-   - **Backend/Architecture**: Check [CLAUDE.md](../CLAUDE.md) for provider pattern, error handling, SQL, REST, and linting conventions
-   - **General**: Check [Contributing Guidelines](../../CONTRIBUTING.md)
-   - **Commits**: Verify [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
-
-5. **Verify feature intent**:
-   - Read the PR description, commit message, or linked issue to understand *what* the change claims to do
-   - Trace the code path end-to-end to confirm the change actually achieves its stated goal
-   - Check that the happy path works as described
-   - Identify any scenarios where the feature silently does nothing or produces wrong results
-
-6. **Review for bug introduction**:
-   - **Regressions**: Does the change break existing behavior? Check callers of modified functions/interfaces
-   - **Edge cases**: Empty inputs, nil/undefined values, boundary conditions, concurrent access
-   - **Error paths**: Are all error cases handled? Can errors be swallowed silently?
-   - **State management**: Are state transitions correct? Can state become inconsistent?
-   - **Race conditions**: Shared mutable state, async operations, missing locks or guards
-   - **Type mismatches**: Unsafe casts, implicit conversions, `any` usage hiding real types
-
-7. **Review for performance implications**:
-   - **Backend**: N+1 queries, missing indexes, unbounded result sets, large allocations in hot paths, unnecessary DB round-trips
-   - **Frontend**: Unnecessary re-renders from inline objects/functions as props, missing memoization on expensive computations, large bundle imports that should be lazy-loaded, unthrottled event handlers
-   - **General**: O(n²) or worse algorithms on potentially large datasets, unnecessary network calls, missing pagination or limits
-
-8. **Provide actionable, concise feedback** in structured format
-
-## Review Checklist
-
-For coding conventions and style, refer to the linked guideline docs. This checklist focuses on **review-specific concerns** that guidelines alone don't catch.
-
-### Correctness & Intent
- [ ] Change achieves what the PR/commit/issue describes
- [ ] Happy path works end-to-end
- [ ] Edge cases handled (empty, nil, boundary, concurrent)
- [ ] Error paths don't swallow failures silently
- [ ] No regressions to existing callers of modified code
-
-### Security
- [ ] No exposed secrets, API keys, credentials
- [ ] No sensitive data in logs
- [ ] Input validation at system boundaries
- [ ] Authentication/authorization checked for new endpoints
- [ ] No SQL injection or XSS risks
-
-### Performance
- [ ] No N+1 queries or unbounded result sets
- [ ] No unnecessary re-renders (inline objects/functions as props, missing memoization)
- [ ] No large imports that should be lazy-loaded
- [ ] No O(n²) on potentially large datasets
- [ ] Pagination/limits present where needed
-
-### Testing
- [ ] Edge cases and error paths tested
- [ ] Tests are deterministic (no flakiness)
-
-## Output Format
-
-Provide feedback in this structured format:
-
-```markdown
-## Code Review
-
-**Scope**: [What was reviewed]
-**Overall**: [1-2 sentence summary and general sentiment]
-
---
-
-### 🚨 Critical Issues (Must Fix)
-
-1. **[Category]** `file:line`
-   **Problem**: [What's wrong]
-   **Why**: [Why it matters]
-   **Fix**: [Specific solution]
-   ```[language]
-   // Example fix if helpful
-   ```
-
-### ⚠️ Suggestions (Should Consider)
-
-1. **[Category]** `file:line`
-   **Issue**: [What could be improved]
-   **Suggestion**: [Concrete improvement]
-
---
-
-**References**:
- [Relevant guideline links]
-```
-
-## Review Categories
-
-Use these categories for issues:
-
- **Bug / Regression**: Logic errors, edge cases, race conditions, broken existing behavior
- **Feature Gap**: Change doesn't fully achieve its stated intent
- **Security Risk**: Authentication, authorization, data exposure, injection
- **Performance Issue**: Inefficient queries, unnecessary re-renders, memory leaks, unbounded data
- **Convention Violation**: Style, patterns, architectural guidelines (link to relevant guideline doc)
- **Code Quality**: Complexity, duplication, naming, type safety
- **Testing**: Missing tests, inadequate coverage, flaky tests
-
-## Example Review
-
-```markdown
-## Code Review
-
-**Scope**: Changes in `frontend/src/pages/TraceDetail/` (3 files, 245 additions)
-**Overall**: Good implementation of pagination feature. Found 2 critical issues and 3 suggestions.
-
---
-
-### 🚨 Critical Issues (Must Fix)
-
-1. **Security Risk** `TraceList.tsx:45`
-   **Problem**: API token exposed in client-side code
-   **Why**: Security vulnerability - tokens should never be in frontend
-   **Fix**: Move authentication to backend, use session-based auth
-
-2. **Performance Issue** `TraceList.tsx:89`
-   **Problem**: Inline function passed as prop causes unnecessary re-renders
-   **Why**: Violates frontend guideline, degrades performance with large lists
-   **Fix**:
-   ```typescript
-   const handleTraceClick = useCallback((traceId: string) => {
-     navigate(`/trace/${traceId}`);
-   }, [navigate]);
-   ```
-
-### ⚠️ Suggestions (Should Consider)
-
-1. **Code Quality** `TraceList.tsx:120-180`
-   **Issue**: Function exceeds 40-line guideline
-   **Suggestion**: Extract into smaller functions:
-   - `filterTracesByTimeRange()`
-   - `aggregateMetrics()`
-   - `renderChartData()`
-
-2. **Type Safety** `types.ts:23`
-   **Issue**: Using `any` for trace attributes
-   **Suggestion**: Define proper interface for TraceAttributes
-
-3. **Convention** `TraceList.tsx:12`
-   **Issue**: File imports not organized
-   **Suggestion**: Let simple-import-sort auto-organize (will happen on save)
-
---
-
-**References**:
- [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md)
- [useCallback best practices](https://kentcdodds.com/blog/usememo-and-usecallback)
-```
-
-## Tone Guidelines
-
- **Be respectful**: Focus on code, not the person
- **Be specific**: Always reference exact file:line locations
- **Be concise**: Get to the point, avoid verbosity
- **Be actionable**: Every comment should have clear resolution path
- **Be educational**: Explain why something is an issue, link to guidelines
-
-## Priority Levels
-
-1. **Critical (🚨)**: Security, bugs, data corruption, crashes
-2. **Important (⚠️)**: Performance, maintainability, convention violations
-3. **Nice to have (💡)**: Style preferences, micro-optimizations
-
-## Important Notes
-
- **Reference specific guidelines** from docs when applicable
- **Provide code examples** for fixes when helpful
- **Ask questions** if code intent is unclear
- **Link to external resources** for educational value
- **Distinguish** must-fix from should-consider
- **Be concise** - reviewers value their time
-
-## Critical Rules
-
- **NEVER** be vague - always specify file and line number
- **NEVER** just point out problems - suggest solutions
- **NEVER** review without reading the actual code
- **ALWAYS** check against SigNoz's specific guidelines
- **ALWAYS** provide rationale for each comment
- **ALWAYS** be constructive and respectful
-
-## Reference Documents
-
- [Frontend Guidelines](../../frontend/CONTRIBUTIONS.md) - React, TypeScript, styling
- [Contributing Guidelines](../../CONTRIBUTING.md) - Workflow, commit conventions
- [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) - Commit format
- [CLAUDE.md](../CLAUDE.md) - Project architecture and conventions
--- a/pkg/modules/thirdpartyapi/translator.go
+++ b/pkg/modules/thirdpartyapi/translator.go
@@ -81,8 +81,10 @@ func FilterIntermediateColumns(result *qbtypes.QueryRangeResponse) *qbtypes.Quer
 		// Filter out columns for intermediate queries used only in formulas
 		filteredColumns := make([]*qbtypes.ColumnDescriptor, 0)
 		intermediateQueryNames := map[string]bool{
-			"error":      true,
-			"total_span": true,
+			"error":             true,
+			"total_span":        true,
+			"endpoints_current": true,
+			"endpoints_legacy":  true,
 		}

 		columnIndices := make([]int, 0)
@@ -296,15 +298,15 @@ func BuildDomainList(req *thirdpartyapitypes.ThirdPartyApiRequest) (*qbtypes.Que
 		return nil, err
 	}

-	queries := []qbtypes.QueryEnvelope{
-		buildEndpointsQuery(req),
+	queries := buildEndpointsQueries(req)
+	queries = append(queries,
 		buildLastSeenQuery(req),
 		buildRpsQuery(req),
 		buildErrorQuery(req),
 		buildTotalSpanQuery(req),
 		buildP99Query(req),
 		buildErrorRateFormula(),
-	}
+	)

 	return &qbtypes.QueryRangeRequest{
 		SchemaVersion: "v5",
@@ -346,20 +348,58 @@ func BuildDomainInfo(req *thirdpartyapitypes.ThirdPartyApiRequest) (*qbtypes.Que
 	}, nil
 }

-func buildEndpointsQuery(req *thirdpartyapitypes.ThirdPartyApiRequest) qbtypes.QueryEnvelope {
-	return qbtypes.QueryEnvelope{
+// buildEndpointsQueries returns queries for counting distinct URLs with semconv fallback.
+// It uses two queries with mutually exclusive filters:
+// - endpoints_current: count_distinct(url.full) WHERE url.full EXISTS
+// - endpoints_legacy: count_distinct(http.url) WHERE url.full NOT EXISTS
+// And a formula to combine them: endpoints_current + endpoints_legacy
+func buildEndpointsQueries(req *thirdpartyapitypes.ThirdPartyApiRequest) []qbtypes.QueryEnvelope {
+	// Query for current semconv (url.full)
+	currentFilter := buildBaseFilter(req.Filter)
+	currentFilter.Expression = fmt.Sprintf("(%s) AND %s EXISTS", currentFilter.Expression, urlPathKey)
+
+	endpointsCurrent := qbtypes.QueryEnvelope{
 		Type: qbtypes.QueryTypeBuilder,
 		Spec: qbtypes.QueryBuilderQuery[qbtypes.TraceAggregation]{
-			Name:         "endpoints",
+			Name:         "endpoints_current",
 			Signal:       telemetrytypes.SignalTraces,
 			StepInterval: qbtypes.Step{Duration: defaultStepInterval},
 			Aggregations: []qbtypes.TraceAggregation{
-				{Expression: "count_distinct(http.url)"},
+				{Expression: fmt.Sprintf("count_distinct(%s)", urlPathKey)},
 			},
-			Filter:  buildBaseFilter(req.Filter),
+			Filter:  currentFilter,
 			GroupBy: mergeGroupBy(dualSemconvGroupByKeys["server"], req.GroupBy),
 		},
 	}
+
+	// Query for legacy semconv (http.url) - only when url.full doesn't exist
+	legacyFilter := buildBaseFilter(req.Filter)
+	legacyFilter.Expression = fmt.Sprintf("(%s) AND %s NOT EXISTS", legacyFilter.Expression, urlPathKey)
+
+	endpointsLegacy := qbtypes.QueryEnvelope{
+		Type: qbtypes.QueryTypeBuilder,
+		Spec: qbtypes.QueryBuilderQuery[qbtypes.TraceAggregation]{
+			Name:         "endpoints_legacy",
+			Signal:       telemetrytypes.SignalTraces,
+			StepInterval: qbtypes.Step{Duration: defaultStepInterval},
+			Aggregations: []qbtypes.TraceAggregation{
+				{Expression: fmt.Sprintf("count_distinct(%s)", urlPathKeyLegacy)},
+			},
+			Filter:  legacyFilter,
+			GroupBy: mergeGroupBy(dualSemconvGroupByKeys["server"], req.GroupBy),
+		},
+	}
+
+	// Formula to combine both counts
+	endpointsFormula := qbtypes.QueryEnvelope{
+		Type: qbtypes.QueryTypeFormula,
+		Spec: qbtypes.QueryBuilderFormula{
+			Name:       "endpoints",
+			Expression: "endpoints_current + endpoints_legacy",
+		},
+	}
+
+	return []qbtypes.QueryEnvelope{endpointsCurrent, endpointsLegacy, endpointsFormula}
 }

 func buildLastSeenQuery(req *thirdpartyapitypes.ThirdPartyApiRequest) qbtypes.QueryEnvelope {