Compare commits

..

5 Commits

Author SHA1 Message Date
Miguel Valdes
0c597edf8f feat(deprecation): improved instructions 2026-06-22 13:38:58 -05:00
Miguel Valdes
b205587fdf feat(deprecation): update readme, no docker compose reference 2026-06-22 13:27:27 -05:00
Miguel Valdes
3043f24ea7 feat(deprecation): remove legacy files and update migration guide 2026-06-22 12:15:19 -05:00
Miguel Valdes
9bc86339f5 feat(signoz): include kind in migration doc 2026-06-18 12:43:22 -05:00
Miguel Valdes
d96cbed139 feat(signoz): deprecate install script 2026-06-18 12:08:07 -05:00
54 changed files with 223 additions and 3901 deletions

119
deploy/MIGRATION.md Normal file
View File

@@ -0,0 +1,119 @@
# Migrating from Docker Compose to Foundry
This guide walks you through migrating an existing SigNoz deployment running via
Docker Compose to [Foundry](https://signoz.io/docs/install/docker/).
## Overview
SigNoz has developed a CLI to make installation and deployment much easier. Foundry is the evolution of how SigNoz should be installed and managed going forward.
The install script is now deprecated and will no longer receive updates.
To stay up to date on new installation platforms and patterns, please refer to [Foundry](https://github.com/SigNoz/foundry).
Two `foundryctl` commands are used throughout this guide:
- **`forge`** — generates deployment manifests from your `casting.yaml`. It does not touch running containers, so it is safe to re-run while you iterate.
- **`cast`** — applies the generated manifests: it creates and starts the containers (and pulls new images).
## Prerequisites
- [ ] Install Foundry - `curl -fsSL https://signoz.io/foundry.sh | bash`
## Migration Steps
> [!WARNING]
> **Before proceeding, back up both:**
> - **Your docker volumes** — these hold your data.
> - **Your existing `docker-compose.yaml` (and any config it references)** — keep a copy somewhere safe. The compose manifests are no longer distributed by SigNoz, so this backup is your only way to roll back to your previous setup.
1. Make a note of the volume names used by your existing deployment for the following components:
- ClickHouse
- SigNoz
- ZooKeeper
> If you used the docker compose file we provided, the volumes will be `signoz-clickhouse`, `signoz-sqlite`, and `signoz-zookeeper-1`.
2. Generate your `casting.yaml`. Based on internal testing, the following casting should generate the manifests that mimic the legacy docker compose setup (compare against your backed-up `docker-compose.yaml`). Once created, run `foundryctl forge -f casting.yaml`.
> [!NOTE]
> The casting contains `patches` and `config` overrides to ensure that the newly generated manifests use the existing volumes and configurations.
> [!WARNING]
> If your deployment had more than 1 shard or replica, you will need to adjust your manifest volumes accordingly. Additionally, if you had specific container images, you need to include them in your casting.
> [!IMPORTANT]
> The `replica` and `shard` macros below are placeholders. Replace them with the values from your existing ClickHouse configuration (check the `macros` section of your current ClickHouse config, e.g. `config.xml`/`metrika.xml`), otherwise the generated manifests will not match your existing data.
```yaml
apiVersion: v1alpha1
kind: Installation
metadata:
name: signoz
spec:
deployment:
flavor: compose
mode: docker
metastore:
kind: sqlite
telemetrykeeper:
kind: zookeeper
telemetrystore:
spec:
config:
data:
config-0-0.yaml: |
macros:
replica: "example01-01-1" # replace with your existing ClickHouse replica macro (see legacy configuration files for reference)
shard: "01" # replace with your existing ClickHouse shard macro (see legacy configuration files for reference)
patches:
- target: "deployment/compose.yaml"
operations:
- op: replace
path: /volumes/dev-telemetrykeeper-0-data/name
value: signoz-zookeeper-1
- op: replace
path: /volumes/dev-telemetrystore-0-0-data/name
value: signoz-clickhouse
- op: replace
path: /volumes/dev-metastore-sqlite-0-data/name
value: signoz-sqlite
- op: add
path: /services/dev-telemetrykeeper-zookeeper-0/user
value: root
```
> [!NOTE]
> The `user: root` patch on the ZooKeeper service is required so the container can read/write the data in your reused ZooKeeper volume, which was created with `root`-owned files by the legacy compose setup. Without it, ZooKeeper may fail to start with permission errors.
3. Validate the manifests in `pours/deployment`. Pay special attention to `compose.yaml` — it should mimic the legacy manifest and the configuration files needed for `clickhouse`. **Do note that these are now in YAML instead of XML.**
If you had custom settings for features like SMTP or ingestion processors/receivers, you will need to include those in your casting file.
4. Execute a `docker compose down`. **Do not** include parameters such as `--volumes` (or `-v`), as it will wipe the volumes we need to maintain and reuse to avoid data loss. **NOTE: this will generate downtime so please plan accordingly**.
5. Validate the SigNoz containers are down with `docker ps -a`. Multiple containers cannot bind the same volume.
6. Run `foundryctl cast -f casting.yaml`. This will recreate the containers based on the spec. This process will download new container images.
> [!NOTE]
> When `cast` is run, the migration container will execute its migrations.
## Verifying the Migration
- SigNoz containers will be up and running.
- Log in to the SigNoz UI and verify that data is present.
- Validate that your data ingestion is receiving data.
- Review the logs from both ClickHouse and ZooKeeper; no errors should be present.
## Rolling Back
Because step 4 brought the legacy stack down *without* `-v`, your original volumes
are untouched and still hold your data. To roll back:
- Stop and remove the containers created by Foundry (`docker compose down`, again without `-v`).
- Confirm the containers are gone with `docker ps -a` so nothing else is bound to the volumes.
- Reapply your original docker compose file (`docker compose up -d`). It will reattach to the
existing volumes and restore your prior state.
## Troubleshooting
- Please reach out to our community on [Slack](https://signoz.io/slack).
## References
- [SigNoz Docker installation docs](https://signoz.io/docs/install/docker/)
- [SigNoz documentation](https://signoz.io/docs)
- [Foundry](https://github.com/SigNoz/foundry)

View File

@@ -3,77 +3,18 @@
Check that you have cloned [signoz/signoz](https://github.com/signoz/signoz)
and currently are in `signoz/deploy` folder.
## Docker
## Installation
If you don't have docker set up, please follow [this guide](https://docs.docker.com/engine/install/)
to set up docker before proceeding with the next steps.
> **Note:** The `install.sh` script and the `docker-compose` manifests have been deprecated.
### Using Install Script
SigNoz now installs and runs through [Foundry](https://signoz.io/docs/install/docker/).
Now run the following command to install:
> **Already running SigNoz via Docker Compose?** See the [Migration Guide](./MIGRATION.md) to transition your existing deployment to Foundry.
```sh
./install.sh
```
Please follow the latest installation instructions at [signoz.io/docs/install/docker](https://signoz.io/docs/install/docker/).
Foundry has support for **different platforms and architectures**, please review the project documentation for more details.
### Using Docker Compose
If you don't have docker compose set up, please follow [this guide](https://docs.docker.com/compose/install/)
to set up docker compose before proceeding with the next steps.
```sh
cd deploy/docker
docker compose up -d
```
Open http://localhost:8080 in your favourite browser.
To start collecting logs and metrics from your infrastructure, run the following command:
```sh
cd generator/infra
docker compose up -d
```
To start generating sample traces, run the following command:
```sh
cd generator/hotrod
docker compose up -d
```
In a couple of minutes, you should see the data generated from hotrod in SigNoz UI.
For more details, please refer to the [SigNoz documentation](https://signoz.io/docs/install/docker/).
## Docker Swarm
To install SigNoz using Docker Swarm, run the following command:
```sh
cd deploy/docker-swarm
docker stack deploy -c docker-compose.yaml signoz
```
Open http://localhost:8080 in your favourite browser.
To start collecting logs and metrics from your infrastructure, run the following command:
```sh
cd generator/infra
docker stack deploy -c docker-compose.yaml infra
```
To start generating sample traces, run the following command:
```sh
cd generator/hotrod
docker stack deploy -c docker-compose.yaml hotrod
```
In a couple of minutes, you should see the data generated from hotrod in SigNoz UI.
For more details, please refer to the [SigNoz documentation](https://signoz.io/docs/install/docker-swarm/).
> You will need Docker installed first. If you don't have it set up, follow [this guide](https://docs.docker.com/engine/install/).
## Uninstall/Troubleshoot?

View File

@@ -1,75 +0,0 @@
<?xml version="1.0"?>
<clickhouse>
<!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
Optional. If you don't use replicated tables, you could omit that.
See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
-->
<zookeeper>
<node index="1">
<host>zookeeper-1</host>
<port>2181</port>
</node>
<node index="2">
<host>zookeeper-2</host>
<port>2181</port>
</node>
<node index="3">
<host>zookeeper-3</host>
<port>2181</port>
</node>
</zookeeper>
<!-- Configuration of clusters that could be used in Distributed tables.
https://clickhouse.com/docs/en/operations/table_engines/distributed/
-->
<remote_servers>
<cluster>
<!-- Inter-server per-cluster secret for Distributed queries
default: no secret (no authentication will be performed)
If set, then Distributed queries will be validated on shards, so at least:
- such cluster should exist on the shard,
- such cluster should have the same secret.
And also (and which is more important), the initial_user will
be used as current user for the query.
Right now the protocol is pretty simple and it only takes into account:
- cluster name
- query
Also it will be nice if the following will be implemented:
- source hostname (see interserver_http_host), but then it will depends from DNS,
it can use IP address instead, but then the you need to get correct on the initiator node.
- target hostname / ip address (same notes as for source hostname)
- time-based security tokens
-->
<!-- <secret></secret> -->
<shard>
<!-- Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas). -->
<!-- <internal_replication>false</internal_replication> -->
<!-- Optional. Shard weight when writing data. Default: 1. -->
<!-- <weight>1</weight> -->
<replica>
<host>clickhouse</host>
<port>9000</port>
<!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->
<!-- <priority>1</priority> -->
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-3</host>
<port>9000</port>
</replica>
</shard>
</cluster>
</remote_servers>
</clickhouse>

View File

@@ -1,75 +0,0 @@
<?xml version="1.0"?>
<clickhouse>
<!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
Optional. If you don't use replicated tables, you could omit that.
See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
-->
<zookeeper>
<node index="1">
<host>zookeeper-1</host>
<port>2181</port>
</node>
<!-- <node index="2">
<host>zookeeper-2</host>
<port>2181</port>
</node>
<node index="3">
<host>zookeeper-3</host>
<port>2181</port>
</node> -->
</zookeeper>
<!-- Configuration of clusters that could be used in Distributed tables.
https://clickhouse.com/docs/en/operations/table_engines/distributed/
-->
<remote_servers>
<cluster>
<!-- Inter-server per-cluster secret for Distributed queries
default: no secret (no authentication will be performed)
If set, then Distributed queries will be validated on shards, so at least:
- such cluster should exist on the shard,
- such cluster should have the same secret.
And also (and which is more important), the initial_user will
be used as current user for the query.
Right now the protocol is pretty simple and it only takes into account:
- cluster name
- query
Also it will be nice if the following will be implemented:
- source hostname (see interserver_http_host), but then it will depends from DNS,
it can use IP address instead, but then the you need to get correct on the initiator node.
- target hostname / ip address (same notes as for source hostname)
- time-based security tokens
-->
<!-- <secret></secret> -->
<shard>
<!-- Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas). -->
<!-- <internal_replication>false</internal_replication> -->
<!-- Optional. Shard weight when writing data. Default: 1. -->
<!-- <weight>1</weight> -->
<replica>
<host>clickhouse</host>
<port>9000</port>
<!-- Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority). -->
<!-- <priority>1</priority> -->
</replica>
</shard>
<!-- <shard>
<replica>
<host>clickhouse-2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>clickhouse-3</host>
<port>9000</port>
</replica>
</shard> -->
</cluster>
</remote_servers>
</clickhouse>

File diff suppressed because it is too large Load Diff

View File

@@ -1,21 +0,0 @@
<functions>
<function>
<type>executable</type>
<name>histogramQuantile</name>
<return_type>Float64</return_type>
<argument>
<type>Array(Float64)</type>
<name>buckets</name>
</argument>
<argument>
<type>Array(Float64)</type>
<name>counts</name>
</argument>
<argument>
<type>Float64</type>
<name>quantile</name>
</argument>
<format>CSV</format>
<command>./histogramQuantile</command>
</function>
</functions>

View File

@@ -1,41 +0,0 @@
<?xml version="1.0"?>
<clickhouse>
<storage_configuration>
<disks>
<default>
<keep_free_space_bytes>10485760</keep_free_space_bytes>
</default>
<s3>
<type>s3</type>
<!-- For S3 cold storage,
if region is us-east-1, endpoint can be https://<bucket-name>.s3.amazonaws.com
if region is not us-east-1, endpoint should be https://<bucket-name>.s3-<region>.amazonaws.com
For GCS cold storage,
endpoint should be https://storage.googleapis.com/<bucket-name>/data/
-->
<endpoint>https://BUCKET-NAME.s3-REGION-NAME.amazonaws.com/data/</endpoint>
<access_key_id>ACCESS-KEY-ID</access_key_id>
<secret_access_key>SECRET-ACCESS-KEY</secret_access_key>
<!-- In case of S3, uncomment the below configuration in case you want to read
AWS credentials from the Environment variables if they exist. -->
<!-- <use_environment_credentials>true</use_environment_credentials> -->
<!-- In case of GCS, uncomment the below configuration, since GCS does
not support batch deletion and result in error messages in logs. -->
<!-- <support_batch_delete>false</support_batch_delete> -->
</s3>
</disks>
<policies>
<tiered>
<volumes>
<default>
<disk>default</disk>
</default>
<s3>
<disk>s3</disk>
<perform_ttl_move_on_insert>0</perform_ttl_move_on_insert>
</s3>
</volumes>
</tiered>
</policies>
</storage_configuration>
</clickhouse>

View File

@@ -1,123 +0,0 @@
<?xml version="1.0"?>
<clickhouse>
<!-- See also the files in users.d directory where the settings can be overridden. -->
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>
<!-- How to choose between replicas during distributed query processing.
random - choose random replica from set of replicas with minimum number of errors
nearest_hostname - from set of replicas with minimum number of errors, choose replica
with minimum number of different symbols between replica's hostname and local hostname
(Hamming distance).
in_order - first live replica is chosen in specified order.
first_or_random - if first replica one has higher number of errors, pick a random one from replicas with minimum number of errors.
-->
<load_balancing>random</load_balancing>
</default>
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<!-- Users and ACL. -->
<users>
<!-- If user name was not specified, 'default' user is used. -->
<default>
<!-- See also the files in users.d directory where the password can be overridden.
Password could be specified in plaintext or in SHA256 (in hex format).
If you want to specify password in plaintext (not recommended), place it in 'password' element.
Example: <password>qwerty</password>.
Password could be empty.
If you want to specify SHA256, place it in 'password_sha256_hex' element.
Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
Restrictions of SHA256: impossibility to connect to ClickHouse using MySQL JS client (as of July 2019).
If you want to specify double SHA1, place it in 'password_double_sha1_hex' element.
Example: <password_double_sha1_hex>e395796d6546b1b65db9d665cd43f0e858dd4303</password_double_sha1_hex>
If you want to specify a previously defined LDAP server (see 'ldap_servers' in the main config) for authentication,
place its name in 'server' element inside 'ldap' element.
Example: <ldap><server>my_ldap_server</server></ldap>
If you want to authenticate the user via Kerberos (assuming Kerberos is enabled, see 'kerberos' in the main config),
place 'kerberos' element instead of 'password' (and similar) elements.
The name part of the canonical principal name of the initiator must match the user name for authentication to succeed.
You can also place 'realm' element inside 'kerberos' element to further restrict authentication to only those requests
whose initiator's realm matches it.
Example: <kerberos />
Example: <kerberos><realm>EXAMPLE.COM</realm></kerberos>
How to generate decent password:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
In first line will be password and in second - corresponding SHA256.
How to generate double SHA1:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d '-'
In first line will be password and in second - corresponding double SHA1.
-->
<password></password>
<!-- List of networks with open access.
To open access from everywhere, specify:
<ip>::/0</ip>
To open access only from localhost, specify:
<ip>::1</ip>
<ip>127.0.0.1</ip>
Each element of list has one of the following forms:
<ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 10.0.0.1/255.255.255.0
2a02:6b8::3 or 2a02:6b8::3/64 or 2a02:6b8::3/ffff:ffff:ffff:ffff::.
<host> Hostname. Example: server01.clickhouse.com.
To check access, DNS query is performed, and all received addresses compared to peer address.
<host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.clickhouse\.com$
To check access, DNS PTR query is performed for peer address and then regexp is applied.
Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
Strongly recommended that regexp is ends with $
All results of DNS requests are cached till server restart.
-->
<networks>
<ip>::/0</ip>
</networks>
<!-- Settings profile for user. -->
<profile>default</profile>
<!-- Quota for user. -->
<quota>default</quota>
<!-- User can create other users and grant rights to them. -->
<!-- <access_management>1</access_management> -->
</default>
</users>
<!-- Quotas. -->
<quotas>
<!-- Name of quota. -->
<default>
<!-- Limits for time interval. You could specify many intervals with different limits. -->
<interval>
<!-- Length of interval. -->
<duration>3600</duration>
<!-- No limits. Just calculate resource usage for time interval. -->
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
</clickhouse>

View File

@@ -1,16 +0,0 @@
from locust import HttpUser, task, between
class UserTasks(HttpUser):
wait_time = between(5, 15)
@task
def rachel(self):
self.client.get("/dispatch?customer=123&nonse=0.6308392664170006")
@task
def trom(self):
self.client.get("/dispatch?customer=392&nonse=0.015296363321630757")
@task
def japanese(self):
self.client.get("/dispatch?customer=731&nonse=0.8022286220408668")
@task
def coffee(self):
self.client.get("/dispatch?customer=567&nonse=0.0022220379420636593")

View File

@@ -1 +0,0 @@
server_endpoint: ws://signoz:4320/v1/opamp

View File

@@ -1,25 +0,0 @@
# my global config
global:
scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: []
# - "first_rules.yml"
# - "second_rules.yml"
# - 'alerts.yml'
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs: []
remote_read:
- url: tcp://clickhouse:9000/signoz_metrics

View File

@@ -1,288 +0,0 @@
version: "3"
x-common: &common
networks:
- signoz-net
deploy:
restart_policy:
condition: on-failure
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
- zookeeper-1
- zookeeper-2
- zookeeper-3
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
- clickhouse
- clickhouse-2
- clickhouse-3
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
deploy:
restart_policy:
condition: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-2:
!!merge <<: *zookeeper-defaults
# ports:
# - "2182:2181"
# - "2889:2888"
# - "3889:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-2:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=2
- ZOO_SERVERS=zookeeper-1:2888:3888,0.0.0.0:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-3:
!!merge <<: *zookeeper-defaults
# ports:
# - "2183:2181"
# - "2890:2888"
# - "3890:3888"
volumes:
- ./clickhouse-setup/data/zookeeper-3:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=3
- ZOO_SERVERS=zookeeper-1:2888:3888,zookeeper-2:2888:3888,0.0.0.0:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
# TODO: needed for schema-migrator to work, remove this redundancy once we have a better solution
hostname: clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-2:
!!merge <<: *clickhouse-defaults
hostname: clickhouse-2
# ports:
# - "9001:9000"
# - "8124:8123"
# - "9182:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse-2/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-3:
!!merge <<: *clickhouse-defaults
hostname: clickhouse-3
# ports:
# - "9002:9000"
# - "8125:8123"
# - "9183:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.ha.xml
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ./clickhouse-setup/data/clickhouse-3/:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:v0.129.0
ports:
- "8080:8080" # signoz port
# - "6060:6060" # pprof port
volumes:
- ./clickhouse-setup/data/signoz/:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.5
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
configs:
- source: otel-collector-config
target: /etc/otel-collector-config.yaml
- source: otel-manager-config
target: /etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name={{.Node.Hostname}},os.type={{.Node.Platform.OS}}
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
deploy:
replicas: 3
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.5
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
clickhouse-2:
name: signoz-clickhouse-2
clickhouse-3:
name: signoz-clickhouse-3
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
zookeeper-2:
name: signoz-zookeeper-2
zookeeper-3:
name: signoz-zookeeper-3
configs:
clickhouse-config:
file: ../common/clickhouse/config.xml
clickhouse-users:
file: ../common/clickhouse/users.xml
clickhouse-custom-function:
file: ../common/clickhouse/custom-function.xml
clickhouse-cluster:
file: ../common/clickhouse/cluster.ha.xml
otel-collector-config:
file: ./otel-collector-config.yaml
otel-manager-config:
file: ../common/signoz/otel-collector-opamp-config.yaml

View File

@@ -1,206 +0,0 @@
version: "3"
x-common: &common
networks:
- signoz-net
deploy:
restart_policy:
condition: on-failure
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
- init-clickhouse
- zookeeper-1
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
deploy:
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
- clickhouse
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
deploy:
restart_policy:
condition: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
# TODO: needed for clickhouse TCP connectio
hostname: clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
configs:
- source: clickhouse-config
target: /etc/clickhouse-server/config.xml
- source: clickhouse-users
target: /etc/clickhouse-server/users.xml
- source: clickhouse-custom-function
target: /etc/clickhouse-server/custom-function.xml
- source: clickhouse-cluster
target: /etc/clickhouse-server/config.d/cluster.xml
volumes:
- clickhouse:/var/lib/clickhouse/
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:v0.129.0
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.5
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
configs:
- source: otel-collector-config
target: /etc/otel-collector-config.yaml
- source: otel-manager-config
target: /etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name={{.Node.Hostname}},os.type={{.Node.Platform.OS}}
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
deploy:
replicas: 3
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:v0.144.5
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
configs:
clickhouse-config:
file: ../common/clickhouse/config.xml
clickhouse-users:
file: ../common/clickhouse/users.xml
clickhouse-custom-function:
file: ../common/clickhouse/custom-function.xml
clickhouse-cluster:
file: ../common/clickhouse/cluster.xml
otel-collector-config:
file: ./otel-collector-config.yaml
otel-manager-config:
file: ../common/signoz/otel-collector-opamp-config.yaml

View File

@@ -1,118 +0,0 @@
connectors:
signozmeter:
metrics_flush_interval: 1h
dimensions:
- name: service.name
- name: deployment.environment
- name: host.name
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
global:
scrape_interval: 60s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
batch/meter:
send_batch_max_size: 25000
send_batch_size: 20000
timeout: 1s
resourcedetection:
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
detectors: [env, system]
timeout: 2s
signozspanmetrics/delta:
metrics_exporter: signozclickhousemetrics
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: signoz.collector.id
- name: service.version
- name: browser.platform
- name: browser.mobile
- name: k8s.cluster.name
- name: k8s.node.name
- name: k8s.namespace.name
- name: host.name
- name: host.type
- name: container.name
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/signoz_traces
low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING}
use_new_schema: true
signozclickhousemetrics:
dsn: tcp://clickhouse:9000/signoz_metrics
clickhouselogsexporter:
dsn: tcp://clickhouse:9000/signoz_logs
timeout: 10s
use_new_schema: true
signozclickhousemeter:
dsn: tcp://clickhouse:9000/signoz_meter
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
encoding: json
extensions:
- health_check
- pprof
pipelines:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]
exporters: [signozclickhousemeter]

View File

@@ -1 +0,0 @@
COMPOSE_PROJECT_NAME=signoz

View File

@@ -1,3 +0,0 @@
This data directory is deprecated and will be removed in the future.
Please use the migration script under `scripts/volume-migration` to migrate data from bind mounts to Docker volumes.
The script also renames the project name to `signoz` and the network name to `signoz-net` (if not already in place).

View File

@@ -1,2 +0,0 @@
This directory is deprecated and will be removed in the future.
Please use the new directory for Clickhouse setup scripts: `scripts/clickhouse` instead.

View File

@@ -1,265 +0,0 @@
version: "3"
x-common: &common
networks:
- signoz-net
restart: unless-stopped
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
# addding non LTS version due to this fix https://github.com/ClickHouse/ClickHouse/commit/32caf8716352f45c1b617274c7508c86b7d1afab
image: clickhouse/clickhouse-server:25.5.6
tty: true
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
init-clickhouse:
condition: service_completed_successfully
zookeeper-1:
condition: service_healthy
zookeeper-2:
condition: service_healthy
zookeeper-3:
condition: service_healthy
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
clickhouse:
condition: service_healthy
clickhouse-2:
condition: service_healthy
clickhouse-3:
condition: service_healthy
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
container_name: signoz-init-clickhouse
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
restart: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-1
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-2:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-2
# ports:
# - "2182:2181"
# - "2889:2888"
# - "3889:3888"
volumes:
- zookeeper-2:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=2
- ZOO_SERVERS=zookeeper-1:2888:3888,0.0.0.0:2888:3888,zookeeper-3:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
zookeeper-3:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-3
# ports:
# - "2183:2181"
# - "2890:2888"
# - "3890:3888"
volumes:
- zookeeper-3:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=3
- ZOO_SERVERS=zookeeper-1:2888:3888,zookeeper-2:2888:3888,0.0.0.0:2888:3888
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-2:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse-2
# ports:
# - "9001:9000"
# - "8124:8123"
# - "9182:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse-2:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
clickhouse-3:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse-3
# ports:
# - "9002:9000"
# - "8125:8123"
# - "9183:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.ha.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse-3:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:${VERSION:-v0.129.0}
container_name: signoz
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.5}
container_name: signoz-otel-collector
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
- ../common/signoz/otel-collector-opamp-config.yaml:/etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.5}
container_name: signoz-telemetrystore-migrator
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
restart: on-failure
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
clickhouse-2:
name: signoz-clickhouse-2
clickhouse-3:
name: signoz-clickhouse-3
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1
zookeeper-2:
name: signoz-zookeeper-2
zookeeper-3:
name: signoz-zookeeper-3

View File

@@ -1,185 +0,0 @@
version: "3"
x-common: &common
networks:
- signoz-net
restart: unless-stopped
logging:
options:
max-size: 50m
max-file: "3"
x-clickhouse-defaults: &clickhouse-defaults
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
tty: true
labels:
signoz.io/scrape: "true"
signoz.io/port: "9363"
signoz.io/path: "/metrics"
depends_on:
init-clickhouse:
condition: service_completed_successfully
zookeeper-1:
condition: service_healthy
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- 0.0.0.0:8123/ping
interval: 30s
timeout: 5s
retries: 3
ulimits:
nproc: 65535
nofile:
soft: 262144
hard: 262144
environment:
- CLICKHOUSE_SKIP_USER_SETUP=1
x-zookeeper-defaults: &zookeeper-defaults
!!merge <<: *common
image: signoz/zookeeper:3.7.1
user: root
labels:
signoz.io/scrape: "true"
signoz.io/port: "9141"
signoz.io/path: "/metrics"
healthcheck:
test:
- CMD-SHELL
- curl -s -m 2 http://localhost:8080/commands/ruok | grep error | grep null
interval: 30s
timeout: 5s
retries: 3
x-db-depend: &db-depend
!!merge <<: *common
depends_on:
clickhouse:
condition: service_healthy
services:
init-clickhouse:
!!merge <<: *common
image: clickhouse/clickhouse-server:25.5.6
container_name: signoz-init-clickhouse
command:
- bash
- -c
- |
version="v0.0.1"
node_os=$$(uname -s | tr '[:upper:]' '[:lower:]')
node_arch=$$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
echo "Fetching histogram-binary for $${node_os}/$${node_arch}"
cd /tmp
wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F$${version}/histogram-quantile_$${node_os}_$${node_arch}.tar.gz"
tar -xvzf histogram-quantile.tar.gz
mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
restart: on-failure
volumes:
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
zookeeper-1:
!!merge <<: *zookeeper-defaults
container_name: signoz-zookeeper-1
# ports:
# - "2181:2181"
# - "2888:2888"
# - "3888:3888"
volumes:
- zookeeper-1:/bitnami/zookeeper
environment:
- ZOO_SERVER_ID=1
- ALLOW_ANONYMOUS_LOGIN=yes
- ZOO_AUTOPURGE_INTERVAL=1
- ZOO_ENABLE_PROMETHEUS_METRICS=yes
- ZOO_PROMETHEUS_METRICS_PORT_NUMBER=9141
clickhouse:
!!merge <<: *clickhouse-defaults
container_name: signoz-clickhouse
# ports:
# - "9000:9000"
# - "8123:8123"
# - "9181:9181"
volumes:
- ../common/clickhouse/config.xml:/etc/clickhouse-server/config.xml
- ../common/clickhouse/users.xml:/etc/clickhouse-server/users.xml
- ../common/clickhouse/custom-function.xml:/etc/clickhouse-server/custom-function.xml
- ../common/clickhouse/user_scripts:/var/lib/clickhouse/user_scripts/
- ../common/clickhouse/cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
- clickhouse:/var/lib/clickhouse/
# - ../common/clickhouse/storage.xml:/etc/clickhouse-server/config.d/storage.xml
signoz:
!!merge <<: *db-depend
image: signoz/signoz:${VERSION:-v0.129.0}
container_name: signoz
ports:
- "8080:8080" # signoz port
volumes:
- sqlite:/var/lib/signoz/
environment:
- SIGNOZ_ALERTMANAGER_PROVIDER=signoz
- SIGNOZ_TELEMETRYSTORE_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_SQLSTORE_SQLITE_PATH=/var/lib/signoz/signoz.db
- SIGNOZ_TOKENIZER_JWT_SECRET=secret
healthcheck:
test:
- CMD
- wget
- --spider
- -q
- localhost:8080/api/v1/health
interval: 30s
timeout: 5s
retries: 3
otel-collector:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.5}
container_name: signoz-otel-collector
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate sync check &&
/signoz-otel-collector --config=/etc/otel-collector-config.yaml --manager-config=/etc/manager-config.yaml --copy-path=/var/tmp/collector-config.yaml
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
- ../common/signoz/otel-collector-opamp-config.yaml:/etc/manager-config.yaml
environment:
- OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
- LOW_CARDINAL_EXCEPTION_GROUPING=false
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
ports:
# - "1777:1777" # pprof extension
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
signoz-telemetrystore-migrator:
!!merge <<: *db-depend
image: signoz/signoz-otel-collector:${OTELCOL_TAG:-v0.144.5}
container_name: signoz-telemetrystore-migrator
environment:
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_DSN=tcp://clickhouse:9000
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_CLUSTER=cluster
- SIGNOZ_OTEL_COLLECTOR_CLICKHOUSE_REPLICATION=true
- SIGNOZ_OTEL_COLLECTOR_TIMEOUT=10m
entrypoint:
- /bin/sh
command:
- -c
- |
/signoz-otel-collector migrate bootstrap &&
/signoz-otel-collector migrate sync up &&
/signoz-otel-collector migrate async up
restart: on-failure
networks:
signoz-net:
name: signoz-net
volumes:
clickhouse:
name: signoz-clickhouse
sqlite:
name: signoz-sqlite
zookeeper-1:
name: signoz-zookeeper-1

View File

@@ -1,118 +0,0 @@
connectors:
signozmeter:
metrics_flush_interval: 1h
dimensions:
- name: service.name
- name: deployment.environment
- name: host.name
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
prometheus:
config:
global:
scrape_interval: 60s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets:
- localhost:8888
labels:
job_name: otel-collector
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 11000
timeout: 10s
batch/meter:
send_batch_max_size: 25000
send_batch_size: 20000
timeout: 1s
resourcedetection:
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
detectors: [env, system]
timeout: 2s
signozspanmetrics/delta:
metrics_exporter: signozclickhousemetrics
metrics_flush_interval: 60s
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
dimensions_cache_size: 100000
aggregation_temporality: AGGREGATION_TEMPORALITY_DELTA
enable_exp_histogram: true
dimensions:
- name: service.namespace
default: default
- name: deployment.environment
default: default
# This is added to ensure the uniqueness of the timeseries
# Otherwise, identical timeseries produced by multiple replicas of
# collectors result in incorrect APM metrics
- name: signoz.collector.id
- name: service.version
- name: browser.platform
- name: browser.mobile
- name: k8s.cluster.name
- name: k8s.node.name
- name: k8s.namespace.name
- name: host.name
- name: host.type
- name: container.name
extensions:
health_check:
endpoint: 0.0.0.0:13133
pprof:
endpoint: 0.0.0.0:1777
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/signoz_traces
low_cardinal_exception_grouping: ${env:LOW_CARDINAL_EXCEPTION_GROUPING}
use_new_schema: true
signozclickhousemetrics:
dsn: tcp://clickhouse:9000/signoz_metrics
clickhouselogsexporter:
dsn: tcp://clickhouse:9000/signoz_logs
timeout: 10s
use_new_schema: true
signozclickhousemeter:
dsn: tcp://clickhouse:9000/signoz_meter
timeout: 45s
sending_queue:
enabled: false
metadataexporter:
cache:
provider: in_memory
dsn: tcp://clickhouse:9000/signoz_metadata
enabled: true
timeout: 45s
service:
telemetry:
logs:
encoding: json
extensions:
- health_check
- pprof
pipelines:
traces:
receivers: [otlp]
processors: [signozspanmetrics/delta, batch]
exporters: [clickhousetraces, metadataexporter, signozmeter]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
metrics/prometheus:
receivers: [prometheus]
processors: [batch]
exporters: [signozclickhousemetrics, metadataexporter, signozmeter]
logs:
receivers: [otlp]
processors: [batch]
exporters: [clickhouselogsexporter, metadataexporter, signozmeter]
metrics/meter:
receivers: [signozmeter]
processors: [batch/meter]
exporters: [signozclickhousemeter]

View File

@@ -1,563 +0,0 @@
#!/bin/bash
set -o errexit
# Variables
BASE_DIR="$(dirname "$(readlink -f "$0")")"
DOCKER_STANDALONE_DIR="docker"
DOCKER_SWARM_DIR="docker-swarm" # TODO: Add docker swarm support
# Regular Colors
Black='\033[0;30m' # Black
Red='\[\e[0;31m\]' # Red
Green='\033[0;32m' # Green
Yellow='\033[0;33m' # Yellow
Blue='\033[0;34m' # Blue
Purple='\033[0;35m' # Purple
Cyan='\033[0;36m' # Cyan
White='\033[0;37m' # White
NC='\033[0m' # No Color
is_command_present() {
type "$1" >/dev/null 2>&1
}
# Check whether 'wget' command exists.
has_wget() {
has_cmd wget
}
# Check whether 'curl' command exists.
has_curl() {
has_cmd curl
}
# Check whether the given command exists.
has_cmd() {
command -v "$1" > /dev/null 2>&1
}
# Check if docker compose plugin is present
has_docker_compose_plugin() {
docker compose version > /dev/null 2>&1
}
is_mac() {
[[ $OSTYPE == darwin* ]]
}
is_arm64(){
[[ `uname -m` == 'arm64' || `uname -m` == 'aarch64' ]]
}
check_os() {
if is_mac; then
package_manager="brew"
desired_os=1
os="Mac"
return
fi
if is_arm64; then
arch="arm64"
arch_official="aarch64"
else
arch="amd64"
arch_official="x86_64"
fi
platform=$(uname -s | tr '[:upper:]' '[:lower:]')
os_name="$(cat /etc/*-release | awk -F= '$1 == "NAME" { gsub(/"/, ""); print $2; exit }')"
case "$os_name" in
Ubuntu*|Pop!_OS)
desired_os=1
os="ubuntu"
package_manager="apt-get"
;;
Amazon\ Linux*)
desired_os=1
os="amazon linux"
package_manager="yum"
;;
Debian*)
desired_os=1
os="debian"
package_manager="apt-get"
;;
Linux\ Mint*)
desired_os=1
os="linux mint"
package_manager="apt-get"
;;
Red\ Hat*)
desired_os=1
os="rhel"
package_manager="yum"
;;
CentOS*)
desired_os=1
os="centos"
package_manager="yum"
;;
Rocky*)
desired_os=1
os="centos"
package_manager="yum"
;;
SLES*)
desired_os=1
os="sles"
package_manager="zypper"
;;
openSUSE*)
desired_os=1
os="opensuse"
package_manager="zypper"
;;
*)
desired_os=0
os="Not Found: $os_name"
esac
}
# This function checks if the relevant ports required by SigNoz are available or not
# The script should error out in case they aren't available
check_ports_occupied() {
local port_check_output
local ports_pattern="8080|4317"
if is_mac; then
port_check_output="$(netstat -anp tcp | awk '$6 == "LISTEN" && $4 ~ /^.*\.('"$ports_pattern"')$/')"
elif is_command_present ss; then
# The `ss` command seems to be a better/faster version of `netstat`, but is not available on all Linux
# distributions by default. Other distributions have `ss` but no `netstat`. So, we try for `ss` first, then
# fallback to `netstat`.
port_check_output="$(ss --all --numeric --tcp | awk '$1 == "LISTEN" && $4 ~ /^.*:('"$ports_pattern"')$/')"
elif is_command_present netstat; then
port_check_output="$(netstat --all --numeric --tcp | awk '$6 == "LISTEN" && $4 ~ /^.*:('"$ports_pattern"')$/')"
fi
if [[ -n $port_check_output ]]; then
send_event "port_not_available"
echo "+++++++++++ ERROR ++++++++++++++++++++++"
echo "SigNoz requires ports 8080 & 4317 to be open. Please shut down any other service(s) that may be running on these ports."
echo "You can run SigNoz on another port following this guide https://signoz.io/docs/install/troubleshooting/"
echo "++++++++++++++++++++++++++++++++++++++++"
echo ""
exit 1
fi
}
install_docker() {
echo "++++++++++++++++++++++++"
echo "Setting up docker repos"
if [[ $package_manager == apt-get ]]; then
apt_cmd="$sudo_cmd apt-get --yes --quiet"
$apt_cmd update
$apt_cmd install software-properties-common gnupg-agent
curl -fsSL "https://download.docker.com/linux/$os/gpg" | $sudo_cmd apt-key add -
$sudo_cmd add-apt-repository \
"deb [arch=$arch] https://download.docker.com/linux/$os $(lsb_release -cs) stable"
$apt_cmd update
echo "Installing docker"
$apt_cmd install docker-ce docker-ce-cli containerd.io
elif [[ $package_manager == zypper ]]; then
zypper_cmd="$sudo_cmd zypper --quiet --no-gpg-checks --non-interactive"
echo "Installing docker"
if [[ $os == sles ]]; then
os_sp="$(cat /etc/*-release | awk -F= '$1 == "VERSION_ID" { gsub(/"/, ""); print $2; exit }')"
os_arch="$(uname -i)"
SUSEConnect -p sle-module-containers/$os_sp/$os_arch -r ''
fi
$zypper_cmd install docker docker-runc containerd
$sudo_cmd systemctl enable docker.service
elif [[ $package_manager == yum && $os == 'amazon linux' ]]; then
echo
echo "Amazon Linux detected ... "
echo
# yum install docker
# service docker start
$sudo_cmd yum install -y amazon-linux-extras
$sudo_cmd amazon-linux-extras enable docker
$sudo_cmd yum install -y docker
else
yum_cmd="$sudo_cmd yum --assumeyes --quiet"
$yum_cmd install yum-utils
$sudo_cmd yum-config-manager --add-repo https://download.docker.com/linux/$os/docker-ce.repo
echo "Installing docker"
$yum_cmd install docker-ce docker-ce-cli containerd.io
fi
}
compose_version () {
local compose_version
compose_version="$(curl -s https://api.github.com/repos/docker/compose/releases/latest | grep 'tag_name' | cut -d\" -f4)"
echo "${compose_version:-v2.18.1}"
}
install_docker_compose() {
if [[ $package_manager == "apt-get" || $package_manager == "zypper" || $package_manager == "yum" ]]; then
if [[ ! -f /usr/bin/docker-compose ]];then
echo "++++++++++++++++++++++++"
echo "Installing docker-compose"
compose_url="https://github.com/docker/compose/releases/download/$(compose_version)/docker-compose-$platform-$arch_official"
echo "Downloading docker-compose from $compose_url"
$sudo_cmd curl -L "$compose_url" -o /usr/local/bin/docker-compose
$sudo_cmd chmod +x /usr/local/bin/docker-compose
$sudo_cmd ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
echo "docker-compose installed!"
echo ""
fi
else
send_event "docker_compose_not_found"
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "docker-compose not found! Please install docker-compose first and then continue with this installation."
echo "Refer https://docs.docker.com/compose/install/ for installing docker-compose."
echo "+++++++++++++++++++++++++++++++++++++++++++++++++"
exit 1
fi
}
start_docker() {
echo -e "🐳 Starting Docker ...\n"
if [[ $os == "Mac" ]]; then
open --background -a Docker && while ! docker system info > /dev/null 2>&1; do sleep 1; done
else
if ! $sudo_cmd systemctl is-active docker.service > /dev/null; then
echo "Starting docker service"
$sudo_cmd systemctl start docker.service
fi
if [[ -z $sudo_cmd ]]; then
if ! docker ps > /dev/null && true; then
request_sudo
fi
fi
fi
}
wait_for_containers_start() {
local timeout=$1
# The while loop is important because for-loops don't work for dynamic values
while [[ $timeout -gt 0 ]]; do
status_code="$(curl -s -o /dev/null -w "%{http_code}" "http://localhost:8080/api/v1/health?live=1" || true)"
if [[ status_code -eq 200 ]]; then
break
else
echo -ne "Waiting for all containers to start. This check will timeout in $timeout seconds ...\r\c"
fi
((timeout--))
sleep 1
done
echo ""
}
bye() { # Prints a friendly good bye message and exits the script.
# Switch back to the original directory
popd > /dev/null 2>&1
if [[ "$?" -ne 0 ]]; then
set +o errexit
echo "🔴 The containers didn't seem to start correctly. Please run the following command to check containers that may have errored out:"
echo ""
echo -e "cd ${DOCKER_STANDALONE_DIR}"
echo -e "$sudo_cmd $docker_compose_cmd ps -a"
echo "Please read our troubleshooting guide https://signoz.io/docs/install/troubleshooting/"
echo "or reach us for support in #help channel in our Slack Community https://signoz.io/slack"
echo "++++++++++++++++++++++++++++++++++++++++"
if [[ $email == "" ]]; then
echo -e "\n📨 Please share your email to receive support with the installation"
read -rp 'Email: ' email
while [[ $email == "" ]]
do
read -rp 'Email: ' email
done
fi
send_event "installation_support"
echo ""
echo -e "\nWe will reach out to you at the email provided shortly, Exiting for now. Bye! 👋 \n"
exit 0
fi
}
request_sudo() {
if hash sudo 2>/dev/null; then
echo -e "\n\n🙇 We will need sudo access to complete the installation."
if (( $EUID != 0 )); then
sudo_cmd="sudo"
echo -e "Please enter your sudo password, if prompted."
if ! $sudo_cmd -l | grep -e "NOPASSWD: ALL" > /dev/null && ! $sudo_cmd -v; then
echo "Need sudo privileges to proceed with the installation."
exit 1;
fi
echo -e "Got it! Thanks!! 🙏\n"
echo -e "Okay! We will bring up the SigNoz cluster from here 🚀\n"
fi
fi
}
echo ""
echo -e "👋 Thank you for trying out SigNoz! "
echo ""
sudo_cmd=""
docker_compose_cmd=""
# Check sudo permissions
if (( $EUID != 0 )); then
echo "🟡 Running installer with non-sudo permissions."
echo " In case of any failure or prompt, please consider running the script with sudo privileges."
echo ""
else
sudo_cmd="sudo"
fi
# Checking OS and assigning package manager
desired_os=0
os=""
email=""
echo -e "🌏 Detecting your OS ...\n"
check_os
# Obtain unique installation id
# sysinfo="$(uname -a)"
# if [[ $? -ne 0 ]]; then
# uuid="$(uuidgen)"
# uuid="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
# sysinfo="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
# fi
if ! sysinfo="$(uname -a)"; then
uuid="$(uuidgen)"
uuid="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
sysinfo="${uuid:-$(cat /proc/sys/kernel/random/uuid)}"
fi
digest_cmd=""
if hash shasum 2>/dev/null; then
digest_cmd="shasum -a 256"
elif hash sha256sum 2>/dev/null; then
digest_cmd="sha256sum"
elif hash openssl 2>/dev/null; then
digest_cmd="openssl dgst -sha256"
fi
if [[ -z $digest_cmd ]]; then
SIGNOZ_INSTALLATION_ID="$sysinfo"
else
SIGNOZ_INSTALLATION_ID=$(echo "$sysinfo" | $digest_cmd | grep -E -o '[a-zA-Z0-9]{64}')
fi
setup_type='clickhouse'
# Run bye if failure happens
trap bye EXIT
URL="https://api.segment.io/v1/track"
HEADER_1="Content-Type: application/json"
HEADER_2="Authorization: Basic OWtScko3b1BDR1BFSkxGNlFqTVBMdDVibGpGaFJRQnI="
send_event() {
error=""
case "$1" in
'install_started')
event="Installation Started"
;;
'os_not_supported')
event="Installation Error"
error="OS Not Supported"
;;
'docker_not_installed')
event="Installation Error"
error="Docker not installed"
;;
'docker_compose_not_found')
event="Installation Error"
event="Docker Compose not found"
;;
'port_not_available')
event="Installation Error"
error="port not available"
;;
'installation_error_checks')
event="Installation Error - Checks"
error="Containers not started"
others='"data": "some_checks",'
;;
'installation_support')
event="Installation Support"
others='"email": "'"$email"'",'
;;
'installation_success')
event="Installation Success"
;;
'identify_successful_installation')
event="Identify Successful Installation"
others='"email": "'"$email"'",'
;;
*)
print_error "unknown event type: $1"
exit 1
;;
esac
if [[ "$error" != "" ]]; then
error='"error": "'"$error"'", '
fi
DATA='{ "anonymousId": "'"$SIGNOZ_INSTALLATION_ID"'", "event": "'"$event"'", "properties": { "os": "'"$os"'", '"$error $others"' "setup_type": "'"$setup_type"'" } }'
if has_curl; then
curl -sfL -d "$DATA" --header "$HEADER_1" --header "$HEADER_2" "$URL" > /dev/null 2>&1
elif has_wget; then
wget -q --post-data="$DATA" --header "$HEADER_1" --header "$HEADER_2" "$URL" > /dev/null 2>&1
fi
}
send_event "install_started"
if [[ $desired_os -eq 0 ]]; then
send_event "os_not_supported"
fi
# Check is Docker daemon is installed and available. If not, the install & start Docker for Linux machines. We cannot automatically install Docker Desktop on Mac OS
if ! is_command_present docker; then
if [[ $package_manager == "apt-get" || $package_manager == "zypper" || $package_manager == "yum" ]]; then
request_sudo
install_docker
# enable docker without sudo from next reboot
sudo usermod -aG docker "${USER}"
elif is_mac; then
echo ""
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "Docker Desktop must be installed manually on Mac OS to proceed. Docker can only be installed automatically on Ubuntu / openSUSE / SLES / Redhat / Cent OS"
echo "https://docs.docker.com/docker-for-mac/install/"
echo "++++++++++++++++++++++++++++++++++++++++++++++++"
send_event "docker_not_installed"
exit 1
else
echo ""
echo "+++++++++++ IMPORTANT READ ++++++++++++++++++++++"
echo "Docker must be installed manually on your machine to proceed. Docker can only be installed automatically on Ubuntu / openSUSE / SLES / Redhat / Cent OS"
echo "https://docs.docker.com/get-docker/"
echo "++++++++++++++++++++++++++++++++++++++++++++++++"
send_event "docker_not_installed"
exit 1
fi
fi
if has_docker_compose_plugin; then
echo "docker compose plugin is present, using it"
docker_compose_cmd="docker compose"
# Install docker-compose
else
docker_compose_cmd="docker-compose"
if ! is_command_present docker-compose; then
request_sudo
install_docker_compose
fi
fi
start_docker
# Switch to the Docker Standalone directory
pushd "${BASE_DIR}/${DOCKER_STANDALONE_DIR}" > /dev/null 2>&1
# check for open ports, if signoz is not installed
if is_command_present docker-compose; then
if $sudo_cmd $docker_compose_cmd ps | grep "signoz" | grep -q "healthy" > /dev/null 2>&1; then
echo "SigNoz already installed, skipping the occupied ports check"
else
check_ports_occupied
fi
fi
echo ""
echo -e "\n🟡 Pulling the latest container images for SigNoz.\n"
$sudo_cmd $docker_compose_cmd pull
echo ""
echo "🟡 Starting the SigNoz containers. It may take a few minutes ..."
echo
# The $docker_compose_cmd command does some nasty stuff for the `--detach` functionality. So we add a `|| true` so that the
# script doesn't exit because this command looks like it failed to do it's thing.
$sudo_cmd $docker_compose_cmd up --detach --remove-orphans || true
wait_for_containers_start 60
echo ""
if [[ $status_code -ne 200 ]]; then
echo "+++++++++++ ERROR ++++++++++++++++++++++"
echo "🔴 The containers didn't seem to start correctly. Please run the following command to check containers that may have errored out:"
echo ""
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd ps -a"
echo ""
echo "Try bringing down the containers and retrying the installation"
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd down -v"
echo ""
echo "Please read our troubleshooting guide https://signoz.io/docs/install/troubleshooting/"
echo "or reach us on SigNoz for support https://signoz.io/slack"
echo "++++++++++++++++++++++++++++++++++++++++"
send_event "installation_error_checks"
exit 1
else
send_event "installation_success"
echo "++++++++++++++++++ SUCCESS ++++++++++++++++++++++"
echo ""
echo "🟢 Your installation is complete!"
echo ""
echo -e "🟢 SigNoz is running on http://localhost:8080"
echo ""
echo " By default, retention period is set to 15 days for logs and traces, and 30 days for metrics."
echo -e "To change this, navigate to the General tab on the Settings page of SigNoz UI. For more details, refer to https://signoz.io/docs/userguide/retention-period \n"
echo " To bring down SigNoz and clean volumes:"
echo ""
echo "cd ${DOCKER_STANDALONE_DIR}"
echo "$sudo_cmd $docker_compose_cmd down -v"
echo ""
echo "+++++++++++++++++++++++++++++++++++++++++++++++++"
echo ""
echo "👉 Need help in Getting Started?"
echo -e "Join us on Slack https://signoz.io/slack"
echo ""
echo -e "\n📨 Please share your email to receive support & updates about SigNoz!"
read -rp 'Email: ' email
while [[ $email == "" ]]
do
read -rp 'Email: ' email
done
send_event "identify_successful_installation"
fi
echo -e "\n🙏 Thank you!\n"

View File

@@ -3,16 +3,15 @@ package flagger
import "github.com/SigNoz/signoz/pkg/types/featuretypes"
var (
FeatureUseSpanMetrics = featuretypes.MustNewName("use_span_metrics")
FeatureKafkaSpanEval = featuretypes.MustNewName("kafka_span_eval")
FeatureHideRootUser = featuretypes.MustNewName("hide_root_user")
FeatureGetMetersFromZeus = featuretypes.MustNewName("get_meters_from_zeus")
FeaturePutMetersInZeus = featuretypes.MustNewName("put_meters_in_zeus")
FeatureUseMeterReporter = featuretypes.MustNewName("use_meter_reporter")
FeatureUseJSONBody = featuretypes.MustNewName("use_json_body")
FeatureUseFineGrainedAuthz = featuretypes.MustNewName("use_fine_grained_authz")
FeatureUseDashboardV2 = featuretypes.MustNewName("use_dashboard_v2")
FeatureEnableMetricsReduction = featuretypes.MustNewName("enable_metrics_reduction")
FeatureUseSpanMetrics = featuretypes.MustNewName("use_span_metrics")
FeatureKafkaSpanEval = featuretypes.MustNewName("kafka_span_eval")
FeatureHideRootUser = featuretypes.MustNewName("hide_root_user")
FeatureGetMetersFromZeus = featuretypes.MustNewName("get_meters_from_zeus")
FeaturePutMetersInZeus = featuretypes.MustNewName("put_meters_in_zeus")
FeatureUseMeterReporter = featuretypes.MustNewName("use_meter_reporter")
FeatureUseJSONBody = featuretypes.MustNewName("use_json_body")
FeatureUseFineGrainedAuthz = featuretypes.MustNewName("use_fine_grained_authz")
FeatureUseDashboardV2 = featuretypes.MustNewName("use_dashboard_v2")
)
func MustNewRegistry() featuretypes.Registry {
@@ -89,14 +88,6 @@ func MustNewRegistry() featuretypes.Registry {
DefaultVariant: featuretypes.MustNewName("disabled"),
Variants: featuretypes.NewBooleanVariants(),
},
&featuretypes.Feature{
Name: FeatureEnableMetricsReduction,
Kind: featuretypes.KindBoolean,
Stage: featuretypes.StageExperimental,
Description: "Controls whether metrics cardinality reduction (buffer/reduced tables) is read by the querier",
DefaultVariant: featuretypes.MustNewName("disabled"),
Variants: featuretypes.NewBooleanVariants(),
},
)
if err != nil {
panic(err)

View File

@@ -341,12 +341,12 @@ func alignedMetricWindow(startMs, endMs int64) (
}
tsAdjustedStartMs, _, distributedTSTable, localTSTable := telemetrymetrics.WhichTSTableToUse(
samplesAdjustedStartMs, flooredEndMs, false, nil,
samplesAdjustedStartMs, flooredEndMs, nil,
)
distributedSamplesTable, localSamplesTable := telemetrymetrics.WhichSamplesTableToUse(
samplesAdjustedStartMs, flooredEndMs,
metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, false, nil,
metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, nil,
)
return samplesAdjustedStartMs, flooredEndMs, tsAdjustedStartMs, distributedTSTable, localTSTable, distributedSamplesTable, localSamplesTable

View File

@@ -141,7 +141,7 @@ func (m *module) listMetrics(ctx context.Context, orgID valuer.UUID, params *met
sb.Select("DISTINCT metric_name")
if params.Start != nil && params.End != nil {
start, end, distributedTsTable, _ := telemetrymetrics.WhichTSTableToUse(uint64(*params.Start), uint64(*params.End), false, nil)
start, end, distributedTsTable, _ := telemetrymetrics.WhichTSTableToUse(uint64(*params.Start), uint64(*params.End), nil)
sb.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, distributedTsTable))
sb.Where(sb.Between("unix_milli", start, end))
} else {
@@ -527,7 +527,7 @@ func (m *module) InspectMetrics(
return nil, err
}
tsStart, _, tsTable, _ := telemetrymetrics.WhichTSTableToUse(start, end, false, nil)
tsStart, _, tsTable, _ := telemetrymetrics.WhichTSTableToUse(start, end, nil)
tsSb := sqlbuilder.NewSelectBuilder()
tsSb.Select("fingerprint", "labels")
tsSb.From(fmt.Sprintf("%s.%s", telemetrymetrics.DBName, tsTable))
@@ -971,8 +971,8 @@ func (m *module) fetchMetricsStatsWithSamples(
}
}
start, end, distributedTsTable, localTsTable := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), false, nil)
distributedSamplesTable, _ := telemetrymetrics.WhichSamplesTableToUse(uint64(req.Start), uint64(req.End), metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, false, nil)
start, end, distributedTsTable, localTsTable := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), nil)
distributedSamplesTable, _ := telemetrymetrics.WhichSamplesTableToUse(uint64(req.Start), uint64(req.End), metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, nil)
countExp := telemetrymetrics.CountExpressionForSamplesTable(distributedSamplesTable)
// Timeseries counts per metric
@@ -1100,7 +1100,7 @@ func (m *module) computeTimeseriesTreemap(ctx context.Context, req *metricsexplo
}
}
start, end, distributedTsTable, _ := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), false, nil)
start, end, distributedTsTable, _ := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), nil)
totalTSBuilder := sqlbuilder.NewSelectBuilder()
totalTSBuilder.Select("uniq(fingerprint) AS total_time_series")
@@ -1176,8 +1176,8 @@ func (m *module) computeSamplesTreemap(ctx context.Context, req *metricsexplorer
}
}
start, end, distributedTsTable, localTsTable := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), false, nil)
distributedSamplesTable, _ := telemetrymetrics.WhichSamplesTableToUse(uint64(req.Start), uint64(req.End), metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, false, nil)
start, end, distributedTsTable, localTsTable := telemetrymetrics.WhichTSTableToUse(uint64(req.Start), uint64(req.End), nil)
distributedSamplesTable, _ := telemetrymetrics.WhichSamplesTableToUse(uint64(req.Start), uint64(req.End), metrictypes.UnspecifiedType, metrictypes.TimeAggregationUnspecified, nil)
countExp := telemetrymetrics.CountExpressionForSamplesTable(distributedSamplesTable)
candidateLimit := req.Limit + 50

View File

@@ -91,22 +91,13 @@ func (q *builderQuery[T]) Fingerprint() string {
if a.ComparisonSpaceAggregationParam != nil {
spaceAggParamStr = a.ComparisonSpaceAggregationParam.StringValue()
}
part := fmt.Sprintf("%s:%s:%s:%s:%s",
aggParts = append(aggParts, fmt.Sprintf("%s:%s:%s:%s:%s",
a.MetricName,
a.Temporality.StringValue(),
a.TimeAggregation.StringValue(),
a.SpaceAggregation.StringValue(),
spaceAggParamStr,
)
if a.Reduced {
oneDay := uint64(24 * time.Hour.Milliseconds())
route := "reduced"
if q.toMS-q.fromMS < oneDay && q.fromMS >= uint64(time.Now().UnixMilli())-oneDay {
route = "buffer"
}
part += ":" + route
}
aggParts = append(aggParts, part)
))
}
}
parts = append(parts, fmt.Sprintf("aggs=[%s]", strings.Join(aggParts, ",")))

View File

@@ -119,7 +119,7 @@ func (q *querier) QueryRange(ctx context.Context, orgID valuer.UUID, req *qbtype
queries := make(map[string]qbtypes.Query)
steps := make(map[string]qbtypes.Step)
missingMetricQueries, metricWarnings, err := q.resolveMetricMetadata(ctx, orgID, req.CompositeQuery.Queries, req.Start, req.End)
missingMetricQueries, metricWarnings, err := q.resolveMetricMetadata(ctx, req.CompositeQuery.Queries, req.Start, req.End)
if err != nil {
return nil, err
}
@@ -304,7 +304,7 @@ func (q *querier) populateQBEvent(event *qbtypes.QBEvent, queries []qbtypes.Quer
// resolved: never-seen metrics and dormant metrics (seen but no data in
// the query window).
// - err: Internal when a metadata fetch fails.
func (q *querier) resolveMetricMetadata(ctx context.Context, orgID valuer.UUID, queries []qbtypes.QueryEnvelope, start, end uint64) (missingMetricQueries []string, metricWarnings []string, err error) {
func (q *querier) resolveMetricMetadata(ctx context.Context, queries []qbtypes.QueryEnvelope, start, end uint64) (missingMetricQueries []string, metricWarnings []string, err error) {
metricNames := make([]string, 0)
for idx := range queries {
if queries[idx].Type != qbtypes.QueryTypeBuilder {
@@ -325,7 +325,7 @@ func (q *querier) resolveMetricMetadata(ctx context.Context, orgID valuer.UUID,
return nil, nil, nil
}
metricTemporality, metricTypes, reducedMetricsSet, err := q.metadataStore.FetchTemporalityAndTypeMulti(ctx, orgID, start, end, metricNames...)
metricTemporality, metricTypes, err := q.metadataStore.FetchTemporalityAndTypeMulti(ctx, start, end, metricNames...)
if err != nil {
q.logger.WarnContext(ctx, "failed to fetch metric temporality", errors.Attr(err), slog.Any("metrics", metricNames))
return nil, nil, errors.NewInternalf(errors.CodeInternal, "failed to fetch metrics temporality")
@@ -362,9 +362,6 @@ func (q *querier) resolveMetricMetadata(ctx context.Context, orgID valuer.UUID,
if err := spec.Aggregations[i].ValidateForType(); err != nil {
return nil, nil, err
}
if reducedMetricsSet[spec.Aggregations[i].MetricName] {
spec.Aggregations[i].Reduced = true
}
presentAggregations = append(presentAggregations, spec.Aggregations[i])
}
if len(presentAggregations) == 0 {

View File

@@ -2136,12 +2136,12 @@ func (t *telemetryMetaStore) GetAllValues(ctx context.Context, fieldValueSelecto
return values, complete, nil
}
func (t *telemetryMetaStore) FetchTemporality(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error) {
func (t *telemetryMetaStore) FetchTemporality(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error) {
if metricName == "" {
return metrictypes.Unknown, errors.Newf(errors.TypeInternal, errors.CodeInternal, "metric name cannot be empty")
}
temporalityMap, err := t.FetchTemporalityMulti(ctx, orgID, queryTimeRangeStartTs, queryTimeRangeEndTs, metricName)
temporalityMap, err := t.FetchTemporalityMulti(ctx, queryTimeRangeStartTs, queryTimeRangeEndTs, metricName)
if err != nil {
return metrictypes.Unknown, err
}
@@ -2154,27 +2154,25 @@ func (t *telemetryMetaStore) FetchTemporality(ctx context.Context, orgID valuer.
return temporality, nil
}
func (t *telemetryMetaStore) FetchTemporalityMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error) {
temporalities, _, _, err := t.FetchTemporalityAndTypeMulti(ctx, orgID, queryTimeRangeStartTs, queryTimeRangeEndTs, metricNames...)
func (t *telemetryMetaStore) FetchTemporalityMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error) {
temporalities, _, err := t.FetchTemporalityAndTypeMulti(ctx, queryTimeRangeStartTs, queryTimeRangeEndTs, metricNames...)
return temporalities, err
}
func (t *telemetryMetaStore) FetchTemporalityAndTypeMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, map[string]bool, error) {
func (t *telemetryMetaStore) FetchTemporalityAndTypeMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, error) {
if len(metricNames) == 0 {
return make(map[string]metrictypes.Temporality), make(map[string]metrictypes.Type), make(map[string]bool), nil
return make(map[string]metrictypes.Temporality), make(map[string]metrictypes.Type), nil
}
reductionEnabled := t.fl.BooleanOrEmpty(ctx, flagger.FeatureEnableMetricsReduction, featuretypes.NewFlaggerEvaluationContext(orgID))
temporalities := make(map[string]metrictypes.Temporality)
types := make(map[string]metrictypes.Type)
metricsTemporality, metricTypes, reduced, err := t.fetchMetricsTemporalityAndType(ctx, queryTimeRangeStartTs, queryTimeRangeEndTs, reductionEnabled, metricNames...)
metricsTemporality, metricTypes, err := t.fetchMetricsTemporalityAndType(ctx, queryTimeRangeStartTs, queryTimeRangeEndTs, metricNames...)
if err != nil {
return nil, nil, nil, err
return nil, nil, err
}
meterMetricsTemporality, meterMetricsTypes, err := t.fetchMeterSourceMetricsTemporalityAndType(ctx, metricNames...)
if err != nil {
return nil, nil, nil, err
return nil, nil, err
}
// For metrics not found in the database, set to Unknown
@@ -2199,10 +2197,10 @@ func (t *telemetryMetaStore) FetchTemporalityAndTypeMulti(ctx context.Context, o
}
}
return temporalities, types, reduced, nil
return temporalities, types, nil
}
func (t *telemetryMetaStore) fetchMetricsTemporalityAndType(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, reductionEnabled bool, metricNames ...string) (map[string][]metrictypes.Temporality, map[string]metrictypes.Type, map[string]bool, error) {
func (t *telemetryMetaStore) fetchMetricsTemporalityAndType(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string][]metrictypes.Temporality, map[string]metrictypes.Type, error) {
ctx = ctxtypes.NewContextWithCommentVals(ctx, map[string]string{
instrumentationtypes.TelemetrySignal: telemetrytypes.SignalMetrics.StringValue(),
instrumentationtypes.CodeNamespace: "metadata",
@@ -2210,58 +2208,48 @@ func (t *telemetryMetaStore) fetchMetricsTemporalityAndType(ctx context.Context,
})
temporalities := make(map[string][]metrictypes.Temporality)
types := make(map[string]metrictypes.Type)
reduced := make(map[string]bool)
adjustedStartTs, adjustedEndTs, tsTableName, _ := telemetrymetrics.WhichTSTableToUse(queryTimeRangeStartTs, queryTimeRangeEndTs, false, nil)
adjustedStartTs, adjustedEndTs, tsTableName, _ := telemetrymetrics.WhichTSTableToUse(queryTimeRangeStartTs, queryTimeRangeEndTs, nil)
cols := []string{"metric_name", "temporality", "any(type) AS type", "any(is_monotonic) as is_monotonic"}
// Build query to fetch temporality for all metrics
// We use attr_string_value where attr_name = '__temporality__'
// Note: The columns are mixed in the current data - temporality column contains metric_name
// and metric_name column contains temporality value, so we use the correct mapping
sb := sqlbuilder.Select(
"metric_name",
"temporality",
"any(type) AS type",
"any(is_monotonic) as is_monotonic",
).
From(t.metricsDBName + "." + tsTableName)
// When reduction is enabled, fold the reduced-catalog presence check into the
// same query so a metric's reduced status comes back in one round trip.
var reducedArgs []any
if reductionEnabled {
rs := sqlbuilder.NewSelectBuilder()
rs.Select("metric_name")
rs.From(t.metricsDBName + "." + telemetrymetrics.TimeseriesV4ReducedTableName)
rs.Where(rs.In("metric_name", metricNames), rs.GTE("unix_milli", adjustedStartTs), rs.LT("unix_milli", adjustedEndTs))
rs.GroupBy("metric_name")
rsQuery, rsArgs := rs.BuildWithFlavor(sqlbuilder.ClickHouse)
cols = append(cols, fmt.Sprintf("metric_name GLOBAL IN (%s) AS reduced", rsQuery))
reducedArgs = rsArgs
}
sb := sqlbuilder.NewSelectBuilder()
sb.Select(cols...)
sb.From(t.metricsDBName + "." + tsTableName)
// Filter by metric names (in the temporality column due to data mix-up)
sb.Where(
sb.In("metric_name", metricNames),
sb.GTE("unix_milli", adjustedStartTs),
sb.LT("unix_milli", adjustedEndTs),
)
sb.GroupBy("metric_name", "temporality")
query, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse, reducedArgs...)
query, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
t.logger.DebugContext(ctx, "fetching metric temporality", slog.String("query", query), slog.Any("args", args))
rows, err := t.telemetrystore.ClickhouseDB().Query(ctx, query, args...)
if err != nil {
return nil, nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "failed to fetch metric temporality")
return nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "failed to fetch metric temporality")
}
defer rows.Close()
// Process results
for rows.Next() {
var metricName string
var temporality metrictypes.Temporality
var metricType metrictypes.Type
var isMonotonic bool
var isReduced uint8
dest := []any{&metricName, &temporality, &metricType, &isMonotonic}
if reductionEnabled {
dest = append(dest, &isReduced)
}
if err := rows.Scan(dest...); err != nil {
return nil, nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "failed to scan temporality result")
if err := rows.Scan(&metricName, &temporality, &metricType, &isMonotonic); err != nil {
return nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "failed to scan temporality result")
}
if temporality != metrictypes.Unknown {
temporalities[metricName] = append(temporalities[metricName], temporality)
@@ -2270,15 +2258,12 @@ func (t *telemetryMetaStore) fetchMetricsTemporalityAndType(ctx context.Context,
metricType = metrictypes.GaugeType
}
types[metricName] = metricType
if isReduced != 0 {
reduced[metricName] = true
}
}
if err := rows.Err(); err != nil {
return nil, nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "error iterating over metrics temporality rows")
return nil, nil, errors.Wrapf(err, errors.TypeInternal, errors.CodeInternal, "error iterating over metrics temporality rows")
}
return temporalities, types, reduced, nil
return temporalities, types, nil
}
func (t *telemetryMetaStore) fetchMeterSourceMetricsTemporalityAndType(ctx context.Context, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, error) {

View File

@@ -1,157 +0,0 @@
package telemetrymetrics
import (
"context"
"testing"
"time"
"github.com/SigNoz/signoz/pkg/flagger"
"github.com/SigNoz/signoz/pkg/instrumentation/instrumentationtest"
"github.com/SigNoz/signoz/pkg/types/metrictypes"
qbtypes "github.com/SigNoz/signoz/pkg/types/querybuildertypes/querybuildertypesv5"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes/telemetrytypestest"
"github.com/stretchr/testify/require"
)
func reducedQuery(metric string, ty metrictypes.Type, temp metrictypes.Temporality, ta metrictypes.TimeAggregation, sa metrictypes.SpaceAggregation) qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation] {
return qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]{
Signal: telemetrytypes.SignalMetrics,
StepInterval: qbtypes.Step{Duration: 5 * time.Minute},
Aggregations: []qbtypes.MetricAggregation{{
MetricName: metric,
Type: ty,
Temporality: temp,
TimeAggregation: ta,
SpaceAggregation: sa,
Reduced: true,
}},
}
}
func TestReducedStatementBuilder(t *testing.T) {
cases := []struct {
name string
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]
expected qbtypes.Statement
}{
{
name: "gauge_sum_latest",
query: reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationLatest, metrictypes.SpaceAggregationSum),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, anyLast(last) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, argMax(value, unix_milli) AS per_series_value FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum_last`, computed_at) AS value FROM signoz_metrics.distributed_samples_v4_reduced_last_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "unspecified", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
{
name: "gauge_avg_avg",
query: reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationAvg, metrictypes.SpaceAggregationAvg),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, sum(sum) / sum(count) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, avg(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, avg(value) AS per_series_value, avg(weight) AS per_series_weight FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum_last`, computed_at) AS value, argMax(`count_series`, computed_at) AS weight FROM signoz_metrics.distributed_samples_v4_reduced_last_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) / sum(per_series_weight) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "unspecified", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
{
name: "gauge_min_min",
query: reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationMin, metrictypes.SpaceAggregationMin),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, min(min) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, min(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, min(value) AS per_series_value FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`min`, computed_at) AS value FROM signoz_metrics.distributed_samples_v4_reduced_last_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, min(per_series_value) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "unspecified", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
{
name: "gauge_max_max",
query: reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationMax, metrictypes.SpaceAggregationMax),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, max(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(value) AS per_series_value FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`max`, computed_at) AS value FROM signoz_metrics.distributed_samples_v4_reduced_last_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, max(per_series_value) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "unspecified", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
{
name: "counter_sum_rate",
query: reducedQuery("test.metric", metrictypes.SumType, metrictypes.Cumulative, metrictypes.TimeAggregationRate, metrictypes.SpaceAggregationSum),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT ts, multiIf(row_number() OVER rate_window = 1, nan, (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) < 0, per_series_value / (ts - lagInFrame(ts, 1) OVER rate_window), (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) / (ts - lagInFrame(ts, 1) OVER rate_window)) AS per_series_value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window AS (PARTITION BY fingerprint ORDER BY fingerprint, ts)), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, sum(value) / 300 AS per_series_value FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum`, computed_at) AS value FROM signoz_metrics.distributed_samples_v4_reduced_sum_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "cumulative", false, "test.metric", uint64(1746999600000), uint64(1747172760000), 0, "test.metric", uint64(1746999600000), uint64(1747172760000), "test.metric", uint64(1746999600000), uint64(1747172760000), false},
},
},
{
name: "counter_avg_increase",
query: reducedQuery("test.metric", metrictypes.SumType, metrictypes.Cumulative, metrictypes.TimeAggregationIncrease, metrictypes.SpaceAggregationAvg),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT ts, multiIf(row_number() OVER rate_window = 1, nan, (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) < 0, per_series_value, per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) AS per_series_value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window AS (PARTITION BY fingerprint ORDER BY fingerprint, ts)), __spatial_aggregation_cte AS (SELECT ts, avg(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, sum(value) AS per_series_value, avg(weight) AS per_series_weight FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum`, computed_at) AS value, argMax(`count_series`, computed_at) AS weight FROM signoz_metrics.distributed_samples_v4_reduced_sum_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) / sum(per_series_weight) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "cumulative", false, "test.metric", uint64(1746999600000), uint64(1747172760000), 0, "test.metric", uint64(1746999600000), uint64(1747172760000), "test.metric", uint64(1746999600000), uint64(1747172760000), false},
},
},
{
name: "counter_min_omitted",
query: reducedQuery("test.metric", metrictypes.SumType, metrictypes.Cumulative, metrictypes.TimeAggregationRate, metrictypes.SpaceAggregationMin),
expected: qbtypes.Statement{
Query: "WITH __temporal_aggregation_cte AS (SELECT ts, multiIf(row_number() OVER rate_window = 1, nan, (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) < 0, per_series_value / (ts - lagInFrame(ts, 1) OVER rate_window), (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) / (ts - lagInFrame(ts, 1) OVER rate_window)) AS per_series_value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window AS (PARTITION BY fingerprint ORDER BY fingerprint, ts)), __spatial_aggregation_cte AS (SELECT ts, min(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "cumulative", false, "test.metric", uint64(1746999600000), uint64(1747172760000), 0},
},
},
{
name: "counter_max_omitted",
query: reducedQuery("test.metric", metrictypes.SumType, metrictypes.Cumulative, metrictypes.TimeAggregationRate, metrictypes.SpaceAggregationMax),
expected: qbtypes.Statement{
Query: "WITH __temporal_aggregation_cte AS (SELECT ts, multiIf(row_number() OVER rate_window = 1, nan, (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) < 0, per_series_value / (ts - lagInFrame(ts, 1) OVER rate_window), (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) / (ts - lagInFrame(ts, 1) OVER rate_window)) AS per_series_value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts) WINDOW rate_window AS (PARTITION BY fingerprint ORDER BY fingerprint, ts)), __spatial_aggregation_cte AS (SELECT ts, max(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "cumulative", false, "test.metric", uint64(1746999600000), uint64(1747172760000), 0},
},
},
{
name: "histogram_p99",
query: reducedQuery("test.metric", metrictypes.HistogramType, metrictypes.Cumulative, metrictypes.TimeAggregationUnspecified, metrictypes.SpaceAggregationPercentile99),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT ts, `le`, multiIf(row_number() OVER rate_window = 1, nan, (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) < 0, per_series_value / (ts - lagInFrame(ts, 1) OVER rate_window), (per_series_value - lagInFrame(per_series_value, 1) OVER rate_window) / (ts - lagInFrame(ts, 1) OVER rate_window)) AS per_series_value FROM (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, `le`, max(max) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint, JSONExtractString(labels, 'le') AS `le` FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint, `le`) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts, `le` ORDER BY fingerprint, ts) WINDOW rate_window AS (PARTITION BY fingerprint ORDER BY fingerprint, ts)), __spatial_aggregation_cte AS (SELECT ts, `le`, sum(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts, `le`) SELECT ts, histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.990) AS value FROM __spatial_aggregation_cte GROUP BY ts ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, `le`, sum(value) / 300 AS per_series_value FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum`, computed_at) AS value FROM signoz_metrics.distributed_samples_v4_reduced_sum_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint, JSONExtractString(labels, 'le') AS `le` FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint, `le`) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts, `le`), __spatial_aggregation_cte AS (SELECT ts, `le`, sum(per_series_value) AS value FROM __temporal_aggregation_cte GROUP BY ts, `le`) SELECT ts, histogramQuantile(arrayMap(x -> toFloat64(x), groupArray(le)), groupArray(value), 0.990) AS value FROM __spatial_aggregation_cte GROUP BY ts ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "cumulative", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
{
name: "summary_avg",
query: reducedQuery("test.metric", metrictypes.SummaryType, metrictypes.Unspecified, metrictypes.TimeAggregationAvg, metrictypes.SpaceAggregationAvg),
expected: qbtypes.Statement{
Query: "SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, sum(sum) / sum(count) AS per_series_value FROM signoz_metrics.distributed_samples_v4_agg_5m AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.time_series_v4_1day WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND LOWER(temporality) LIKE LOWER(?) AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY fingerprint, ts ORDER BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, avg(per_series_value) AS value FROM __temporal_aggregation_cte WHERE isNaN(per_series_value) = ? GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) UNION ALL SELECT * FROM (WITH __temporal_aggregation_cte AS (SELECT fingerprint, toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(300)) AS ts, avg(value) AS per_series_value, avg(weight) AS per_series_weight FROM (SELECT reduced_fingerprint AS fingerprint, unix_milli, argMax(`sum_last`, computed_at) AS value, argMax(`count_series`, computed_at) AS weight FROM signoz_metrics.distributed_samples_v4_reduced_last_60s WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli < ? GROUP BY reduced_fingerprint, unix_milli) AS points INNER JOIN (SELECT fingerprint FROM signoz_metrics.distributed_time_series_v4_reduced WHERE metric_name IN (?) AND unix_milli >= ? AND unix_milli <= ? AND __normalized = ? GROUP BY fingerprint) AS filtered_time_series ON points.fingerprint = filtered_time_series.fingerprint GROUP BY fingerprint, ts), __spatial_aggregation_cte AS (SELECT ts, sum(per_series_value) / sum(per_series_weight) AS value FROM __temporal_aggregation_cte GROUP BY ts) SELECT * FROM __spatial_aggregation_cte ORDER BY ts) ORDER BY ts",
Args: []any{"test.metric", uint64(1746921600000), uint64(1747172760000), "unspecified", false, "test.metric", uint64(1746999900000), uint64(1747172760000), 0, "test.metric", uint64(1746999900000), uint64(1747172760000), "test.metric", uint64(1746999900000), uint64(1747172760000), false},
},
},
}
fm := NewFieldMapper()
cb := NewConditionBuilder(fm)
fl, err := flagger.New(context.Background(), instrumentationtest.New().ToProviderSettings(), flagger.Config{}, flagger.MustNewRegistry())
require.NoError(t, err)
sb := NewMetricQueryStatementBuilder(instrumentationtest.New().ToProviderSettings(), telemetrytypestest.NewMockMetadataStore(), fm, cb, fl)
const start, end = uint64(1747000000000), uint64(1747172800000)
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
got, err := sb.Build(context.Background(), start, end, qbtypes.RequestTypeTimeSeries, c.query, nil)
require.NoError(t, err)
require.Equal(t, c.expected.Query, got.Query)
require.Equal(t, c.expected.Args, got.Args)
})
}
t.Run("buffer_recent_window", func(t *testing.T) {
now := time.Now().UnixMilli()
q := reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationLatest, metrictypes.SpaceAggregationSum)
got, err := sb.Build(context.Background(), uint64(now-2*time.Hour.Milliseconds()), uint64(now), qbtypes.RequestTypeTimeSeries, q, nil)
require.NoError(t, err)
require.Contains(t, got.Query, "signoz_metrics.distributed_samples_v4_buffer")
require.Contains(t, got.Query, "signoz_metrics.time_series_v4_buffer")
require.Contains(t, got.Query, "is_reduced")
require.NotContains(t, got.Query, "UNION ALL")
})
t.Run("not_reduced", func(t *testing.T) {
q := reducedQuery("test.metric", metrictypes.GaugeType, metrictypes.Unspecified, metrictypes.TimeAggregationLatest, metrictypes.SpaceAggregationSum)
q.Aggregations[0].Reduced = false
got, err := sb.Build(context.Background(), start, end, qbtypes.RequestTypeTimeSeries, q, nil)
require.NoError(t, err)
require.NotContains(t, got.Query, "UNION ALL")
require.NotContains(t, got.Query, "reduced")
require.NotContains(t, got.Query, "buffer")
})
}

View File

@@ -4,7 +4,6 @@ import (
"context"
"fmt"
"log/slog"
"time"
"github.com/SigNoz/signoz/pkg/factory"
"github.com/SigNoz/signoz/pkg/flagger"
@@ -181,30 +180,19 @@ func (b *MetricQueryStatementBuilder) buildPipelineStatement(
query.Aggregations[0].SpaceAggregation = metrictypes.SpaceAggregationSum
}
agg := query.Aggregations[0]
// A reduced metric reads the raw buffer for recent short windows, and
// samples_v4/agg (unioned with the reduced tables) otherwise. The buffer is
// shaped exactly like samples_v4 / time_series_v4, so once the table names are
// chosen the rest of the pipeline is unchanged.
useBuffer := agg.Reduced &&
end-start < oneDayInMilliseconds &&
start >= uint64(time.Now().UnixMilli())-oneDayInMilliseconds
samplesTable, _ := WhichSamplesTableToUse(start, end, agg.Type, agg.TimeAggregation, useBuffer, agg.TableHints)
tsStart, tsEnd, _, tsTable := WhichTSTableToUse(start, end, useBuffer, agg.TableHints)
var timeSeriesCTE string
var timeSeriesCTEArgs []any
var err error
if timeSeriesCTE, timeSeriesCTEArgs, err = b.buildTimeSeriesCTE(ctx, tsStart, tsEnd, query, keys, variables, tsTable); err != nil {
// time_series_cte
// this is applicable for all the queries
if timeSeriesCTE, timeSeriesCTEArgs, err = b.buildTimeSeriesCTE(ctx, start, end, query, keys, variables); err != nil {
return nil, err
}
if qbtypes.CanShortCircuitDelta(query.Aggregations[0]) {
// spatial_aggregation_cte directly for certain delta queries
if frag, args, err := b.buildTemporalAggDeltaFastPath(start, end, query, samplesTable, timeSeriesCTE, timeSeriesCTEArgs); err != nil {
if frag, args, err := b.buildTemporalAggDeltaFastPath(start, end, query, timeSeriesCTE, timeSeriesCTEArgs); err != nil {
return nil, err
} else if frag != "" {
cteFragments = append(cteFragments, frag)
@@ -212,7 +200,7 @@ func (b *MetricQueryStatementBuilder) buildPipelineStatement(
}
} else {
// temporal_aggregation_cte
if frag, args, err := b.buildTemporalAggregationCTE(ctx, start, end, query, keys, samplesTable, timeSeriesCTE, timeSeriesCTEArgs); err != nil {
if frag, args, err := b.buildTemporalAggregationCTE(ctx, start, end, query, keys, timeSeriesCTE, timeSeriesCTEArgs); err != nil {
return nil, err
} else if frag != "" {
cteFragments = append(cteFragments, frag)
@@ -226,188 +214,18 @@ func (b *MetricQueryStatementBuilder) buildPipelineStatement(
}
}
var reducedFragments []string
var reducedArgs [][]any
if agg.Reduced && !useBuffer {
var tsCTE string
var tsArgs []any
if tsCTE, tsArgs, err = b.buildReducedTimeSeriesCTE(ctx, start, end, query, keys, variables); err != nil {
return nil, err
}
if temporalFrag, temporalArgs, ok := b.buildReducedTemporalAggregationCTE(start, end, query, tsCTE, tsArgs); ok {
spatialFrag, spatialArgs := b.buildReducedSpatialAggregationCTE(query)
reducedFragments = []string{temporalFrag, spatialFrag}
reducedArgs = [][]any{temporalArgs, spatialArgs}
}
}
// reset the query to the original state
query.Aggregations[0].SpaceAggregation = origSpaceAgg
query.Aggregations[0].TimeAggregation = origTimeAgg
query.GroupBy = origGroupBy
mainStmt, err := b.BuildFinalSelect(cteFragments, cteArgs, query)
if err != nil {
return nil, err
}
if reducedFragments == nil {
return mainStmt, nil
}
reducedStmt, err := b.BuildFinalSelect(reducedFragments, reducedArgs, query)
if err != nil {
return nil, err
}
return unionStatements(mainStmt, reducedStmt, query)
}
func unionStatements(main, reduced *qbtypes.Statement, query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation]) (*qbtypes.Statement, error) {
orderBy := "ts"
for _, g := range query.GroupBy {
orderBy = fmt.Sprintf("`%s`, ", g.Name) + orderBy
}
q := fmt.Sprintf("SELECT * FROM (%s) UNION ALL SELECT * FROM (%s) ORDER BY %s", main.Query, reduced.Query, orderBy)
args := append(append([]any{}, main.Args...), reduced.Args...)
warnings := append(append([]string{}, main.Warnings...), reduced.Warnings...)
return &qbtypes.Statement{Query: q, Args: args, Warnings: warnings}, nil
}
func (b *MetricQueryStatementBuilder) buildReducedTimeSeriesCTE(
ctx context.Context,
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
keys map[string][]*telemetrytypes.TelemetryFieldKey,
variables map[string]qbtypes.VariableItem,
) (string, []any, error) {
sb := sqlbuilder.NewSelectBuilder()
var preparedWhereClause querybuilder.PreparedWhereClause
var err error
if query.Filter != nil && query.Filter.Expression != "" {
preparedWhereClause, err = querybuilder.PrepareWhereClause(query.Filter.Expression, querybuilder.FilterExprVisitorOpts{
Context: ctx,
Logger: b.logger,
FieldMapper: b.fm,
ConditionBuilder: b.cb,
FieldKeys: keys,
FullTextColumn: &telemetrytypes.TelemetryFieldKey{Name: "labels"},
Variables: variables,
StartNs: start,
EndNs: end,
})
if err != nil {
return "", nil, err
}
}
sb.From(fmt.Sprintf("%s.%s", DBName, TimeseriesV4ReducedTableName))
sb.Select("fingerprint")
for _, g := range query.GroupBy {
col, err := b.fm.ColumnExpressionFor(ctx, start, end, &g.TelemetryFieldKey, keys)
if err != nil {
return "", nil, err
}
sb.SelectMore(col)
}
sb.Where(
sb.In("metric_name", query.Aggregations[0].MetricName),
sb.GTE("unix_milli", start),
sb.LTE("unix_milli", end),
sb.EQ("__normalized", false),
)
if !preparedWhereClause.IsEmpty() {
sb.AddWhereClause(preparedWhereClause.WhereClause)
}
sb.GroupBy("fingerprint")
sb.GroupBy(querybuilder.GroupByKeys(query.GroupBy)...)
q, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
return fmt.Sprintf("(%s) AS filtered_time_series", q), args, nil
}
func (b *MetricQueryStatementBuilder) buildReducedTemporalAggregationCTE(
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, bool) {
agg := query.Aggregations[0]
stepSec := int64(query.StepInterval.Seconds())
value, weight, ok := ReducedValueColumn(agg.Type, agg.SpaceAggregation)
if !ok {
return "", nil, false
}
// dedup recomputed buckets: latest computed_at wins per (series, 60s bucket)
dedup := sqlbuilder.NewSelectBuilder()
dedup.Select("reduced_fingerprint AS fingerprint", "unix_milli")
dedup.SelectMore(fmt.Sprintf("argMax(%s, computed_at) AS value", value))
if weight != "" {
dedup.SelectMore(fmt.Sprintf("argMax(%s, computed_at) AS weight", weight))
}
dedup.From(fmt.Sprintf("%s.%s", DBName, WhichReducedSamplesTableToUse(agg.Type)))
dedup.Where(
dedup.In("metric_name", agg.MetricName),
dedup.GTE("unix_milli", start),
dedup.LT("unix_milli", end),
)
dedup.GroupBy("reduced_fingerprint", "unix_milli")
dedupQuery, dedupArgs := dedup.BuildWithFlavor(sqlbuilder.ClickHouse)
sb := sqlbuilder.NewSelectBuilder()
sb.Select("fingerprint")
sb.SelectMore(fmt.Sprintf("toStartOfInterval(toDateTime(intDiv(unix_milli, 1000)), toIntervalSecond(%d)) AS ts", stepSec))
for _, g := range query.GroupBy {
sb.SelectMore(fmt.Sprintf("`%s`", g.Name))
}
sb.SelectMore(fmt.Sprintf("%s AS per_series_value", ReducedTimeAggregationColumn(agg.TimeAggregation, stepSec)))
if weight != "" {
// count_series is a series count, not additive over time, so the avg
// denominator is reduced with avg
sb.SelectMore("avg(weight) AS per_series_weight")
}
sb.From(fmt.Sprintf("(%s) AS points", dedupQuery))
sb.JoinWithOption(sqlbuilder.InnerJoin, timeSeriesCTE, "points.fingerprint = filtered_time_series.fingerprint")
sb.GroupBy("fingerprint", "ts")
sb.GroupBy(querybuilder.GroupByKeys(query.GroupBy)...)
initArgs := append(append([]any{}, dedupArgs...), timeSeriesCTEArgs...)
q, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse, initArgs...)
return fmt.Sprintf("__temporal_aggregation_cte AS (%s)", q), args, true
}
func (b *MetricQueryStatementBuilder) buildReducedSpatialAggregationCTE(
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
) (string, []any) {
spatial := "sum(per_series_value)"
switch query.Aggregations[0].SpaceAggregation {
case metrictypes.SpaceAggregationAvg:
spatial = "sum(per_series_value) / sum(per_series_weight)"
case metrictypes.SpaceAggregationMin:
spatial = "min(per_series_value)"
case metrictypes.SpaceAggregationMax:
spatial = "max(per_series_value)"
}
sb := sqlbuilder.NewSelectBuilder()
sb.Select("ts")
for _, g := range query.GroupBy {
sb.SelectMore(fmt.Sprintf("`%s`", g.Name))
}
sb.SelectMore(spatial + " AS value")
sb.From("__temporal_aggregation_cte")
sb.GroupBy("ts")
sb.GroupBy(querybuilder.GroupByKeys(query.GroupBy)...)
q, args := sb.BuildWithFlavor(sqlbuilder.ClickHouse)
return fmt.Sprintf("__spatial_aggregation_cte AS (%s)", q), args
// final SELECT
return b.BuildFinalSelect(cteFragments, cteArgs, query)
}
func (b *MetricQueryStatementBuilder) buildTemporalAggDeltaFastPath(
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
samplesTable string,
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, error) {
@@ -424,7 +242,8 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggDeltaFastPath(
}
aggCol, err := AggregationColumnForSamplesTable(
samplesTable, query.Aggregations[0].Temporality, query.Aggregations[0].TimeAggregation,
start, end, query.Aggregations[0].Type, query.Aggregations[0].Temporality,
query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints,
)
if err != nil {
return "", nil, err
@@ -441,7 +260,8 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggDeltaFastPath(
sb.SelectMore(fmt.Sprintf("%s AS value", aggCol))
sb.From(fmt.Sprintf("%s.%s AS points", DBName, samplesTable))
tbl, _ := WhichSamplesTableToUse(start, end, query.Aggregations[0].Type, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
sb.From(fmt.Sprintf("%s.%s AS points", DBName, tbl))
sb.JoinWithOption(sqlbuilder.InnerJoin, timeSeriesCTE, "points.fingerprint = filtered_time_series.fingerprint")
sb.Where(
sb.In("metric_name", query.Aggregations[0].MetricName),
@@ -461,7 +281,6 @@ func (b *MetricQueryStatementBuilder) buildTimeSeriesCTE(
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
keys map[string][]*telemetrytypes.TelemetryFieldKey,
variables map[string]qbtypes.VariableItem,
tsTable string,
) (string, []any, error) {
sb := sqlbuilder.NewSelectBuilder()
@@ -485,7 +304,8 @@ func (b *MetricQueryStatementBuilder) buildTimeSeriesCTE(
}
}
sb.From(fmt.Sprintf("%s.%s", DBName, tsTable))
start, end, _, tbl := WhichTSTableToUse(start, end, query.Aggregations[0].TableHints)
sb.From(fmt.Sprintf("%s.%s", DBName, tbl))
sb.Select("fingerprint")
for _, g := range query.GroupBy {
@@ -511,12 +331,6 @@ func (b *MetricQueryStatementBuilder) buildTimeSeriesCTE(
sb.EQ("__normalized", false),
)
// the buffer holds both raw rows and the reduced catalog rows; the raw read
// only wants the original series
if tsTable == TimeseriesV4BufferLocalTableName {
sb.Where(sb.EQ("is_reduced", false))
}
if !preparedWhereClause.IsEmpty() {
sb.AddWhereClause(preparedWhereClause.WhereClause)
}
@@ -533,23 +347,21 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggregationCTE(
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
_ map[string][]*telemetrytypes.TelemetryFieldKey,
samplesTable string,
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, error) {
if query.Aggregations[0].Temporality == metrictypes.Delta {
return b.buildTemporalAggDelta(ctx, start, end, query, samplesTable, timeSeriesCTE, timeSeriesCTEArgs)
return b.buildTemporalAggDelta(ctx, start, end, query, timeSeriesCTE, timeSeriesCTEArgs)
} else if query.Aggregations[0].Temporality != metrictypes.Multiple {
return b.buildTemporalAggCumulativeOrUnspecified(ctx, start, end, query, samplesTable, timeSeriesCTE, timeSeriesCTEArgs)
return b.buildTemporalAggCumulativeOrUnspecified(ctx, start, end, query, timeSeriesCTE, timeSeriesCTEArgs)
}
return b.buildTemporalAggForMultipleTemporalities(ctx, start, end, query, samplesTable, timeSeriesCTE, timeSeriesCTEArgs)
return b.buildTemporalAggForMultipleTemporalities(ctx, start, end, query, timeSeriesCTE, timeSeriesCTEArgs)
}
func (b *MetricQueryStatementBuilder) buildTemporalAggDelta(
_ context.Context,
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
samplesTable string,
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, error) {
@@ -566,7 +378,7 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggDelta(
sb.SelectMore(fmt.Sprintf("`%s`", g.Name))
}
aggCol, err := AggregationColumnForSamplesTable(samplesTable, query.Aggregations[0].Temporality, query.Aggregations[0].TimeAggregation)
aggCol, err := AggregationColumnForSamplesTable(start, end, query.Aggregations[0].Type, query.Aggregations[0].Temporality, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
if err != nil {
return "", nil, err
}
@@ -577,7 +389,8 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggDelta(
sb.SelectMore(fmt.Sprintf("%s AS per_series_value", aggCol))
sb.From(fmt.Sprintf("%s.%s AS points", DBName, samplesTable))
tbl, _ := WhichSamplesTableToUse(start, end, query.Aggregations[0].Type, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
sb.From(fmt.Sprintf("%s.%s AS points", DBName, tbl))
sb.JoinWithOption(sqlbuilder.InnerJoin, timeSeriesCTE, "points.fingerprint = filtered_time_series.fingerprint")
sb.Where(
sb.In("metric_name", query.Aggregations[0].MetricName),
@@ -596,7 +409,6 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggCumulativeOrUnspecified(
_ context.Context,
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
samplesTable string,
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, error) {
@@ -612,13 +424,14 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggCumulativeOrUnspecified(
baseSb.SelectMore(fmt.Sprintf("`%s`", g.Name))
}
aggCol, err := AggregationColumnForSamplesTable(samplesTable, query.Aggregations[0].Temporality, query.Aggregations[0].TimeAggregation)
aggCol, err := AggregationColumnForSamplesTable(start, end, query.Aggregations[0].Type, query.Aggregations[0].Temporality, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
if err != nil {
return "", nil, err
}
baseSb.SelectMore(fmt.Sprintf("%s AS per_series_value", aggCol))
baseSb.From(fmt.Sprintf("%s.%s AS points", DBName, samplesTable))
tbl, _ := WhichSamplesTableToUse(start, end, query.Aggregations[0].Type, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
baseSb.From(fmt.Sprintf("%s.%s AS points", DBName, tbl))
baseSb.JoinWithOption(sqlbuilder.InnerJoin, timeSeriesCTE, "points.fingerprint = filtered_time_series.fingerprint")
baseSb.Where(
baseSb.In("metric_name", query.Aggregations[0].MetricName),
@@ -662,7 +475,6 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggForMultipleTemporalities(
_ context.Context,
start, end uint64,
query qbtypes.QueryBuilderQuery[qbtypes.MetricAggregation],
samplesTable string,
timeSeriesCTE string,
timeSeriesCTEArgs []any,
) (string, []any, error) {
@@ -677,11 +489,11 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggForMultipleTemporalities(
sb.SelectMore(fmt.Sprintf("`%s`", g.Name))
}
aggForDeltaTemporality, err := AggregationColumnForSamplesTable(samplesTable, metrictypes.Delta, query.Aggregations[0].TimeAggregation)
aggForDeltaTemporality, err := AggregationColumnForSamplesTable(start, end, query.Aggregations[0].Type, metrictypes.Delta, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
if err != nil {
return "", nil, err
}
aggForCumulativeTemporality, err := AggregationColumnForSamplesTable(samplesTable, metrictypes.Cumulative, query.Aggregations[0].TimeAggregation)
aggForCumulativeTemporality, err := AggregationColumnForSamplesTable(start, end, query.Aggregations[0].Type, metrictypes.Cumulative, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
if err != nil {
return "", nil, err
}
@@ -709,7 +521,8 @@ func (b *MetricQueryStatementBuilder) buildTemporalAggForMultipleTemporalities(
sb.SelectMore(expr)
}
sb.From(fmt.Sprintf("%s.%s AS points", DBName, samplesTable))
tbl, _ := WhichSamplesTableToUse(start, end, query.Aggregations[0].Type, query.Aggregations[0].TimeAggregation, query.Aggregations[0].TableHints)
sb.From(fmt.Sprintf("%s.%s AS points", DBName, tbl))
sb.JoinWithOption(sqlbuilder.InnerJoin, timeSeriesCTE, "points.fingerprint = filtered_time_series.fingerprint")
sb.Where(
sb.In("metric_name", query.Aggregations[0].MetricName),

View File

@@ -30,17 +30,6 @@ const (
TimeseriesV41weekLocalTableName = "time_series_v4_1week"
AttributesMetadataTableName = "distributed_metadata"
AttributesMetadataLocalTableName = "metadata"
// The buffer holds raw points for ~24h; the reduced tables hold 60s
// aggregates of dropped-label series.
SamplesV4BufferTableName = "distributed_samples_v4_buffer"
SamplesV4BufferLocalTableName = "samples_v4_buffer"
TimeseriesV4BufferTableName = "distributed_time_series_v4_buffer"
TimeseriesV4BufferLocalTableName = "time_series_v4_buffer"
SamplesV4ReducedLastTableName = "distributed_samples_v4_reduced_last_60s"
SamplesV4ReducedSumTableName = "distributed_samples_v4_reduced_sum_60s"
TimeseriesV4ReducedTableName = "distributed_time_series_v4_reduced"
TimeseriesV4ReducedLocalTableName = "time_series_v4_reduced"
)
var (
@@ -60,16 +49,8 @@ var (
// in that order.
func WhichTSTableToUse(
start, end uint64,
useBuffer bool,
tableHints *metrictypes.MetricTableHints,
) (uint64, uint64, string, string) {
// the buffer holds the recent raw window for reduced metrics and has the same
// shape as time_series_v4; round the start to the hour like the v4 table.
if useBuffer {
start = start - (start % (oneHourInMilliseconds))
return start, end, TimeseriesV4BufferTableName, TimeseriesV4BufferLocalTableName
}
// if we have a hint for the table, we need to use it
// the hint will be used to override the default table selection logic
if tableHints != nil {
@@ -168,20 +149,14 @@ func WhichSamplesTableToUse(
start, end uint64,
metricType metrictypes.Type,
timeAggregation metrictypes.TimeAggregation,
useBuffer bool,
tableHints *metrictypes.MetricTableHints,
) (string, string) {
// the buffer holds the recent raw window for reduced metrics; same shape as samples_v4
if useBuffer {
return SamplesV4BufferTableName, SamplesV4BufferLocalTableName
}
// if we have a hint for the table, we need to use it
// the hint will be used to override the default table selection logic.
// SamplesTableName is the distributed name; derive the local via switch.
if tableHints != nil && tableHints.SamplesTableName != "" {
switch tableHints.SamplesTableName {
case SamplesV4TableName, SamplesV4BufferTableName:
case SamplesV4TableName:
return SamplesV4TableName, SamplesV4LocalTableName
case SamplesV4Agg5mTableName:
return SamplesV4Agg5mTableName, SamplesV4Agg5mLocalTableName
@@ -213,10 +188,13 @@ func WhichSamplesTableToUse(
}
func AggregationColumnForSamplesTable(
tableName string,
start, end uint64,
metricType metrictypes.Type,
temporality metrictypes.Temporality,
timeAggregation metrictypes.TimeAggregation,
tableHints *metrictypes.MetricTableHints,
) (string, error) {
tableName, _ := WhichSamplesTableToUse(start, end, metricType, timeAggregation, tableHints)
var aggregationColumn string
switch temporality {
case metrictypes.Delta:
@@ -224,7 +202,7 @@ func AggregationColumnForSamplesTable(
// although it doesn't make sense to use anyLast, avg, min, max, count on delta metrics,
// we are keeping it here to make sure that query will not be invalid
switch tableName {
case SamplesV4TableName, SamplesV4BufferTableName:
case SamplesV4TableName:
switch timeAggregation {
case metrictypes.TimeAggregationLatest:
aggregationColumn = "anyLast(value)"
@@ -266,7 +244,7 @@ func AggregationColumnForSamplesTable(
// for cumulative metrics, we only support `RATE`/`INCREASE`. The max value in window is
// used to calculate the sum which is then divided by the window size to get the rate
switch tableName {
case SamplesV4TableName, SamplesV4BufferTableName:
case SamplesV4TableName:
switch timeAggregation {
case metrictypes.TimeAggregationLatest:
aggregationColumn = "anyLast(value)"
@@ -306,7 +284,7 @@ func AggregationColumnForSamplesTable(
}
case metrictypes.Unspecified:
switch tableName {
case SamplesV4TableName, SamplesV4BufferTableName:
case SamplesV4TableName:
switch timeAggregation {
case metrictypes.TimeAggregationLatest:
aggregationColumn = "anyLast(value)"
@@ -354,65 +332,6 @@ func AggregationColumnForSamplesTable(
return aggregationColumn, nil
}
// WhichReducedSamplesTableToUse returns the 60s reduced samples table for a metric
// type: the last_60s table for gauge-like series, the sum_60s table for counters
// and histograms.
func WhichReducedSamplesTableToUse(metricType metrictypes.Type) string {
if metricType == metrictypes.SumType || metricType == metrictypes.HistogramType {
return SamplesV4ReducedSumTableName
}
return SamplesV4ReducedLastTableName
}
// ReducedValueColumn returns the reduced value column (and the avg-denominator
// weight) for a space aggregation. The reduced columns are pre-aggregated across
// the original series, so the space aggregation picks the underlying value; the
// sum table only has `sum`, so min/max across series have no column (ok=false).
func ReducedValueColumn(metricType metrictypes.Type, space metrictypes.SpaceAggregation) (value, weight string, ok bool) {
if metricType == metrictypes.SumType || metricType == metrictypes.HistogramType {
switch space {
case metrictypes.SpaceAggregationSum:
return "`sum`", "", true
case metrictypes.SpaceAggregationAvg:
return "`sum`", "`count_series`", true
}
return "", "", false
}
switch space {
case metrictypes.SpaceAggregationSum:
return "`sum_last`", "", true
case metrictypes.SpaceAggregationAvg:
return "`sum_last`", "`count_series`", true
case metrictypes.SpaceAggregationMin:
return "`min`", "", true
case metrictypes.SpaceAggregationMax:
return "`max`", "", true
}
return "", "", false
}
// ReducedTimeAggregationColumn applies the time aggregation to the reduced `value`
// column over the step's 60s buckets. latest uses argMax over the bucket timestamp
// (the buckets have no read order); rate divides the per-step sum by the step.
func ReducedTimeAggregationColumn(timeAggregation metrictypes.TimeAggregation, stepSec int64) string {
switch timeAggregation {
case metrictypes.TimeAggregationLatest:
return "argMax(value, unix_milli)"
case metrictypes.TimeAggregationAvg:
return "avg(value)"
case metrictypes.TimeAggregationMin:
return "min(value)"
case metrictypes.TimeAggregationMax:
return "max(value)"
case metrictypes.TimeAggregationCount:
return "count(value)"
case metrictypes.TimeAggregationRate:
return fmt.Sprintf("sum(value) / %d", stepSec)
default: // sum, increase
return "sum(value)"
}
}
func AggregationQueryForHistogramCountWithParams(param *metrictypes.ComparisonSpaceAggregationParam) (string, error) {
if param == nil {
return "", errors.NewInvalidInputf(errors.CodeInvalidInput, "no aggregation param provided for histogram count")

View File

@@ -480,9 +480,7 @@ type MetricAggregation struct {
// value filter to apply to the query
ValueFilter *metrictypes.MetricValueFilter `json:"-"`
// reduce to operator for metric scalar requests
ReduceTo ReduceTo `json:"reduceTo,omitzero"`
Reduced bool `json:"-"`
ReduceTo ReduceTo `json:"reduceTo,omitempty"`
}
// Copy creates a deep copy of MetricAggregation.

View File

@@ -4,7 +4,6 @@ import (
"context"
"github.com/SigNoz/signoz/pkg/types/metrictypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
// MetadataStore is the interface for the telemetry metadata store.
@@ -27,12 +26,12 @@ type MetadataStore interface {
GetAllValues(ctx context.Context, fieldValueSelector *FieldValueSelector) (*TelemetryFieldValues, bool, error)
// FetchTemporality fetches the temporality for metric
FetchTemporality(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error)
FetchTemporality(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error)
// FetchTemporalityMulti fetches the temporality for multiple metrics
FetchTemporalityMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error)
FetchTemporalityMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error)
FetchTemporalityAndTypeMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, map[string]bool, error)
FetchTemporalityAndTypeMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, error)
// ListLogsJSONIndexes lists the JSON indexes for the logs table.
ListLogsJSONIndexes(ctx context.Context, filters ...string) ([]TelemetryFieldKeySkipIndex, error)

View File

@@ -6,7 +6,6 @@ import (
"github.com/SigNoz/signoz/pkg/types/metrictypes"
"github.com/SigNoz/signoz/pkg/types/telemetrytypes"
"github.com/SigNoz/signoz/pkg/valuer"
)
// MockMetadataStore implements the MetadataStore interface for testing purposes.
@@ -17,7 +16,6 @@ type MockMetadataStore struct {
AllValuesMap map[string]*telemetrytypes.TelemetryFieldValues
TemporalityMap map[string]metrictypes.Temporality
TypeMap map[string]metrictypes.Type
ReducedMap map[string]bool
PromotedPathsMap map[string]bool
LogsJSONIndexes []telemetrytypes.TelemetryFieldKeySkipIndex
ColumnEvolutionMetadataMap map[string][]*telemetrytypes.EvolutionEntry
@@ -308,7 +306,7 @@ func (m *MockMetadataStore) SetAllValues(lookupKey string, values *telemetrytype
}
// FetchTemporality fetches the temporality for a metric.
func (m *MockMetadataStore) FetchTemporality(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error) {
func (m *MockMetadataStore) FetchTemporality(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricName string) (metrictypes.Temporality, error) {
if temporality, exists := m.TemporalityMap[metricName]; exists {
return temporality, nil
}
@@ -316,7 +314,7 @@ func (m *MockMetadataStore) FetchTemporality(ctx context.Context, orgID valuer.U
}
// FetchTemporalityMulti fetches the temporality for multiple metrics.
func (m *MockMetadataStore) FetchTemporalityMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error) {
func (m *MockMetadataStore) FetchTemporalityMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, error) {
result := make(map[string]metrictypes.Temporality)
for _, metricName := range metricNames {
@@ -331,10 +329,9 @@ func (m *MockMetadataStore) FetchTemporalityMulti(ctx context.Context, orgID val
}
// FetchTemporalityMulti fetches the temporality for multiple metrics.
func (m *MockMetadataStore) FetchTemporalityAndTypeMulti(ctx context.Context, orgID valuer.UUID, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, map[string]bool, error) {
func (m *MockMetadataStore) FetchTemporalityAndTypeMulti(ctx context.Context, queryTimeRangeStartTs, queryTimeRangeEndTs uint64, metricNames ...string) (map[string]metrictypes.Temporality, map[string]metrictypes.Type, error) {
temporalities := make(map[string]metrictypes.Temporality)
types := make(map[string]metrictypes.Type)
reduced := make(map[string]bool)
for _, metricName := range metricNames {
if temporality, exists := m.TemporalityMap[metricName]; exists {
@@ -347,12 +344,9 @@ func (m *MockMetadataStore) FetchTemporalityAndTypeMulti(ctx context.Context, or
} else {
types[metricName] = metrictypes.UnspecifiedType
}
if m.ReducedMap[metricName] {
reduced[metricName] = true
}
}
return temporalities, types, reduced, nil
return temporalities, types, nil
}
// SetTemporality sets the temporality for a metric in the mock store.