I’m evaluating Grafanf Tempo as our tracing solution. As of now, I’ve installed it in distributed mode and playing with full-backend search.
I execute the same query(search) for a 2hours period and tried to adjust some parameters.
The api_search latency on query_frontend and querier are quite high:
At the same time, backend latency is much lower:
I tried to increase the number of queries (from 2 to 3) but it didn’t help at all.
Also played a bit with the configuration with recommendations here tempo/backend_search.md at main · grafana/tempo · GitHub But with the same 0 improvements
Can someone advise where to look at to reduce the search time
here is a diff of my config:
GET /status/config --- compactor: compaction: block_retention: 168h0m0s max_block_bytes: 3221225472 ring: kvstore: store: memberlist distributor: receivers: kafka: auth: tls: ca_file: /tmp/ca.crt insecure: true brokers: kafka-cluster-kafka-bootstrap.kafka.svc.cluster.local:9093 client_id: tempo-ingester encoding: otlp_proto group_id: tempo-ingester message_marking: after: true on_error: false protocol_version: 2.8.0 topic: otlp-tracing ingester: lifecycler: readiness_check_ring_health: false tokens_file_path: /var/tempo/tokens.json memberlist: abort_if_cluster_join_fails: false dead_node_reclaim_time: 10s join_members: - grafana-tempo-tempo-distributed-gossip-ring overrides: per_tenant_override_config: /conf/overrides.yaml querier: frontend_worker: frontend_address: grafana-tempo-tempo-distributed-query-frontend-discovery:9095 parallelism: 10 max_concurrent_queries: 10 search_query_timeout: 1m30s query_frontend: query_shards: 30 search: concurrent_jobs: 300 max_duration: 2h0m0s search_enabled: true server: http_listen_port: 3100 http_server_read_timeout: 2m0s storage: trace: backend: s3 cache: memcached local: path: /var/tempo/traces memcached: addresses: dns+memcached:11211 circuit_breaker_consecutive_failures: 0 circuit_breaker_interval: 0s circuit_breaker_timeout: 0s consistent_hash: true host: "" max_idle_conns: 16 max_item_size: 0 service: "" timeout: 1s ttl: 0s update_interval: 1m0s pool: max_workers: 100 queue_depth: 10000 s3: bucket: kubernetes-tracing-us-east-1 endpoint: s3.amazonaws.com search: prefetch_trace_count: 20000 wal: blocksfilepath: /var/tempo/wal/blocks completedfilepath: /var/tempo/wal/completed target: querier