Iβm using an OSS stack, testing in development, using Docker. I have traces being generated successfully and can successfully query these within Grafana. However, I cannot get metrics generation, or service graphs working for the life of me. Any help is much appreciated!
I have Tempo configured to generate metrics thusly:
tempo.yaml
...
metrics_generator:
processor:
service_graphs:
span_metrics:
registry:
external_labels:
source: tempo
cluster: docker-compose
storage:
path: /var/tempo/generator/wal
wal:
remote_write_flush_deadline: 30s
remote_write:
- url: http://host.docker.internal:9090/api/v1/write
send_exemplars: true
traces_storage:
path: /var/tempo/generator/traces
overrides:
defaults:
metrics_generator:
processors: [service-graphs, span-metrics, local-blocks]
...
And in the logs, I can see the following:
$ docker-compose logs -f tempo
tempo-1 | level=info ts=2024-05-29T16:42:27.186169918Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:42:42.1872593Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:42:57.186681168Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:43:12.187923717Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:43:27.187517543Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:43:42.187359467Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:43:57.188085252Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:44:12.188117259Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:44:27.187242793Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:44:42.187321925Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:44:57.18904496Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:45:12.188460425Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:45:27.186341627Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:45:36.611407423Z caller=poller.go:136 msg="blocklist poll complete" seconds=0.00014825
tempo-1 | level=info ts=2024-05-29T16:45:42.191599509Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:45:57.185562127Z caller=registry.go:257 tenant=single-tenant msg="deleted stale series" active_series=0
tempo-1 | level=info ts=2024-05-29T16:45:57.186138252Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:46:12.187804967Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:46:27.188579127Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:46:42.186421217Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:46:57.187713293Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:47:12.191388217Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:47:27.189738502Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:47:42.186984175Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:47:57.187703418Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
tempo-1 | level=info ts=2024-05-29T16:48:12.19496105Z caller=registry.go:236 tenant=single-tenant msg="collecting metrics" active_series=0
But nothing is sent to Prometheus, and what I find odd is that the WAL is empty and zero bytes:
/var/tempo/generator # tree -ah
[4.0K] .
βββ [4.0K] traces
β βββ [4.0K] single-tenant
β βββ [4.0K] blocks
β β βββ [4.0K] single-tenant
β β βββ [4.0K] 13a28e4a-d15e-45a1-a841-1414e8e59e57
β β β βββ [100K] bloom-0
β β β βββ [ 20K] data.parquet
β β β βββ [ 42] index
β β β βββ [ 414] meta.json
β β βββ [4.0K] 7d9a5da1-51ec-4b45-97b5-5c018848d727
β β β βββ [100K] bloom-0
β β β βββ [ 22K] data.parquet
β β β βββ [ 42] index
β β β βββ [ 415] meta.json
β β βββ [4.0K] ee59dd30-075c-4dfa-ae17-dc2c7192d5f3
β β β βββ [100K] bloom-0
β β β βββ [ 21K] data.parquet
β β β βββ [ 42] index
β β β βββ [ 415] meta.json
β β βββ [4.0K] f4bfddf1-e9a6-463e-b33f-72ed561d3836
β β βββ [100K] bloom-0
β β βββ [ 22K] data.parquet
β β βββ [ 42] index
β β βββ [ 415] meta.json
β βββ [4.0K] e296a42a-0dfa-4dd6-9581-a085f3037f6d+single-tenant+vParquet3
β βββ [ 0] 0000000001
βββ [4.0K] wal
βββ [4.0K] single-tenant
βββ [ 0] lock
βββ [4.0K] wal
βββ [ 0] 00000000
Hereβs an example trace: Trace-f33d2e-2024-05-29 18_00_08.json - Google Drive
Iβve trawled the docs, forums, Github, and anywhere else I can find that mentions issues with the metrics, but Iβm continuing to bang my head against this. What is most frustrating is that, at some point, I managed to get this working and saw metrics within Prometheus, such as traces_spanmetrics_size_total
, but this was short-lived, and I canβt figure out why this started working, or equally why it stopped so abruptly.
Please help!