Tempo - Minio S3 troubleshooting

I have set up Tempo via Helm Chart and the following configuration for S3 in Tempo:

backend:                                          
    s3:
      bucket: tempo                                   
      endpoint: minio-s3.minio.svc.cluster.local  
      access_key: **                            
      secret_key: **                             
      insecure: true                                 
  storage:
    trace:
      backend: local
      local:
        path: /var/tempo/traces
      wal:
        path: /var/tempo/wal

The python app i use to test:

from flask import Flask, request

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk.trace.export import BatchSpanProcessor


app = Flask(__name__)

span_exporter = OTLPSpanExporter(
    endpoint="tempo.monitoring.svc.cluster.local",

)

tracer_provider = TracerProvider()
trace.set_tracer_provider(tracer_provider)

span_processor = BatchSpanProcessor(span_exporter)
tracer_provider.add_span_processor(span_processor)


tracer = trace.get_tracer_provider().get_tracer(__name__)


@app.route("/server_request")
def server_request():
    print(request.args.get("param"))
    return "served"


if __name__ == "__main__":
    app.run(port=8082)

In my Nginx ingress I can see:
{"log":"10.0.1.238 - - [02/Nov/2021:09:43:23 +0000] \"POST /opentelemetry.proto.collector.trace.v1.TraceService/Export HTTP/2.0\" 200 5 \"-\" \"grpc-python/1.41.1 grpc-c/19.0.0 (osx; chttp2)\" 1001 0.002 [monitoring-tempo-4317] [] 10.0.0.26:4317 47 0.004 200 e6b94ca73b5667dd710d5e42961b581d\n","stream":"stdout","time":"2021-11-02T09:43:23.563588542Z"}

So the ingestion seems to be successful.

Logs from Tempo:

level=info ts=2021-11-01T08:46:30.701107722Z caller=main.go:190 msg="initialising OpenTracing tracer"

level=info ts=2021-11-01T08:46:30.720934127Z caller=main.go:109 msg="Starting Tempo" version="(version=, branch=HEAD, revision=e586b3380)"

level=info ts=2021-11-01T08:46:30.724209061Z caller=server.go:239 http=[::]:3100 grpc=[::]:9095 msg="server listening on addresses"

level=info ts=2021-11-01T08:46:30.724620278Z caller=frontend.go:30 msg="creating tripperware in query frontend to shard queries"

level=warn ts=2021-11-01T08:46:30.725623406Z caller=modules.go:145 msg="Worker address is empty in single binary mode. Attempting automatic worker conf

iguration. If queries are unresponsive consider configuring the worker explicitly." address=127.0.0.1:9095

level=info ts=2021-11-01T08:46:30.725866856Z caller=worker.go:113 msg="Starting querier worker connected to query-frontend" frontend=127.0.0.1:9095

ts=2021-11-01T08:46:30Z level=info msg="OTel Shim Logger Initialized" component=tempo

level=info ts=2021-11-01T08:46:30.727561229Z caller=module_service.go:59 msg=initialising module=store

level=info ts=2021-11-01T08:46:30.727921407Z caller=module_service.go:59 msg=initialising module=server

level=info ts=2021-11-01T08:46:30.728098062Z caller=module_service.go:59 msg=initialising module=query-frontend

level=info ts=2021-11-01T08:46:30.728258452Z caller=module_service.go:59 msg=initialising module=memberlist-kv

level=info ts=2021-11-01T08:46:30.728387509Z caller=module_service.go:59 msg=initialising module=ring

level=info ts=2021-11-01T08:46:30.728602596Z caller=ring.go:275 msg="ring doesn't exist in KV store yet"

level=info ts=2021-11-01T08:46:30.728725361Z caller=client.go:247 msg="value is nil" key=collectors/ring index=1

level=info ts=2021-11-01T08:46:30.728837683Z caller=module_service.go:59 msg=initialising module=querier

level=info ts=2021-11-01T08:46:30.728976203Z caller=worker.go:193 msg="adding connection" addr=127.0.0.1:9095

level=info ts=2021-11-01T08:46:30.731534998Z caller=module_service.go:59 msg=initialising module=overrides

level=info ts=2021-11-01T08:46:30.731603832Z caller=module_service.go:59 msg=initialising module=ingester

level=info ts=2021-11-01T08:46:30.731719203Z caller=module_service.go:59 msg=initialising module=distributor

level=info ts=2021-11-01T08:46:30.731761179Z caller=module_service.go:59 msg=initialising module=compactor

level=info ts=2021-11-01T08:46:30.73182886Z caller=tempodb.go:379 msg="polling enabled" interval=5m0s concurrency=50

level=info ts=2021-11-01T08:46:30.732185809Z caller=compactor.go:101 msg="waiting for compaction ring to settle" waitDuration=1m0s

level=info ts=2021-11-01T08:46:30.732375293Z caller=ingester.go:321 msg="beginning wal replay"

ts=2021-11-01T08:46:30Z level=info msg="No sampling strategies provided or URL is unavailable, using defaults" component=tempo

ts=2021-11-01T08:46:30Z level=info msg="Starting GRPC server on endpoint 0.0.0.0:4317" component=tempo

ts=2021-11-01T08:46:30Z level=info msg="Setting up a second GRPC listener on legacy endpoint 0.0.0.0:55680" component=tempo

ts=2021-11-01T08:46:30Z level=info msg="Starting GRPC server on endpoint 0.0.0.0:55680" component=tempo

ts=2021-11-01T08:46:30Z level=info msg="Starting HTTP server on endpoint 0.0.0.0:4318" component=tempo

level=info ts=2021-11-01T08:46:30.732892945Z caller=ingester.go:345 msg="wal replay complete"

level=info ts=2021-11-01T08:46:30.732980635Z caller=ingester.go:359 msg="reloading local blocks" tenants=0

level=info ts=2021-11-01T08:46:30.733098856Z caller=lifecycler.go:530 msg="not loading tokens from file, tokens file path is empty"

level=info ts=2021-11-01T08:46:30.733193106Z caller=app.go:251 msg="Tempo started"

level=info ts=2021-11-01T08:46:30.733308404Z caller=lifecycler.go:559 msg="instance not found in ring, adding with no tokens" ring=ingester

level=info ts=2021-11-01T08:46:30.733491435Z caller=lifecycler.go:406 msg="auto-joining cluster after timeout" ring=ingester

level=info ts=2021-11-01T08:47:30.733746155Z caller=compactor.go:103 msg="enabling compaction"

level=info ts=2021-11-01T08:47:30.733802195Z caller=tempodb.go:359 msg="compaction and retention enabled."

Tempo query:

2021/11/01 08:46:35 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined

{"level":"info","ts":1635756395.4116976,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}

{"level":"info","ts":1635756395.4117658,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}

{"level":"info","ts":1635756395.4120743,"caller":"flags/admin.go:121","msg":"Mounting health check on admin server","route":"/"}

{"level":"info","ts":1635756395.4121535,"caller":"flags/admin.go:127","msg":"Starting admin HTTP server","http-addr":":16687"}

{"level":"info","ts":1635756395.4121768,"caller":"flags/admin.go:113","msg":"Admin server started","http.host-port":"[::]:16687","health-status":"unavai

lable"}

{"level":"info","ts":1635756395.4386072,"caller":"grpc@v1.29.1/clientconn.go:243","msg":"parsed scheme: \"\"","system":"grpc","grpc_log":true}

{"level":"info","ts":1635756395.4387138,"caller":"grpc@v1.29.1/clientconn.go:249","msg":"scheme \"\" not registered, fallback to default scheme","system

":"grpc","grpc_log":true}

{"level":"info","ts":1635756395.4387798,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{unused

0 }] }","system":"grpc","grpc_log":true}

{"level":"info","ts":1635756395.4388123,"caller":"grpc@v1.29.1/clientconn.go:667","msg":"ClientConn switching balancer to \"pick_first\"","system":"grpc

","grpc_log":true}

{"level":"info","ts":1635756395.4388225,"caller":"grpc@v1.29.1/clientconn.go:682","msg":"Channel switches to new LB policy \"pick_first\"","system":"grp

c","grpc_log":true}

{"level":"info","ts":1635756395.438887,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","

grpc_log":true}

{"level":"info","ts":1635756395.438935,"caller":"grpc/factory.go:79","msg":"External plugin storage configuration","configuration":{"PluginBinary":"/tmp

/tempo-query","PluginConfigurationFile":"/conf/tempo-query.yaml","PluginLogLevel":"warn"}}

{"level":"info","ts":1635756395.4390037,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"unused\" to connect","system"

:"grpc","grpc_log":true}

{"level":"info","ts":1635756395.439275,"caller":"grpc@v1.29.1/clientconn.go:881","msg":"blockingPicker: the picked transport is not ready, loop back to

repick","system":"grpc","grpc_log":true}

{"level":"info","ts":1635756395.4393234,"caller":"grpc@v1.29.1/clientconn.go:881","msg":"blockingPicker: the picked transport is not ready, loop back to

repick","system":"grpc","grpc_log":true}

{"level":"info","ts":1635756395.4395618,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grp

c_log":true}

{"level":"info","ts":1635756395.4399667,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc

_log":true}

{"level":"info","ts":1635756395.439994,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log"

:true}

{"level":"info","ts":1635756395.4416337,"caller":"querysvc/query_service.go:137","msg":"Archive storage not created","reason":"archive storage not suppo

rted"}

{"level":"info","ts":1635756395.4416702,"caller":"app/flags.go:143","msg":"Archive storage not initialized"}

{"level":"info","ts":1635756395.4422145,"caller":"app/server.go:163","msg":"Query server started","port":16686,"addr":":16686"}

{"level":"info","ts":1635756395.442267,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}

{"level":"info","ts":1635756395.442288,"caller":"app/server.go:232","msg":"Starting CMUX server","port":16686,"addr":":16686"}

{"level":"info","ts":1635756395.4423287,"caller":"app/server.go:208","msg":"Starting HTTP server","port":16686,"addr":":16686"}

{"level":"info","ts":1635756395.4423535,"caller":"app/server.go:221","msg":"Starting GRPC server","port":16686,"addr":":16686"}

How can I check if Tempo can store the traces in S3?

Also I don’t know if I’m using the right port for Tempo and Grafana:

Tempo Query is only required by Grafana 7.4.x and before. This can probably be ignored.

The helm chart defaults to port 3100:
helm-charts/charts/tempo at main · grafana/helm-charts · GitHub (search for httpListenPort)

and you can see our example config here:

This is hopefully just a configuration issue between Grafana and Tempo. Try the 3100 port and see how it goes.

How can I check if Tempo can store the traces in S3?

After running Tempo for a bit you can just check the configured bucket to see if it has created any blocks. You can also start Tempo with --distributor.log-received-traces and it will log all the traces it receives. This can be a quick way to confirm ingestion is working.

Also here are general purpose troubleshooting docs:

1 Like

Thank you for answering my questions. I have checked the Prometheus metrics and they return numbers > 0

tempo_ingester_traces_created_total{container="tempo", endpoint="tempo-prom-metrics", instance="10.0.0.10:3100", job="tempo", namespace="monitoring", pod="tempo-0", service="tempo", tenant="single-tenant"}
12

and

tempo_distributor_spans_received_total{container="tempo", endpoint="tempo-prom-metrics", instance="10.0.0.10:3100", job="tempo", namespace="monitoring", pod="tempo-0", service="tempo", tenant="single-tenant"}

For accessing Tempo via Grafana, I changed tempo.monitoring.svc.cluster.local:3100 to tempo:3100 and no errors present anymore. However, the Minio bucket is still empty, so no traces are stored in it.

However, the Minio bucket is still empty, so no traces are stored in it.

Tempo stores traces in memory and on disk until a certain threshold is reached at which point it cuts a block and pushes it to Minio. It’s possible this just hasn’t happened yet.

See max_block_bytes and max_block_duration for ways to reduce the threshold on cutting blocks:

Also you can check Tempo’s filesystem to see if there are wal files in /var/tempo/wal.

1 Like