Getting (all) data directly out of compacted blocks?

Hi, I want to get tracing data back from blocks which have been soft-deleted on a MinIO storage (I do not need to query it, it would be fine to just extract all data).

  • I restored some blocks,
  • let Tempo recreate the index.json.gz,
  • tweaked timeouts
  • and queried via the /api/search endpoint.

But currently I am struggling with very long query times (10+ minutes and no result, running in monolithic mode via Docker on my local machine) which makes it unpractical for me to go this route further.

Context about the blocks: tempo-cli list blocks lists ~200 blocks, totalling 12953 objects and 20MB, all created around 60d ago, storing 5-10m of duration in 1h windows.

Is there a practical way to just get the data directly and structured from the data file of these blocks? Can you hint me to some documentation or tooling? Or do I “just” need to scale up (a lot)?

There is no direct way to just dump every trace out of a block. It could be accomplished with some minor changes to one of the cli commands though.

For instance, the tempo-cli list block with the --scan parameter iterates through every trace in the block to generate some basic metrics. Each trace is unmarshalled here:

The trace is a *tempopb.Trace which you could marshal to json and dump to stdout.

With your hints I was able to dump the information. That was really helpful and works nicely around my query time issue. Thanks a lot!

1 Like