Hi all,
Grafana stack seems to be heavily focused on alarms and metrics as a starting point. First, you have some metrics event, then you drill down to logs and traces to figure out what happened.
Why I got that impression? Loki does not have indexing; its labels seem to be used for sharding (streams). The recommendation is to use only source identifiers as labels. It is only efficient to search by when and where. Tempo indexes only by trace id, so only searching by trace id that you got from somewhere (like metrics) is efficient.
That is all good and well unless you want to find functional rather than non-functional issues, as you cannot easily search by customer id or similar field. For example, if a customer complains about the transaction that he had yesterday at 13:44, how would you find that? Maybe backend search can help but it is not production-grade.
Am I getting this right? What can I do about functional issues?
Thanks!