How to determine disk usage per log stream in Loki

I am using Grafana Loki for storing logs from multiple microservices. Currently, I encounter huge amount of data logs in loki (approximately 40gb) in short period of time. I would like to know, which application stores so much data. What is the best way how to calculate amount of storage per data stream?

I am using three labels: service, stack and replica

At first, I tried to check metrics - there is no info about it. Second I tried the logcli - there is no option for this use case. I tried to analyze chunk directory, which could lead to some results.

I know, all data are stored in chunks and when I list them its just some list of base64 strings:

ZmFrZS80MThiYjBhMDIyYjc3NDNiOjE4MzU0NDg0NTMxOjE4MzU0YjY3NzEzOmM5ZWE4ODc2
ZmFrZS80MThiYjBhMDIyYjc3NDNiOjE4MzU0YjY5ZWExOjE4MzU1MjRkMzMwOmNkNWUwOTM5
ZmFrZS80MThiYjBhMDIyYjc3NDNiOjE4MzU1MjRmNGE0OjE4MzU1OTMzMWIxOjFiNTNjNmYw
ZmFrZS80MThiYjBhMDIyYjc3NDNiOjE4MzU1OTM1OTM5OjE4MzU2MDE2ZjljOmY1OGE2ZWIw
ZmFrZS80MThiYjBhMDIyYjc3NDNiOjE4MzU2MDE5NzJjOjE4MzU2NmZkMjNiOjIzZmIyMDEw

When I decode it with base64:

fake/418bb0a022b7743b:18354484531:18354b67713:c9ea8876
fake/418bb0a022b7743b:18354b69ea1:1835524d330:cd5e0939
fake/418bb0a022b7743b:1835524f4a4:183559331b1:1b53c6f0
fake/418bb0a022b7743b:18355935939:18356016f9c:f58a6eb0
fake/418bb0a022b7743b:1835601972c:183566fd23b:23fb2010

From the output I can determine only one think fake/ is default tenancy ID but do not know, what other numbers means.

I am using Promtail to harvest data from docker swarm services with docker_sd_configs.

I also thinking to use chunks-inspect but it is quiet brute force logic I would like not to try because it take huge amount of computation time and energy with increasing logs amount.

What is the best way to analyze how much data per stream is stored?

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.