Loki ingesters ram overflowing

I deployed a cluster containing 2 distributors, 3 ingesters, 3 querier and 3 query-frontend. each has its own machine, I am testing capabilities, and I am trying to write 2 mb/s from 30 clients (each client 65kb/s). All the logs are saved well but slowly the the ingesters ram is going up until it reach 100% and crush, it took it 1.5± hours to reach a state that 2 ingesters are on 65% ram and one of them is one 90.
each ingester machine has 8g ram and 4 cpus. (was 4g ram before and tried scaling it to see if it fixes it - it didn’t)
All the data was saved succesfully, the distributors handled the work easily.

Did someone had something similar happening? or any recommendations?

Hi @dror1212 Can you please share your loki configs?. I’m particularly interested in ingester config.

Few clarifications:

  1. Is all of your 30 clients are sending logs via promtail? Can you share your client configs as well?
  2. Also curious. Where are you running your workloads? (kubernetes or plain EC2/GCE or something else)

Can you also please share what version of Loki are you using?

Hi @kavikanagaraj sorry for the delay, and thanks for the reply.
Yes they all sent it via promtail, and running it on our own rhv based environment.
The problem was that I misconfigured the parameter chunk_retain_period so it didn’t remove the data from the memory, after lowering it, the issue was fixed.

1 Like