Hi,
I deployed a cluster containing 2 distributors, 3 ingesters, 3 querier and 3 query-frontend. each has its own machine, I am testing capabilities, and I am trying to write 2 mb/s from 30 clients (each client 65kb/s). All the logs are saved well but slowly the the ingesters ram is going up until it reach 100% and crush, it took it 1.5± hours to reach a state that 2 ingesters are on 65% ram and one of them is one 90.
each ingester machine has 8g ram and 4 cpus. (was 4g ram before and tried scaling it to see if it fixes it - it didn’t)
All the data was saved succesfully, the distributors handled the work easily.
Did someone had something similar happening? or any recommendations?
Hi @kavikanagaraj sorry for the delay, and thanks for the reply.
Yes they all sent it via promtail, and running it on our own rhv based environment.
The problem was that I misconfigured the parameter chunk_retain_period so it didn’t remove the data from the memory, after lowering it, the issue was fixed.