"unable to get stream rates from ingester" posted out intermittently

Hi All,

I am running Loki in SSD mode at version 3.1.0 on CKS.

Recently, I found errors below post out on all of the write pod:

I can confirm there is no data I/O burst at that time, CPU/Memory usage was under average. Besides, there are another 2 loki stack with same configuration but even no data ingested, the error above complained as well almost at the same time:

I tried to increase the remote_timeout from default 5s to 10s, but nothing positive happened.

Any thought on this issue is highly welcome!

@tonyswumac sorry to bother you Tony, but it would be much appreciated if you could have a look on this one.

Something must be causing some sort of delay. Can you share your configuration, please? Did anything change recently?

Also check and see if host was busy for other reasons during that time.

hi @tonyswumac ,

Thanks for jumping in.

Here is the configmap of the loki stack:

And here is the container config of the affected pod loki-write:

I have confirmed with the K8S admin and they said there is no scheduled job running during that period.

Btw, same error came out again around the same time today in all of the 3 clusters.

Your configuration looks good to me. You mentioned same time today, does that mean the errors are intermittent and only happen at a specific time?

hi Tony,

Thanks for the reply.

It happened at a specific time, mostly are from 21:30 to 21:50 as you may tell from the 2nd screenshot.

If it happens at a specific time then it’s most likely not because of Loki itself. If it were a configuration issue I’d expect it to manifest much more frequently.

Check your kubernetes cluster, see if the containers are replaced at the time for some reason, or if hosts are replaced at the time for some reason. If you see it reliably during the time see if you can do a telnet test from one Loki container to another.

Thanks Tony, I will check with K8S admin later and get back to you afterwards if there is any update.