Please I need help with the below error, it is showing up multiple times in the ingester component.
I have tried to increase many limits, but it is showing the limit as 4KB/sec for many streams.
consider splitting a stream via additional labels or contact your Loki administrator to see if the limit can be increased',\nentry with timestamp 2024-08-07 05:08:56.217050506 +0000 UTC ignored, reason: 'Per stream rate limit exceeded (limit: 4KB/sec) while attempting to ingest for stream
This seems a bit excessive. The recommendation is 5MB/20MB for these, I’d try that and see if it helps. Also hit the /config endpoint on your ingesters and double check your configurations are applied.
I have checked the configuration, and it is applied in the pod’s config file. I have even reduced it to the recommended 5MB/20MB settings and it’s still showing the same error unfortunately.
Can you double check your ingester configuration by hitting the http endpoint with /config? Just as a sanity check.
You could also try to deploy with a different topology (such as simple scalable mode), and maybe even a different version (maybe 3.0.0 or even 2.9.*), just as tests. You never know, the more you poke at it sometimes you find stuff you weren’t seeing before. I still think your configurations are somehow not applied, though.
I have double checked the config from inside the container and saw the loki config that is being used, which shows the exact same config values as the ones in the helm config.
even ran loki -verify-config in all the ingester pods and it shows as valid.
Unfortunately, I cannot downgrade as this is a live environment, as even with the errors I am still getting most of the logs.
I’d consider putting up a test cluster if you can. Use the same helm chart, and deploy to your dev environment, and see what happens, so that you can freely poke at it.
Yes sharding is enabled, as for the test cluster I have one cluster in a test environment now, but the log volume is definitely not the same because the environments are segregated, and I’m not getting the error there.
Honestly I am not quite sure why you are having this problem, and I think your best bet is to keep poking at it, and try to re-produce it if possible. You can use Grafana’s K6 to generate enough burst traffic to mimic your production cluster, and see if that gets you what you need.
Hi,
Just an update, I do not get the error messages anymore!
Here is my updated config, where bloom has been completely disabled, and ingestion_rate_strategy has been set to local with sane per_stream_rate_limit limits.