Description:
We are experiencing occasional errors when writing logs to S3 using Loki. The error message we see is:
failed to flush chunks: store put chunk: SlowDown: Please reduce your request rate. status code: 503
Environment:
- Single tenant deployment using a single S3 bucket (s3.us-east-1.amazonaws.com)
- Ingestion parameters: ingestion_rate_mb is set to 16 and ingestion_burst_size_mb is set to 32
- Chunk parameters: chunk_target_size is set to 8MB, and chunk_idle_period is set to 1h
Steps to Reproduce:
- Under high log ingestion load, some chunks fail to flush to S3, triggering the intermittent “SlowDown” error during write operations.
What Has Been Tried:
- Lowering the ingestion rate to smooth out the write flow
- Adjusting chunk_target_size and chunk_idle_period to encourage batch writes to S3
- Tuning HTTP connection settings such as timeout and idle_conn_timeout
- Verifying and adjusting retry strategies for write failures