Hi Loki Team,
Recently when I upgraded the loki version from 2.6.1 to 2.8.2 there were some issues that we are facing when we are running complex queries which retrieves a lot of data up to last 30 days. But in our case it also does not run when try to retrieve the logs for last 7 days as well! Every time we are facing 504 Gateway timeout error. After going through the logs we figured out querier, query-frontend components are showing error. The error that they are showing are msg=“error processing requests” err=EOF and err=“rpc error: code = Canceled desc = context canceled” . Even ingester component was showing error. After going through the recent documentation and community page I added lot of changes to my configurations:
- Added the index-gateway with pvc.
- Added querier timeout in limits_config to 5 min.
- Updated the Grafana Proxy timeout to 300s.
- Added proxy read, write and connect timeout of 300s in the ingress of grafana.
- Added a pvc to ingester. Because ingester was showing i/o timeout error
- Added ingestion_rate_mb, ingestion_burst_size_mb, per_stream_rate_limit, per_stream_rate_limit_burst the reason to add was the promtail logs we showing warning that stream limit is exceeded.
All the simple queries are running but whenever we run complex queries and try to retrieve the logs for last 7 to 30 days it fails! Let me know what we are missing here.
Thanks!