Hi! We are trying to aggregate on data of size around 200 GB per day. This is our query
quantile_over_time(0.99, {__aws_log_type=“s3_lb”, __extra_name=“abhinav”, __aws_s3_lb_owner=“399771530480”, __aws_s3_lb=“ProdGatewayServicePrivateLB”} | pattern “<> <> <> <> <> <> <> <> <> <> <> <>” | unwrap latency [1m] )
This query is not being run successfully using Grafana. We are noticing these errors in Loki queriers.
level=error ts=2023-05-17T06:31:39.38601402Z caller=retry.go:73 org_id=fake msg=“error processing request” try=0 query=“quantile_over_time(0.99, {__extra_name="abhinav"} |= "join" | pattern "<>\t<>\t<>\t\t<>\t\t<>\t\t\t<>" | unwrap latency [2s])” err=“context canceled”
Can someone please help us identify the bottleneck here?
This is our configuration
Judging by the error you most likely hit some sort of limit. Here are a couple of things I’d recommend trying:
-
Change both grpc_server_max_recv_msg_size
and grpc_server_max_send_msg_size
to 100MB.
-
Change split_queries_by_interval
to 30m or 1 hour.
Tried it but still getting this error in querier - level=error ts=2023-05-18T06:33:40.555562424Z caller=series_index_store.go:584 org_id=fake msg=“error querying storage” err=“query index: rpc error: code = Canceled desc = context canceled”
And on grafana - 502 Bad Gateway
502 Bad Gateway
nginx/1.19.10
This error is different from your original post. I am assuming your querier is working and only failing when given a big query?
My apologies if I have not explained clearly.
Yes, the querier is working great for smaller queries.
BUT for bigger queries it is failing.
On grafana the error is 502, Bad Gateway.
In logs for loki read pods I can see the error mentioned above -
caller=series_index_store.go:584 org_id=fake msg=“error querying storage” err=“query index: rpc error: code = Canceled desc = context canceled”
I am using Simple Scalable deployment by Loki. We have deployed this cluster on Kubernetes using the Helm Charts provided.
Looking forward to your response. Thanks in advance
Hi, you guys could find any solution to this? please help me in case you do.