Keep-Alive for for long running panel queries when fronted by a reverse proxy

ashbullock123 · February 18, 2023, 4:16pm

What Grafana version and what operating system are you using?
Grafana v9.3.2 Alpine image, running on Azure App Service
What are you trying to achieve?
I’ve got Grafana (9.3.2 Alpine container) deployed to an Azure App Service, connected with a Loki datasource. Everything has been working ok, except I’m having issues with long running queries to Loki.

For longer Loki queries, I’m receiving a time out from the Azure App Service side:

1 queries with total query time of 4.00 min
status: 504
statusText: "Gateway Timeout"

This always times out after 4 minutes, and unfortunately I’ve found it’s an Azure limit, that cannot be increased:

Is it possible to handle a query timeout from an external loadbalancer without increasing the timeout on the loadbalancer, but instead by sending a TCP Keep-Alive for the connection?

I was looking into the dataproxy settings and found the keep_alive_seconds setting, which I thought may work for this use case: Configure Grafana | Grafana documentation

Unfortunately experimenting with this value hasn’t helped so was wondering if anyone else has experienced something similar and knows of a way around it before I switch Infrastructure setup.

Here’s my relevant proxy config settings:

[server]
# The full public facing url you use in browser, used for redirects and emails
# If you use reverse proxy and sub path specify full url (with sub path)
root_url = <The fqdn for my loadbalancer>

[dataproxy]
# This enables data proxy logging, default is false
logging = true

# How long the data proxy waits to read the headers of the response before timing out
timeout = 30
# How long the data proxy waits to establish a TCP connection before timing out
dialTimeout = 10
# How many seconds the data proxy waits before sending a keepalive probe request.
keep_alive_seconds = 15

Thanks in advance!

yosiasz · February 18, 2023, 7:25pm

I have not experienced this but why do you have long running queries in the first place? I think fixing that would be my priority. any other configs (timeout increase, keep alive settings etc) will just keep pushing the root issue?

Topic		Replies	Views
Loki queries within grafana dashboards Grafana Loki	2	2599	August 30, 2023
Increasing timeout query to loki datasource Grafana Loki query-help , datasource	8	31000	January 18, 2023
Loki won't excecute large queries Grafana Loki	9	7130	February 29, 2024
Frontend 502 Error - Long queries Grafana Plugin Development	0	411	December 12, 2018
Error: Status: 504:Message: Get "http://Loki-querier / Get "http://thanos-querier" Configuration	1	457	July 8, 2024

Keep-Alive for for long running panel queries when fronted by a reverse proxy

Related topics