504 Gateway Error when connection to MSSQL Server

clsch · December 1, 2025, 3:22pm

We are running Grafana Enterprise on an open shift environment. We have several MSSQL datasources which connect to differnet servers and databases. Everything is working fine, but as soon as a sql query takes longer than 30 seconds we recieve following error:

504 Gateway Time-out: The server didn’t response in time

We have tried increasing the timeout in grafana.ini, we have checked timouts in sql server but we could not solve this issue. if we are running the quiery directly on SQL Server it takes around 45s.

We have read some issues which points to time out of the webserver it self. Does anyone have a similar issue?

We are running Grafana 12.2.1

yosiasz · December 1, 2025, 3:32pm

45s is an internity for a query. I would recommend you fix the issue in mssql

What does the query look like?

You need to fine tune things or have grafana fetch the data from a flattened single table (emulate dw) that has all the values you need and that is prepopulated via a cron job

If these tables you query from are also being used by other systems (oltp) you will face issues

clsch · December 2, 2025, 7:09am

Thank you for the fast reply.

The query we are using is a data quality analysis query which is visualized in a grafana dashboard. It will only be used once a day during backfill operations. That is also why we chose to do it quicky with a query instead of a table which we have to load first.

As described above, the query runs fine in SQL Server Management Studio. We know 45s is not ideal, but it is also not really really bad also we already improved the query as good as possible.

We hoped there is a setting / configuration where you can configure the time out.. but unfortunately until now we still not know where the time out is coming from. we cannot find anything in the Grafana logs.

jangaraj · December 2, 2025, 9:18am

I guess:

github.com/grafana/grafana

conf/defaults.ini

e36ea7877


      
          
          # This enables encryption of values stored in the remote cache
          encryption =
          
          #################################### Data proxy ###########################
          [dataproxy]
          
          # This enables data proxy logging, default is false
          logging = false
          
          # How long the data proxy waits to read the headers of the response before timing out, default is 30 seconds.
          # This setting also applies to core backend HTTP data sources where query requests use an HTTP client with timeout set.
          timeout = 30
          
          # How long the data proxy waits to establish a TCP connection before timing out, default is 10 seconds.
          dialTimeout = 10
          
          # How many seconds the data proxy waits before sending a keepalive request.
          keep_alive_seconds = 30
          
          # How many seconds the data proxy waits for a successful TLS Handshake before timing out.

Of course any networkin involved (Openshift included) may have own timeouts, which you may need reconfigure as well.

clsch · December 2, 2025, 9:55am

We already testet the [dataproxy] settings and set the timeout to 90 with no effect. For clarification, is the dataproxy always used?

#################################### Data proxy ###########################
    [dataproxy]

    # This enables data proxy logging, default is false
    logging = false

    # How long the data proxy waits to read the headers of the response before timing out, default is 30 seconds.
    # This setting also applies to core backend HTTP data sources where query requests use an HTTP client with timeout set.
    timeout = 90

    # How long the data proxy waits to establish a TCP connection before timing out, default is 10 seconds.
    dialTimeout = 10

    # How many seconds the data proxy waits before sending a keepalive request.
    keep_alive_seconds = 30

yosiasz · December 2, 2025, 12:26pm

Can you share the query so we can see what is happening

clsch · December 2, 2025, 12:43pm

the query is used to determine the delay between kafka and the database. if i run it directly on sql server (SSMS) it takes around 45 seconds for the last 7 days.

SELECT 
			DATE_BUCKET(MINUTE, ${BucketSize_Min}, ImportDateTimeUtc) AS Bucket
	   ,MAX(DATEDIFF(SECOND, [ReceivedDateTimeUtc], [ImportDateTimeUtc])) AS MaxTotalDelayInSeconds
		 ,AVG(CAST(DATEDIFF(SECOND, [ReceivedDateTimeUtc], [ImportDateTimeUtc]) AS BIGINT)) AS AvgTotalDelayInSeconds
	   ,MAX(DATEDIFF(SECOND, [KafkaTimestamp], [ImportDateTimeUtc])) AS MaxDatabaseDelayInSeconds
		 ,AVG(CAST(DATEDIFF(SECOND, [KafkaTimestamp], [ImportDateTimeUtc]) AS BIGINT)) AS AvgDatabaseDelayInSeconds
FROM [ETP].[Disk].[Quote_${Venue_Account}_Active] WITH (NOLOCK)
WHERE ImportDateTimeUtc BETWEEN $__timeFrom() AND $__timeTo()
GROUP BY DATE_BUCKET(MINUTE, ${BucketSize_Min}, ImportDateTimeUtc)

in grafana i recieve following error:

clsch · December 3, 2025, 9:46am

Sorry for the late response. I posted the query and a screenshot yesterday, but somehow it got into a spam filter and needs to be checked by a staff member…

Nevertheless we have found the issue in OpenShift. The timeout of the ingress has a default value of 30s. and after that it closes the session. So in our case grafana fires the sql query which is running for 45s. But after 30s OpenShift closes the session acitvely and that’s why we recieve the error.

we could solve it with this config:

apiVersion: route.openshift.io/v1
kind: Route
metadata:
  name: myapp
  annotations:
    haproxy.router.openshift.io/timeout: 90s
spec:
  host: myapp.example.com
  to:
    kind: Service
    name: myapp-service
  port:
    targetPort: 8080

The setting can be added under annotations.

haproxy.router.openshift.io/timeout: 90s

jangaraj · December 3, 2025, 10:24am

I mentioned that:

And it may not be only Ingress. Also Service type: LoadBalancer can be used. So generally whole network chain must be configured to support that timeout.

clsch · December 3, 2025, 10:31am

yes, and that’s why we have talked with the team responsible for OpenShift. they then analysed it and came up with this solution this morning. Thank you for the hint

Topic		Replies	Views
MSSQL query timeout MSSQL mssql	11	4800	January 19, 2024
SQL time out on panels MSSQL mssql	5	6677	May 24, 2018
Read tcp [::1]:52473->[::1]127.0.0.1:1433 i/o timeout Grafana	0	855	September 12, 2018
Getting gateway timeout errors PostgreSQL	0	1650	June 5, 2023
Grafana timing out when querying Prometheus datasource Prometheus	1	9441	August 5, 2021

504 Gateway Error when connection to MSSQL Server

Related topics