We are encountering an issue with Grafana, which is installed in our ARO cluster. When trying to filter logs over a time range exceeding 1 hour, we receive the following error:
Query error
504 Gateway Time-out
the server didn't respond in time.
```
For logs within 1-hour time frame the query works fine. However, extending the query beyond that time frame consistently results in the 504 Gateway Timeout error.
We have attempted to resolve the issue by increase the dataproxy timeout setting, but unfortunately, it hasn’t resolved the problem.
Could you please assist us in diagnosing and resolving this issue?
Additional details: ARO - 4.13,
datasource: Loki
Thank you in advance
Please share your Loki configuration.
apiVersion: loki.grafana.com/v1
kind: LokiStack
metadata:
annotations:
loki.grafana.com/certRotationRequiredAt: '2024-10-04T05:35:08Z'
loki.grafana.com/rulesDiscoveredAt: '2024-10-14T05:34:05Z'
creationTimestamp: '2024-07-24T05:35:04Z'
generation: 1
managedFields:
- apiVersion: loki.grafana.com/v1
fieldsType: FieldsV1
fieldsV1:
'f:spec':
.: {}
'f:limits':
.: {}
'f:global':
.: {}
'f:queries':
.: {}
'f:queryTimeout': {}
'f:managementState': {}
'f:size': {}
'f:storage':
.: {}
'f:schemas': {}
'f:secret':
.: {}
'f:name': {}
'f:type': {}
'f:storageClassName': {}
'f:tenants':
.: {}
'f:mode': {}
'f:openshift':
.: {}
'f:adminGroups': {}
manager: Mozilla
operation: Update
time: '2024-07-24T05:35:04Z'
- apiVersion: loki.grafana.com/v1
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:loki.grafana.com/certRotationRequiredAt': {}
'f:loki.grafana.com/rulesDiscoveredAt': {}
manager: manager
operation: Update
time: '2024-10-14T05:34:05Z'
- apiVersion: loki.grafana.com/v1
fieldsType: FieldsV1
fieldsV1:
'f:status':
.: {}
'f:components':
.: {}
'f:compactor':
.: {}
'f:Ready': {}
'f:distributor':
.: {}
'f:Ready': {}
'f:gateway':
.: {}
'f:Ready': {}
'f:indexGateway':
.: {}
'f:Ready': {}
'f:ingester':
.: {}
'f:Ready': {}
'f:querier':
.: {}
'f:Ready': {}
'f:queryFrontend':
.: {}
'f:Ready': {}
'f:conditions': {}
'f:storage':
.: {}
'f:credentialMode': {}
'f:schemas': {}
manager: manager
operation: Update
subresource: status
time: '2024-10-14T05:34:05Z'
name: logging-loki
namespace: openshift-logging
resourceVersion: '113440103'
uid: af894837-ff43-449b-93e1-e6bd0e28679d
spec:
limits:
global:
queries:
queryTimeout: 3m
managementState: Managed
size: 1x.extra-small
storage:
schemas:
- effectiveDate: '2024-07-23'
version: v13
secret:
name: logging-loki-azure
type: azure
storageClassName: managed-disk-premium
tenants:
mode: openshift-logging
openshift:
adminGroups:
- cluster-admin
status:
components:
compactor:
Ready:
- logging-loki-compactor-0
distributor:
Ready:
- logging-loki-distributor-XXXXXXXXXXXXXX
- logging-loki-distributor-XXXXXXXXXXXXXX
gateway:
Ready:
- logging-loki-gateway-XXXXXXXXXXXXXX
- logging-loki-gateway-XXXXXXXXXXXX
indexGateway:
Ready:
- logging-loki-index-gateway-0
- logging-loki-index-gateway-1
ingester:
Ready:
- logging-loki-ingester-0
- logging-loki-ingester-1
querier:
Ready:
- logging-loki-querier-XXXXXXXXXXXXXX
- logging-loki-querier-XXXXXXXXXXXXXX
queryFrontend:
Ready:
- logging-loki-query-frontend-XXXXXXXXXXXXXX
- logging-loki-query-frontend-XXXXXXXXXXXXXX
conditions:
- lastTransitionTime: '2024-10-14T05:34:05Z'
message: 'All components are running, but some readiness checks are failing'
reason: PendingComponents
status: 'False'
type: Pending
- lastTransitionTime: '2024-10-14T05:34:05Z'
message: One or more LokiStack components failed
reason: FailedComponents
status: 'False'
type: Failed
- lastTransitionTime: '2024-10-14T05:34:05Z'
message: All components ready
reason: ReadyComponents
status: 'True'
type: Ready
storage:
credentialMode: static
schemas:
- effectiveDate: '2024-07-23'
version: v13
Find the additional details for your reference
Grafana version:
repository: docker.io/grafana/grafana
tag: “11.1.0”
loki operator version: 5.9.4
I see you are using persistent volume, I’d recommend double checking and making sure that logs are actually written into the persistent volume.
Can you also perhaps hit the /config
endpoint on your Loki container and share the config, please?
Hi @tonyswumac, thank you for looking into this.
I’m also working on this along with my colleague. Kindly assist us here.
We have deployed the Loki stack using the operator. Facing network issue. Unable to fetch the /config the moment
For now, sharing the deployment config file.
Hi @tonyswumac , sharing the LokiStack here for reference.
apiVersion: loki.grafana.com/v1
kind: LokiStack
metadata:
spec:
limits:
global:
queries:
queryTimeout: 3m
managementState: Managed
size: 1x.extra-small
storage:
schemas:
- effectiveDate: ‘2024-07-23’
version: v13
secret:
name: logging-loki-azure
type: azure
storageClassName: managed-disk-premium
tenants:
mode: openshift-logging
openshift:
adminGroups:
- cluster-admin
status:
components:
compactor:
Ready:
- logging-loki-compactor-0
distributor:
Ready:
- logging-loki-distributor-5d78747486-4dv7w
- logging-loki-distributor-5d78747486-hsk4f
gateway:
Ready:
- logging-loki-gateway-6bf58bf76c-tgr4h
- logging-loki-gateway-6bf58bf76c-m8ljh
indexGateway:
Ready:
- logging-loki-index-gateway-0
- logging-loki-index-gateway-1
ingester:
Ready:
- logging-loki-ingester-0
- logging-loki-ingester-1
querier:
Ready:
- logging-loki-querier-6899fd567c-pwqdh
- logging-loki-querier-6899fd567c-rdh2d
queryFrontend:
Ready:
- logging-loki-query-frontend-6c5dc84dbf-xpvfp
- logging-loki-query-frontend-6c5dc84dbf-4lx55
conditions:
- lastTransitionTime: ‘2024-10-29T02:49:12Z’
message: One or more LokiStack components pending on dependencies
reason: PendingComponents
status: ‘False’
type: Pending
- lastTransitionTime: ‘2024-10-29T02:49:12Z’
message: One or more LokiStack components failed
reason: FailedComponents
status: ‘False’
type: Failed
- lastTransitionTime: ‘2024-10-29T02:49:12Z’
message: All components ready
reason: ReadyComponents
status: ‘True’
type: Ready
storage:
credentialMode: static
schemas:
- effectiveDate: ‘2024-07-23’
version: v13
Can you confirm if your logs are actually written to your azure storage? I see you define an azure storage, but I don’t see if referenced in schema config.
Hi @tonyswumac ,
This wasn’t configured by me but even I suspect this. I would like to share my observations here.
There are 3 statefulsets where the PVC’s are defined:
logging-loki-compactor
logging-loki-index-gateway
logging-loki-ingester
The usage is not heavy and there is no schema_config & no retention defined in the lokistack
can you suggest me how to proceed further.
If your intent is to store Loki data in azure storage, then I’d recommend you look at the documentation here (Single Store TSDB (tsdb) | Grafana Loki documentation) for some examples on how to configure a storage and how to reference it from index schema, and make sure it’s writing to your Azure storage without problem.