I am facing an issue that won’t allow me to query historical data I imported successfully using promtail. I deployded the loki-distributed helm chart in the most recent version as of today.
I have a sample log with 1mio loglines in proper json which I sorted by timestamp to avoid out-of-order issues. I was able to import them and saw the subdir for the tenant (using multi-tenancy) in my s3 bucket on aws. So far so good.
All my queries either did not return any result or an error. Sometimes even HTTP 502.
I tried using grafana or logcli with the correct tenant header.
Grafana suggests the correct labels that were imported but does not show any data.
When I import them using the timestamp of the import I can query all data, except that it is not the correct timestamp.
The timestamps of the logs vary from 1970-01-01 (obviously a mistake “someone” made ) to a week ago.
This is my most recent result.
I can call the labels, but not the actual data.
❯ logcli labels --org-id="my-tenant" --from="1970-01-01T00:00:00Z"
2023/02/28 18:12:53 https://my-loki/loki/api/v1/labels?end=1677604373467159000&start=0
label1
label2
label3
❯ logcli query --org-id="my-tenant" --from="1970-01-01T00:00:00Z" --limit=5 '{label1="my-tenant"}'
2023/02/28 18:13:40 https://my-loki/loki/api/v1/query_range?direction=BACKWARD&end=1677604420542177000&limit=5&query=%7Blabel1%3D%22my-tenant%22%7D&start=0
2023/02/28 18:13:56 Error response from server: failed to get s3 object: NoSuchKey: The specified key does not exist.
status code: 404, request id: FCREYBYMEHFS9P5E, host id: RnmSKnZbdym0gnSHnEZcyGU+yX6y8P+CXJYcq9M07W7bIyiU2fcnOLUFul95rzhulQ/HVoDfLEo=
(<nil>) attempts remaining: 0
2023/02/28 18:13:56 Query failed: Run out of attempts while querying the server; response: failed to get s3 object: NoSuchKey: The specified key does not exist.
status code: 404, request id: FCREYBYMEHFS9P5E, host id: RnmSKnZbdym0gnSHnEZcyGU+yX6y8P+CXJYcq9M07W7bIyiU2fcnOLUFul95rzhulQ/HVoDfLEo=
I am using one service account for all pods. I can see that data is created on the s3 storage and the ingester logs that it is uploading them. The compactor has access too with the very same service account. I can tell that because of the newly created files with „compactor“ in its name. Is there a way to define a different false access for the querier? I thought they are all sharing the storage config and the service account.
The policy allows wildcard operations on the bucket and it’s content.
Could it be because of the epoch 0 timestamp in my „from“? I will try to remove those lines a re-import the logs tomorrow.
I am not sure if this is possible, but you can try setting a common storage. We don’t use storage_config, instead we put storage under common (which implicitly gets applied to all components). Like this:
I think I found the culprit. I just replaced all 1970-01-01 timestamps with 1971-01-01 and voila… it works.
In my query I still look “from=1970-01-01” but now I don’t receive an error.
I had to disable the gateway pod because I could’t adjust the timeout. That was the next issue I had. Now I use the ingress without the gateway and a timeout of 600s.