Loki: no logs after few seconds

Hello,
I’m a newbie of Loki and I’m trying to configure simple Loki setup: syslog-ng-> Promtail-> Loki.

Promtail 2.7.1 and Loki 2.9.4 are running inside podman containers with volumes.
Syslog is running on a remote machine and sending logs to Promtail using TCP.
Syslog correctly sends the logs to promtail and promtail can forward them to Loki.

The problem is that logs seems gone after around 30 seconds.

I can list the labels:

$ LOKI_ADDR=http://127.0.0.1:${LOKI_HTTP_PORT}${LOKI_PATH} logcli labels
2024/03/04 13:02:42 http://127.0.0.1:20015/58f6d2dd-2e40-42e6-96ad-60d278b386f5/loki/api/v1/labels?end=1709553762638211076&start=1709550162638211076
controller_name
host
job

I can also inspect the values:

$ LOKI_ADDR=http://127.0.0.1:${LOKI_HTTP_PORT}${LOKI_PATH} logcli labels host
2024/03/04 13:03:14 http://127.0.0.1:20015/58f6d2dd-2e40-42e6-96ad-60d278b386f5/loki/api/v1/label/host/values?end=1709553794563072437&start=1709550194563072437
NethSec

But the analyze command fails:

$ LOKI_ADDR=http://127.0.0.1:${LOKI_HTTP_PORT}${LOKI_PATH} logcli series --analyze-labels '{host="NethSec"}'
2024/03/04 13:04:08 http://127.0.0.1:20015/58f6d2dd-2e40-42e6-96ad-60d278b386f5/loki/api/v1/series?end=1709553848601599413&match=%7Bhost%3D%22NethSec%22%7D&start=1709550248601599413
2024/03/04 13:04:08 Error response from server: 404 page not found
 (<nil>) attempts remaining: 0
2024/03/04 13:04:08 Error doing request: run out of attempts while querying the server

Query using curl on the /query endpoint works:

 curl -s -G -H "Content-Type: application/json" http://127.0.0.1:${LOKI_HTTP_PORT}${LOKI_PATH}/loki/api/v1/query --data-urlencode 'query={host="NethSec"}' 
{"status":"success","data":{"resultType":"streams","result":[{"stream":{"controller_name":"nethsec","host":"NethSec","job":"syslog"},"values":[["1709553880974212949"," aaaa\n"]]}],"stats":{"summary":{"bytesProcessedPerSecond":3217,"linesProcessedPerSecond":536,"totalBytesProcessed":6,"totalLinesProcessed":1,"execTime":0.001865,"queueTime":0.000454,"subqueries":0,"totalEntriesReturned":1,"splits":0,"shards":0,"totalPostFilterLines":1,"totalStructuredMetadataBytesProcessed":0},"querier":{"store":{"totalChunksRef":0,"totalChunksDownloaded":0,"chunksDownloadTime":0,"chunk":{"headChunkBytes":0,"headChunkLines":0,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":0,"postFilterLines":0,"headChunkStructuredMetadataBytes":0,"decompressedStructuredMetadataBytes":0}}},"ingester":{"totalReached":1,"totalChunksMatched":1,"totalBatches":2,"totalLinesSent":1,"store":{"totalChunksRef":0,"totalChunksDownloaded":0,"chunksDownloadTime":0,"chunk":{"headChunkBytes":6,"headChunkLines":1,"decompressedBytes":0,"decompressedLines":0,"compressedBytes":0,"totalDuplicates":0,"postFilterLines":1,"headChunkStructuredMetadataBytes":0,"decompressedStructuredMetadataBytes":0}}},"cache":{"chunk":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0},"index":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0},"result":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0},"statsResult":{"entriesFound":0,"entriesRequested":0,"entriesStored":0,"bytesReceived":0,"bytesSent":0,"requests":0,"downloadTime":0}}}}}

But if I execute the same command after around 30 seconds, the messages are gone:

$  curl -s -G -H "Content-Type: application/json" http://127.0.0.1:${LOKI_HTTP_PORT}${LOKI_PATH}/loki/api/v1/query --data-urlencode 'query={host="NethSec"}' | jq .data.result
[]

The same happens with logcli:

This is the relevant debug log of loki:

Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.506819721Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/ring
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.506898127Z caller=mock.go:150 msg=Get key=collectors/ring wait_index=331
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.506929744Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/distributor
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.507006769Z caller=mock.go:150 msg=Get key=collectors/distributor wait_index=332
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.50702302Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/scheduler
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.507130411Z caller=mock.go:150 msg=Get key=collectors/scheduler wait_index=330
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.50721575Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/compactor
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.50726812Z caller=mock.go:150 msg=Get key=collectors/compactor wait_index=333
Mar 04 13:08:15 node loki[5317]: level=info ts=2024-03-04T12:08:15.551390597Z caller=roundtrip.go:309 org_id=fake traceID=477bd37b2d8901b1 msg="executing query" type=instant query="{host=\"NethSec\"}" query_hash=2931680508
Mar 04 13:08:15 node loki[5317]: level=info ts=2024-03-04T12:08:15.551753972Z caller=engine.go:232 component=querier org_id=fake traceID=477bd37b2d8901b1 msg="executing query" type=instant query="{host=\"NethSec\"}" query_hash=2931680508
Mar 04 13:08:15 node loki[5317]: ts=2024-03-04T12:08:15.551781911Z caller=spanlogger.go:86 user=fake level=debug msg="querying ingester" params="selector={host=\"NethSec\"}, direction=BACKWARD, start=2024-03-04 12:07:45.551668145 +0000 UTC m=+1195.920211561, end=2024-03-04 12:08:15.551668145 +0000 UTC m=+1225.920211561, limit=100, shards="
Mar 04 13:08:15 node loki[5317]: ts=2024-03-04T12:08:15.551994431Z caller=spanlogger.go:86 user=fake level=debug shortcut=false from=2024-03-04T12:07:45.551Z through=2024-03-04T12:08:15.552Z err=null
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.552036045Z caller=index_set.go:207 table-name=tsdb_index_19786 org_id=fake traceID=477bd37b2d8901b1 index-files-count=0
Mar 04 13:08:15 node loki[5317]: level=debug ts=2024-03-04T12:08:15.552041801Z caller=index_set.go:207 table-name=tsdb_index_19786 user-id=fake org_id=fake traceID=477bd37b2d8901b1 index-files-count=1
Mar 04 13:08:15 node loki[5317]: ts=2024-03-04T12:08:15.552140782Z caller=spanlogger.go:86 user=fake level=debug Ingester.TotalReached=1 Ingester.TotalChunksMatched=0 Ingester.TotalBatches=1 Ingester.TotalLinesSent=0 Ingester.TotalChunksRef=0 Ingester.TotalChunksDownloaded=0 Ingester.ChunksDownloadTime=0s Ingester.HeadChunkBytes="0 B" Ingester.HeadChunkLines=0 Ingester.DecompressedBytes="0 B" Ingester.DecompressedLines=0 Ingester.PostFilterLInes=0 Ingester.CompressedBytes="0 B" Ingester.TotalDuplicates=0 Querier.TotalChunksRef=0 Querier.TotalChunksDownloaded=0 Querier.ChunksDownloadTime=0s Querier.HeadChunkBytes="0 B" Querier.HeadChunkLines=0 Querier.DecompressedBytes="0 B" Querier.DecompressedLines=0 Querier.PostFilterLInes=0 Querier.CompressedBytes="0 B" Querier.TotalDuplicates=0
Mar 04 13:08:15 node loki[5317]: ts=2024-03-04T12:08:15.552155079Z caller=spanlogger.go:86 user=fake level=debug Cache.Chunk.Requests=0 Cache.Chunk.EntriesRequested=0 Cache.Chunk.EntriesFound=0 Cache.Chunk.EntriesStored=0 Cache.Chunk.BytesSent="0 B" Cache.Chunk.BytesReceived="0 B" Cache.Chunk.DownloadTime=0s Cache.Index.Requests=0 Cache.Index.EntriesRequested=0 Cache.Index.EntriesFound=0 Cache.Index.EntriesStored=0 Cache.Index.BytesSent="0 B" Cache.Index.BytesReceived="0 B" Cache.Index.DownloadTime=0s Cache.StatsResult.Requests=0 Cache.StatsResult.EntriesRequested=0 Cache.StatsResult.EntriesFound=0 Cache.StatsResult.EntriesStored=0 Cache.StatsResult.BytesSent="0 B" Cache.StatsResult.BytesReceived="0 B" Cache.Result.DownloadTime=0s Cache.Result.Requests=0 Cache.Result.EntriesRequested=0 Cache.Result.EntriesFound=0 Cache.Result.EntriesStored=0 Cache.Result.BytesSent="0 B" Cache.Result.BytesReceived="0 B" Cache.Result.DownloadTime=0s
Mar 04 13:08:15 node loki[5317]: ts=2024-03-04T12:08:15.552164663Z caller=spanlogger.go:86 user=fake level=debug Summary.BytesProcessedPerSecond="0 B" Summary.LinesProcessedPerSecond=0 Summary.TotalBytesProcessed="0 B" Summary.TotalLinesProcessed=0 Summary.PostFilterLines=0 Summary.ExecTime=369.696µs Summary.QueueTime=50.384µs
Mar 04 13:08:15 node loki[5317]: level=info ts=2024-03-04T12:08:15.552178655Z caller=metrics.go:159 component=querier org_id=fake traceID=477bd37b2d8901b1 latency=fast query="{host=\"NethSec\"}" query_hash=2931680508 query_type=limited range_type=instant length=0s start_delta=507.431µs end_delta=507.473µs step=0s duration=369.696µs status=200 limit=100 returned_lines=0 throughput=0B total_bytes=0B total_bytes_structured_metadata=0B lines_per_second=0 total_lines=0 post_filter_lines=0 total_entries=0 store_chunks_download_time=0s queue_time=50.384µs splits=0 shards=0 cache_chunk_req=0 cache_chunk_hit=0 cache_chunk_bytes_stored=0 cache_chunk_bytes_fetched=0 cache_chunk_download_time=0s cache_index_req=0 cache_index_hit=0 cache_index_download_time=0s cache_stats_results_req=0 cache_stats_results_hit=0 cache_stats_results_download_time=0s cache_result_req=0 cache_result_hit=0 cache_result_download_time=0s
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507876592Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/compactor
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507911795Z caller=mock.go:150 msg=Get key=collectors/compactor wait_index=333
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507926848Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/scheduler
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507973074Z caller=mock.go:150 msg=Get key=collectors/scheduler wait_index=330
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507984852Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/distributor
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.50799318Z caller=mock.go:150 msg=Get key=collectors/distributor wait_index=332
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.507999538Z caller=mock.go:186 msg="Get - deadline exceeded" key=collectors/ring
Mar 04 13:08:16 node loki[5317]: level=debug ts=2024-03-04T12:08:16.508008807Z caller=mock.go:150 msg=Get key=collectors/ring wait_index=331

I can also see that chunks are saved inside the container:

$ find /loki
/loki
/loki/rules
/loki/rules-temp
/loki/chunks
/loki/chunks/index
/loki/chunks/index/delete_requests
/loki/chunks/index/delete_requests/delete_requests.gz
/loki/chunks/index/tsdb_index_19786
/loki/chunks/index/tsdb_index_19786/fake
/loki/chunks/index/tsdb_index_19786/fake/1709547695-compactor-1709545020662-1709545038520-498bcaae.tsdb.gz
/loki/chunks/loki_cluster_seed.json
/loki/chunks/fake
/loki/chunks/fake/65ddc5e276f01a9d
/loki/chunks/fake/65ddc5e276f01a9d/MThlMDhkMmRjZjY6MThlMDhkMmRjZjg6Y2ExMWM0OTA=
/loki/chunks/fake/b2850b5b54694cec
/loki/chunks/fake/b2850b5b54694cec/MThlMDhkMmRjZjc6MThlMDhkMmRjZjg6ZGE3N2QyNWE=
/loki/chunks/fake/664b6372e4dfeacc
/loki/chunks/fake/664b6372e4dfeacc/MThlMDhkMzIyYjc6MThlMDhkMzIyYjg6NTdiZjBjNmY=
/loki/chunks/fake/6978c43cee2c4147
/loki/chunks/fake/6978c43cee2c4147/MThlMDhkMmRjZjc6MThlMDhkMzIyYjg6NzQ2ZTcxMWU=
/loki/chunks/fake/bd8e04215265783f
/loki/chunks/fake/bd8e04215265783f/MThlMDhkMzIyYjc6MThlMDhkMzIyYjg6Zjk4ODVlN2Y=
/loki/chunks/fake/f9ad06ec829f4262
/loki/chunks/fake/f9ad06ec829f4262/MThlMDhkMzIyYjc6MThlMDhkMzIyYjg6YmIwZDZhY2M=
/loki/chunks/fake/f6ee43cceea74731
/loki/chunks/fake/f6ee43cceea74731/MThlMDhkMzIyYjQ6MThlMDhkMzIyYjc6MWJiZjBkNmY=
/loki/retention
/loki/retention/deletion
/loki/retention/deletion/delete_requests
/loki/retention/deletion/delete_requests/delete_requests
/loki/retention/retention
/loki/retention/retention/filesystem
/loki/retention/retention/filesystem/markers
/loki/tsdb-shipper-cache
/loki/tsdb-shipper-cache/tsdb_index_19786
/loki/tsdb-shipper-cache/tsdb_index_19786/fake
/loki/tsdb-shipper-cache/tsdb_index_19786/fake/1709547695-compactor-1709545020662-1709545038520-498bcaae.tsdb
/loki/tsdb-shipper-active
/loki/tsdb-shipper-active/uploader
/loki/tsdb-shipper-active/uploader/name
/loki/tsdb-shipper-active/wal
/loki/tsdb-shipper-active/wal/filesystem_2023-10-10
/loki/tsdb-shipper-active/wal/filesystem_2023-10-10/1709553649
/loki/tsdb-shipper-active/wal/filesystem_2023-10-10/1709553649/00000000
/loki/tsdb-shipper-active/multitenant
/loki/tsdb-shipper-active/multitenant/tsdb_index_19786
/loki/tsdb-shipper-active/per_tenant
/loki/tsdb-shipper-active/scratch
/loki/wal
/loki/wal/00000031
/loki/wal/checkpoint.000030
/loki/wal/checkpoint.000030/00000000
/loki/wal/00000032
/loki/wal/00000033
/loki/compactor

This is the promtail configuration:

server:
  disable: true

clients:
  - url: http://${LOKI_ADDR}:${LOKI_HTTP_PORT}${LOKI_PATH}/loki/api/v1/push

positions:
  filename: /tmp/positions.yaml

scrape_configs:
  - job_name: syslog
    syslog:
      listen_address: ${PROMTAIL_ADDRESS}:${PROMTAIL_PORT}
      label_structured_data: yes
      idle_timeout: 1800s
      labels:
        job: "syslog"
        controller_name: ${OVPN_CN}
    relabel_configs:
      - source_labels: ['__syslog_message_hostname']
        target_label: 'host'

Loki is configured to save everything on the local filesystem:

auth_enabled: false
target: "all"

server:
  http_listen_port: ${LOKI_HTTP_PORT}
  grpc_listen_port: ${LOKIGRPCC_PORT}
  http_listen_address: 127.0.0.1
  grpc_listen_address: 127.0.0.1
  http_path_prefix: ${LOKI_PATH}
  log_level: 'debug'

common:
  instance_addr: 127.0.0.1
  path_prefix: /loki
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules

schema_config:
  configs:
    - from: 2023-10-10
      store: tsdb
      object_store: filesystem
      schema: v12
      index:
        prefix: tsdb_index_
        period: "24h"

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

limits_config:
  retention_period: 180d

compactor:
  working_directory: /loki/retention
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 1h
  retention_delete_worker_count: 15

Extra information: Grafana fails to validate the Loki endpoint, but I can correctly explore the labels from Grafana. Instant queries are working from Grafana for few seconds, around 30 (just like the behavior found with curl).

I’ve tried to follow countless threads or suggestions inside the tracker but with no luck.

Anyone had encountered a similar issue an can give any suggestion on how try to fix it?

Thanks in advance
Giacomo

  1. If you are running in podman (docker) you should avoid binding by 127.0.0.1. I’d first try and get rid of that (just remove all those lines and let Loki figure it out).

  2. Try enabling debug log on querier and see what you can find.

Thanks for the response!

The container are running using host network, so binding 127.0.0.1 it should not be a problem.

But, after 4 days of work I finally managed to make it working: the problem is the http_path_prefix option.
There is a bug about it HTTP path-prefix option results to 404 on 2.4.1 · Issue #4756 · grafana/loki · GitHub
But even if marked as closed, the issue is still present.

The solution is removing the http_path_prefix option. It’s a security issue for me because I was using a random path, but I will try to find a workaround for my own deployment.

In the end, Loki does not lose data but such data where not accessible by the query frontend.

How can I do it? I already enabled the debug on Loki. Is there any other option?
I think it can be helpful in the future.