Compaction error in Loki logs - invalid tsdb path

Hi!

I’ve got Loki running on a Docker Swarm, Loki container has a mounted volume to store the logs/chunks, compaction/retention enabled via the compactor, and everything appears to be working fine. But I keep seeing below repeating errors in the Loki logs, which make me suspicious. They appear to always be referring to the exact same index/tsdb_path.

2024-05-12 23:58:12.842	
level=error ts=2024-05-12T21:58:12.842199624Z caller=compactor.go:601 msg="failed to compact files" table=index_19412 err="invalid tsdb path: /loki/tsdb-shipper-compactor/index_19412/fb4747e12b92-1623764091715318708-1677258561"

2024-05-12 23:48:12.882	
level=error ts=2024-05-12T21:48:12.882135921Z caller=compactor.go:523 msg="failed to run compaction" err="invalid tsdb path: /loki/tsdb-shipper-compactor/index_19412/fb4747e12b92-1623764091715318708-1677258561"

Any idea/suggestion what causes them, if they are relevant, how to mitigate?
Only suggestions I found so far seem to pointing towards conflicting schema_config, but since I’m only using one, that shouldn’t be the case?

Loki v2.9.4 with attached configuration:

target: all

auth_enabled: true

server:
  http_listen_port: ${HTTP_LISTEN_PORT:-3100}
  http_server_read_timeout: 300s
  http_server_write_timeout: 300s
  log_level: ${LOG_LEVEL:-info} 

common:
  instance_addr: ${INSTANCE_ADDR:-127.0.0.1}
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks 
      rules_directory: /loki/rules
  ring:
    kvstore:
      store: inmemory
    heartbeat_timeout: 10m 
  replication_factor: 1

schema_config:
  configs:
    - from: 2020-01-01
      store: tsdb
      object_store: filesystem
      schema: v12
      index:
        prefix: index_
        period: 24h # The index period must be 24h for the compactor.

ingester:
  chunk_encoding: snappy
  chunk_target_size: 1572864
  chunk_idle_period: 2h

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

compactor:
  shared_store: filesystem
  working_directory: /loki/tsdb-shipper-compactor
  compaction_interval: 10m 
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150

limits_config:
  retention_period: ${RETENTION_PERIOD:-30d}

  max_query_lookback: 90d


frontend:
  max_outstanding_per_tenant: 2048

  compress_responses: true 

  log_queries_longer_than: 20s

query_scheduler:

  max_outstanding_requests_per_tenant: 32000

Does the file /loki/tsdb-shipper-compactor/index_19412/fb4747e12b92-1623764091715318708-1677258561 actually exist? Can you query for logs during that time frame?

Thanks for your reply. I am probably still somewhat confused about the actual storage structure but no, I see a folder /loki/index/ with other index files in there (which are actual files, not folders?), and I see the folders deletion and retention within /loki/tsdb-shipper-compactor/. Within retention/filesystem/markers I find the timestamp files which seem to relate to my respective logging/retention time window.

Ad/ time frame, if I interpret it correctly, this should be somewhere between June 2021 and February 2023, so way before i started collecting logs.

Think I’ve might have figured it out. Reason could have been a misconfigured old bind mount volume in the docker setup, where old compaction data might have been present. After wiping said volume, the errors seem to no longer appear.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.