Loki container root filesystem filling up with files starting with dGVuYW50LUdQ

Hi,
I’m running Loki 2.3.0 inside a container and the / filesystem of the container is now populated with over 30000 files ( it’s been running for a day ) and the filenames starts with dGVuYW50LUdQ ( just in case that’s important). It’s 30000 so far and the hosts’s docker filesystem is slowly filling up. I’m not too sure where these files are coming from and whether there should be a cleanup function.

The contents of the file

od -c /dGVuYW50LUdQVC80M2Y4YzIxMmZlZTJkMjU0OjE3Y2E0NWQwMTg5OjE3Y2E0NzhiNzMwOmJhNWIwYzlh|more
0000000 \0 \0 001 024 377 006 \0 \0 s N a P p Y 001 002
0000020 001 \0 201 z 242 363 { " f i n g e r p r
0000040 i n t " : 4 8 9 7 8 7 7 9 8 1 6
0000060 0 6 8 9 2 1 1 6 , " u s e r I D
0000100 " : " t e n a n t - G P T " , "
0000120 f r o m " : 1 6 3 4 8 4 5 1 3 1
0000140 . 1 4 5 , " t h r o u g h " : 1
0000160 6 3 4 8 4 6 9 4 7 . 1 2 0 , " m
0000200 e t r i c " : { " _ _ n a m e _
0000220 _ " : " l o g s " , " f i l e n
0000240 a m e " : " / d a t a 1 / p r i
0000260 m a r y / g p s e g 2 7 / p g _
0000300 l o g / g p d b - 2 0 2 1 - 1 0
0000320 - 0 9 _ 1 8 5 7 2 4 . c s v " ,
0000340 " h o s t " : " u l t g p l 1 1
0000360 1 " , " j o b " : " g p t l o g
0000400 s " } , " e n c o d i n g " : 1

My storage config writes to s3 and that appears to be working.

Does anyone have any ideas.

Cheers,

Greg

Hi, yes those are your chunk files.

$ echo -n 'dGVuYW50LUdQVC80M2Y4YzIxMmZlZTJkMjU0OjE3Y2E0NWQwMTg5OjE3Y2E0NzhiNzMwOmJhNWIwYzlh' | base64 -d            
tenant-GPT/43f8c212fee2d254:17ca45d0189:17ca478b730:ba5b0c9a

You can use the loki/cmd/chunks-inspect at main · grafana/loki · GitHub tool to inspect these files if you’re curious.

Can you provide your config please?

Hi Danny,
ok, thanks for explaining that. Still so much to learn.
My config is

auth_enabled: true

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

ingester:
  wal:
    enabled: true
    dir: /mnt/loki/wal
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h       
  max_chunk_age: 30m           
  chunk_target_size: 1048576 
  chunk_retain_period: 5m    
  max_transfer_retries: 0     # Chunk transfers disabled

schema_config:
  configs:
  - from: 2020-05-15
    store: boltdb-shipper
    object_store: filesystem
    schema: v11
    index:
      prefix: index_
      period: 24h

storage_config:
  aws:
    s3: s3://username:password@S3EquivalentLocation/lokidata3103
    s3forcepathstyle: true

  boltdb_shipper:
        active_index_directory: /mnt/loki/index
        shared_store: s3
        cache_location: /mnt/loki/boltdb-cache

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  per_tenant_override_config: /mnt/loki/override-config.yaml
  ingestion_rate_mb: 8

compactor:
  working_directory: /mnt/loki/boltdb-shipper-compactor
  shared_store: filesystem

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: true
  retention_period: 168h

ruler:
  storage:
    type: local
    local:
      directory: /mnt/loki/rules
  rule_path: /mnt/loki/rules-temp
  alertmanager_url: http://localhost:9093
  ring:
    kvstore:
      store: inmemory
  enable_api: true

and I kick off the container with docker run -v $(pwd):/mnt/loki

I had some problems with the k8s cluster where it will eventually end up so I’m running it under docker for the time being. Everything seems to be working , just that the container is filling up.

Regards,

Greg

Can you please make your config preformatted? (edit, select config, Ctrl-E)
It’s a bit difficult to read like this

auth_enabled: true

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

ingester:
  wal:
    enabled: true
    dir: /mnt/loki/wal
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h       # Any chunk not receiving new logs in this time will be flushed
  max_chunk_age: 30m           # All chunks will be flushed when they hit this age, default is 1h
  chunk_target_size: 1048576  # Loki will attempt to build chunks up to 1.5MB, flushing first if chunk_idle_period or max_chunk_age is reached first
  chunk_retain_period: 5m    # Must be greater than index read cache TTL if using an index cache (Default index read cache TTL is 5m)
  max_transfer_retries: 0     # Chunk transfers disabled

schema_config:
  configs:
  - from: 2020-05-15
    store: boltdb-shipper
    object_store: filesystem
    schema: v11
    index:
      prefix: index_
      period: 24h

storage_config:
  aws:
    s3: s3://username:password@S3EquivalentLocation/lokidata3103
    s3forcepathstyle: true

  boltdb_shipper:
        active_index_directory: /mnt/loki/index
        shared_store: s3
        cache_location: /mnt/loki/boltdb-cache

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  per_tenant_override_config: /mnt/loki/override-config.yaml
  ingestion_rate_mb: 8

compactor:
  working_directory: /mnt/loki/boltdb-shipper-compactor
  shared_store: filesystem

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: true
  retention_period: 168h

ruler:
  storage:
    type: local
    local:
      directory: /mnt/loki/rules
  rule_path: /mnt/loki/rules-temp
  alertmanager_url: http://localhost:9093
  ring:
    kvstore:
      store: inmemory
  enable_api: true

I think you need to change object_store to s3.

Hi,
Thanks very much, that worked like a charm. Now my storage system is awash with data.
I guess I was expecting the boltdb-shipper storage destination to be taken from the boltdb-shipper storage config but I think I was getting confused as to why one had an underscore and one had a dash.

Now I’m getting
level=error ts=2021-10-22T10:16:29.714088781Z caller=flush.go:220 org_id=tenant-GPT msg=“failed to flush user” err="OperationAborted: Object already exists.\n\tstatus code: 409, request id: , host id: "

but that might be because it’s trying to process a bit of information that has been banking up. The CPU has gone through the roof too.

Anyway, thanks again for your quick diagnosis and resolution.

Regards

Greg

Our configs could definitely use a lot of work. We’ve got some updates coming in the next few months, so keep an eye out for that.

My pleasure Greg! Happy Lokiing :+1: