Loki ingester killed OOM

Hi, I’m operation grafana loki on k8s env.
But Ingester keeps getting killed because of OOM.
I want Ingester to flush all the chunks he has in his memory before he uses up all the memory limits in the Pod, save them as Object storage, and empty the memory.
Can anyone help me with this?

  • loki version: 3.0.0 (distributed mode)
  • ingester cpu: 7000m
  • ingester memory: 14Gi
  • distributor received line: about 1.5K/s
  • distributor reveived bytes: about 1.5~2.0MB/s
    limits_config:  # Configuring the retention period
      query_timeout: 30m
      volume_max_series: 1000000 # The maximum number of aggregated series in a log-volume response
      reject_old_samples: true
      reject_old_samples_max_age: 1w
      split_queries_by_interval: 15m # default: 1h
      max_global_streams_per_user: 0 # default: 5000
      max_streams_per_user: 0 # default: 0
      retention_period: 30d 
      per_stream_rate_limit: 128MB # default: 3MB
      per_stream_rate_limit_burst: 256MB # default 15MB
      max_query_parallelism: 64 # default 32
      max_query_series: 100000
      max_query_length: 0 # default 30d1h
      volume_enabled: true
      ingestion_rate_mb: 128  # default 4
      ingestion_burst_size_mb: 256 # default 8
      discover_log_levels: false # default true
    distributor:
      rate_store:
        max_request_parallelism: 100 # default 200
        debug: true
    ingester:
      wal:
        enabled: false
      autoforget_unhealthy: true
      #chunk_retain_period: 1s
      concurrent_flushes: 1024 # How many flushes can happen concurrently from each stream.
      chunk_block_size: 786432 # 768KB default 262144
      chunk_target_size: 6164480  # 5MB default 1572864
      chunk_idle_period: 30m  # default 30m
      # The validity window for unordered writes is the highest timestamp present minus 1/2 * max-chunk-age.
      max_chunk_age: 2h  # default 2h.
      chunk_encoding: snappy  # The algorithm to use for compressing chunk. (none, gzip, lz4-64k, snappy,lz4-256k, lz4-1M, lz4, flate, zstd)

I have same issue.

What happend:
I upgrape promtail to 2.9.1 and keep loki 2.5.0, when I restart my pod, k8s killed loki pod with OOM, actually Memory didn’t increase until the limit I set.

k8s Existed code:137

Loki config
auth_enabled: false

server:
http_listen_port: 3100
grpc_listen_port: 9095
grpc_server_max_recv_msg_size: 5368709120
grpc_server_max_send_msg_size: 5368709120
common:
path_prefix: /loki
storage:
filesystem:
chunks_directory: /loki/chunks
rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory

querier:
query_timeout: 30s
query_range:

make queries more cache-able by aligning them with their step intervals

align_queries_with_step: true
max_retries: 2
cache_results: true
results_cache:
cache:
# We’re going to use the in-process “FIFO” cache
enable_fifocache: true
fifocache:
size: 2048
validity: 24h
ingester:
max_chunk_age: 24h0m0s
chunk_target_size: 8388608
limits_config:
ingestion_rate_mb: 16
ingestion_burst_size_mb: 32
reject_old_samples: false
reject_old_samples_max_age: 1d
max_query_length: 200d1h
max_query_series: 10000
max_entries_limit_per_query: 10000
max_cache_freshness_per_query: ‘10m’
max_global_streams_per_user: 0

parallelize queries in 15min intervals

split_queries_by_interval: 15m
retention_period: 200d
frontend_worker:
frontend_address: 127.0.0.1:9095
grpc_client_config:
max_recv_msg_size: 5368709120
max_send_msg_size: 5368709120
parallelism: 8
frontend:
max_outstanding_per_tenant: 1048576
log_queries_longer_than: 5s
compress_responses: true
chunk_store_config:
max_look_back_period: 672h
table_manager:
retention_deletes_enabled: true
retention_period: 672h
compactor:
retention_enabled: true
retention_delete_delay: 1h0m0s
delete_request_cancel_period: 1h0m0s

schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
chunks:
prefix: “”
period: 24h

ruler:
alertmanager_url: http://localhost:9093

k8s config
limits:
cpu: 32000m
memory: 8000Mi