Loki crash every few days

Hi,

my loki crash every few days.
Its the newest binary download file for 64 bit.
The system is a quadcore celeron with 8gb ram.
Is there any way to have detailed logfiles on this crashs?

All i have is the systemctl output:

loki.service - Loki service
   Loaded: loaded (/etc/systemd/system/loki.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2021-10-04 00:28:25 CEST; 21h ago
  Process: 17068 ExecStart=/usr/local/bin/loki -config.file /etc/loki-local-config.yaml (code=exited, status=2)
 Main PID: 17068 (code=exited, status=2)

Okt 04 00:28:24 core1 loki[17068]: google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0e35936d0, 0x1, 0x0, 0x0, 0x0, 0x0)
Okt 04 00:28:24 core1 loki[17068]:         /src/loki/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:407 +0xff
Okt 04 00:28:24 core1 loki[17068]: google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc03aef3b60, 0x0, 0x0)
Okt 04 00:28:24 core1 loki[17068]:         /src/loki/vendor/google.golang.org/grpc/internal/transport/controlbuf.go:527 +0x1dd
Okt 04 00:28:24 core1 loki[17068]: google.golang.org/grpc/internal/transport.newHTTP2Server.func2(0xc0007fca80)
Okt 04 00:28:24 core1 loki[17068]:         /src/loki/vendor/google.golang.org/grpc/internal/transport/http2_server.go:292 +0xd7
Okt 04 00:28:24 core1 loki[17068]: created by google.golang.org/grpc/internal/transport.newHTTP2Server
Okt 04 00:28:24 core1 loki[17068]:         /src/loki/vendor/google.golang.org/grpc/internal/transport/http2_server.go:289 +0x1110
Okt 04 00:28:25 core1 systemd[1]: loki.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Okt 04 00:28:25 core1 systemd[1]: loki.service: Failed with result 'exit-code'.

Thanks for your help.

Hi,
Do you have some kind of monitoring on the ram? like grafana or something, to show. In my experience it happens when the ram gets high because of the ingesters, also I would recommend deploying no in all in one solution but multiple services.

Hello @dror1212 ,

grafana is monitoring the hostsystem.
I see that there was high ram for a while.
But what should be enough ram for loki?
Am i right or not? this is a coding failure?
Normaly the programm has to break the process, not itself. :slight_smile:
I am monitoring an overview for OPNSense, input and output counting within an hour.
A top100 ips and ports in block chain. All about max the last day.

One of the ingesters configuration value is

chunk_retain_period

and it’s used to hold some of the data in the ram as some kind of cache, try lowering it and check if it helps

Hi @dror1212,

the default for this value seems to be 30s.
I set it to 20s and will recheck.

Thx

it’s 15 minutes by default, you might have looked on the wrong variable.

Its in /etc/loki-local-config.yaml

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 5m
  chunk_retain_period: 20s
  max_transfer_retries: 0

ok, so my mistake, I looked at the documentetion when I said the defaults.
It can’t be the problem if it’s that low… can you send your full configuration?

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 5m
  chunk_retain_period: 20s
  max_transfer_retries: 0

schema_config:
  configs:
    - from: 2018-04-15
      store: boltdb
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 168h

storage_config:
  boltdb:
    directory: /usr/share/loki/index

  filesystem:
    directory: /usr/share/loki/chunks

limits_config:
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s