Debuggin loki pod CrashLoopBackoff

Hi, I a k8s cluster deployed with loki-stack Ihave a loki server that worked well until the volume filled (100%). It started eating RAM. After enlarging the volume it showed permission problems that I fixed with chown/chmod.

Now I can’t start it, it seems it does not pass the liveness test (but honestly I can’t see anymore that warning that I saw in the morning). I just get

# kubectl describe -n monitoring pod/loki-0
Name:         loki-0
Namespace:    monitoring
...
Containers:
  loki:
    Container ID:  containerd://b310f8f6edf97de394424ba21c905340e972013a1b3324b67854ce633c6a2efe
    Image:         grafana/loki:2.5.0
    Image ID:      docker.io/grafana/loki@sha256:f9ef133793af0b8dc9091fb9694edebb2392a17558639b8a17767afddcca7a0f
    Ports:         3100/TCP, 9095/TCP, 7946/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Args:
      -config.file=/etc/loki/loki.yaml
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 06 Oct 2022 13:47:45 +0000
      Finished:     Thu, 06 Oct 2022 13:48:55 +0000
    Ready:          False
    Restart Count:  73
    Liveness:       http-get http://:http-metrics/ready delay=45s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:http-metrics/ready delay=45s timeout=1s period=10s #success=1 #failure=3
...
Events:
  Type     Reason   Age                      From     Message
  ----     ------   ----                     ----     -------
  Warning  BackOff  4m32s (x816 over 4h12m)  kubelet  Back-off restarting failed container

and logs report:

# kubectl logs -f -n monitoring loki-0
level=info ts=2022-10-06T13:53:59.136420745Z caller=main.go:106 msg="Starting Loki" version="(version=2.5.0, branch=HEAD, revision=2d9d0ee23)"
level=info ts=2022-10-06T13:53:59.136966393Z caller=server.go:260 http=[::]:3100 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2022-10-06T13:53:59.137071848Z caller=modules.go:597 msg="RulerStorage is not configured in single binary mode and will not be started."
level=info ts=2022-10-06T13:53:59.137615082Z caller=memberlist_client.go:394 msg="Using memberlist cluster node name" name=loki-0-f8157810
level=info ts=2022-10-06T13:53:59.14276134Z caller=memberlist_client.go:513 msg="joined memberlist cluster" reached_nodes=1
level=warn ts=2022-10-06T13:53:59.144747503Z caller=experimental.go:20 msg="experimental feature in use" feature="In-memory (FIFO) cache"
level=info ts=2022-10-06T13:53:59.181752254Z caller=table_manager.go:239 msg="loading table index_19256"
[...]
level=info ts=2022-10-06T13:55:05.103199819Z caller=table.go:443 msg="cleaning up unwanted dbs from table index_19267"
level=info ts=2022-10-06T13:55:05.10380821Z caller=table.go:358 msg="uploading table index_19256"
level=info ts=2022-10-06T13:55:05.411105415Z caller=table.go:385 msg="finished uploading table index_19256"
level=info ts=2022-10-06T13:55:05.411164686Z caller=table.go:443 msg="cleaning up unwanted dbs from table index_19256"
level=info ts=2022-10-06T13:55:05.411244085Z caller=module_service.go:96 msg="module stopped" module=store
level=info ts=2022-10-06T13:55:05.412562643Z caller=modules.go:877 msg="server stopped"
level=info ts=2022-10-06T13:55:05.412613538Z caller=module_service.go:96 msg="module stopped" module=server
level=info ts=2022-10-06T13:55:05.412644661Z caller=loki.go:373 msg="Loki stopped"
level=error ts=2022-10-06T13:55:05.412703241Z caller=log.go:100 msg="error running loki" err="failed services\ngithub.com/grafana/loki/pkg/loki.(*Loki).Run\n\t/src/loki/pkg/loki/loki.go:419\nmain.main\n\t/src/loki/cmd/loki/main.go:108\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581"

At this point the server restarts…

When I had permission problems logs clearly stated that, what else could it be? How can I debug it?

TIA

try setting the loki log level to debug to see if you can get more clues.

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.