Background: I was on the grafana page and queried the logs for about a week, which resulted in 8G of memory usage for the loki-write-1 pod, then I restarted the loki-write-1 pod, but I don’t understand the reason why it stays unready
root@qcloud-singapore-3-dhysrhw-web-1:~/v1.30/addons/loki# kubectl -n monitoring get pods
NAME READY STATUS RESTARTS AGE
loki-backend-0 2/2 Running 0 26d
loki-backend-1 2/2 Running 0 26d
loki-canary-29b44 1/1 Running 0 26d
loki-canary-dvd77 1/1 Running 0 26d
loki-canary-tx6m6 1/1 Running 0 26d
loki-chunks-cache-0 2/2 Running 0 26d
loki-gateway-746484b579-lnn8n 1/1 Running 0 26d
loki-read-97fc97848-6xpnn 1/1 Running 0 26d
loki-read-97fc97848-nvbfk 1/1 Running 0 26d
loki-results-cache-0 2/2 Running 0 26d
loki-write-0 1/1 Running 0 26d
loki-write-1 0/1 Running 0 22m
monitoring-ingress-nginx-controller-6f5d55d94b-pfrvx 2/2 Running 0 26d
promtail-ltwdf 1/1 Running 0 26d
promtail-x98q5 1/1 Running 0 26d
promtail-zr88g 1/1 Running 0 26d
Then I queried the logs for loki-write-1, which read as follows
level=info ts=2024-12-11T03:35:55.819322153Z caller=flush.go:304 component=ingester msg="flushing stream" user=fake fp=73db404a2542f038 immediate=true num_chunks=1 total_comp="254 B" avg_comp="254 B" total_uncomp="113 B" avg_uncomp="113 B" forced=1 labels="{app=\"hdh5-server\", container=\"setsysctl\", filename=\"/var/log/pods/hdh5_game-hdh5-server-10002-758cd9655b-8wkwv_e68508a8-37ac-4db5-8082-3a4127f47557/setsysctl/0.log\", instance=\"game-10002\", job=\"hdh5/hdh5-server\", namespace=\"hdh5\", node_name=\"10.18.44.12\", pod=\"game-hdh5-server-10002-758cd9655b-8wkwv\", service_name=\"hdh5-server\", stream=\"stdout\"}"
level=error ts=2024-12-11T03:35:55.830387705Z caller=flush.go:261 component=ingester loop=24 org_id=fake msg="failed to flush" retries=7 err="failed to flush chunks: store put chunk: AccessDenied: Access Denied.\n\tstatus code: 403, request id: Njc1OTA4OWJfOGFlZjc4MGJfMmJhOTFfNjdmYjBjYw==, host id: , num_chunks: 1, labels: {app=\"hdh5-server\", container=\"setsysctl\", filename=\"/var/log/pods/hdh5_game-hdh5-server-10002-758cd9655b-8wkwv_e68508a8-37ac-4db5-8082-3a4127f47557/setsysctl/0.log\", instance=\"game-10002\", job=\"hdh5/hdh5-server\", namespace=\"hdh5\", node_name=\"10.18.44.12\", pod=\"game-hdh5-server-10002-758cd9655b-8wkwv\", service_name=\"hdh5-server\", stream=\"stdout\"}"
level=info ts=2024-12-11T03:35:56.090581336Z caller=flush.go:304 component=ingester msg="flushing stream" user=fake fp=36b975fc936ca164 immediate=true num_chunks=1 total_comp="80 kB" avg_comp="80 kB" total_uncomp="446 kB" avg_uncomp="446 kB" forced=1 labels="{app=\"promtail\", container=\"promtail\", filename=\"/var/log/pods/monitoring_promtail-zr88g_7d22001e-7dbf-42e9-b7df-c2736a526238/promtail/0.log\", instance=\"promtail\", job=\"monitoring/promtail\", namespace=\"monitoring\", node_name=\"10.18.44.8\", pod=\"promtail-zr88g\", service_name=\"promtail\", stream=\"stderr\"}"
level=error ts=2024-12-11T03:35:56.106623535Z caller=flush.go:261 component=ingester loop=4 org_id=fake msg="failed to flush" retries=8 err="failed to flush chunks: store put chunk: AccessDenied: Access Denied.\n\tstatus code: 403, request id: Njc1OTA4OWNfOGFlZjc4MGJfMmJhOWVfNjgwNDBjZA==, host id: , num_chunks: 1, labels: {app=\"promtail\", container=\"promtail\", filename=\"/var/log/pods/monitoring_promtail-zr88g_7d22001e-7dbf-42e9-b7df-c2736a526238/promtail/0.log\", instance=\"promtail\", job=\"monitoring/promtail\", namespace=\"monitoring\", node_name=\"10.18.44.8\", pod=\"promtail-zr88g\", service_name=\"promtail\", stream=\"stderr\"}"
level=info ts=2024-12-11T03:35:59.577310697Z caller=flush.go:304 component=ingester msg="flushing stream" user=fake fp=d952a0f6e200a3c8 immediate=true num_chunks=1 total_comp="50 kB" avg_comp="50 kB" total_uncomp="106 kB" avg_uncomp="106 kB" forced=1 labels="{filename=\"/var/local/game-log-10001/debug/2024112917.log\", job=\"阿斯加德-debug\", service_name=\"阿斯加德-debug\"}"
level=error ts=2024-12-11T03:35:59.58793773Z caller=flush.go:261 component=ingester loop=8 org_id=fake msg="failed to flush" retries=8 err="failed to flush chunks: store put chunk: AccessDenied: Access Denied.\n\tstatus code: 403, request id: Njc1OTA4OWZfOGFlZjc4MGJfMmJhOTFfNjdmYjEzZg==, host id: , num_chunks: 1, labels: {filename=\"/var/local/game-log-10001/debug/2024112917.log\", job=\"阿斯加德-debug\", service_name=\"阿斯加德-debug\"}"
How should I optimise my LOKI cluster as I don’t have enough understanding of LOKI and what should I do correctly for unready pods