Hi Team,
We recently got a huge cost on S3, specifically on the Put object API. On checking, we saw that the PVC of Loki Write Pod is full and the pod was showing an error level=error ts=2025-01-08T04:57:14.850699038Z caller=flush.go:144 org_id=fake msg="failed to flush" err="failed to flush chunks: store put chunk: open /var/loki/boltdb-shipper-active/loki_index_20014/1736311500: no space left on device, num_chunks: 1, labels: {app_kubernetes_io_name=\"pai-spark-driver\", application_component_name=\"spark_job_driver\", application_name=\"spark_job\", job_name=\"qualyshostvulnerabilityhostedareport20241018023531m\", sds_app_type=\"application\"}"
So we increased the PVC size, and after a few minutes, the used space was 10% of the total size. This decreased the S3 cost.
I know that once the logs are finalized, Loki pushes this to S3.
My doubt is, will loki push the logs directly to S3? And why increasing PVC size cleared the PVC to 10% after a few minutes?
Question is a bit unclear. Yes, Loki pushes logs directly to S3, in the sense that there is nothing between Loki and S3 in terns of writing objects to the bucket. But on the other hand no, Loki does not write logs to S3 as soon as it comes in (just image how many files you’d get if this were the case). There are several factors that control how often a chunk file is written, such as max_chunk_age
and chunk_target_size
. There are a couple of more, you can read through the documentation for more information.
Perhaps whatever WAL logs that’s backed up in your PVC is being cleared?