Hello, i’m currently exploring an idea to deploy loki in an kubernetes environment that don’t have access to persistence volume (hence no wal) but have access to kafka. My primary concern is to make sure that all logs routed to kafka are safely persisted in loki without data loss.
My idea is
Deploy loki write with additional container that fetch logs from kafka.
Have that container write the logs to loki write container.
Periodically call shutdown with flush to loki write
After successfully doing step no.3, commit the consumer offset to kafka.
My question, will this scenario works in ensuring the persistence of the logs to loki? And can this pod (which contain loki write and custom container) scaled horizontally?
You should just call the /flush endpoint without shutdown. Shutdown / restart every flush seems rather inefficient.
I think it sounds good on paper. In terms of scaling that’s up to how you code your kafka extractor. You can use the Zookeeper cluster within Kafka, write your extractor state so you can form a cluster for parallel processing, you can then scale your kafka extractor along with Loki writer. And your flow would be:
One extractor per Loki writer.
extractor retrieve logs from Kafka.
extractor sends logs to its dedicated writer.
extractor sends /flush to its dedicated writer.
extractor writes offset back to Kafka.
Repeat.
Performance would probably be the primary concern here. You probably don’t want to flush too often, but you also don’t want to wait too long.