Recently, i found that Loki ingester does not de-duplicate well when RF(Replication Factor) is more than 1.
When Ingester does not de-duplicate chunk logs and flush to duplicated chunks to s3, even though querier is doing deduplication process, there will also be a capacity problem because unnecessary data is flushing into s3, and I think there will be a considerable load on querier as well because querier fetches all data and inspects duplicated or not.
1. Current Situation
: Here is my chunk flush bytes and chunk de-duplicate bytes
We don’t run Loki with replication factor of 3, so please take my comment with a grain of salt.
Deduplication based on chunk hash is tricky, because it relies on chunks from all ingesters to be identical. But given that ingesters are started at different time, and that chunks are cut based on rough size they most likely aren’t going to be the same. Loki deals with this by cutting chunks based on time instead, with an additional configuration of minimum utilization of expected chunk size. See How Loki Reduces Log Storage | Grafana Labs for a good explanation.
So make sure sync_period and sync_min_utilization are configured accordingly. Also double check your chunk size and chunk idle to make sure they aren’t being written too often. Also share your Loki config as well, if it’s still not working.
If the chunk is flushed before sync_period(ex. 15m) by the chunk_target_size,
I expected that deduplication would not perform well,
so I raised the chunk_target_size quite a lot.
You might try posting your question in the community slack as well, perhaps someone more knowledgeable than I or a developer could chime in.
We use only replication factor 1. The primary driver for us is cost. Obviously it’s against recommendation, but we feel that with simple scalable mode and WAL, and that we very rarely lose a node to the point that WAL becomes unrecoverable, it’s an acceptable risk.
I’d also recommend not scaling down the writer containers automatically. Because there is only one copy of logs at any given time, we decided to not scale down writers automatically, and only do it manually when needed so that we can ensure that the chunks are flushed properly.
1. Pause log transmission from log driver (ex. fluentbit or promtail or kafka consumer, …) 2. then proceed force flushing chunks 3. and then restart the loki writer
Is it right that I understand correctly?
If it is correct, I would like to change the Log Pipeline as follows.