Does Loki deduplicates messages?

Hi! I am backfilling messages from weeks ago to grafana loki.

Sometimes I make a mistake and re-run the same backfilling process to grafana loki which re-posts exact same messages to Loki. I am using /v1/push loki endpoint.

Does the exact same messages with the exact same nanoseconds timestamp get automatically de-duplicated by grafana loki, or do they result in repeated messages?

You cannot write older logs to a log stream (logs with same set of labels) that already has newer logs.

If you were backfilling logs and are sending the logs with the same label, then you can’t backfill again once newer logs are present. You can still change your labels and backfill again, of course.

Hi, thank you for your response.

You can write old logs. There are some settings to configure it. Firstly there is reject_old_samples and reject_old_samples_max_age , shich just confugure when you can start the stream.

The most important is max_chunk_age. You can set max_chunk_age to a big (rudiculus) value and allow importing older chunks.

With max_chunk_age for example equal to one hour, i should be able to backfill infinitely many logs in the last 30 minutes of a stream in any random chronological order.

My question is, are these log lines deduplicated? I am seeing the deduplication, but i am suprised by existence of such feature. I suspect a flaw in my testing method and I’m trying to confirm with others.

Some of my log files actually span less than 30 minutes, so it is a difference. I am considering if i should execute first logli to confirm the log lines are not there, or worse, make a diff and upload only new lines. This os however expensive, both in programming time and api requests. There are over 40gb of logs every day witg around 300 streams, which adds complexity.

Thanks.

Yes, you can write older logs. I was referring to writing older logs into a log stream that already has new logs.

For example, if you backfil logs with labels {app="123"}, with timestamp that goes from a week to today, if you try to backfil it again with the same label it will fail (because the log stream {app="123"} already has newer logs).

So, if you backfil multiple times with the same labels, then I suspect only the first time was successful, which gave you the impression of deduplication. The only deduplication Loki does that I am aware of is when you configure Loki ingester with replication then it’ll deduplicate when querying.