Best Way to Retrieve Per-Day Logs in Loki with High Volume (1M+ Lines/Day)

I want to retrieve logs for over a period of two months. We generate over one million logs per day.

  1. Are there different way to interact with Loki other than REST request, logcli and grafana UI ?
  2. What is best way to interact with loki to prevent in loss of logs while retrieving.
  3. Should I build any application on top of Loki to achieve this?

LogCLI is probably your best bet. To export all logs you can do something like this:

/usr/bin/logcli query \
  --timezone=UTC \
  --from=2025-01-01T00:00:00Z \
  --to=2025-02-01T00:00:00Z \
  --output=default \
  --parallel-duration=1h \
  --parallel-max-workers=<NUM> \
  --part-path-prefix=/logcli-result/export \
  '{QUERY_FILTER}'

This will output all the logs between from and to, on an 1-hour interval, with NUM number of parallel workers, into the directory with the prefix configured. One thing to note that for each file the log lines are sorted from latests to oldest (so opposite to how you’d usually see from a logfile). I don’t think this can be changed.

You can also do API, but you’d have to take into account of number of results returned so you can set maxi return log lines appropriately, and you’d of course want to loop your API call on a smaller interval instead of querying for 2 months at once.

http://localhost:3100/loki/api/v1/query_range?direction=BACKWARD&end=1735807219916086688&limit=1000&query=%7Bjob%3D%22prodlogs%22%7D&start=1735803959000000000
http://localhost:3100/loki/api/v1/query_range?direction=BACKWARD&end=1735807218034104990&limit=1000&query=%7Bjob%3D%22prodlogs%22%7D&start=1735803959000000000
http://localhost:3100/loki/api/v1/query_range?direction=BACKWARD&end=1735807214950801269&limit=1000&query=%7Bjob%3D%22prodlogs%22%7D&start=1735803959000000000

I am trying to export using logCLI. The above request are getting generated when I run the logcli command. I tried to set the limit to 0 and other values (like 5000) too but the request still gets generated with 1000 as limit. I have the following questions

  1. Does this limit cause loss of log export?
  2. How to override the limit ?
  3. What should I do to the loki config file to prevent any limits in the export ?

If you want to export everything don’t set any limit.

Event when I don’t set the limit in the logcli command, the requests get generated with as limit=1000 in params.

Then it’s probably batching it. Did you compare the entire output to the actual number of logs from Loki? You can probably try it for a day or two to verify.