A way of deleting log files after promtail has scraped and forwarded them to Loki?

We have a directory containing syslog files downlinked from a satellite. At each downlink there is a new file added which contains a timestamp in its file name.
The folder with the syslog files, promtail and Loki are all on the same server.
Is there a way for promtail to delete a log file once is has scraped the contents and successfully handed the data over to Loki?
If I delete the log files asynchronously to promtail (e.g. with a cron job) I think a file could get deleted before promtail has processed it and we could loose the data.

@markuswiedemann hi, thanks for the question.

Promtail doesn’t have this functionality right now. However, you could write a script which looks at the positions file.

root@promtail-loki-ops-2wqq5:/# tail /var/log/positions-loki-ops.yml
  /var/log/pods/loki-ops_ingester-65_57e1d3c2-bce8-48a7-a789-6964c3edb888/ingester/0.log: "64596"
  /var/log/pods/loki-ops_memcached-14_a60e6a35-fc57-4a15-88f9-fe618f366556/exporter/0.log: "23836"
  /var/log/pods/loki-ops_memcached-14_a60e6a35-fc57-4a15-88f9-fe618f366556/memcached/0.log: "8667203"
  /var/log/pods/loki-ops_promtail-loki-ops-2wqq5_9dfbdc98-9273-49cb-bb09-d3e007831a26/promtail/0.log: "5234629"
  /var/log/pods/loki-ops_querier-7667578b9d-wck7g_bbd2e76f-50c1-493a-8b54-9bee8c9ec137/querier/1.log: "1807974"
  /var/log/pods/loki-ops_querier-7667578b9d-wck7g_bbd2e76f-50c1-493a-8b54-9bee8c9ec137/querier/2.log: "4402538"

The value there is the byte offset that was last read from the file. If this value is equivalent to the number of bytes in the file, then the file has been fully read and can be deleted (provided you don’t expect the file to be appended to at some random time).

1 Like

Hi @dannykopping,

thank you for your suggestion. I wrote a shell script which is checking the position file and deletes completely inserted files once a day using a cronjob. The script also stops the promtail service, writes a new position file which only contains entries of files which are not completely inserted and then starts the promtail service again. I did that because otherwise the position file will grow bigger and bigger as our syslog file have a timestamp included in the file name.

@markuswiedemann that sounds great.
You actually won’t need to remove the entries from the positions file since promtail is already aware of files being removed (and will cleanup the positions file every -positions.sync-period).

Oh ok, that’s even better. I won’t have to stop and start promtail then and deal with permissions.
Thanks for mentioning this.