At work I sometimes get bug reports that have been detected a few days ago (or up to 2 weeks). I often have to phase the same challenges to parse through hundreds of (archived) files to find the correct log file and spot where the bug happened (personal record: 10 .log + 150 .log.gz files of 10 instances of a web service, up to 4GB in total size).
Alloy + Loki help me tons, because they effectively search for the right keywords without manually unzipping a lot of .log.gz files in different subfolders. The problem is that I sometimes have to wait a long time until Alloy parses the right time range for me (I think the longest I ever had to wait was 1h or 1h30m).
I could look for certain file patterns in local.file_match and use ignore_older_than argument, but sometimes I need to look for other occurences on other days.
My Question
What is the best practice to deal with hundreds of log and archived log files in Alloy?
I use a docker-compose setup that also scrapes metrices from Loki and Alloy
I would like to accelerate log ingestion without excluding too many log files
Archived log files are of structure *.log-*.gz (per day log rotation like data-importer.log-20250313.gz)
I use one .alloy config file per web service (because they often use different log patterns)
I want to limit manual adjustments per situation to a minimum, i.e. ignore_older_than
Why not divide the files into groups and run many instances of Alloy in parallel?
The only thing you need to watch out for would be the order of logs, you can’t write to a log stream if newer logs already exist, so you’d need to give each alloy instance a unique label so they don’t conflict with each other.
@tonyswumac I’m not worried about “newer logs” issue as Alloy adds the “filename” label which makes the stream unique already (except for log rotation where importer.log turns into importer.log-20250313.gz). But I like the idea in grouping the logs and run multiple Alloy instances.
Label by date? I thought TSDBs are already efficient with date range search?
I want to keep compressed files compressed. Some of them are already 200MB in size. Imagine how big they are when they are uncompressed in my file system
Thanks for your tips. I will definitely take the grouping and parallel instance approach to heart. The Log server already creates groups and sub-groups by server and instance.
Will be a pain to C&P some configs and then mount the specific folder name, but better than letting 1 instance deal with 150 log files by itself.