I have been running Loki for a couple of years now in the way that I just send all logs to promtail without much configuration and then do logfmt and filtering in Grafana. The performance of that is fairly poor, e.g. looking at my firewall dashboard for more than now()-6h - now() is all but impossible.
But just the other day, I stumbled over promtail pipelines and I’m wondering now if my performance issues are related to my setup and if the basic idea of loki is to have the feeder, i.e. promtail to at least all the labeling stuff.
One other thing I have problems is that I can’t get the logs from my OpenWRT router into promtail via syslog, neither via TCP nor via UDP. I’m currently proxying those through an rsyslogd on the loki host but that’s ugly…
But it is not clear what’s your use case. For example you are searching needle in your logs, then bloom filter will help - Loki 3.0 release: Bloom filters, native OpenTelemetry support, and more! | Grafana Labs
Maybe you have not very smart dashboard implementation, e. g. loading all the logs (so it can be MB, GB of logs) into Grafana, just to have log count graph - simple metric logql query and Grafana will be loading a few kB of data.
So you just complained, but didn’t provide any evidence what’s your case, so there is now way to provide any advice (withiut guessing) how it can be improved.
The same for syslog issue. No details: no tested promtail configuration, no info about syslog format,…
Generally: provide reproducible example what you have
I did not complain. My question was if it has performance advantages to do the labeling in a promtail pipeline at the time of ingesting the logs vs. in a query at the time of querying.
I wrote I tried via TCP and UDP. I don’t know in which format OpenWRT sends the logs - if there are different syslog formats.
You have all config options there. Of course if configured options doesn’t match your syslog source, then it wouldn’t work. But you have always error logs which may help you to find why it’s not working.
If I would have found a solution in the docs, I wouldn’t have posted here.
If there would be any errors in the logs, I would have solved the issue myself or then posted the logs together with my post.
I did solve the OpenWRT issue now by dumping the default logd and changing to syslog-ng on the OpenWRT box.
Main main question is still open, i.e. does it have performance advantages to do the labeling in a promtail pipeline on ingest vs. at query time?
I don’t believe there is any significant performance advantages, certainly not from querying logs. The main reason you’d want to have a limit on labels are mostly to reduce the cardinality. As long as you are somewhat reason able on label usage I don’t think it’s particularly impactful either way. What labels do you apply to your logs by the way?
Loki’s performance, especially query performance, comes from distribution. You want query split and query frontend, and with proper configuration I find the performance acceptable. How are you operating your cluster at the moent?
Actually, I moved Loki yesterday from my Xeon with a spinning disk ZFS array to a dedicated RK3588 ARM64 system with NVME SSDs and it got much MUCH snappier! I wouldn’t have thought the difference would be that big because the ZFS on the Xeon has SSD and RAM cache.
I haven’t yet thought about a cluster yet because - at least at the moment - the log volume is very low. Mainly the firewall and 3 more Linux hosts with a few docker apps. I have been running Loki for a few years now but just now had the time to really try to deep dive. My ultimate goal is to become familiar enough with Loki on my private infra that I can comfortably use it professionally.
I will look into cluster best practices. That is a good hint, thanks!