Is |~ | regexp faster than | regexp?

kamilcuk · July 11, 2024, 1:32pm

Hi. I have quite a lot of logs and I am researching how to optimize the queries. I am parsing some logs in CSV format. Consider the following:

{job="JOBS"}
| regexp `^(?P<time>[^,]*),(?P<category>[^,]*),(?P<etc>[^,]*)`
| time =~ `1[05].*`

vs:

{job="JOBS"}
|~       `^(?P<time>[^,]*),(?P<category>[^,]*),(?P<etc>[^,]*)`
| regexp `^(?P<time>[^,]*),(?P<category>[^,]*),(?P<etc>[^,]*)`
| time =~ `1[05].*`

From my understanding: the regexp does not filter lines. It only sets the labels. The actual line filtering happens when matching | time =~ is matched, and lines without time label are excluded from the match. Because such lines get “later” in the query, the query might be slower. Vs in the second code, such lines are excluded right away, but the regex is probably compiled twice? Can I run regexp and extract label and filter lines at the same time?

My question is, which one is expected to be faster? Or it does not matter? Thanks!

tonyswumac · July 11, 2024, 3:06pm

Normally I’d say reducing number of logs to be processed by filtering them is generally a good practice.

But in your case I don’t think it really matters. Because your log is in CSV format, in your second example all the regex filtering is doing is making sure that the CSV has at least 3 fields, therefore not terribly useful. Essentially both of the following would pass your regex filter:

10,category,something_etc
clearly_not_time,123,456

So in this case I’d just go with your first example, and then match label afterwards.

kamilcuk · July 15, 2024, 3:37pm

Hi, thanks for the response.
Actually here a lot of log lines with the same origin labels which are not csv and there are around 11 csv fields. I would say around lower half like 40% of logs are not csv.
With these assumptions in mind, do you think your opinion changes?
Thank you.

tonyswumac · July 23, 2024, 11:35pm

I think so, unless you can filter logs out at the SELECTOR portion. Any filtering done after SELECTOR ({job="JOBS"}) is done after chunks are already downloaded, and at the point running regex before or after additional filtering won’t make much of a difference (in my opinion).

system · July 23, 2025, 11:36pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Loki query is too permissive Grafana Loki	5	92	July 10, 2025
Numeric keys in LogQL Grafana Loki	3	863	December 1, 2020
Unwrap loki regexp Grafana Loki loki	1	649	March 26, 2022
Drop log lines not matching regex Grafana Alloy	2	497	November 18, 2024
Logql optimization Grafana Loki loki	7	102	February 6, 2025

Is |~ | regexp faster than | regexp?

Related topics