logQL and regexp unexpected unwrap

szymonzy1 · February 26, 2021, 7:15pm

I have example line from pentaho logs

2021-02-26 15:01:21.523 INFO <get.Data - 1000.select> [/public/server/path.kjb /public/server/backup/job.path.kjb /public/get.Base.ktr] Finished processing (I=9730, O=0, R=0, W=9730, U=0, E=0)

I try to sum W through time so I run logQL querry:

sum by (writes) ({job=“integrator”} | regexp `.*W=(?P<writes>.*),` | unwrap writes [5m])

As I understand logic behind is:

Chose stream job = integrator
Extract field from the stream *W=[field content] .And save content in Writes label.
Replace log line with value from Writes as result
Define aggregate period as [5m]
sum this values along above period.

How ever I got unexpected unwrap error. This is fairly hard to find examples of regexp with aggregate functions run from explorer or as panel querry.

Can any one point me to good tutorial or point what I’m doing wrong.

szymonzy1 · February 26, 2021, 8:55pm

Nope, not at all
I know it is quite new topic so there are almost no docs anywhere - if I understand correctly regexp here comes with loki 2.0).

Do I don’t understand logical steps in this querry? The same regexp works fine with promtail label but having such label does not make sense.

BTW: Looks like I have 2 accounts here assigned to my github account Weird.

Hmmmm … shouldn’t I aggregate using different label?

melrose · February 27, 2021, 12:24pm

hello yes please aggregate and then come back again with report

anon74857911 · March 1, 2021, 6:20am

Hey @szymonzy1, I think what you’re trying to do would be a sum_over_time aggregation:

sum_over_time(
  {job="integrator"}
    | regexp `.*W=(?P<writes>\S*),`
    | unwrap writes[5m]
)

Also note I’ve changed your regex from .*W=(?P<writes>.*), to .*W=(?P<writes>\S*),.

If you use .*W=(?P<writes>.*),, you will overrun that field: regex101: build, test, and debug regex
... (I=9730, O=0, R=0, W=9730, U=0, E=0) will match anything after W= until the next comma: W=9730, U=

I assume you want to capture just the value after W, so searching for any non-space character (\S) is probably what you want in that case.

szymonzy1 · March 1, 2021, 6:59am

@anon74857911 Thank you very much for your support. It is not easy to get any example for such calculations.

The mistake in the regex come from forum parser. I had troubles with escaping special characters here.
It had to be non greedy
`.W=(?P<writes>.?),`
Any way the unexpected unwrap went away but I got something new. No matter I use your or mine syntax.

pipeline error: 'SampleExtractionErr' for series: '{__error__="SampleExtractionErr", app="pentaho", filename="********/pdi.log", host="*******", job="integrator", level="WARN ", path="/************.kjb"}'. Use a label filter to intentionally skip this error. (e.g | __error__!="SampleExtractionErr"). To skip all potential errors you can match empty errors.(e.g __error__="") The label filter can also be specified after unwrap. (e.g | unwrap latency | __error__="" )

I know that not all lines include this match, co adding filter
| __error__=""
should avoid that however grafana 7.3.7 claims that there is
syntax error:unexpected |, expecting )
That would mean it is unsupported for my current version?

szymonzy1 · March 1, 2021, 7:02am

Ok, fixed!
error filter should go before time aggregate [5m] !

sum_over_time(
{job=“integrator”}
| regexp `.*W=(?P<writes>.*?),`
| unwrap writes
| __error__=""[5m]
)

mishiko · March 26, 2021, 3:17pm

Hi guys. I need your help please.
I have strings format look as

2021-03-26 12:36:34 2021-03-26 12:36:34.910 INFO 7 --- [or-http-epoll-1] r.m.i.e.r.fobo.components.FoboService : <mark>Analytical log</mark>: Fobo receipt created in 8.434 seconds Show context
2021-03-26 12:36:32 2021-03-26 12:36:32.819 INFO 7 --- [or-http-epoll-4] r.m.i.e.r.fobo.components.FoboService : <mark>Analytical log</mark>: Fobo receipt created in 6.810 seconds

in final i need dashbord with 99 perсentile on created_time.

I made this query

quantile_over_time(0.99,
{container="etl-receipt-fobo-create"} 
|="Analytical log" 
| regexp "^(?s)(?P<event_time>\\S+\\s\\S+).\\s(?P<log_lvl>\\S+).*?in\\s(?P<created_time>\\d+\\.\\d+).* $" 
| unwrap created_time
| __error__ = ""[1m])

But query result is empty =(
No errors , just empty.

What i make wrong?

If i’ll make this :
{container="etl-receipt-fobo-create"} |="Analytical log" | regexp "^(?s)(?P<event_time>\\S+\\s\\S+).\\s(?P<log_lvl>\\S+).*?in\\s(?P<created_time>\\d+\\.\\d+).*$"

then see true labels

jesuscarpintero · September 29, 2021, 1:50pm

Hi,
I have the exact same issue, did you find a solution?

The labels user, rx and tx are created, if I just unwrap the error appears, if I add the filter __error__="" no results at all.

sum_over_time({filename="/var/log/access.log"}
  | regexp `INFO client-disconnect event user=(?P<user>\S*) .+ bytes_recv:(?P<rx>[0-9]+) bytes_sent:(?P<tx>[0-9]+)`
  | unwrap rx|__error__=""[5m])

haironggao · April 28, 2022, 9:07am

I have the exact same issue, did you find a solution?

Topic		Replies	Views
Parse errors when using unwrap with sum_over_time Grafana Loki	0	1355	July 28, 2022
Why unwrap query shows :syntax error: unexpected unwrap Grafana Loki	5	1807	August 12, 2024
Unwrap loki regexp Grafana Loki loki	0	701	March 26, 2021
A question on unwrap Grafana Loki	3	15204	October 21, 2021
How hard is it to count string occurances? Grafana Cloud loki	2	5707	November 14, 2023

logQL and regexp unexpected unwrap

Related topics