logQL and regexp unexpected unwrap

I have example line from pentaho logs

2021-02-26 15:01:21.523 INFO <get.Data - 1000.select> [/public/server/path.kjb /public/server/backup/job.path.kjb /public/get.Base.ktr] Finished processing (I=9730, O=0, R=0, W=9730, U=0, E=0)

I try to sum W through time so I run logQL querry:

sum by (writes) ({job=“integrator”} | regexp `.*W=(?P<writes>.*),` | unwrap writes [5m])

As I understand logic behind is:

  1. Chose stream job = integrator
  2. Extract field from the stream *W=[field content] .And save content in Writes label.
  3. Replace log line with value from Writes as result
  4. Define aggregate period as [5m]
  5. sum this values along above period.

How ever I got unexpected unwrap error. This is fairly hard to find examples of regexp with aggregate functions run from explorer or as panel querry.

Can any one point me to good tutorial or point what I’m doing wrong.

Nope, not at all :wink:
I know it is quite new topic so there are almost no docs anywhere - if I understand correctly regexp here comes with loki 2.0).

Do I don’t understand logical steps in this querry? The same regexp works fine with promtail label but having such label does not make sense.

BTW: Looks like I have 2 accounts here assigned to my github account :smiley: Weird.

Hmmmm … shouldn’t I aggregate using different label?

hello yes please aggregate and then come back again with report

Hey @szymonzy1, I think what you’re trying to do would be a sum_over_time aggregation:

    | regexp `.*W=(?P<writes>\S*),`
    | unwrap writes[5m]

Also note I’ve changed your regex from .*W=(?P<writes>.*), to .*W=(?P<writes>\S*),.

If you use .*W=(?P<writes>.*),, you will overrun that field: regex101: build, test, and debug regex
... (I=9730, O=0, R=0, W=9730, U=0, E=0) will match anything after W= until the next comma: W=9730, U=

I assume you want to capture just the value after W, so searching for any non-space character (\S) is probably what you want in that case.

1 Like

@dannykopping Thank you very much for your support. It is not easy to get any example for such calculations.

The mistake in the regex come from forum parser. I had troubles with escaping special characters here.
It had to be non greedy
Any way the unexpected unwrap went away but I got something new. No matter I use your or mine syntax.

pipeline error: 'SampleExtractionErr' for series: '{__error__="SampleExtractionErr", app="pentaho", filename="********/pdi.log", host="*******", job="integrator", level="WARN ", path="/************.kjb"}'. Use a label filter to intentionally skip this error. (e.g | __error__!="SampleExtractionErr"). To skip all potential errors you can match empty errors.(e.g __error__="") The label filter can also be specified after unwrap. (e.g | unwrap latency | __error__="" )

I know that not all lines include this match, co adding filter
| __error__=""
should avoid that however grafana 7.3.7 claims that there is
syntax error:unexpected |, expecting )
That would mean it is unsupported for my current version?

Ok, fixed!
error filter should go before time aggregate [5m] !

| regexp `.*W=(?P<writes>.*?),`
| unwrap writes
| __error__=""[5m]


Hi guys. I need your help please.
I have strings format look as

2021-03-26 12:36:34 2021-03-26 12:36:34.910 INFO 7 --- [or-http-epoll-1] r.m.i.e.r.fobo.components.FoboService : <mark>Analytical log</mark>: Fobo receipt created in 8.434 seconds Show context
2021-03-26 12:36:32 2021-03-26 12:36:32.819 INFO 7 --- [or-http-epoll-4] r.m.i.e.r.fobo.components.FoboService : <mark>Analytical log</mark>: Fobo receipt created in 6.810 seconds

in final i need dashbord with 99 perсentile on created_time.

I made this query

|="Analytical log" 
| regexp "^(?s)(?P<event_time>\\S+\\s\\S+).\\s(?P<log_lvl>\\S+).*?in\\s(?P<created_time>\\d+\\.\\d+).* $" 
| unwrap created_time
| __error__ = ""[1m])

But query result is empty =(
No errors , just empty.

What i make wrong?

If i’ll make this :
{container="etl-receipt-fobo-create"} |="Analytical log" | regexp "^(?s)(?P<event_time>\\S+\\s\\S+).\\s(?P<log_lvl>\\S+).*?in\\s(?P<created_time>\\d+\\.\\d+).*$"

then see true labels