Loki query is too permissive

hello everyone,

since i installed loki in v3.0.0 i have a problem with this query being matched even though it should not:

{filename=~".*.log"}
          | regexp "(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z?): (?P<level>\\w+)(?:/(?P<process>.*))? \\[(?P<component>.*)\\] - \\[Support\\]\\s*(?P<message>.*)"
          | level=~"(?i)critical"

this is an example log that is validated even though it should not

2024-07-08T11:55:10.088Z [WARN]  agent: Check is now critical: check=mem2

as you can see the query should already be excluded because of the T and the colon afterwards, but it is still taken into account by loki.

Is there a bug in the code? Or has it simply changed the way regex is done?

Does your log already come with a label named level?

As far as I know regexp is a pattern tool, and doesn’t invalidate or limit search. You might try something like this:

{filename=~".*.log"}
  | regexp "(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z?): (?P<level>\\w+)(?:/(?P<process>.*))? \\[(?P<component>.*)\\] - \\[Support\\]\\s*(?P<message>.*)"
  | __error__=""
  | level=~"(?i)critical"

Hi,

Yes we have the level tag defined

In any case, I’m adding some context: the problem is not related to the loki version but rather to the store migration.
During the upgrade we went from boltdb to tsdb.

As you said, regexp does not exclude but still includes values (strange because in the past with boltdb it excluded them).
Currently we have modified the query in this way and it seems to work

regexp "(?P<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z?): (?P<xlevel>\\w+)(?:/(?P<process>.*))? \\[(?P<component>.*)\\] - (?P<xcmd>\\[Support\\])\\s*(?P<message>.*)

Where did you find the fact that regexp is inclusive and not exclusive?

Thank you very much for the answer

I can’t test on older version anymore, but I thought this was always the case. regexp, like pattern and logfmt, is a parser, which is supposed to parse log lines and extract information in the form of labels. So really the difference between a successful and failed parsing should be that the successful query would return a set of parsed labels, but the failed query won’t, but the logs returned should be identical. Let me put it in the form of API return.

Consider log line:

this is a log line host=localhost ip=1.2.3.4

With a successful parse to extract host and ip as labels, you might see return like this:

{
  "stream": {
    "host": "localhost",
    "ip": "1.2.3.4"
  },
  "values": [
    [
      "<TIMESTAMP_IN_NANOSECONDS>", "this is a log line host=localhost ip=1.2.3.4"
    ],
  ]
}

Whereas if the parse attempt failed you should see return such as:

{
  "stream": {},
  "values": [
    [
      "<TIMESTAMP_IN_NANOSECONDS>", "this is a log line host=localhost ip=1.2.3.4"
    ],
  ]
}

This only becomes a problem when you are aggregating metrics based on label, which can usually be dealt with by checking for __error__.

I could be wrong that this used to behave differently, but at lease this is what I thought it always did.

ok thanks for the explanation

in any case, I have the impression that the problem is due to the fact that since there is the word ‘critical’ in the log, the log level was changed independently, despite the regexp pattern