Filtering out single line messages that have been parsed from a multi-line log

Hi folks: I poked around but couldn’t find the answer here.

I want to be able to construct a match filter that is based on the number of lines taht were just parsed. My goal is to skip when there is only one line.

My logs look something like this:

batch job running
batch job running
batch job running
batch job running
batch job running
batch job running
batch job running
WARNING Property: Unknown Property name. [11:3: overflow-wrap]
batch job running
batch job running
batch job running
WARNING Property: Unknown Property name. [11:3: overflow-wrap]
batch job running
08:08:20 Traceback (most recent call last):
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/rq/worker.py", line 1013, in perform_job
    rv = job.perform()
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/rq/job.py", line 709, in perform
    self._result = self._execute()
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/rq/job.py", line 732, in _execute
    result = self.func(*self.args, **self.kwargs)
  File "/home/erp/erpnext/apps/frappe/frappe/utils/background_jobs.py", line 164, in execute_job
    frappe.db.commit()
  File "/home/erp/erpnext/apps/frappe/frappe/database/database.py", line 908, in commit
    self.sql("commit")
  File "/home/erp/erpnext/apps/frappe/frappe/database/database.py", line 180, in sql
    self._cursor.execute(query)
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/pymysql/cursors.py", line 148, in execute
    result = self._query(query)
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/pymysql/cursors.py", line 310, in _query
    conn.query(q)
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/pymysql/connections.py", line 547, in query
    self._execute_command(COMMAND.COM_QUERY, sql)
  File "/home/erp/erpnext/env/lib/python3.8/site-packages/pymysql/connections.py", line 793, in _execute_command
    raise err.InterfaceError(0, "")
pymysql.err.InterfaceError: (0, '')

batch job running
batch job running
batch job running
batch job running

Yea, there’s no timestamp (we didn’t write the app).

My goal of course is to NOT log the lines “batch j ob running” but do log the error, which frustratingly enough doesn’t have the word ‘error’, and the warning. I don’t know ahead of time exactly what the strings will consist of, so I don’t want to build include match blocks.

Is there a way to filter on # of lines? I thought maybe there was a lines parsed metric I could filter on?

For completeness my promtail setup is:

  - job_name: worker_error_logs
    static_configs:
    - targets:
          - localhost
      labels:
            job: worker
            app: ourapp
            __path__: /home/app/ourapp/logs/worker.error.log

    pipeline_stages:
      - multiline:
            firstline: '^batch job running' 
            max_lines: 128
            max_wait_time: 3s

Thank you,

    == John ==

I think you are overthinking this a bit. I can think of two potential solutions:

  1. Log everything except batch job running. Something like this should work:
pipeline_stages:
  - regex:
      expression: ^(?P<is_batch_job_running>batch\sjob\srunning)$
  - labels:
      is_batch_job_running:
  - match:
      selector: '{is_batch_job_running=~"^batch\sjob\srunning$"}'
      action: drop
      drop_counter_reason: batch_job_running
  1. What is the motivation behind not wanting to log the batch job running line? It’s pretty easy to exclude a matching line with LogQL if you wish to not display it in Grafana.

Personally I’d recommend #2. In general keeping your log forwarding logic as simple as possible should be more preferable in my opinion.

I think you are overthinking this a bit

Probably :slight_smile:

“batch job running” is the start of every log line, and thus is the only text string to start the multi-line on. (It’s actually a slightly different text line but you get the idea).

What is the motivation behind not wanting to log the batch job running line?

It’s not that line I want to exclude. It’s only a log that has ONLY ‘batch job running’ in other words, 0 additional lines.

In other words, this is 3 log entries:

batch job running
batch job running
WARNING Property: Unknown Property name. [11:3: overflow-wrap]
batch job running
08:08:20 Traceback (most recent call last):
  stuff

The start of the line is not the timestamp, frustratingly. The start of a log entry is “batch job running”.

The first log entry was 0 lines long. Those are what I’m trying not to log. So from the above, I would want Loki to ingest 2 log entries, and discard the first. All of them start with “batch job running”

I thought about using non-indented lines that don’t say “batch job running” as the start of a log, but in the larger log on the server, I also found that indented spaces also cannot be a signifier that it’s a continuation log line, as there are multiple non-indented lines between “batch job running”.

If I exclude “batch job running” I’ll log nothing.

    == John ==

My point still stands, what is the reason behind not wanting to log just batch job running? Assuming your concern is not about the additional storage or network traffic cost, you can quite easily ignore all log lines that has just batch job running after they land in Loki.

How? That’s not obvious to me without stripping the multi-line logs we do want.

Additionally, I’m not sure I’m explaining this well. That is every single log line. If I exclude those in loki, we have no logs.

Every log entry starts with “batch job running” and then maybe 1% have additional lines underneath that string.

Assuming your concern is not about the additional storage or network traffic cost,

Exactly. It makes Loki ingest 100x more data than required.

We want ONLY the “Batch Job Running” log entries that have lines in following “Batch Job Running”

Like, “linecount > 1” but I’m not sure how to do that.

Sure, if that could be filtered in Loki, that would work, but there are hundreds of thousands of these lines every day that are logged. Anytime a batch job runs with no errors, and every 5 minutes, we get one log entry with no additional lines that says “batch job running”. I don’t see a need to log meaningless chatter.

    == John ==

I personally haven’t had to deal with multi-line logs, so I may be speaking out of my ass hear, but I’d imagine you can still filter them by regex. For example, you can use regex ^batch job running$ to match exactly that string and that string only, meaning anything that has more than that string (which should include logs with more than one lines) wouldn’t match. I think it’s worth a try.

I guess it comes down to preference. Personally I prefer to filter and drop as little as possible during logging pipeline because I prefer to have more confidence in knowing that nothing is being dropped that shouldn’t be dropped, and with compression and retention policy a bit of junk data I can live with.

But personal preference aside, I don’t think what you want is possible with promtail. With the multiline stage you have to match something as the first line, and because you don’t know what the “legit” lines would start with you have no choice but to match with batch job running, which would then include that line into the log. This is just unavoidable.

I still think it’s worth a try just to log everything, and try to do the filtering when querying. If not, then you might have to look at other log agent such as logstash or fluentd and see if it’s possible to do exactly what you want with those.