Parsing *.log files from Apache Airflow via Docker: Promtail Loki Grafana

Hi to all,

I little bit confused, I trying for my POC via Docker, collect and read *.log files, for example line from log:

[2024-05-29T09:06:12.000+0300] {taskinstance.py:1364} INFO - Starting attempt 1 of 2

I want get from log-file: timestamp, level of log and message.

So my docker-compose.yaml include grafana, loki and promtail. That my promtail-config.yml:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
- job_name: airflow
  static_configs:
  - targets:
      - localhost
    labels:
      job: "airflow_logs"
      __path__: /var/log/*.log
  pipeline_stages:
    - regex:
        expression: '\[(?P<timestamp>.*?)\]\s+\{(?P<logger>.*?)\}\s+(?P<level>[A-Z]+)\s+-\s+(?P<message>.*)$'
    - timestamp:
        source: timestamp
        format: RFC3339
    - labels:
        level:
        logger:

In Explore tab I get filename, job, level and logger. So when I create query it’s load data. But still
times is wrong, instead extract timestamp it show time now.

Any idea how to parsing *.log files? (Actually log files come from Apache Airflow). I tried and I’m tired ChatGPT but no successful. And one more question should I migrate to Grafana Alloy?

  1. Yes, it is recommended to migrate to Grafana Alloy.

  2. Your timestamp is neither RFC3339 nor RFC3339Nano. See time package - time - Go Packages. You can still use custom time string to parse it.

1 Like