Bad performance with loki

Hello,

if have an performance issue with Promtail->Loki->Grafana (docker-compose). I start my board in grafana with option "Last 90 Day"s and loki need all Cores (Intel I3) 100% over 10 Seconds, but there are only 1100 log lines (100KB) to read. I mark Timestamp, Logmessage and labled Loglevel with promtail.

Example from Logfile: [2023-05-12 00:31:21,995] INFO: Starting App

promtail-config

    server:
      http_listen_port: 9080
      grpc_listen_port: 0
      log_level: warn 

    positions:
      filename: /tmp/positions.yaml

    clients:
      - url: http://loki:3100/loki/api/v1/push
    limits_config:
      readline_rate_enabled: true
      readline_rate: 1000
      readline_burst: 1000
      readline_rate_drop: false


    scrape_configs:
      - job_name: jobnameexample
        pipeline_stages:
          - regex:
              expression: '^(?s)\[(?P<regextimestamp>[^\]]+)] (?P<regextype>INFO|CRITICAL|WARNING)\: (?P<regexcontent>.*?)$' 
          - timestamp:
              source: regextimestamp
              format: '2006-01-02 15:04:05,000'
          - labels:
              type: regextype
          - drop:
              older_than: 2160h # 90 Days
              drop_counter_reason: "line_too_old"
          - output:
              source: regexcontent
        static_configs:
          - targets:
              - localhost
          - labels:
              job: jobnameexample
              host: server
              group: backup
              source: logfile
              __path__: /var/log/log.log

loki-config

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096
  log_level: "warn"
common:
  instance_addr: 127.0.0.1
  path_prefix: /tmp/loki
  storage:
    filesystem:
      chunks_directory: /tmp/loki/chunks
      rules_directory: /tmp/loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

analytics:
  reporting_enabled: false

limits_config:
  max_query_length: 0h # Default: 721h
  ingestion_rate_mb: 10
  reject_old_samples: false

Loki’s performance comes from distribution. Since you are running on a single instance with local filesystem as storage I don’t think there is much point worrying about or tuning for performance as is.

Grafana Loki performs the search query in the following way:

  1. It selects the needed log streams according to stream selector

  2. It scans all the log lines across the selected log streams on the selected time range.

If the number of log lines for the selected log streams on the selected time range is large, then the query performance is slow. There are the following solutions exist:

  • To add more CPU and RAM to Grafana Loki components (mostly querier and ingestor), so they can scan log lines at higher speed.
  • To add more queriers and ingestors to Grafana Loki setup in order to scale query performance horizontally.
  • To use alternative solutions, which provide better full-text search query performance over large volumes of logs, such as Elasticsearch or VictoriaLogs. VictoriaLogs is preferred, since it requires up to 30x less RAM and up to 15x less disk space than Elasticsearch for production logs.