Help properly setting up loki with Docker Swarm

Hey! I’ve been trying out Loki and i’m running into some issues.

this is my loki configuration:

auth_enabled: false

server:
  http_listen_port: 3100
  log_level: error

common:
  instance_addr: 
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: 

limits_config:
  ingestion_rate_mb: 100

querier:
  max_concurrent: 512
query_scheduler:
  max_outstanding_requests_per_tenant: 512127.0.0.1http://localhost:9093

And I have a grafana dasboard set up where I do a bunch of queries like these:

{source=~"$source", swarm_stack=~"$swarm_stack", swarm_service=~"$swarm_service"} |= `$searchable_pattern`

Which is kind of nice, because it allows me to filter by stdout/stderr or by stack of service in the dashboard.

Data is ingested by promtail and the loki docker plugin.

Now the issues i’m encountering are:

  1. If Loki/grafana wants to query some log entry that is too long (~1 MB) it crashes, and isnt able to display it. → Is there some way to just cut off long log lines? Or should Loki be able to handle this?
  2. I log quite a lot, around a million entries a day. If I do anything in the dashboard that processes more than 2 days worth of information, Loki takes up 100% CPU, and somehow causes all docker services on the same server to become unavailable/unresponsive. Causing me to have to restart. → What could cause this? Is it the regex’es is my queries? Or have I configured Loki in some sub-optimal way?

Thanks! :slight_smile:

  1. I am not sure what would cause long query to crash Grafana or Loki. Check logs and see if there is anything interesting.

  2. If you are running a single instance, you would probably want to:

  • Reduce concurrent queries
  • Reduce query splitting ( split_queries_by_interval). Default 1h, i’d recommend setting this to 6h or 12.

You’ll also want to reduce the number of chunks written to your filesystem, which means:

  • Bigger chunk_target_size
  • Longer chunk_idle_period
  • Longer max_chunk_age
  • Longer query_ingesters_within

Thank you! Tried these configs and seems to be okay now!. Am at least able to do a 7-day query containing 10 million rows. Still not very fast, but think thats probably due to the regex’s being heavly relied on for the queries.

My first issue i’ve now been able to resolve by limiting the cpu and ram resources the loki docker container is allowed to use.