Loki is very slow with tsdb compared to boltdb-shipper in single binary mode

I have a single binary mode loki server running with about 5kEps of write path stream. Querying all events (with {foo=~“.+”}) for a one hour window takes 10 sec with boltdb-shipper as store and with tsdb only a 5 minute window of all events can be queryied within the 30 sec timeout of grafana. This was both tested with loki 2.9.x and loki 3.0.0. Do I miss any tsdb specific configs that are maybe single binary mode specific?

/etc/loki/config.yml

auth_enabled: false

server:
  http_listen_port: 3101
  grpc_listen_port: 9097
  grpc_server_max_recv_msg_size: 20971520
  grpc_server_max_send_msg_size: 20971520
ingester:
  wal:
    enabled: true
    dir: /opt/loki/wal
  chunk_encoding: snappy


common:
  instance_addr: 127.0.0.1
  path_prefix: /opt/loki/
  storage:
    filesystem:
      chunks_directory: /opt/loki/chunks
      rules_directory: /opt/loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

query_scheduler:
  max_outstanding_requests_per_tenant: 32000

limits_config:
  ingestion_burst_size_mb: 4096
  ingestion_rate_mb: 2048
  per_stream_rate_limit: 1024M
  per_stream_rate_limit_burst: 2048M
  retention_period: 7d
  max_global_streams_per_user: 0 
  allow_structured_metadata: "false / true"

schema_config:
  configs:
    - from: 2024-05-01
      store: "boltdb-shipper / tsdb"
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

compactor:
  working_directory: /opt/loki/compactor 
  compaction_interval: 10m
  retention_enabled: true 
  retention_delete_delay: 15m
  retention_delete_worker_count: 150
  delete_request_store: filesystem

table_manager:
  retention_deletes_enabled: true
  retention_period: 7d

I don’t think there is any index configuration specific to single instance.

What does your query splitting look like? And what’s your query_ingesters_within set to?

@tonyswumac: split_queries_by_interval is set to 1h. query_ingesters_within is set 1 h greater than max_chunk_age.

I found two problems with my setup:

  1. the system was limited by the amount of memory available. Now with more memory tsdb and boltdb perform about the same.
  2. when comparing 2.9.x with 3.0.0 I found that there is a new default value for querier max_concurrent (old: 10, new: 4) that on single instance relates to the maximum number of cpu cores usable for the loki server process.