Lots of "context canceled" errors when running queries

My Loki logs are producing lots and lots of “context canceled” errors which correlates to my running queries (through Grafana) quite nicely.
Also note that the queries are executed without apparent problems and return logs alright.

However, I don’t really understand the error messages (pretty much a noob here).
My setup so far is mostly for testing purposes, so it is local (using the 2.7.1 Docker image) and not scaled whatsoever.
There seems to be a related issue which was reported here:

Here is my config:

auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

        
compactor:
  retention_enabled: true
  delete_request_cancel_period: 15m

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h 

And these are the error messages in question:

level=error ts=2022-12-14T08:34:53.836753189Z caller=scheduler_processor.go:137 org_id=fake msg="error notifying scheduler about finished query" err=EOF addr=<loki-container-ip-address>:9095
level=error ts=2022-12-14T08:34:53.836740365Z caller=scheduler_processor.go:182 org_id=fake msg="error notifying frontend about finished query" err="rpc error: code = Canceled desc = context canceled" frontend=<loki-container-ip-address>:9095
level=error ts=2022-12-14T08:34:53.783633288Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.76464648Z caller=scheduler_processor.go:137 org_id=fake msg="error notifying scheduler about finished query" err=EOF addr=<loki-container-ip-address>:9095
level=error ts=2022-12-14T08:33:40.764637123Z caller=scheduler_processor.go:182 org_id=fake msg="error notifying frontend about finished query" err="rpc error: code = Canceled desc = context canceled" frontend=<loki-container-ip-address>:9095
level=error ts=2022-12-14T08:33:40.708772286Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708748542Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.70873676Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708736189Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708733935Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708732643Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708732392Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"
level=error ts=2022-12-14T08:33:40.708728806Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 err="context canceled"

I have been trying to understand what is happening here, but have not made any real progress. So if anyone could help me understand I would be much obliged!

1 Like

Same problem in here.

I’m using chart Loki-distributed 0.69.1 and Loki 2.7.1

have the same problem too, using standard docker container version 2.7.4

my timeout value in grafana datasource is 600.

and my loki configuration looks like below.

auth_enabled: false

analytics:
  reporting_enabled: false

server:
  http_listen_port: 3100
  http_server_read_timeout: 300s # allow longer time span queries
  http_server_write_timeout: 300s # allow longer time span queries
  
# will fix a lot of "context canceled" log messages
# https://github.com/grafana/loki/pull/5077/files#diff-025adfc5a8f641b9f5a1869996e3297b6c17f13933f52354cd9b375548ad7970R399
query_range:
  parallelise_shardable_queries: false
  split_queries_by_interval: 0s

# used to reduce amount of "context canceled" log messages
frontend:
  address: 127.0.0.1           # avoids "caller=scheduler_processor.go:182 org_id=fake msg="error notifying frontend about finished query" err="rpc error: code = Canceled desc = context canceled" frontend=172.16.0.110:9095"
#  max_outstanding_per_tenant: 2048 # default = 100]
#  log_queries_longer_than: 20s

querier:
#  max_concurrent: 20
  engine:
    timeout: 5m

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 30m       # Any chunk not receiving new logs in this time will be flushed
  max_chunk_age: 1h            # All chunks will be flushed when they hit this age, default is 1h
  chunk_retain_period: 30s
  max_transfer_retries: 0
  wal:
    dir: /loki/wal

schema_config:
  configs:
    - from: 2018-04-15
      store: boltdb
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb:
    directory: /loki/index

  filesystem:
    directory: /loki/chunks

chunk_store_config:
  max_look_back_period: 0s

limits_config:
  retention_period: 4320h # 180 days
  #retention_stream:
  #- selector: '{namespace="dev"}'
  #  priority: 1
  #  period: 24h
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h # 7 days
  max_query_series: 100000
  query_timeout: 5m
#  max_query_parallelism: 64

compactor:
  working_directory: /loki/compactor
  shared_store: filesystem
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 100
 
ruler:
  storage:
    type: local
    local:
      directory: /etc/loki/rules
  rule_path: /tmp
  alertmanager_url: http://alertmanager:9093
  ring:
    kvstore:
      store: inmemory
  enable_api: true
  enable_alertmanager_v2: true

Hi all,

I see the above GitHub issue as many other users also reported it. I have assigned it to the right project team and will a response soon.

Same problem in here.

I’m using local binary Loki 2.8.2。

When i execute the query using the local binary address, everything is ok .

and then i configurate a query-frontend , An error is reported when using the frontend to execute the query:

this frontend logs:

level=error ts=2023-05-23T08:47:43.034303408Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="sum by (level)(count_over_time({filename=\"/var/log/messages\", job=\"varlogs\"}[1m]))" err="context canceled"
level=error ts=2023-05-23T08:47:43.034333617Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="sum by (level)(count_over_time({filename=\"/var/log/messages\", job=\"varlogs\"}[1m]))" err="context canceled"
level=error ts=2023-05-23T08:47:43.038425191Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="{filename=\"/var/log/messages\", job=\"varlogs\"}" err="context canceled"
level=error ts=2023-05-23T08:47:43.03843109Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="{filename=\"/var/log/messages\", job=\"varlogs\"}" err="context canceled"

this local binary logs:

level=error ts=2023-05-23T08:49:08.02761579Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="{filename=\"/var/log/messages\", job=\"varlogs\"}" err="context canceled"
level=error ts=2023-05-23T08:49:08.028011325Z caller=retry.go:73 org_id=fake msg="error processing request" try=0 query="sum by (level)(count_over_time({filename=\"/var/log/messages\", job=\"varlogs\"}[1m]))" err="context canceled"

this is frontend configuration:

target: query-frontend
auth_enabled: false

http_prefix:

server:
  http_listen_port: 3101
  grpc_listen_port: 9096

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

query_range:    

  cache_results: true
  results_cache:
    compression: snappy
    cache:
      default_validity: 3h  
      embedded_cache:
        enabled: true
        max_size_mb: 100
        ttl: 1h
      fifocache:
        max_size_bytes: 1GB
        ttl: 1h
  max_retries: 5


limits_config:
  split_queries_by_interval: 30m
  max_cache_freshness_per_query: '10m'
  ingestion_rate_mb: 100   
  max_query_parallelism: 20    


frontend:
  log_queries_longer_than: 10s
  downstream_url: http://local-binary:3100
  compress_responses: true

common:
  compactor_address: http://local-binary:3100

I want to know what happend ?

my configuration is wrong?

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.