Loki Live Tail broken in the Explore UI only with Websocket Error

sarasensible · April 20, 2022, 7:27pm

I am setting up Loki for the first time using loki-distributed and everything works great except for the live tail which is returning:

Query error
Live tailing was stopped due to following error: undefined

If I use logcli with port-forward of the query-frontend I can tail successfully:

logcli query --addr="http://localhost:3100" '{filename="/var/log/pods/monitoring_loki-loki-distributed-ingester-0_d439f9ec-713b-4b62-a8b2-b10cdfea28c5/ingester/0.log"}' --tail

If I check Developer tools I see it is a websocket error:

centrifuge.js:585 WebSocket connection to 'wss://monitoring.mysite.io/api/live/ws' failed: 
value @ centrifuge.js:585
WebSocketSubject.js:94 WebSocket connection to 'wss://monitoring.mysite.io/api/datasources/proxy/4/loki/api/v1/tail?query=%7Bnamespace%3D~%22.%2B%22%7D' failed: 
t._connectSocket @ WebSocketSubject.js:94
runRequest.ts:147 runRequest.catchError Live tailing was stopped due to following error: undefined

I am installing through Helm with an S3 backend, here is my configuration:

loki:
  containerSecurityContext:
    readOnlyRootFilesystem: false
#  image:
#    repository: grafana/loki
#    tag: 2.3.0
  config: |
    auth_enabled: false
    server:
      log_level: debug
      http_listen_port: 3100
    memberlist:
      randomize_node_name: false
      join_members:
        - {{ include "loki.fullname" . }}-memberlist.monitoring.svc.cluster.local
    ingester:
      lifecycler:
        ring:
          kvstore:
            store: memberlist
          replication_factor: 1
      chunk_idle_period: 30m
      chunk_block_size: 262144
      chunk_encoding: snappy
      chunk_retain_period: 1m
      max_transfer_retries: 0
      wal:
        dir: /var/loki/wal
    limits_config:
      enforce_metric_name: false
      ingestion_burst_size_mb: 20
      ingestion_rate_mb: 50
      ingestion_rate_strategy: global
      max_cache_freshness_per_query: 10m
      max_global_streams_per_user: 10000
      max_query_length: 12000h
      max_query_parallelism: 16
      max_streams_per_user: 0
      reject_old_samples: true
      reject_old_samples_max_age: 168h
    {{- if .Values.loki.schemaConfig}}
    schema_config:
    {{- toYaml .Values.loki.schemaConfig | nindent 2}}
    {{- end}}
    {{- if .Values.loki.storageConfig}}
    storage_config:
    {{- if .Values.indexGateway.enabled}}
    {{- $indexGatewayClient := dict "server_address" (printf "dns:///%s:9095" (include "loki.indexGatewayFullname" .)) }}
    {{- $_ := set .Values.loki.storageConfig.boltdb_shipper "index_gateway_client" $indexGatewayClient }}
    {{- end}}
    {{- toYaml .Values.loki.storageConfig | nindent 2}}
    {{- end}}
    query_range:
      # make queries more cache-able by aligning them with their step intervals
      align_queries_with_step: true
      max_retries: 5
      cache_results: true

      results_cache:
        cache:
          enable_fifocache: true
          fifocache:
            max_size_items: 1024
            validity: 24h

    frontend_worker:
      frontend_address: {{ include "loki.queryFrontendFullname" . }}:9095

    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
      tail_proxy_url: http://{{ include "loki.querierFullname" . }}:3100

    compactor:
      working_directory: /data/loki/boltdb-shipper-compactor
      shared_store: s3
      compaction_interval: 10m
      retention_enabled: true
      retention_delete_delay: 2h
      retention_delete_worker_count: 150
      compactor_ring:
        kvstore:
          store: memberlist
  schemaConfig:
    configs:
      - from: "2020-05-15"
        store: boltdb-shipper
        object_store: s3
        schema: v11
        index:
          prefix: index_
          period: 24h
  storageConfig:
    aws:
      s3: s3://us-east-2/mybucket
      s3forcepathstyle: true
    boltdb_shipper:
      active_index_directory: /var/loki/index
      shared_store: s3
      cache_location: /var/loki/boltdb-cache
      cache_ttl: 168h
      index_gateway_client:
        server_address: dns:///loki-distributed-index-gateway:9095
querier:
  replicas: 1
  extraVolumes:
    - name: bolt-db
      emptyDir: {}
  extraVolumeMounts:
    - name: bolt-db
      mountPath: /var/loki
serviceAccount:
  create: false
  name: s3-full
indexGateway:
  enabled: true
compactor:
  enabled: true

Any help would be greatly appreciated!

sarasensible · April 20, 2022, 7:40pm

And as soon as I posted I found the answer … I am using Pomerium to gatekeep my Grafana ingress which needed the following annotation to allow websockets:

ingress.pomerium.io/allow_websockets: "true"

With that annotation live tail works perfectly.

system · April 20, 2023, 7:41pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Loki live tailing stops just after displaying initial batch Grafana Loki	3	1878	September 6, 2023
sporadic connection issues to Loki - Helm 5.10.0 and Loki 2.8.3 Grafana Loki loki	1	26	February 13, 2025
Query error Live tailing was stopped due to following error: undefined Live loki	0	286	August 28, 2023
Loki Canary can't tail new logs Grafana Loki loki	1	807	May 24, 2024
Logcli tail commands fail after period with websocket: close 1006 error Grafana Loki	1	688	October 21, 2023

Loki Live Tail broken in the Explore UI only with Websocket Error

Related topics