Loki Live Tail broken in the Explore UI only with Websocket Error

I am setting up Loki for the first time using loki-distributed and everything works great except for the live tail which is returning:

Query error
Live tailing was stopped due to following error: undefined

If I use logcli with port-forward of the query-frontend I can tail successfully:

logcli query --addr="http://localhost:3100" '{filename="/var/log/pods/monitoring_loki-loki-distributed-ingester-0_d439f9ec-713b-4b62-a8b2-b10cdfea28c5/ingester/0.log"}' --tail

If I check Developer tools I see it is a websocket error:

centrifuge.js:585 WebSocket connection to 'wss://monitoring.mysite.io/api/live/ws' failed: 
value @ centrifuge.js:585
WebSocketSubject.js:94 WebSocket connection to 'wss://monitoring.mysite.io/api/datasources/proxy/4/loki/api/v1/tail?query=%7Bnamespace%3D~%22.%2B%22%7D' failed: 
t._connectSocket @ WebSocketSubject.js:94
runRequest.ts:147 runRequest.catchError Live tailing was stopped due to following error: undefined

I am installing through Helm with an S3 backend, here is my configuration:

loki:
  containerSecurityContext:
    readOnlyRootFilesystem: false
#  image:
#    repository: grafana/loki
#    tag: 2.3.0
  config: |
    auth_enabled: false
    server:
      log_level: debug
      http_listen_port: 3100
    memberlist:
      randomize_node_name: false
      join_members:
        - {{ include "loki.fullname" . }}-memberlist.monitoring.svc.cluster.local
    ingester:
      lifecycler:
        ring:
          kvstore:
            store: memberlist
          replication_factor: 1
      chunk_idle_period: 30m
      chunk_block_size: 262144
      chunk_encoding: snappy
      chunk_retain_period: 1m
      max_transfer_retries: 0
      wal:
        dir: /var/loki/wal
    limits_config:
      enforce_metric_name: false
      ingestion_burst_size_mb: 20
      ingestion_rate_mb: 50
      ingestion_rate_strategy: global
      max_cache_freshness_per_query: 10m
      max_global_streams_per_user: 10000
      max_query_length: 12000h
      max_query_parallelism: 16
      max_streams_per_user: 0
      reject_old_samples: true
      reject_old_samples_max_age: 168h
    {{- if .Values.loki.schemaConfig}}
    schema_config:
    {{- toYaml .Values.loki.schemaConfig | nindent 2}}
    {{- end}}
    {{- if .Values.loki.storageConfig}}
    storage_config:
    {{- if .Values.indexGateway.enabled}}
    {{- $indexGatewayClient := dict "server_address" (printf "dns:///%s:9095" (include "loki.indexGatewayFullname" .)) }}
    {{- $_ := set .Values.loki.storageConfig.boltdb_shipper "index_gateway_client" $indexGatewayClient }}
    {{- end}}
    {{- toYaml .Values.loki.storageConfig | nindent 2}}
    {{- end}}
    query_range:
      # make queries more cache-able by aligning them with their step intervals
      align_queries_with_step: true
      max_retries: 5
      cache_results: true

      results_cache:
        cache:
          enable_fifocache: true
          fifocache:
            max_size_items: 1024
            validity: 24h

    frontend_worker:
      frontend_address: {{ include "loki.queryFrontendFullname" . }}:9095

    frontend:
      log_queries_longer_than: 5s
      compress_responses: true
      tail_proxy_url: http://{{ include "loki.querierFullname" . }}:3100

    compactor:
      working_directory: /data/loki/boltdb-shipper-compactor
      shared_store: s3
      compaction_interval: 10m
      retention_enabled: true
      retention_delete_delay: 2h
      retention_delete_worker_count: 150
      compactor_ring:
        kvstore:
          store: memberlist
  schemaConfig:
    configs:
      - from: "2020-05-15"
        store: boltdb-shipper
        object_store: s3
        schema: v11
        index:
          prefix: index_
          period: 24h
  storageConfig:
    aws:
      s3: s3://us-east-2/mybucket
      s3forcepathstyle: true
    boltdb_shipper:
      active_index_directory: /var/loki/index
      shared_store: s3
      cache_location: /var/loki/boltdb-cache
      cache_ttl: 168h
      index_gateway_client:
        server_address: dns:///loki-distributed-index-gateway:9095
querier:
  replicas: 1
  extraVolumes:
    - name: bolt-db
      emptyDir: {}
  extraVolumeMounts:
    - name: bolt-db
      mountPath: /var/loki
serviceAccount:
  create: false
  name: s3-full
indexGateway:
  enabled: true
compactor:
  enabled: true

Any help would be greatly appreciated!

And as soon as I posted I found the answer … I am using Pomerium to gatekeep my Grafana ingress which needed the following annotation to allow websockets:

ingress.pomerium.io/allow_websockets: "true"

With that annotation live tail works perfectly.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.