Loki datasource not displaying all the services under a Kubernetes(EKS) cluster under Grafana UI

I have loki setup in grafana (added as Datasource), it has proper labels and it runs on k8s cluster. the values listed by labels are not shown properly, Eg: Like it shows only half of the values which are assigned to labels…but k8s cluster has all the pods and promtail is configured with loki and it supplies proper logs to loki as well, what is missing here ?

I have 77 services running in my cluster and promtail ships all the logs to loki also, but for some reason loki displays labels such as container, under the container only 30 or so services are found and not more than that. I have logged inside the loki ingester and querier pods as well to check for index storage also. the indexes are present too.

I’ll provide a valid screenshot of my Grafana UI and running pods also



Please get back with any solutions if any, I can provide with configmap also if needed.

Thank You !!

How are you generating the container label? Are you sure all your container logs fit whatever criteria it is?

yes @tonyswumac it fits the criteria, I have found why my deployments were not showing up, since the deployments were done prior to implementation of loki in my cluster, some of them were not showing up, after restarting some deployments (services) it finally showed up in my grafana UI.

I have one more problem statement that, everytime I edit my configmap and restart my Loki deployments like distributor, querier and ingester, the logs for services are not showing up in my Grafana, even though I’m storing them in my S3 bucket(indexes are present), after this restart only logs after the restart were showing up and I badly want to know how the retention system truly works here. I will share my config map below as well.

Data
====
config.yaml:
----
auth_enabled: false
chunk_store_config:
  max_look_back_period: 0s
common:
  compactor_address: http://loki-loki-distributed-compactor:3100
compactor:
  shared_store: s3
  working_directory: /var/loki/compactor
distributor:
  ring:
    kvstore:
      store: memberlist
frontend:
  compress_responses: true
  log_queries_longer_than: 30s
  tail_proxy_url: http://loki-loki-distributed-querier:3100
frontend_worker:
  frontend_address: loki-loki-distributed-query-frontend-headless:9095
ingester:
  chunk_block_size: 262144
  chunk_encoding: snappy
  chunk_idle_period: 30m
  chunk_retain_period: 1m
  lifecycler:
    ring:
      kvstore:
        store: memberlist
      replication_factor: 1
  max_transfer_retries: 0
  wal:
    dir: /var/loki/wal
ingester_client:
  grpc_client_config:
    grpc_compression: gzip
limits_config:
  enforce_metric_name: false
  ingestion_rate_mb: 100
  ingestion_burst_size_mb: 200
  max_cache_freshness_per_query: 10m
  reject_old_samples: false
  reject_old_samples_max_age: 168h
  split_queries_by_interval: 15m
  max_streams_per_user: 10000
  max_label_name_length: 2048
  max_label_value_length: 4096
  max_label_names_per_series: 50
  max_entries_limit_per_query: 5000
memberlist:
  cluster_label: loki.monitoring
  join_members:
  - loki-loki-distributed-memberlist
query_range:
  align_queries_with_step: true
  cache_results: true
  max_retries: 5
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        ttl: 24h
ruler:
  alertmanager_url: https://alertmanager.xx
  external_url: https://alertmanager.xx
  ring:
    kvstore:
      store: memberlist
  rule_path: /tmp/loki/scratch
  storage:
    local:
      directory: /etc/loki/rules
    type: local
runtime_config:
  file: /var/loki-distributed-runtime/runtime.yaml
schema_config:
  configs:
  - from: "2020-09-07"
    index:
      period: 24h
      prefix: loki_index_
    object_store: s3
    schema: v11
    store: boltdb-shipper
server:
  http_listen_port: 3100
storage_config:
  boltdb_shipper:
    active_index_directory: /var/loki/index
    cache_location: /var/loki/cache
    cache_ttl: 168h
    shared_store: s3
  aws:
    s3: s3://****@us-west-2/staging-loki-s3
    s3forcepathstyle: true
    endpoint: https://staging-loki-s3.s3.us-west-2.amazonaws.com

Let me know your thoughts on this, as log retention is one of the main thing that needs for my setup.

Thanks in advance !!

Are you sure your logs are actually stored in S3? Do you see the chunk files there?

You don’t have retention configured, and as such Loki is supposed to keep logs forever. Retention is not your problem if you aren’t see logs after restart.

Here are my s3 bucket screenshots:


If these configurations are wrong suggest me a better way to store them in s3, i have made this after referring the docs. But I’m still confused on this part.

What’s under the fake directory?

There are more than 999+ objects like this under fake

That looks normal to me.

Do you see any error message in your querier? If you try to hit your Loki cluster with API (either query or labels API) right after restart what do you get?

Let me check on this with restarts and get back to you once. But as far I noticed querier doesn’t throw any error.

Hi @tonyswumac i have tried restarting the querier, which has been deployed as statefulset, during the pod start i don’t see any errors popping up. So how to generate older logs from s3 ? even if the deployments for loki components(like ingester, querier) have been restarted(in some cases ran into pod crashes).

Not sure what you mean here. You should be able to send an API call to Loki querier and specify a time frame for logs from before the containers were restarted.