Error "Empty results, no matching label"

Hi Guys,
I shipped a log snippet via FluentBit to Loki and from Grafana when I clicked the labels I set for this snippet, it gave me such error:

When I tried the api via curl, no label found for this log snippet:

{
  "status": "success",
  "data": [
    {
      "app": "jsm",
      "filename": "/home/dujas/fluent-bit/jira.log",
      "hostname": "centos8-1"
    },
    {
      "hostname": "centos8-1",
      "app": "jsm",
      "filename": "/home/dujas/fluent-bit/jira2.log"
    },
    {
      "app": "jsm",
      "hostname": "centos8-1",
      "filename": "/home/dujas/fluent-bit/jira4.log"
    }
  ]
}

Here is the log snippet:

2024-03-01 17:29:45,196+0800 FelixStartLevel INFO [c.a.c.e.c.a.j.i.s.managers.DefaultSearchHandlerManager.helperResettableLazyReference] Cache com.atlassian.jira.issue.search.managers.DefaultSearchHandlerManager.helperResettableLazyReference was flushed
2024-03-01 17:29:45,196+0800 FelixStartLevel INFO [c.a.c.e.c.a.j.i.s.managers.DefaultSearchHandlerManager.helperResettableLazyReference] Cache com.atlassian.jira.issue.search.managers.DefaultSearchHandlerManager.helperResettableLazyReference was flushed
2024-03-01 17:29:45,196+0800 FelixStartLevel INFO [c.a.c.e.c.a.j.i.s.managers.DefaultSearchHandlerManager.helperResettableLazyReference] Cache com.atlassian.jira.issue.search.managers.DefaultSearchHandlerManager.helperResettableLazyReference was flushed
2024-03-01 17:29:45,220+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.plugins.atlassian-whitelist-api-plugin-5.0.6’, not ACTIVE
2024-03-01 17:29:45,232+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.plugin.jslibs’, not ACTIVE
2024-03-01 17:29:45,233+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.plugin.atlassian-spring-scanner-runtime’, not ACTIVE
2024-03-01 17:29:45,233+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.plugin.atlassian-spring-scanner-annotation’, not ACTIVE
2024-03-01 17:29:45,243+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.oauth.atlassian-oauth-service-provider-spi’, not ACTIVE
2024-03-01 17:29:45,243+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.oauth.atlassian-oauth-consumer-spi’, not ACTIVE
2024-03-01 17:29:45,243+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.oauth.admin’, not ACTIVE
2024-03-01 17:29:45,780+0800 FelixStartLevel WARN [c.a.p.osgi.factory.OsgiBundlePlugin] Cannot disable Bundle ‘com.atlassian.bundles.nekohtml-1.9.12.1’, not ACTIVE
2024-03-01 17:29:46,156+0800 main INFO [c.a.jira.versioning.TransactionSupportHelper] [VERSIONING] Shutting down required-new-transaction pool
2024-03-01 17:29:46,183+0800 main INFO [c.a.jira.plugin.PluginTransactionListener] [plugin-transaction] Shutting down

I tried to set the max_chunk_age to 108h, but still no luck. For the latest log files, it is working fine.

Any thought?

Well, the log is searchable for now, may I know what the trick under the hood is? Why can’t I search the log files right away?

Please share your configuration.

From your description it sounds like you are not seeing logs until they are written into chunk storage. You’ll want to make sure querier is configured to query the ingester within a reasonable time frame (default is 3h if I remember correctly). If it’s configured correctly, then you might want to double check and make sure your querier can connect to your ingesters

Hi Tony,

Thanks for your reply.

I checked the config via “/config”, yes, query_ingesters_within is set by default, 3h0m0s.

Please check below for the config, which is based on Simple Scalable Deployment of K8S:

    auth_enabled: true

    server:
      http_listen_port: 3100
      grpc_listen_port: 9095
      log_level: debug

      http_tls_config:
        cert_file: /etc/loki/certs/loki-server.com.crt
        key_file: /etc/loki/certs/loki-server.com.key
        client_auth_type: RequireAndVerifyClientCert
        client_ca_file: /etc/loki/certs/ca.crt

      grpc_tls_config:
        cert_file: /etc/loki/certs/loki-server.com.crt
        key_file: /etc/loki/certs/loki-server.com.key
        client_auth_type: RequireAndVerifyClientCert
        client_ca_file: /etc/loki/certs/ca.crt

    common:
      path_prefix: /var/loki
      replication_factor: 3
      ring:
        kvstore:
          store: memberlist
      compactor_address: "loki-backend.logging.svc.cluster.local"

      storage:
        s3:
          endpoint: <ip address of S3>
          bucketnames: chunks
          access_key_id: <key>
          secret_access_key: <key>
          s3forcepathstyle: true
          insecure: true

    ingester:
      max_chunk_age: 108h

    ingester_client:
      grpc_client_config:
        tls_enabled: true
        tls_cert_path: /etc/loki/certs/loki-write.logging.svc.cluster.local.crt
        tls_key_path: /etc/loki/certs/loki-write.logging.svc.cluster.local.key
        tls_ca_path: /etc/loki/certs/ca.crt
        tls_server_name: loki-server.com
        tls_insecure_skip_verify: false

    memberlist:
      join_members:
        - "loki-memberlist.logging.svc.cluster.local"
      tls_enabled: true
      tls_cert_path: /etc/loki/certs/loki-memberlist.logging.svc.cluster.local.crt
      tls_key_path: /etc/loki/certs/loki-memberlist.logging.svc.cluster.local.key
      tls_ca_path: /etc/loki/certs/ca.crt
      tls_server_name: loki-server.com
      tls_insecure_skip_verify: false

    limits_config:
      reject_old_samples: true
      reject_old_samples_max_age: 168h
      max_cache_freshness_per_query: 10m
      split_queries_by_interval: 24h
      max_query_parallelism: 100

    query_scheduler:
      max_outstanding_requests_per_tenant: 4096
      grpc_client_config:
        tls_enabled: true
        tls_cert_path: /etc/loki/certs/loki-backend.logging.svc.cluster.local.crt
        tls_key_path: /etc/loki/certs/loki-backend.logging.svc.cluster.local.key
        tls_ca_path: /etc/loki/certs/ca.crt
        tls_server_name: loki-server.com

    index_gateway:
      mode: ring

    frontend:
      scheduler_address: query-scheduler-discovery.logging.svc.cluster.local:9095
      max_outstanding_per_tenant: 4096
      grpc_client_config:
        tls_enabled: true
        tls_cert_path: /etc/loki/certs/loki-read.logging.svc.cluster.local.crt
        tls_key_path: /etc/loki/certs/loki-read.logging.svc.cluster.local.key
        tls_ca_path: /etc/loki/certs/ca.crt
        tls_server_name: loki-server.com

    frontend_worker:
      scheduler_address: query-scheduler-discovery.logging.svc.cluster.local:9095
      grpc_client_config:
        tls_enabled: true
        tls_cert_path: /etc/loki/certs/loki-read.logging.svc.cluster.local.crt
        tls_key_path: /etc/loki/certs/loki-read.logging.svc.cluster.local.key
        tls_ca_path: /etc/loki/certs/ca.crt
        tls_server_name: loki-server.com

    schema_config:
      configs:
        - from: "2023-12-13"
          store: tsdb
          object_store: s3
          schema: v12
          index:
            period: 24h
            prefix: loki_index_

    storage_config:
      tsdb_shipper:
        active_index_directory: /var/loki/tsdb-index
        cache_location: /var/loki/tsdb-cache
        shared_store: s3
        index_gateway_client:
          grpc_client_config:
            tls_enabled: true
            tls_cert_path: /etc/loki/certs/loki-backend.logging.svc.cluster.local.crt
            tls_key_path: /etc/loki/certs/loki-backend.logging.svc.cluster.local.key
            tls_ca_path: /etc/loki/certs/ca.crt
            tls_server_name: loki-server.com

Don’t see anything that’s obviously wrong.

  1. Do you see anything interesting in querier logs?
  2. Is 3h the actual time for a log to show up in your query?
  3. What does /ring looks like from your ingesters?

Hi Tony,

  1. There is nothing special in the query log files as far as I can tell, I may miss some hints but there is no error
  2. As I can remember, it won’t be 3 hours long to see the log files
  3. I am running Loki in SSD mode and here is the ring status:

I set the value of max_chunk_age back to the default 2h and ingested a log snippet around 02:52AM, March 1st, same error as before “Empty results, no matching label”. However, if I ingested a piece of fresh log files, it could be searchable right away. The only difference here is the log time, there should be some magic config I’d have missed.

From the logs of querier:
This is when I queried the elder log:

level=info ts=2024-03-06T08:41:12.052420498Z caller=metrics.go:264 component=frontend org_id=AI traceID=54deac250e11b644 latency=fast query_type=series length=168h0m0s duration=109.604709ms status=200 match=“{filename="/home/dujas/fluent-bit/jira5.log"}” splits=8 throughput=0B total_bytes=0B total_entries=0

Thanks.

Tony,

I just set the query_ingesters_within to 168h and now the log snippet I just ingested is searchable:

Per the doc of Loki configuration:

query_ingesters_within: Maximum lookback beyond which queries are not sent to ingester

So it means if the query time range is out of the scope of query_ingesters_within, Loki will not search ingesters at all but the backend storage. However, since it is still in the window of max_chunk_age or chunk_idle_period , the logs in memory is not flushed to storage yet, then I failed to search the logs from either memory or storage.

Please correct me if there is any misunderstanding.

I set it to 0(all queries are sent to ingester) and the elder logs are searchable for now.

Thanks.

It is correct that querier won’t search ingester outside of query_ingesters_within. I would recommend you to not always search ingester due to performance concern (although I don’t know how much of an impact it would have).

What we have configured is:

query_ingesters_within: 2h
max_chunk_age: 90m
chunk_idle_period: 90m

Perhaps try that instead and see if it works.

Thanks Tony.

In your configuration, the elder logs (elder than query_ingesters_within) which is newly ingested into Loki should not be queried till they are flushed into backend storage (in 90 mins), is that right?

No, all logs should be query-able. Querier will query the ingesters for the time frame of query_ingesters_within configured (if it’s 2h, that means for the past two hours querier will query ingesters first). This means that new logs that haven’t been written into the backend storage can be queried from the ingester.

Thanks Tony.

So query_ingesters_within means the ingested time range but not the query range? By setting it as 2 hours, even the elder log files (elder than 2 hours, like yesterday) ingested within the past 2 hours are still query-able? If this is the case, why in my case those elder log files are not queried right away they are ingested, that’s I am curious about unless I set it as 0.

Not sure what you mean. It is query range, which means querier will query the ingester for the past however long that’s configured. And then it depends on what logs are on ingester during that time frame, which should be either logs that are too small and waiting for max chunk age, or chunks that are idle that are waiting for max idle, or chunks that are accumulating to hit the ideal chunk size before it’s written.

Thanks Tony for the explanation and apologies for the misunderstanding.

In my very first question, I ingested old log files (2024-03-01) with query_ingesters_within set as default (3 hours), but they are not searchable right away, I pasted the error of Grafana as well. However, once I set query_ingesters_within to 168h which is long enough to cover the timestamp of the old log files, they are searchable right away.

Therefore, I would like to understand whether there is any relation between the value and the log timestamp:
For example, I ingested a log snippet with timestamp 2024.03.01-01:00:00, the query_ingesters_within is set as 3 hours, should Loki query the log snippet right away they are ingested?

Then that probably means that particular log managed to stay on the ingester past query_ingesters_within. Perhaps either max idle or max age were set to be bigger than query_ingesters_within?

hi Tony,

I just reset query_ingesters_within, max_chunk_age and chunk_idle_period back to default, which are 3h, 2h and 30min for now. Afterwards, I ingested a log snippet generated at March 9th, details are as below:

2024-03-09 11:27:22,725+0800 index-writer-stats-CHANGE_HISTORY-3-0 INFO [c.a.jira.index.WriterWithStats] [JIRA-STATS] [index-writer-stats] CHANGE_HISTORY : snapshot stats: {“addDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“deleteDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentConditionallyMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentsWithVersionMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“10”:0,“100”:0,“500”:0,“1000”:0,“5000”:0,“10000”:0,“30000”:0,“60000”:0}},“updateDocumentsWithVersionSize”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“1”:0,“10”:0,“100”:0,“1000”:0,“10000”:0}},“replaceDocumentsWithVersionMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“10”:0,“100”:0,“500”:0,“1000”:0,“5000”:0,“10000”:0,“30000”:0,“60000”:0}},“replaceDocumentsWithVersionSize”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“1”:0,“10”:0,“100”:0,“1000”:0,“10000”:0}},“optimizeMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“closeMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“commitMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}}}, index writer version cache stats: {“put”:0,“get”:0,“getFound”:0,“clear”:0,“clearSizeExceeded”:0}, index writer searcher stats: {“createNew”:0,“reuse”:0}
2024-03-09 11:27:22,745+0800 index-writer-stats-WORKLOG-4-0 INFO [c.a.jira.index.WriterWithStats] [JIRA-STATS] [index-writer-stats] WORKLOG : total stats: {“addDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“deleteDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentsMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentConditionallyMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“updateDocumentsWithVersionMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“10”:0,“100”:0,“500”:0,“1000”:0,“5000”:0,“10000”:0,“30000”:0,“60000”:0}},“updateDocumentsWithVersionSize”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“1”:0,“10”:0,“100”:0,“1000”:0,“10000”:0}},“replaceDocumentsWithVersionMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“10”:0,“100”:0,“500”:0,“1000”:0,“5000”:0,“10000”:0,“30000”:0,“60000”:0}},“replaceDocumentsWithVersionSize”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{“1”:0,“10”:0,“100”:0,“1000”:0,“10000”:0}},“optimizeMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“closeMillis”:{“count”:0,“min”:0,“max”:0,“sum”:0,“avg”:0,“distributionCounter”:{}},“commitMillis”:{“count”:5,“min”:1,“max”:21,“sum”:56,“avg”:11,“distributionCounter”:{}}}, index writer version cache stats: {“put”:0,“get”:0,“getFound”:0,“clear”:0,“clearSizeExceeded”:0}, index writer searcher stats: {“createNew”:0,“reuse”:0}

When I tried to query from Grafana for the past 7 days, same error came out:

The same goes to API:

  1. Double check and make sure your queriers can actually connect to your ingesters on both gRPC and HTTP ports.

  2. If you push a log stream with timestamp from the past, I am pretty sure you still need to wait out the max_chunk_age or chunk_idle_period.

Thanks Tony for the reply.

As the time I am typing, I am able to query the logs, 40 mins passed since the ingestion:

I just ingested another snippet, which was generated several minutes ago:

2024-03-13 23:53:51,980+0800 JIRA-Bootstrap INFO [c.a.plugin.util.WaitUntil] Plugins that have yet to be enabled: (18): [com.codebarrel.addons.automation, com.atlassian.servicedesk.servicedesk-knowledge-base-plugin, com.atlassian.jira.migration.jira-migration-plugin, com.atlassian.servicedesk.incident-management-plugin, com.atlassian.servicedesk.project-ui, com.atlassian.servicedesk.plugins.automation.servicedesk-automation-modules-plugin, com.atlassian.servicedesk.frontend-webpack-plugin, com.atlassian.servicedesk.public-rest-api, com.atlassian.servicedesk.servicedesk-variable-substitution-plugin, com.atlassian.servicedesk.plugins.servicedesk-search-plugin, com.atlassian.servicedesk.servicedesk-reports-plugin, com.atlassian.servicedesk.servicedesk-lingo-integration-plugin, com.atlassian.servicedesk.servicedesk-notifications-plugin, com.atlassian.jira.mobile.jira-mobile-rest, com.atlassian.servicedesk, com.atlassian.servicedesk.plugins.automation.servicedesk-automation-then-webhook-plugin, com.atlassian.servicedesk.servicedesk-canned-responses-plugin, com.atlassian.servicedesk.approvals-plugin], 282 seconds remaining
2024-03-13 23:53:51,985+0800 ThreadPoolAsyncTaskExecutor::Thread 8 INFO [c.a.j.migration.guardrails.AccessLogProcessingJobRunner] Successfully registered access log processing job {}. jira-migration-access-log-processing-job-key
2024-03-13 23:53:52,029+0800 ThreadPoolAsyncTaskExecutor::Thread 8 INFO [c.a.j.migration.mediaclient.MediaClient] media base url: https://api.media.atlassian.com

It could be queriable right away:

Btw, may I know how to check the connection between queriers and ingesters?

I would say you can use docker exec and get into the querier container, then install telnet, then telnet to the ingester on both ports (you can get the ingester IPs from the /ring API).

And to clean up you can just replace that one querier container you installed telnet on.

Hi Tony,

As I am using SSD mode for deployment, there is no dedicated pod for querier and ingester, but just read and write.

I added a sidecar with busybox to loki-read pod for telneting, in which the ip address 172.16.58.208 is one of the loki-write pods:

Thanks.

That looks good. If you send a request with current time stamp into your cluster, is it queryable right away?