When querying through LogCLI to get all the logs from Loki, there are a lot of missing logs

I am storing 500 MB of logs (at once and a few MB of logs add up daily) in Loki using Promtail.
I need to gather logs annually to do a report but when querying Loki with LogCLI (I use Minio to store the logs with Loki) there are a lot of missing logs, I get around 5MB of logs when it should be around 500MB.

I have a few questions: Is it a good practice to store logs like that using Loki to do reporting? Loki works well for logs monitoring in live but am I not using it correctly/not for the right purpose when it comes to storing logs on long periods to do annual reports? maybe I should use another system to store the logs for reporting? How do most users do reports from their logs?

Anyways, I set the retention period to a year and indeed when querying I have logs that are old (11 months) but there are still missing ones that should be present in that year.

Here is my config:

auth_enabled: false

server:
  http_listen_port: 3100

distributor:
  ring:
    kvstore:
      store: memberlist

ingester:
  max_chunk_age: 168h
  chunk_idle_period: 168h
  chunk_target_size: 3145728
  lifecycler:
    ring:
      kvstore:
        store: memberlist
      replication_factor: 1
    final_sleep: 0s
  wal:
    enabled: true
    dir: /loki/wal

memberlist:
  abort_if_cluster_join_fails: false

  bind_port: 7946

  join_members:
  - loki:7946

  max_join_backoff: 1m
  max_join_retries: 10
  min_join_backoff: 1s

schema_config:
  configs:
  - from: 2020-05-15
    store: tsdb
    object_store: s3
    schema: v12
    index:
      prefix: index_
      period: 24h

storage_config:
  tsdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    resync_interval: 5s
    shared_store: s3
  aws:
    s3: http://<minioadmin>:<minioadmin>@minio.:9000/loki
    s3forcepathstyle: true

limits_config:
  enforce_metric_name: false
  reject_old_samples: false
  reject_old_samples_max_age: 0h
  retention_period: 365d

compactor:
  retention_enabled: true
  working_directory: /loki/compactor
  shared_store: aws

query_scheduler:
  max_outstanding_requests_per_tenant: 32768
  
querier:
  max_concurrent: 16
 
ruler:
  alertmanager_url: http://localhost:9093


Thank you for your help, I feel like there is something I don’t understand about Loki when it comes to logs and reporting/logs analytics (not log monitoring).

I don’t think there is any problem with storing logs for long period of time. I am sure there are people storing logs for just as long, if no longer, than you do, with a lot more volume.

What is your storage usage on Minio? Do you see the expected amount of storage used? Also print out the current configuration with the /config API endpoint, and review all limits configuration (such as max_query_length).

The minio bucket for Loki is 1.5MB. I am guessing that it is less because it’s compressed, but it is still very low.

Here is my limits_config:

limits_config:
  ingestion_rate_strategy: global
  ingestion_rate_mb: 4
  ingestion_burst_size_mb: 6
  max_label_name_length: 1024
  max_label_value_length: 2048
  max_label_names_per_series: 30
  reject_old_samples: false
  reject_old_samples_max_age: 0s
  creation_grace_period: 10m
  enforce_metric_name: false
  max_line_size: 0
  max_line_size_truncate: false
  increment_duplicate_timestamp: false
  max_streams_per_user: 0
  max_global_streams_per_user: 5000
  unordered_writes: true
  per_stream_rate_limit: 3145728
  per_stream_rate_limit_burst: 15728640
  max_chunks_per_query: 2000000
  max_query_series: 500
  max_query_lookback: 0s
  max_query_length: 30d1h
  max_query_parallelism: 32
  tsdb_max_query_parallelism: 512
  cardinality_limit: 100000
  max_streams_matchers_per_query: 1000
  max_concurrent_tail_requests: 10
  max_entries_limit_per_query: 5000
  max_cache_freshness_per_query: 1m
  max_queriers_per_tenant: 0
  query_ready_index_num_days: 0
  query_timeout: 5m
  split_queries_by_interval: 30m
  min_sharding_lookback: 0s
  ruler_evaluation_delay_duration: 0s
  ruler_max_rules_per_rule_group: 0
  ruler_max_rule_groups_per_tenant: 0
  ruler_alertmanager_config: null
  ruler_tenant_shard_size: 0
  ruler_remote_write_disabled: false
  ruler_remote_write_url: ""
  ruler_remote_write_timeout: 0s
  ruler_remote_write_headers: {}
  ruler_remote_write_queue_capacity: 0
  ruler_remote_write_queue_min_shards: 0
  ruler_remote_write_queue_max_shards: 0
  ruler_remote_write_queue_max_samples_per_send: 0
  ruler_remote_write_queue_batch_send_deadline: 0s
  ruler_remote_write_queue_min_backoff: 0s
  ruler_remote_write_queue_max_backoff: 0s
  ruler_remote_write_queue_retry_on_ratelimit: false
  ruler_remote_write_sigv4_config: null
  deletion_mode: filter-and-delete
  retention_period: 1y
  per_tenant_override_config: ""
  per_tenant_override_period: 10s
  allow_deletes: false
  shard_streams:
    enabled: false
    logging_enabled: false
    desired_rate: 3145728

Are you sure your logs are actually stored in minio? Since you are setting maximum chunk age to 7 days, try querying for anything past 7 days and see if there is any result.

I suspect your storage configuration may be incorrect. If you expected 500MB of total logs then your minion storage should definitely have more than 1.5MB in there. Check and see what’s actually on your minion storage as well.