Problems to migrate Logs to new Loki instance

sebbou · July 1, 2024, 7:58am

Hi,

we try to migrate logs collections from one s3 bucket to another and from a monolithic to scalable mode. But we cant get it to work.
We tried to just simply copy the files from one bucket to another and set the new one in the loki conf but it doesnt seem to find any files. Any ideas what we can do?

Thank you in advance
Sebastian

tonyswumac · July 1, 2024, 6:41pm

Please provide more information. Any error in logs? What does your config look like? Does the new loki cluster write chunks to where you’d expect it to?

If you are already using S3, you should be able to just change from monolithic to SSD mode. Of course you’d want to test in a de environment and make sure you have all the configuration ironed out.

sebbou · July 2, 2024, 11:04am

Hi,

thanks for the fast response.
we sync the two buckets via.
aws s3 sync . it took a decent amount of time.
We tried to edit the example config from the github repo to fit our needs and match the old config in the important parts.
But still no data on the new side
Old Config:

auth_enabled: false
frontend:
  max_outstanding_per_tenant: 4096
server:
  http_listen_port: 3100
  grpc_listen_port: 9096
  http_server_read_timeout: 60s # allow longer time span queries
  http_server_write_timeout: 60s # allow longer time span queries

common:
  instance_addr: 127.0.0.1
  path_prefix: /tmp/loki
  storage:
    s3:
      access_key_id: ****
      bucketnames: ****
      endpoint: ****
      insecure: false
      region: ****
      s3forcepathstyle: false
      secret_access_key: ****
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory
querier:
  engine:
    timeout: 600s

query_range:
  parallelise_shardable_queries: true
  results_cache:
    cache:
      memcached_client:
        consistent_hash: true
        addresses: "grafana-memcached:11211"
        max_idle_conns: 16
        timeout: 500ms
        update_interval: 1m
chunk_store_config:
  max_look_back_period: 672h
  chunk_cache_config:
    memcached:
      batch_size: 256
      parallelism: 10
    memcached_client:
      addresses:  "grafana-memcached:11211"

query_scheduler:
  max_outstanding_requests_per_tenant: 8192

limits_config:
  query_timeout: 600s
  retention_period: 72h
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_cache_freshness_per_query: 10m
  split_queries_by_interval: 24h
  # for big logs tune
  per_stream_rate_limit: 4096M
  per_stream_rate_limit_burst: 8192M
  max_global_streams_per_user: 0
  cardinality_limit: 200000
  ingestion_burst_size_mb: 2000
  ingestion_rate_mb: 10000
  max_entries_limit_per_query: 1000000
    #reject_old_samples: true
    #reject_old_samples_max_age: 168h
    #max_query_series: 100000
    #max_query_parallelism: 2
  max_label_value_length: 20480
  max_label_name_length: 10240
  max_label_names_per_series: 300

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/boltdb-shipper-active
    cache_location: /loki/boltdb-shipper-cache
    cache_ttl: 336h     # Can be increased for faster performance over longer query periods, uses more disk space
    shared_store: s3
  index_queries_cache_config:
    memcached:
      batch_size: 100
      parallelism: 100
    memcached_client:
      consistent_hash: true
      addresses:  "grafana-memcached:11211"
# TSDB Shipper hinzugefügt 19.06.24
  tsdb_shipper:
    active_index_directory: /loki/tsdb-index
    cache_location: /loki/tsdb-cache
schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: index_
        period: 24h
# TSDB hinzugefügt 19.06.24
    - from: 2024-06-19
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h


table_manager:
  retention_deletes_enabled: true
  retention_period: 672h
ruler:
  alertmanager_url: http://localhost:9093

# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
#
# Statistics help us better understand how Loki is used, and they show us performance
# levels for most users. This helps us prioritize features and documentation.
# For more information on what's sent, look at
# https://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go
# Refer to the buildReport method to see what goes into a report.
#
# If you would like to disable reporting, uncomment the following lines:
analytics:
  reporting_enabled: false

New Config:

auth_enabled: false

server:
  http_listen_address: 0.0.0.0
  grpc_listen_address: 0.0.0.0
  http_listen_port: 3100
  grpc_listen_port: 9095
  log_level: debug

common:
  path_prefix: /loki
  compactor_address: http://loki-backend:3100
  replication_factor: 1
  storage:
    s3:
      access_key_id: ****
      bucketnames: ****
      endpoint: ****
      insecure: false
      region: ****
      s3forcepathstyle: false
      secret_access_key: ****



storage_config:
  aws:
      access_key_id: ****
      bucketnames: ****
      endpoint: ****
      insecure: false
      region: ****
      s3forcepathstyle: false
      secret_access_key: ****
  boltdb_shipper:
    active_index_directory: /loki/boltdb-shipper-active
    cache_location: /loki/boltdb-shipper-cache
    cache_ttl: 336h     # Can be increased for faster performance over longer query periods, uses more disk space
    shared_store: s3
  index_queries_cache_config:
    memcached:
      batch_size: 100
      parallelism: 100
    memcached_client:
      consistent_hash: true
      addresses:  "grafana-memcached:11211"
# TSDB Shipper hinzugefügt 19.06.24
  tsdb_shipper:
    active_index_directory: /loki/tsdb-index
    cache_location: /loki/tsdb-cache
    shared_store: s3

memberlist:
  join_members: ["loki-read", "loki-write", "loki-backend"]
  dead_node_reclaim_time: 30s
  gossip_to_dead_nodes_time: 15s
  left_ingesters_timeout: 30s
  bind_addr: ['0.0.0.0']
  bind_port: 7946
  gossip_interval: 2s

ingester:
  lifecycler:
    join_after: 10s
    observe_period: 5s
    ring:
      replication_factor: 3
      kvstore:
        store: memberlist
    final_sleep: 0s
  chunk_idle_period: 1m
  wal:
    enabled: true
    dir: /loki/wal
      #  max_chunk_age: 1m
      #  chunk_retain_period: 30s
      #  chunk_encoding: snappy
      #  chunk_target_size: 1.572864e+06
      #  chunk_block_size: 262144
      #  flush_op_timeout: 10s

ruler:
  enable_api: true
  enable_sharding: true
  wal:
    dir: /loki/ruler-wal
  evaluation:
    mode: remote
    query_frontend:
      address: dns:///loki-read:9095
  storage:
    type: local
    local:
      directory: /loki/rules
  rule_path: /loki/prom-rules
  remote_write:
    enabled: true
    clients:
      local:
         url: http://prometheus:9090/api/v1/write
         queue_config:
           # send immediately as soon as a sample is generated
           capacity: 1
           batch_send_deadline: 0s

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: index_
        period: 24h
# TSDB hinzugefügt 19.06.24
    - from: 2024-06-19
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h

limits_config:
  query_timeout: 600s
  retention_period: 72h
  enforce_metric_name: false
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  max_cache_freshness_per_query: 10m
  split_queries_by_interval: 24h
  # for big logs tune
  per_stream_rate_limit: 4096M
  per_stream_rate_limit_burst: 8192M
  max_global_streams_per_user: 0
  cardinality_limit: 200000
  ingestion_burst_size_mb: 2000
  ingestion_rate_mb: 10000
  max_entries_limit_per_query: 1000000
    #reject_old_samples: true
    #reject_old_samples_max_age: 168h
    #max_query_series: 100000
    #max_query_parallelism: 2
  max_label_value_length: 20480
  max_label_name_length: 10240
  max_label_names_per_series: 300

table_manager:
  retention_deletes_enabled: true
  retention_period: 672h

query_range:
  parallelise_shardable_queries: true
  results_cache:
    cache:
      memcached_client:
        consistent_hash: true
        addresses: "grafana-memcached:11211"
        max_idle_conns: 16
        timeout: 500ms
        update_interval: 1m
chunk_store_config:
  max_look_back_period: 672h
  chunk_cache_config:
    memcached:
      batch_size: 256
      parallelism: 10
    memcached_client:
      addresses:  "grafana-memcached:11211"

frontend:
  log_queries_longer_than: 5s
  compress_responses: true
  max_outstanding_per_tenant: 2048


query_scheduler:
  max_outstanding_requests_per_tenant: 8192

querier:
  engine:
    timeout: 600s
  #  query_ingesters_within: 2h
    #query_store_only: true
compactor:
  #  working_directory: /tmp/compactor
  working_directory: /tmp/data/retention
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h
  retention_delete_worker_count: 150
  delete_request_store: s3

And some infos:

loki-backend-2     | level=info ts=2024-07-02T10:40:37.691103547Z caller=table_manager.go:228 index-store=tsdb-2024-06-19 msg="syncing tables"
loki-backend-2     | level=info ts=2024-07-02T10:40:37.691121483Z caller=table_manager.go:271 index-store=tsdb-2024-06-19 msg="query readiness setup completed" duration=1.155µs distinct_users_len=0 distinct_users=
loki-write-1       | level=debug ts=2024-07-02T10:40:37.672753136Z caller=memcached_client_selector.go:105 msg="updating memcached servers" servers=192.168.112.8:11211 count=1
loki-write-1       | level=info ts=2024-07-02T10:40:37.678052897Z caller=table_manager.go:171 index-store=boltdb-shipper-2020-10-24 msg="handing over indexes to shipper"
loki-write-1       | level=info ts=2024-07-02T10:40:37.678074461Z caller=table_manager.go:136 index-store=boltdb-shipper-2020-10-24 msg="uploading tables"
loki-write-1       | level=info ts=2024-07-02T10:40:37.68322787Z caller=table_manager.go:136 index-store=tsdb-2024-06-19 msg="uploading tables"
loki-write-1       | level=info ts=2024-07-02T10:40:37.712936947Z caller=checkpoint.go:611 msg="starting checkpoint"
loki-write-1       | level=info ts=2024-07-02T10:40:37.713046463Z caller=checkpoint.go:336 msg="attempting checkpoint for" dir=/loki/wal/checkpoint.000022
loki-write-1       | level=info ts=2024-07-02T10:40:37.721097796Z caller=checkpoint.go:498 msg="atomic checkpoint finished" old=/loki/wal/checkpoint.000022.tmp new=/loki/wal/checkpoint.000022
loki-read-2        | ts=2024-07-02T10:40:37.69516016Z caller=spanlogger.go:86 level=info msg="table cache built" duration=31.098309ms
loki-read-2        | level=debug ts=2024-07-02T10:40:37.695180796Z caller=index_set.go:300 table-name=index_19906 user-id=fake msg="updates for table index_19906. toDownload: [], toDelete: []"
loki-read-2        | level=debug ts=2024-07-02T10:40:37.695191958Z caller=index_set.go:293 table-name=index_19906 msg="syncing files for table index_19906"
loki-read-2        | level=debug ts=2024-07-02T10:40:37.695199965Z caller=index_set.go:300 table-name=index_19906 msg="updates for table index_19906. toDownload: [], toDelete: []"
loki-read-2        | level=info ts=2024-07-02T10:40:37.695207569Z caller=table_manager.go:271 index-store=tsdb-2024-06-19 msg="query readiness setup completed" duration=1.08µs distinct_users_len=0 distinct_users=
loki-write-2       | ts=2024-07-02T10:40:38.617808082Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: e235e13283aa-427d1456 192.168.112.4:7946"
loki-read-1        | ts=2024-07-02T10:40:38.617812923Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.11:57406"
loki-backend-1     | level=debug ts=2024-07-02T10:40:38.69837392Z caller=ruler.go:566 msg="syncing rules" reason=periodic
loki-backend-2     | level=debug ts=2024-07-02T10:40:38.886888322Z caller=ruler.go:566 msg="syncing rules" reason=periodic
loki-backend-2     | level=info ts=2024-07-02T10:40:43.713478682Z caller=marker.go:202 msg="no marks file found"
loki-backend-1     | ts=2024-07-02T10:40:50.19997532Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: 3e23034f4e3b-4b7e97c4 192.168.112.3:7946"
loki-read-2        | ts=2024-07-02T10:40:50.199989992Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.5:53522"
loki-read-2        | ts=2024-07-02T10:40:53.666745328Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: e235e13283aa-427d1456 192.168.112.4:7946"
loki-read-1        | ts=2024-07-02T10:40:53.666745325Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.3:33734"
loki-backend-2     | ts=2024-07-02T10:40:55.123864501Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: 3e23034f4e3b-4b7e97c4 192.168.112.3:7946"
loki-read-2        | ts=2024-07-02T10:40:55.123892971Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.12:54706"
loki-read-2        | level=debug ts=2024-07-02T10:40:55.858069096Z caller=reporter.go:202 msg="failed to read cluster seed file" err="failed to get s3 object: NoSuchKey: The specified key does not exist.\n\tstatus code: 404, request id: 7eda9c5b-a55d-1e18-a92c-1402ec963940, host id: "
loki-write-1       | ts=2024-07-02T10:41:01.52259584Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: e235e13283aa-427d1456 192.168.112.4:7946"
loki-read-1        | ts=2024-07-02T10:41:01.522601315Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.13:34708"
loki-read-1        | ts=2024-07-02T10:41:01.93200239Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: bef9e124ddbf-d4dd17e2 192.168.112.5:7946"
loki-backend-1     | ts=2024-07-02T10:41:01.932007162Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.4:41148"
loki-write-2       | ts=2024-07-02T10:41:08.619214608Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: 8dc1559e38a2-ce9a3320 192.168.112.12:7946"
loki-backend-2     | ts=2024-07-02T10:41:08.619253596Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.11:48356"
loki-backend-1     | level=debug ts=2024-07-02T10:41:11.298333406Z caller=reporter.go:202 msg="failed to read cluster seed file" err="failed to get s3 object: NoSuchKey: The specified key does not exist.\n\tstatus code: 404, request id: 512cd884-df16-1bb8-9c1f-1402ec94a470, host id: "
loki-read-1        | level=debug ts=2024-07-02T10:41:17.61631587Z caller=reporter.go:202 msg="failed to read cluster seed file" err="failed to get s3 object: NoSuchKey: The specified key does not exist.\n\tstatus code: 404, request id: 512cd88e-df16-1bb8-9c1f-1402ec94a470, host id: "
loki-backend-2     | ts=2024-07-02T10:41:20.201455064Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.5:44462"
loki-backend-1     | ts=2024-07-02T10:41:20.201440547Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: 8dc1559e38a2-ce9a3320 192.168.112.12:7946"
loki-read-2        | ts=2024-07-02T10:41:23.668158391Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: bef9e124ddbf-d4dd17e2 192.168.112.5:7946"
loki-backend-1     | ts=2024-07-02T10:41:23.668178304Z caller=memberlist_logger.go:74 level=debug msg="Stream connection from=192.168.112.3:56472"
loki-backend-2     | ts=2024-07-02T10:41:25.12600321Z caller=memberlist_logger.go:74 level=debug msg="Initiating push/pull sync with: e235e13283aa-427

tonyswumac · July 2, 2024, 6:36pm

I don’t see anything that’s critically wrong. Couple of minor things:

I don’t think aws under storage_config is needed.
You should disable table manager if you are using compactor.
Your old config disables analytics reporting, your new one doesn’t, that’s where your S3 error is.

In your original post you said it “doesn’t work”. What’s not working?

sebbou · July 3, 2024, 1:12pm

We figured it out. The Problem was the tenant thats called “docker” by default and we copied the old logs to another tenant that was used by the old loki. We reconfigured it so now we can use the old files What i meant with “doesnt work” was, that we couldnt see the copied logs in Grafana. Thank you very much for your time and have a nice rest of the week. We will change the config as you mentioned and i think the rest is resolved

shashwatsaxena · April 21, 2025, 7:51pm

Can we really use a new loki deployment to query logs of another loki deploylment ? It sounds like you did something similar can you let me know the challenges/changes?
Thanks.

Topic		Replies	Views
Slow queries and cannot access old logs - Loki 3 migration Grafana Loki loki , migration	6	76	June 18, 2025
More info about Loki Migrate tool Grafana Loki	2	1113	September 26, 2024
Loki not pushing logs to s3 Grafana Loki loki	2	443	March 12, 2025
How to install Loki on (AWS) EKS using Terraform with S3 Grafana Loki loki , terraform	5	761	November 20, 2024
How to store loki logs into s3 please provide loki-config.yaml Grafana Loki loki	8	1488	September 25, 2024

Problems to migrate Logs to new Loki instance

Related topics