Loki simple scalable on ec2 nodes

Hi Team,
I am new to loki and want to setup a cluster as a simple scalable where each component (read, write and backend) deployed in separate ec2 nodes (Not on Kubernetes).
Need configuration for the same where the storage should be s3, and KV Redis if we can use it.
I also want to use zstd compression on the log files if possible and i want to use tsdb.

Do we need Backend component for this setup. I tried setting up the similar setup without it and it is working for far but i am not confident about it.
It would be great if someone can guide me through it as official documentation is helpful for kubernetes but not for ec2 setup.
Loki Version: 3.2.2
Thanks!!

Sounds like you have it mostly work, what is your question specifically?

Also, yes you will want backend.

Hi @tonyswumac , Thanks for replying. Let me know if my understanding is correct:

  • When we set target type, for ex: write, it will create all the components for the write target distributor and ingestor, for read it will create components querier, frontend querier and for backend target components are compactor, Index Gateway, Query Scheduler and Ruler. It will create these components regardless of if we provide its configuration in the configuration file for respective target, right?
  • Flow: application will throw logs to load balancer → write node → distributor component → it will check in kv which ingester it can forward it to (how does it select?) → ingester ingests the log and then saves index and actual log in memory first until the set threshold is breached for flush, after that it flushes both in the persistent storage (s3 in this case)-> user queries in grafana → load balancer for read nodes → goes to any one read node → querier component understands the query → get the data from persistent storage and stores in cache → returns to the grafana
    now in this flow let me know if i am missing something. Also i am not exactly getting how where exactly does backend comes into play.
    → I am also attaching my configuration file for write and read. kindly let me know if i did something wrong or need to add something as i am not confident about the configuration.
---
auth_enabled: true

server:
  http_listen_port: 3100
  grpc_listen_port: 9095

schema_config:
  configs:
    - from: 2025-02-01
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h
      chunks:
        prefix: chunk_
        period: 24h

ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: consul
        consul:
          host: consul-ip:8500
      replication_factor: 1
  chunk_idle_period: 5m
  max_chunk_age: 1h
  chunk_encoding: zstd
pattern_ingester:
       enabled: true
common:
  compactor_address: localhost:9095
  path_prefix: /loki
  ring:
    kvstore:
      store: consul
      consul:
        host: consul-ip:8500
    instance_addr: 127.0.0.1
    replication_factor: 1


limits_config:
  retention_period: 168h
  max_query_lookback: 168h
  max_query_parallelism: 256
  max_query_length: 168h
  ingestion_rate_mb: 10
  ingestion_burst_size_mb: 30

storage_config:
  aws:
    bucketnames: loki-stg-logs
    region: ap-south-1
    s3forcepathstyle: true
  tsdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/cache
    cache_ttl: 24h

compactor:
  retention_enabled: true
  delete_request_store: s3
  working_directory: /loki/compactor

Questions:

  1. Since I don’t have compactor (Backend component) It doesn’t matter if i have this configuration or not right?
  2. This configuration is not generating log files in zstd compression or any compression. why?
  3. All of the logs files are going in fake folder even when i set the header for X-Scope-OrgID. why?
  4. I saw chunk directory in read nodes and not in write nodes, is it the data it reads from s3 and saves locally?
  5. In S3 I am getting Index directory, fake directory and cluster_seed.json is there anything else that is supposed to be there or that is all?
  6. I need to scale it to get roughly 100-150GB logs daily, how can i tune it?
  7. Read nodes don’t get registered in consul right? (ik stupid question)
  8. There was a gossip method as well, if i am using consul, now it doesn’t uses gossip right?
  9. Can you please also provide a sample config file for backend as well?
    Thankssyouu soo muchh!!

Yes.

Double check the configuration from your Loki container directly by hitting the /config API endpoint.

Double check the configuration from your Loki container directly by hitting the /config API endpoint. Also double check header is actually getting to Loki. If you are using load balancer some may strip away headers.

Not sure. Are there files under the directory?

You should get one directory per tenant.

Pretty easy to scale ingestion traffic. You just scale write containers up mostly.

I believe read nodes also need to be registered. But I don’t use consul, I could be wrong.

Correct.

All loki components can use the same configuration file. You don’t need a separate one for backend.

Also, couple of things I noticed:

  1. You will definitely want a backend target. It has compactor and ruler, and you’d want at least the compactor.
  2. If you are running Loki as a cluster, you should not advertise instance address with 127.0.0.1. In a ring membership all Loki instances need dedicated IPs, if you set everything to 127.0.0.1 then there is no cluster. You should check /ring and make sure your cluster actually is formed.

Hi @tonyswumac,
I got the requirement of compactor! thanks that was helpful.
Another problem i saw was that i checked the /config part and there also the chunk_encoding is zstd and i also tried gzip. but still the files that i am seeing in s3 are like 194f512e8a5:194f512ede0:1053aa25 and the data is encoded but no extention is there. I need some sort of proof that it is getting compressed in the desired format. how can i make sure of that?
ps: i can see that data fine in grafana, i tried to download the file and check the encoding using a tool but it shows that format is not correct.

Also, i have a confusion about what if some chunk is not big enough to get flushed in the s3, so it should not be visible in grafana till it is written in s3, right? but i am seeing logs in grafana as soon as they are sent to loki write node. So i am confused, how?

Loki will not write chunks with file extension. I’ve never tried to do this, but there is a chunk-inspection tool that might get you what you need: loki/cmd/chunks-inspect at main · grafana/loki · GitHub

Loki querier will query the ingester for period configured by query_ingesters_within. This is precisely to deal with the scenario where logs are on ingester but not yet flushed to S3.

Loki will not write chunks with file extension. I’ve never tried to do this, but there is a chunk-inspection tool that might get you what you need: loki/cmd/chunks-inspect at main · grafana/loki · GitHub

Hi @tonyswumac
Yeah i tried this tool only, it throws error, failed to read compression: unknown format: 4, it is able to read the lables, UserID and other metadata but not the compression type or the log. So is there any other way to be sure of the compression type of the log?

Also, i am getting index files in filesystem and s3 both. I want index files only on filesystem and not on s3. in case that is not possible, if i delete index files from s3 but not from filesystem will it break the cluster.

Can i store chunk files at a custom path? like for example at a path like /SerivceName/Environment/Year/Month/Date/ in s3

You could inspect the code and write your own. I don’t know how to assist on this further, this is not a concern for me and I’ve not had to inspect this.

You configured for S3 storage, therefore both index and chunks will be stored on S3. You cannot split them. Whatever you see on your filesystem is temporary.

Got it Thanks!!
One last question,
Can i store chunk files at a custom path? like for example at a path like /SerivceName/Environment/Year/Month/Date/ in s3
I tried to change index prefix and chunks prefix, not able to change the path for chunks. and i want to change the path according to the timestamp. is there any way to achieve this?
The path i am getting is /tenent_id/15797409fd712de2/194fbc1e019:194fbd55aa3:5b6a905f
I want to change this.

I don’t think this is possible. Even if it were I would still recommend you to not do it. Use a dedicated S3 bucket for Loki, and let Loki handle it.