How to scale out a loki cluster on a physical server, Since I am not using Docker or k8s?

Hello,

Currently I am running a single instance of Loki on Linux, since our production env hit single node limit.
We are running Loki as a service, please let me know how we can scale it.

Please see Loki deployment modes | Grafana Loki documentation.

Things to watch out for:

  1. You’d probably want one server per Loki instance.
  2. Because you’ll need memberlist to join the Loki instances into a cluster, you’ll want fixed private IP, along with DNS records that can be used for service discovery.

Thank you so much Tony, Yes your correct.
I have read this deployment mode article, but I didn’t find any detailed documentation for this. It will be great if you can guide me how to achieve this task.

Thanks in Advance.

I don’t run Loki on physical servers, so I can only give you some rough ideas. But before you consider doing that, remember you’ll need some sort of storage that all your Loki nodes can access as backend storage. If you are running onpremise this usually means minio. You can use NFS too, but I suspect you’d get much worse performance with NFS.

  1. Since you are not using containers, I would recommend scaling with monolithic deployment. But I’d still recommend at least checking out a container platform (you don’t need to be too fancy, even Docker Swarm is probably sufficient, although I haven’t tried).
  2. Deploy a set number of nodes.
  3. Create a DNS record (let’s say loki1.local) that points to the nodes you created. This is for service discovery, so the DNS record needs to point to all nodes. For example:
loki1.local. 60 A   <node1 ip>
loki1.local. 60 A   <node2 ip>
  1. Configure join member like so:
memberlist:
  <OTHER_CONFIG>
  join_members:
  - dns+loki1.local:<GOSSIP_PORT (default 7946)>

That’ll give you a cluster. the rest is making sure all your nodes are configured the same way, and can get access to the backend storage.

Hey, thanks a lot Tony, I have deployed the multiple instances on different servers, Since I am using latest Loki version. I need pass member details at service level. Although everything looks fine but when I am trying to check the logs in Grafana I am able to see only logs transferred from loki1. why is that? it should so both the Loki data.

Below is the current

service file for Loki1 running on host1

[Unit]
Description=Loki service
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=loki
ExecStart=/usr/bin/loki --config.file /etc/loki/config.yml
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec = 120
Restart = on-failure
RestartSec = 2

[Install]
WantedBy=multi-user.target


service file for Loki2 running on host2
[Unit]
Description=Loki service
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=loki
ExecStart=/usr/bin/loki --config.file /etc/loki/config.yml \
                        --memberlist.bind-addr='host1-IP' \
                        --memberlist.bind-port=7946
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec = 120
Restart = on-failure
RestartSec = 2

[Install]
WantedBy=multi-user.target```

Not quite sure what you mean by this.

Check /ring endpoint on your Loki instance to verify a cluster is formed.

Hello Tony,

On my both instance I am getting this message.

ring endpoints are also just showing itself.


Please advise what I am missing here?

Below is my config file.

auth_enabled: false

server:
  http_listen_port: 3100
  log_level: debug

common:
  instance_addr: 127.0.0.1
  path_prefix: /tmp/loki
  storage:
    filesystem:
      chunks_directory: /tmp/loki/chunks
      rules_directory: /tmp/loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: memberlist

memberlist:
  node_name: loki-node-61
  join_members:
    - hostip:7946


ingester_rf1:
  enabled: false

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

pattern_ingester:
  enabled: true
  metric_aggregation:
    enabled: true
    loki_address: localhost:3100

ruler:
  alertmanager_url: http://localhost:9093

frontend:
  encoding: protobuf

In order to join two nodes into a cluster, you’ll need a DNS record with two values, pointing to both of your instances, then use service discovery to join them into a cluster. Please review my replies further above, I believe I did put down something on this.