Loki Multi environment simple scalable mode Setup

adilbouchaara · July 19, 2024, 7:22am

we are benchmarking different approaches to run Loki in a multi-environment setup where there’s no direct Connectivity between Dev/stage/live and we have a bunch of clusters for each environment + and s3 bucket for each environment
our idea for the simple scalable mode approach:

run Loki write target in the main cluster per Dev/Stage,
run Loki Read and Write targets in Live Environment (main) which will host Grafana
update Loki read join memberlist in Live with Write endpoints from each environment
endpoints of Loki write of dev/stage should be secured with authentication and mTLS
then connect grafana to the main Loki read in Live
Logs should be stored in different s3 buckets per environment.
querying log of Dev: grafana → Loki read → Loki write(Dev) → fetch logs from Dev s3 bucket + Dev Ingester

can you please give us some advices/guidance if this will work or not

tonyswumac · July 19, 2024, 8:00am

If you don’t have direct connectivity between your environments then you probably have to operate one Loki cluster in each environment, and you will need both read and write targets.

Then you’ll need a Grafana instance somewhere, which will need connectivity to your Loki clusters. This will be a problem as well if you don’t have connectivity between your environments.

adilbouchaara · July 19, 2024, 8:09am

We want to have one single Datasource in grafana for all environments that’s why we want to have read target In Live connected to write targets in other environments + write target from live
is this feasible or not, in case it’s not can you please explain why?

since there’s no connectivity between Envs we want to make Loki write Public + secure it with mTLS

adilbouchaara · July 19, 2024, 1:56pm

@usman.ahmad @anon80964540 any idea regarding this Topic?

tonyswumac · July 19, 2024, 10:09pm

In my opinion you would be much better off operating one Loki cluster in your main environment along with Grafana.

Let’s first consider your first potential solution, of having write targets in dev and stage environments. First your biggest problem is security. mTLS is good, but it won’t secure your endpoint, so you would have to limit your public endpoint with ingress whitelisting. Second problem is cost, at least in most cloud provider each eIP would be additional cost.

If you are considering public connectivity already, it would be much easier to operate your entire Loki cluster in your main environment. Keep everything internal, but expose the write endpoint API via an external load balancer, and whitelist your egress IPs from your dev and stage environments.

adilbouchaara · July 20, 2024, 3:34pm

having complete isolation between environments is one of our PCI DSS requirements.

we can have only one central Loki exported to the internet and secured with Basic authentication + mTLS, and add the endpoint to Promtails in each cluster.

but for PCI DSS requirements we want to isolate data for each environment within its own S3 bucket and also make the setup more scalable

we can test the following setup:
→ Operating Loki cluster(deploying all simple scalable mode components) in the main environment.
→ main Loki read ↔ grafana communication is internal
→ Expose to the internet write component per each environment via a load balancer
→ Add write endpoints to memberlist.join_members parameter of Loki configmap to join the cluster
→ whitelist the IPs of the main Loki read in dev/stage write loki

please correct me if a step doesn’t seem right

tonyswumac · July 23, 2024, 11:30pm

I don’t understand your comment here, you are already considering exposing write endpoint between environments.

If you operate Loki cluster in one environment and expose the API, you have one exposed endpoint. If you operate Loki writer in all environments and expose the writer gRPC port, you have number of writer * number of environment endpoints, not to mention your Loki cluster is now spanning across multiple environments instead of being confined into one.

If you want, share a rough diagram and we can discuss further.

adilbouchaara · July 26, 2024, 12:03pm

I’m struggling to make Loki writes in main k8s clusters in Dev/Stage to join Loki cluster in the Main k8s cluster in the Live environment
below is the diagram for the POC that I’m working on:

for Centralized Loki not sure if it’s supporting multiple object stores per environment
Architecture diagram:

tonyswumac · July 26, 2024, 5:16pm

If you really want to do this, then you should configure each writer as separate cluster. And in your Loki cluster in live environments you’d simply live with the fact that logs from dev and stage would be delayed and you won’t see them until they are written to S3 buckets. I would strongly recommend against this. I think it’s theoretically possible, but I’ve not done it, and I don’t think it’s a good idea.

My original recommendation still stands:

All you are exposing is Loki API through an application load balancer. You can even add whitelisting on the ALB, and this is already less attack surface than your diagram.

tonyswumac · July 26, 2024, 5:17pm

Also, if you are in the cloud you could also look into solution such as transit gateway.

system · July 26, 2025, 5:18pm

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Recommended architecture for multiple environments Grafana Loki	4	2114	February 15, 2023
Loki multiple ingesters and single Grafana using GCS Grafana Loki	3	893	September 22, 2022
Multiple Loki sending to one S3 Grafana Loki	3	2805	October 11, 2022
Centralized Loki index/chunks storage Grafana Loki loki	3	1805	December 19, 2020
Loki local install - HA Design Grafana Loki	1	475	June 19, 2022

Loki Multi environment simple scalable mode Setup

Related topics