Roadmap - Loki in Multi user enviroment

Today we use Grafana OSS in our kubernetes cluster to provide metrics to our users. We are planning to add Loki as log aggregation tool, but wish to limit the logs so the users/teams only kan query logs from their own designated kubernetes namespace.

What is the best approach to reach our goal on this?

On the Loki side, I would recommend you to enable multi-tenancy (see Multi-tenancy | Grafana Loki documentation). What this gives you is logical separation of logs on Loki based on organization ID. You can then use Promtail’s tenant stage to set the tenant/organization ID. This is a good read: Multi-tenancy with Loki, Promtail, and Grafana demystified | by Sander Rodenhuis | ITNEXT. There is a bit of nuance to this, which I’ll discuss at the bottom.

End users should not be allowed to directly connect to Loki API endpoint at all, therefore your user authc/authz should happen on Grafana side. On the Grafana side, if you are currently using organizations as well, then it makes it easy, you can simply create the data source in different grafana organizations with different tenant ID. If not, you may have to look into some sort of role-based access (which is rather limited in OSS).

Lastly, multi-tenancy does not actually enable authentication on Loki, it simply enables organization header. You can of course use that as a sort of API key, but this carries with it security implication (imagine if your tenant ID or namespace is called “abc”, that wouldn’t be a very good API key at all). So a common practice is to put an Nginx instance in front of Loki and let authentication happen there.

2 Likes

Thanks Tony
I will sure investigate this path and perhaps it might solve our challenge.

Thanks again :slight_smile:

1 Like

@tonyswumac Thank you for this explanation. I have one more question for clarification. Ex: I have Org A and Org B within Grafana. In Grafana-Agent-Flow several tenant ids are defined and in Loki multi-tenancy enabled. I use the HTTP header field in Org A “X-Scope-OrgID” = “tenant id 1” and that connects but when I switch to Org B and try to set up Loki Datasource with “X-Scope-OrgID” = “tenant id 1” I get back

Unable to fetch labels from Loki (Failed to call resource), please check the server logs for more details

Any idea what’s happening here? Or things to check for? Thanks!

That probably means there is no logs in org B. After enabling multi tenant you should see the chunks going into different directory structure on your chunk storage. You can take a look at /orgb on your chunk storage and see if there is actually any data there.

Thanks for the pointer - it looks like there aren’t Org Ids within S3, just the directories for each tenant id.
As a follow up, how are the Org Id’s defined? I think maybe I’m conflating Grafana Orgs with what we’re talking about for Loki X-Scope Org ID.
Theoretically, what I’m trying to accomplish is defining a Loki data source for each Grafana Org and using the HTTP Header to access a single (or more) tenant id(s).

There isn’t anything that magically link Grafana org and Loki tenant together. So first you should try to get Loki tenant working first (preferably with authentication instead of just HTTP header).

Once you have that, then you can create individual Loki data source in grafana orgs with the tenant ID (or simple authentication) that you desire.

Here is an overview of how we approached this problem. On the Loki side, we defined a list of tenants with a list of users. For example:

  • tenant_1: [t1_user1, t1_user2]
  • tenant_2: [t2_user1, t2_user2]

And we use an Nginx instance in front of Loki endpoints for authentication. We use locally defined uers with simple authentication, but we add a map so that when authentication with t1_user1 or t1_user2 it gets mapped with a particular Loki header. For example:

http {
  map $remote_user $loki_org_id {
    t1_user1 tenant_1;
    t1_user2 tenant_1;
    t2_user1 tenant_2;
    t2_user2 tenant_2;
  }

  server {
    listen               80;
    client_max_body_size 100m;

    location = /loki/api/v1/push {
      auth_basic           "Authenticate";
      auth_basic_user_file /etc/nginx/.htpassword;
      proxy_set_header     X-Scope-OrgID $loki_org_id;

      proxy_pass http://loki$request_uri;
    }

    # Other Loki API endpoints such as tail or catch all /loki/api/.*
  }
}

What this gives you is an authentication layer on Nginx that maps users to tenants with a pre-defined map. Once you have that, then you can configure one data source per tenant (using different users for authentication), and create those data source in your Grafana org where access is granted.

This is how we did it, you can of course forego the nginx part and rely on the header for tenant ID. The idea is the same, get tenant working for on Loki, then configure one data source per tenant as needed under your grafana orgs.

Yep, I’m looking to go super simple like you mentioned at the end.
I have the data source in Grafana Org A working with just the HTTP header pointed at a single tenant.

But when I switch to Org B for set up, it connects to Loki but can’t retrieve that same tenant.

Would I need to add anything to the agent.yaml or the Loki config?
I so appreciate your help with all the newbie questions.

In Org B, what are you using for value for the X-Scope-OrgID header? Make sure that corresponds to an existing tenant in Loki (with data).

Yep! Tenants are the named after the applications the logs are coming from and I can see those in S3. For testing I’m just trying to reuse the tenant that I know is working in OrgA inside the data source of OrgB. Is there any issue with multiple orgs connecting to the same tenant that you’re aware of?

Not that I am aware of. We currently do this, although we use username / password with tenant ID mapping instead of header directly, but even then I don’t see why there would be a reason or that.

Might need to start checking logs then, both on Loki and Grafana sides.

1 Like