This works very well for the version of Grafana given in the yaml file (image: docker.io/grafana/grafana:8.5.6), but I have not been able to make it work with any 9.x version of Grafana, so something seems to break between the two major versions.
With version 9.5.3 of Grafana, I am able to retrieve the labels, when I explore the Loki datasource in Grafana, but no data is shown.
I have looked through the release notes for Grafana and enabled more logging in Grafana without being able to figure out what goes wrong. That said I’m both a beginner in regard to Loki and Grafana, so I might have missed something.
Has anybody managed to get it working with a 9.x version of Grafana?
If so any pointers that could lead to a solution would be appreciated.
Thanks for the reply. Here are some screenshots. The two first is by using Grafana 9.5.3 where the labels are retrieved but no data was found. The last one is using Grafana 8.5.6 (the image from the example), and here the same query returns data.
I noticed that if I press the “live” button I get data (don’t know why I haven’t done that before). It therefore seems to be a problem related to the time range when I do a search.
If you suspect time range being the problem, have you tried specifying different time range and see if you get data? When you press “live”, what’s the latest timestamp on the last log? Does new log continue to come in?
I tried changing the time range, without any luck (including a time range a day in the past to see if its the time offset, that are causing the problem) . I still don’t get data this way.
The entries retrieved when doing live looks fine. At 7:42 local time I got the following entry:
We updated to the lastest version of Red Hats Loki Operator yesterday (v5.7.2), which didn’t make a difference
As far as we can see of the images, this should result in Loki version 2.8.x being installed at the cluster. Given that we are not sure we have asked Red Hat. I will get back when we have an answer from Red Hat.
We tried doing direct api calls, and it seems to work. Only difference in our api calls and the way grafana do them, are a) that we go through an Openshift route and b) that I authenticate using a token while Grafana currently is configured to use authentication with CA cert (see screenshot below). Given that we get the labels. and things work in Grafana 8.5.6, this is what I would have expected. For example we tried:
I create the datasource using datasource provisioning (as done in the example in the Loki Operator documentation (see link in post above)). My current datasource definition looks like this:
The environment variable is set in the deployment. Here I made a discovery. At some point in trying to get things to work, I have added an extra certificate to this environment variable. I did this because I got the following error while retrieving data (but not when retrieving labels):
It seems that a oauth flow is triggered when retrieving data. The strange thing is that the CA, for the certificate that fails, is present among the containers CA’s. As I understand it go should use these certificates.
In the 8.5.6 container it hasn’t been necessary to ad the extra certificate. Could my problems be related to this?
To be honest I have had trouble finding good detailed information regarding the different authentication methods in the datasources, so any link would be appreciated.
Proxy
We are accessing loki through a service internally on the Openshift cluster, so there shouldn’t be any proxy. But given the error above I suspect that Red Hat might be using Observatorium.
Do you know if there are any way to configure Grafana to logout all the request that the datasources do? I have already enable datasource logging, and plugin login and changed the loglevel to trace.
You can check in your browser’s developer tool (for Chrome under the network section), sometimes it gives you some information.
I haven’t used Loki on openshift before, but if you were able to authenticate to Loki using a token, can you try to configure the data source using a token as well?
I was more after more logging of the requests between the Grafana backend and loki backend.
I tried using the token as you suggested (configured as the Prometheus datasource in my previous post). This works and I get data. My problem therefore seems to be related to authentication or authorization when retrieving data from the backend (strange that I can get the labels).
Unfortunately I can’t configure the datasource with a fixed token, given I need the multi-tenancy that should be provided using the other method. But at least we now know what the problem relates to, which in itself is progress.
Yeah, might want to ask someone with more experience on Openshift. Could also be a question for the Grafana forum since it’s less likely to be directly related to Loki.
I have gotten a lead through another channel and are working on it now. The trick seems to be to configure grafana to use the Openshift oauth server (without using the proxy), and set the oauthPassThru property in the datasource definition.
did you make some progress here with oauth as you hinted in your last update?
I myself am struggling with similar issues on Openshift 4.10, connecting an external Grafana to Openshift logging / Loki.
So I am dealing with the same issue and managed to get this to work with a workaround and I am in process of searching a better approach. I am new to Loki and I am trying to first get it to work at least somehow and deal with security later.
So the main issues that I have encountered are:
Openshift Loki has authentication enabled (client cert required)
logs are split into 3 “organizations”: application, audit, infrastructure
I managed to get it to work by stealing client certificate that is presumably used by Openshift Console to receive logs:
This is far from ideal but might help some of you kickstart a better solution. I want to avoid having to use TLS client auth and I can’t really do this the gitops approach + I am worried these keys get rotated by operator.
Hi davtex,
thx for updating with your findings, and great that it works for you in some way.
Thats one way of getting it to work with Grafana > v8.5, but I think that way are losing the application/namespace authorization feature, as you are skipping the gateway pod and connect directly to query-frontend pod.
Meaning, a user can see all applications logs, theres no restriction on which namespace he is permitted to access by RBACs, right?
If thats works for you, great, but I really need to restrict app logs access based on RBAC permissions.
At least now I know for sure that the rather simple & easy setup using Openshift’s oauth-proxy and Grafana’s auth-proxy authentication feature stopped working in Grafana versions >= 9.0, and I am still investigating using a Grafana oauth configuration using Keycloak.
Openshift: 4.12
Loki Operator: 5.7.6 (redhat)
Grafana Operator: 4.10.1 using different Grafana images: 8.5.27, 9.5.6, 10.0.2
Sorry for the late answer, but I had to make sure that I could publish the following (I got it from Red Hat - a big thanks to them for providing it and letting me publish it). It helped me a lot, and based on this I got it working:
Thanks a lot for coming back and sharing that information!
I haven’t had time to test this out yet, but was reviewing the manifests you shared. Seems like all the magic is now done through the Grafana oauth-client configuration!
One thing that got me confused at first was the grafana.ini field: “role_attribute_path”:
And it seems that this is a JMESpath query to express: IF user is member of “cluster-admin” (or dedicated-admin) group, THEN he will become GrafanaAdmin. While all others get the “Viewer” role in Grafana. Pretty cool!
I have now successfully implemented and tested it using Grafana 10.1 and and have to say it works beautifully! Loki access & authorization based on the user account is working like a charm.
Prometheus/Thanos access did not work for me following you recicpe, as I did not want to install everything in openshift-monitoring namespace, so I switched that configuration back to having the Prom datasource use the Grafana SA token, which works equally fine (and I am currently not interested in the alertmanager stuff that you included).
I believe Prometheus & Alertmanager access was the reason you have to install in openshift-monitoring namespace!?
That way you could enrich the oauth-token scope via the role:logging-grafana-alertmanager-access:openshift-monitoring config.
Is that correct?