For my use case, I have been testing a Promtail instance as a “Deployment” that ships logs to Loki. Promtail gets a persistent volume mounted to it, where Kubernetes audit /diagnostic log entries are continuously written to JSON files. The setup is running on a Minkube cluster. The volumes however read logs from Azure storage accounts via CSI driver.
I noticed the logs stop after a few seconds/minutes after Promtail starts, and only get resumed again after I restart the Minikube cluster. This happens when log entries get written to a new JSON file. Promtail seems to scrape them for a few seconds, then again stops till a new file gets created. The new files get recognized by Promtail and gets logged (msg=“watching new directory” directory=… “tail routine: started” path=…)
There are no problems appearing in the logs of Promtail or Loki. Promtail simply ignores the logs.
I understand that the mounting of volumes could be not stable due to network issues, but this behavior seems to be consistent.
I am experimenting with other approaches such as using sidecar containers, but was curious why this happens.
Would appreciate any tips:)
I assume you have something that pulls logs from Azure storage then writes into the volumes? How does that work? If you tail the log from a command line do you observe the same issue?
I do something similar for some AWS services, and I’ve not seen this problem before.
Hi Tony, the setup is similar to AWS I think, but here Azure CSI is used: the Kubernetes audit logs + other Azure AKS audit logs get written via “Azure Diagnostics Logs” to blob containers of an Azure Storage Accounts in JSON format. From there, using Azure CSI the storage containers get mounted to persistent volumes of Pods on the local Minikube cluster, which get scraped by Promtail.
I tried tailing the logs from command line, and yes, I found the same issue there as well.
When I use Azure Event Hubs for directly streaming audit logs to Promtail on the fly (rather than Storage Accounts for archiving logs, that get mounted to local cluster Pods), everything works fine and log entries get scraped instantaneously.
Sounds like the problem is most likely because of the mount. I don’t have a lot of experiences with Azure store, but with object storage I’d say tailing it is probably not the best idea because the files on object storage aren’t mutable, so if files change it is replaced rather than appended, which is just not great for tailing.
I would recommend you to add one more step to your process by copying files from azore storage to a local filesystem, then configure promtail to tail the files on local filesystem. You’ll need some sort of cleanup procedure as well, of course.