I have alloy configured to gather all files from /var/log/*.log, as well as /var/log/syslog. It is forwarding to loki. Within loki, only certian logs have timestamp, and other fields identified. How do i get alloy or maybe loki to see the timestamps and other data as it should. Thanks
This is from /var/log/auth.log
2024-11-07T13:45:01.205837-06:00 ebpf CRON[1891924]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
This is from /var/log/syslog 2024-11-07T13:54:06.455386-06:00 ebpf alloy[1870079]: ts=2024-11-07T19:54:06.454532494Z level=debug msg="collector succeeded" component_path=/ component_id=prometheus.exporter.unix.default name=cpu duration_seconds=0.001657023
It seems to me that only logs that show ts= have a time stamp, even though the /var/log/syslog , and /var/log/auth.log also have their own timestamps in the actual logs.
All logs sent to Loki have timestamps. I am not sure what version of Grafana you are using, but in Grafana 11 there is an option to show or hide timestamp for Loki logs, and if toggled you’d see timestamp in front of logs:
In older version of Grafana I think ts was disabled as a label, but in epoch format (down to nanoseconds if i remember correctly). I am not sure why your display seems to be inconsistent.
I am new to loki, so bare with me here. It appears that by doing the pattern bit I see the fields. However it also axed the fields from the other logs that had more fields before.
This is more like what I would want to see from linux, however not sure how to pull it off. Doesnt need all of these for sure. However Time, Instance, Level, etc would be super nice.
I think you may be misunderstanding something, I’ll see if I can try answer your questions the best I could understand them:
As mentioned before, all logs sent to Loki have a timestamp. The timestamp is NOT part of the log string, it is NOT a label, and you should not want to see it as a label (because it’s not a label). As my reply above showed, if you toggle Time you’ll see timestamps in front of your logs, otherwise you don’t. Anything else you see is part of the log string, and whatever time you have in the log string may or may not match the actual timestamps of the logs.
Typically you’d parse timestamp out of logs then specifically set it as timestamp (not label). Here is an example:
To make sure I understand correctly your saying by default loki is going to set a time stamp on the logs it ingests. Which is what I am seeing outlined in red on the screenshot below, when “time” is toggled on?
What I should be doing to make the loki time stamp more accurate is to map the logs defined time stamp as the timestamp loki will use when its processing the log?
Since I have no mapping of the log line timestamp and loki’s handling of it is that why I see a difference in time between the box in red, and the one in yellow? Is this also why i see two timestamps as well? (One is the “loki” timestamp, and the other is just part of the log string)
The last point is I dont want to have time as a field because because that will be a very high cardinality log, and generally the idea with loki is to have a small index as a best practice?
Dont want to make this to long, but with all that said. Have a few more questions about these pipelines which is where I think am most confused and am still missing the boat a little bit.
For a log such as the one below , would appropriate labels be the following? Instance = ebpf(this is the computers hostname), Service = CRON, LogFile=/var/log/auth.log and that’s about it? Just labels that define where it came from not the log data itself?
If so how would that be defined to pick up the labels correctly? 2024-11-07T21:17:01.772513-06:00 ebpf CRON[106079]: pam_unix(cron:session): session closed for user root
One other point, in this example we can see a bunch of fields in grafana. However if I use the label browser i see that there is only a few labels , and most do not coorespond to the data being extracted. With that said where does that leave things?
that could be from some other stuff in Loki unrelated to the files you scrapped. this is why before you dive head on, best to vet things out via loki.echo
the way you are going about with this seems a bit shooting in the dark and hope it sticks approach, which is exactly what I did and dug myself into a very dark rabbit hole.
There are two kinds of labels in Loki, the ones that came with logs when being ingester (this is the ones you see in label browser). The rest of what you show in the screenshot are processed labels, that come from filter such as json.
That’s a long discussion, so I am just going to look at your first message.
Some basics first:
Unless you instruct Alloy to parse content, it just takes the log line, put a timestamp when grabbed and push it to Loki with “zero” knowledge about the content
Loki doesn’t parse stuff either on ingestion. You can store log lines with zero knowledge about what’s inside
Content is parsed at query time (using pattern or logfmt or json, etc)
The typical recommendation is to add labels at Alloy level about what’s AROUND and not INSIDE the logs (ex: hostname=my-server, logfile=/var/log/apache.log, env=prod, job=system)
After that said, your usecase is not “weird”: your logline embed a timestamp that can be different from the time the logline is written or grabbed.
First of all: does it matter to correct your timestamp ? Order is preserved, data is correct, only problem you have is that Loki thinks it happened 0.3s later than normally. If it doesn’t matter, just give up on that. You would add a lot of config for no real value.
If you still want to proceed, you have 2 different formats in your logs (one with ts, one without) and you expected 2 differents results. You need to create 2 pipelines in Alloy (you can check localhost:12345 to see your current graph). You can find some knowledge about timestamp replacement here (it uses promtail but that the same concept using Alloy semantic)