Hi there,
i’m using Promtail version 2.9.7 and the latest loki version (3.2.0). my logs are in json format and there is a field that I want to split the value. the value looks like this:
"remote_ip":"192.168.11.25, 127.0.0.1"
if I want to split the value in promtail, how to do it? let’s say I already use regex pattern to match the value. what should I do after this?
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: application
#NOTE: Need to be modified to scrape any additional logs of the system.
__path__: /var/log/gitlab/gitlab-rails/*.log
pipeline_stages:
- regex:
expression: '"remote_ip":"(?P<remote_ip>[^"]+)"'
thanks
Most simple regex:
`\"remote_ip\"\:\"(?P<remote_ip>[0-9\.]+).*\"`
It only matches number and a dot in the most basic sense, doesn’t actually match a valid IP. But given you are probably getting it from some sort of proxy i think that’s acceptable. Otherwise you can search and find a regex to match a valid IP.
If i use that kind of pattern, what the promtail do is just relabeled the value of remote_ip with the first IP match the pattern, am i right?
But the thing is i still need the localhost ip. That’s why i want to split them by a comma separator. If there is a way to split them in promtail, please let me know. Thanks
Then you can just add another match to regex:
\"remote_ip\"\:\"(?P<remote_ip>[0-9\.]+)[^0-9\.]+(?P<local_ip>[0-9\.]+)\"
You should keep in mind that if you are using something like real_ip in Nginx, you can have more than two IPs in the remote IP header. In general you only really care about what the real remote IP is, whether localhost is 127.0.0.1 or not (well it shouldn’t be anything else) really has no value.
Maybe i forgot to mention what my goal is. My goal is to visualize it in grafana using table visualization and group the data by remote_ip. If i make a new field for the localhost IP, then i need to group the data by 2 fields in the table and that’s not match with my usecase. Thanks
I am confused. My first reply assigned remote IP to remote_ip, and you said you needed localhost too. My second reply added a local_ip label, and you said you don’t need localhost.
What exactly are you looking to do? If you just want to aggregate by remote IP my first reply should work for you.
count by(remote_ip) (count_over_time({job=“application”} | json | method != “” [$__auto]))
This is my query. When i said i need to split the value by a comma separator, it’s because i can query the data by aggregating remote_ip only so the data in table visualization will appear 192.168.11.25 and 127.0.0.1
If i’m using your first reply, then the localhost IP will not included as the value gets replaced by the regex, right?
If i’m using your second reply, then i need to define the new field in my query also. If the remote_ip field has 5 or 7 values, do i need to match them using regex and define them one by one?
count by(remote_ip,local_ip) (count_over_time({job=“application”} | json | method != “” [$__auto]))
My point is clear, i just want to split the value based on comma separator. So the value will not thrown into a new field and i can still aggregating the data by remote_ip only. I just want to know if that’s possible or not. If not, is there another way to achieve it? Thanks