Multiple groups with the same name regex can only access the first one?

I’m trying to get metrics from log entry with multiple groups with the same name regex. However, I can only get the metrics of the first one. Here is the log line and promtail config;
Log line(pool_name, resp_time):
dataset1-pool-1(0.05) dataset2-pool-1(0.981) dataset3-pool-1(1.32) dataset4-pool-1(0.23)
dataset1-pool-1(0.342) dataset2-pool-1(0.981) dataset3-pool-1(1.51) dataset4-pool-1(0.123)
dataset1-pool-1(0.212) dataset2-pool-1(0.981) dataset3-pool-1(2.34) dataset4-pool-1(0.32)
dataset1-pool-1(0.451) dataset2-pool-1(0.981) dataset3-pool-1(1.432) dataset4-pool-1(0.863)
dataset1-pool-1(0.251) dataset2-pool-1(0.981) dataset3-pool-1(1.432) dataset4-pool-1(0.863)
dataset1-pool-1(0.151) dataset2-pool-1(0.981) dataset3-pool-1(1.432) dataset4-pool-1(0.863)

scrape config:
pipeline_stages:
- match:
selector: ‘{job=“accesserlog”}’
stages:
- regex:
expression : ‘(?P<pool_name>[^ ]+)((?P<resp_time>[\d.]+))’
- labels:
pool_name:
- metrics:
latency:
type: Histogram
prefix: pool_resp_
source: resp_time
config:
buckets: [0.005,1,10]

Now the promtail can only show the first group metrics as blow:

pool_resp_latency_bucket{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”,le=“0.005”} 0
pool_resp_latency_bucket{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”,le=“1”} 6
pool_resp_latency_bucket{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”,le=“10”} 6
pool_resp_latency_bucket{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”,le="+Inf"} 6
pool_resp_latency_sum{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”} 1.457
pool_resp_latency_count{filename="/var/log/test.log",job=“accesserlog”,pool_name=“dataset1-pool-1”} 6

How can i get the histogram type metrics of all pool, is my configuration wrong?

Thanks in advance.

I don’t think it’s possible to do what you are doing easily, you would have to write a regex that matches each pool and extracts them as separate data and create separate histograms, or you would need to split your log line at the source to only put one pool per log line.

Hi ewelch,
Thank you for your prompt reply. Now I configured as blow:
pipeline_stages:
- regex:
expression : ‘(?P[^ ]+)((?P[\d.]+)) (?P[^ ]+)((?P[\d.]+)) (?P[^ ]+)((?P[\d.]+)) (?P[^ ]+)((?P[\d.]+))’
- labels:
poold: “”
- match:
selector: ‘{poold=""}’
stages:
- labels:
pool_name: pool1
- metrics:
latency:
type: Histogram
prefix: pool_resp_
source: resp1
config:
buckets: [0.005,1,10]
- match:
selector: ‘{poold=""}’
stages:
- labels:
pool_name: pool2
- metrics:
latency:
type: Histogram
prefix: pool_resp_
source: resp2
config:
buckets: [0.005,1,10]
- match:
selector: ‘{poold=""}’
stages:
- labels:
pool_name: pool3
- metrics:
latency:
type: Histogram
prefix: pool_resp_
source: resp3
config:
buckets: [0.005,1,10]
- match:
selector: ‘{poold=""}’
stages:
- labels:
pool_name: pool4
- metrics:
latency:
type: Histogram
prefix: pool_resp_
source: resp4
config:
buckets: [0.005,1,10]

Which can get all pools’ information, but when two pools in the log line swap position,such as:
dataset1-pool-1(0.251) dataset2-pool-1(0.981) dataset3-pool-1(1.432) dataset4-pool-1(0.863)
dataset4-pool-1(0.251) dataset2-pool-1(0.981) dataset3-pool-1(1.432) dataset1-pool-1(0.863)

the promtail metric will go wrong as blow:

  • collected metric “pool_resp_latency” { label:<name:“filename” value:"/var/log/test.log" > label:<name:“job” value:“accesserlog” > label:<name:“pool_name” value:“dataset1-pool-1” > histogram:<sample_count:1 sample_sum:0.863 bucket:<cumulative_count:0 upper_bound:0.005 > bucket:<cumulative_count:1 upper_bound:1 > bucket:<cumulative_count:1 upper_bound:10 > > } was collected before with the same name and label values
  • collected metric “pool_resp_latency” { label:<name:“filename” value:"/var/log/test.log" > label:<name:“job” value:“accesserlog” > label:<name:“pool_name” value:“dataset4-pool-1” > histogram:<sample_count:10 sample_sum:4.798 bucket:<cumulative_count:0 upper_bound:0.005 > bucket:<cumulative_count:10 upper_bound:1 > bucket:<cumulative_count:10 upper_bound:10 > > } was collected before with the same name and label values

Look forward to your feedbacks and suggestions soon.