Hi. My dataset has some abberations. To filter them out, I want to filter all values above and below some percentile.
How do I do that? In a normal database, I would do JOIN
between temporary SELECT
s that calculate the percentile and then filter them out of the second table. However in loki that is not possible.
I could execute two queries to get lower range percentile_over_time(0.1, ... | unwrap value [1d])
and upper range percentile_over_time(0.9, ... | unwrap value [1d])
and get a range for each set of labels.
But then, how do I avg_over_time(.... | <how to filter multiple thresholds for distinct labels> | unwrap value [1d])
?
The only solution I found is to convert the thresholds from percentile_over_time
to a big JSON array and then index it one by one. The data
below has the indexes of labels and values extracted from percentile_over_time
queries above.
avg_over_time(....
| label_format data=`{"job1:"{"name1":{"category1":{"high":1.0,"low":2.0},"category2":..... etc. etc. }`
| label_format high=`{{index (index (index (index (fromJson .data) .job) .category) .name) "high"}}`
| label_format low=`{{index (index (index (index (fromJson .data) .job) .category) .name) "low"}}`
| __error__==""
| value>high | value<low
| drop high, low, data
| unwrap value [1d]
)
However, that is not simple to construct. Is there a simple way?
Ideally I would want to write a nested query to match the difference on labels:
avg_over_time(....
| value>percentile_over_time(0.1, .... | unwrap value [1d])
| value<percentile_over_time(0.9, .... | unwrap value [1d])
| unwrap value [1d]
)
Thank you.