Normalized histogram

Hello,

I have an influxdb time serie data (roughly 40000 of couple of {_time, _value}).
I’m using flux.

I’m able to build the histogram using the histogram plugin, Y-axis going from 0 to 20000

I would like the Y-axis to be normalized from 0 to 1. The idea would to divide each bin by 40000 …

How do I do that ??? I can’t find any solution !!!

thanks all,

have a good day.

Olivier

hello,
you will need map function to perform calculation on your value
i think your situation is the same as calculation to get an percentage,
you can find ressource here :

one exemple of query :

from(bucket: "example-bucket")
    |> range(start: -1h)
    |> filter(fn: (r) => (r._measurement == "m1" or r._measurement == "m2") and (r._field == "field1" or r._field == "field2"))
    |> group()
    |> pivot(rowKey: ["_time"], columnKey: ["_field"], valueColumn: "_value")
    |> map(fn: (r) => ({r with _value: r.field1 / r.field2 * 100.0}))

But in your cas if you want normalize you want to divide by max value of your series… and not your count( key value pair )

Hello, thanks for your answer, I’m not sure to follow you … what I want is to normalize the statistical distribution associated with the time series, and not the time series itself. Therefore, to me I"m interested by how many occurrences happen for one bin in the distribution, and then I need to divide by the total number of occurrences.

My problem here is that :

1 - the grafan distribution plugin does not allow this. Am I correct?
2 - I can use the flux histogram(normalize = True) function, it works, but I don"t know how to display the result on grafana, because I get kind of 2 _values columns … and kind of no x-axis

 ...
 |> histogram(bins: linearBins(start: 0.0, width: 1.0, count: 20), normalize: true) 
 |> difference()
...

That generates 2 columns that I don’t know how to display …

bin A01
1 0
2 0
3 0
4 0.00333
5 0.00595
6 0.0123
7 0.0193
8 0.0161
9 0.0483
10 0.135
11 0.328
12 0.302
13 0.102
14 0.0238
15 0.00298
16 0
17 0
18 0
19 0

3 - I have to rebuild the whole histogram by myself …

with bin column as a string you should be able to display this data in grafana barchart panel it think, if your first column is time format you should be able to display this as time series histogram
1 - the grafana distribution plugin does not allow this. Am I correct?
this one ? CDF - Cumulative Distribution Function plugin for Grafana | Grafana Labs
i think you have make this in your flux query, grafana panel wait for a specific data format, no matter the panel.
2 - I can use the flux histogram(normalize = True) function, it works, but I don"t know how to display the result on grafana, because I get kind of 2 _values columns … and kind of no x-axis
Idk i never use this function, i always build it
3 - I have to rebuild the whole histogram by myself …
yes i think but could be really fast
to help you on your normalization issue, if you really want to normalize by your number of value, you have to divide by a count of this value .
in flux you can create variable of your query

data = from(bucket: “example-bucket”)
|> range(start: -1h)
|> filter(fn: (r) => (r._measurement == “m1” or r._measurement == “m2”) and (r._field == “field1” or r._field == “field2”))
|> group()
|> pivot(rowKey: [“_time”], columnKey: [“_field”], valueColumn: “_value”)

max_count = data
|>count()

res = data |> map(fn: (r) => ({r with _value: r.field1 / max_count }))

Something like that

yes, a lot of interesting things here, thank you very much.

for 1, I’ll definitively have a look to that plugin.
for 3, I understand what you want to do, I’ve kind of tried it already but had difficulties with variable usage, I’m looking at it too.

Thanks again :slight_smile:

for 3, I think there is an cumulative sum missing here