- 10s
a)
...
metrics:
wal_directory: /tmp/agent
global:
scrape_interval: 10s
...
grafana agent conifig
b)
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)
works
- 60s
a)
...
metrics:
wal_directory: /tmp/agent
global:
scrape_interval: 60s
...
grafana agent conifig
b) 1m
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)
no data
c) 5m
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
show chart with value on each 15 sec
Why? How to fix?
scrape_interval: 60s`
What it does? Does it collect data each 60sec or with interval of each 60s? So on each 60s I will have 1 value or more? If one, then why I see on chart value on each 15 sec.? Actually with whatever setting I see it always on each 15sec.
What is happening here?
Why rate?
I read rate do a difference between start and end point. We want to have raw value, not a difference right between time line points? Unless this things mean something different, than I expect.
But first of all why points on chart are always on each 15 sec and why scrape_interval
60s blow up chart?
How I should do it?
I guess this blog post may help you:
1 Like
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[$__rate_interval])) * 100)
no data
How can I read $__rate_interval
value?
For example you can use that variable in the panel title, so it you will have visible current value in the UI.
I have found the value in “inspector”.
Let me know if my conclusion is correct:
agent scrape_interval
value
metrics:
wal_directory: /tmp/agent
global:
scrape_interval: 60s
should be equal to prometheus scrape_interval
global:
scrape_interval: 15s
to make things consistent and simple to use.
This make two questions:
- why default interval for prometheus is 15s, but for agent 60s?
- what is “best practice” value?
edit:
not to
global:
scrape_interval: 15s
but to grafana Interval behaviour
/ Scrape interval
in data sources
.
1.) I don’t know . Money? = to save storage/infra cost? I believe 15sec is overkill for 99% of people anyway (for non rate metrics).
2.) See linked blog post:
It is recommended you use the same scrape interval throughout your organization
But of course you may have own requirement, e.g. rate metric with 10sec precision and then 15s is not right setup for this requirement.
The final solution is:
agent
metrics:
global:
scrape_interval: 60s
grafana prometheus provisioning
datasources:
- name: Prometheus
type: prometheus
jsonData:
timeInterval: 60s
which make
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[$__rate_interval])) * 100)
work as expected
1 Like