Agent scrape_interval break CPU chart

kwladyka · December 20, 2023, 5:12pm

10s

a)

...
metrics:
  wal_directory: /tmp/agent
  global:
    scrape_interval: 10s
...

grafana agent conifig

b)
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)

works

60s

a)

...
metrics:
  wal_directory: /tmp/agent
  global:
    scrape_interval: 60s 
...

grafana agent conifig

b) 1m
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)

no data

c) 5m
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

show chart with value on each 15 sec

Why? How to fix?

scrape_interval: 60s`

What it does? Does it collect data each 60sec or with interval of each 60s? So on each 60s I will have 1 value or more? If one, then why I see on chart value on each 15 sec.? Actually with whatever setting I see it always on each 15sec.

What is happening here?

Why rate?
I read rate do a difference between start and end point. We want to have raw value, not a difference right between time line points? Unless this things mean something different, than I expect.

But first of all why points on chart are always on each 15 sec and why scrape_interval 60s blow up chart?

How I should do it?

jangaraj · December 20, 2023, 7:12pm

I guess this blog post may help you:

kwladyka · December 22, 2023, 6:32pm

100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[$__rate_interval])) * 100)

no data

How can I read $__rate_interval value?

jangaraj · December 22, 2023, 6:35pm

For example you can use that variable in the panel title, so it you will have visible current value in the UI.

kwladyka · December 22, 2023, 6:37pm

I have found the value in “inspector”.

kwladyka · December 22, 2023, 7:05pm

Let me know if my conclusion is correct:

agent scrape_interval value

metrics:
  wal_directory: /tmp/agent
  global:
    scrape_interval: 60s

should be equal to prometheus scrape_interval

global:
  scrape_interval: 15s

to make things consistent and simple to use.

This make two questions:

why default interval for prometheus is 15s, but for agent 60s?
what is “best practice” value?

edit:

not to

global:
  scrape_interval: 15s

but to grafana Interval behaviour / Scrape interval in data sources.

jangaraj · December 22, 2023, 7:22pm

1.) I don’t know . Money? = to save storage/infra cost? I believe 15sec is overkill for 99% of people anyway (for non rate metrics).
2.) See linked blog post:

It is recommended you use the same scrape interval throughout your organization

But of course you may have own requirement, e.g. rate metric with 10sec precision and then 15s is not right setup for this requirement.

kwladyka · December 22, 2023, 7:46pm

The final solution is:

agent

metrics:
  global:
    scrape_interval: 60s

grafana prometheus provisioning

datasources:
  - name: Prometheus
    type: prometheus
    jsonData:
      timeInterval: 60s

which make
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[$__rate_interval])) * 100)

work as expected

Topic		Replies	Views
Grafana Agent scraping interval and metrics Grafana Alloy	0	1256	June 1, 2023
Grafana Agent Monitoring Cpu Utilization Grafana Alloy cloudwatch , telegraf	0	2801	June 8, 2023
Configure grafana agent to send cpu metric every second? Grafana Alloy windows , exporter	6	783	February 6, 2024
Agent stops scraping at scrape_interval=60s Prometheus	0	788	March 21, 2022
1/1 resolution, 15s scrape_interval, and empty graphs when time range is 3h or less Prometheus	1	602	November 29, 2023

Agent scrape_interval break CPU chart

Related topics