Histogram graph data shifted by 2 minute?

I’ve got prometheus scraping a histogram of request latency from my API server every 1m .

I’m then visualizing the histograms as a heatmap in Grafana. It looks like this:

2018-10-26_23%3A28%3A35

My query is:

sum(http_request_duration_seconds_bucket) by (le) 
- 
sum(http_request_duration_seconds_bucket offset 1m) by (le)

What I understand this to be doing is: take the bucket vector at the time concerned and subtract the bucket vector 1 minute prior to get the bucket vector for that minute (histograms accumulate and don’t clear their buckets each scrape)

But the issue is, looking at my request logs, I see:

...
13:56:13 GET 304 3 ms
13:56:18 GET 304 4 ms
13:56:18 POST - -
13:56:23 GET 304 4 ms
13:56:28 GET 304 0 ms
13:56:32 GET 304 3 ms
13:56:38 GET 304 6 ms
13:56:42 GET 304 4 ms
13:56:48 GET 304 9 ms
13:56:53 GET 304 2 ms
13:56:54 POST - -
13:56:58 GET 304 2 ms

13:57:01 GET 200 2 ms
13:57:03 GET 304 6 ms
13:57:08 GET 304 4 ms
13:57:12 GET 304 3 ms
13:57:18 POST - -
13:57:21 GET 304 8 ms
13:57:21 GET 304 8 ms
13:57:23 GET 304 4 ms
...

As you can see, the first time there were requests of “Inf” runtime was in 13:56 , but if we look at the histogram, we don’t see that until 13:38 .

Why is the data shifted by 2 minutes?

FTR this question received an answer on StackOverflow, which, although not definitive, has pointed the way for further investigation. I will update both here and there with any findings.

I think I can account for the delay. There are two aspects at play:

  • I had wrongly assumed that prometheus was preserving the timestamps of the individual events. Instead, the data in the histogram is collected at the moment the scrape hits, and the data is tagged as occuring at that moment.
  • I had forgotten the details of my own instrumentation - I add the data point to the histogram metric using the prometheus client library at the moment the request finishes (and how could it be any other way? we want to know the latency!)

So, if the data for the request at 13:56:18 (the first long-running request) was added to the histogram at the moment the request returned (more than 30 seconds after that point), say at 13:57:15 , and then the scrape times were: [ 13:56:10 , 13:57:10 , 13:58:10 ], then the request which began at 13:56:18 would end up being part of the 13:58 scrape.