Drilling into AWS CloudWatch ALB response time data

anon56958872 · May 15, 2018, 9:07am

Hi there,

My goal is to identify requests that are taking time and lower them. Right now I’m viewing the 5 minute sum of TargetResponseTime in CloudWatch, though it doesn’t quite look the same in Grafana. Why is that?

What is Grafana’s version of CloudWatch’s period?

Here’s a video that hopefully better demonstrates my confusion:

Any tips please how I can best drill into this data?

Many thanks!

jangaraj · May 15, 2018, 9:47am

You can’t view individual requests in the CloudWatch, only aggregated metrics => Grafana is not the right tool for your goal.

You should enable access logging and then analyze access logs:

parse access logs from S3 into Elasticsearch and then you can query/visualize individial requests in Grafana - find some ready AWS Lambda code, which parse access logs into ES
use AWS native approach - Athena
use other log analytic tools, there is a plenty of them

AWS Athena is good and quick solution for one-time job. If you need to create also other dashboards (histogram per response code, backend, user agent, …) and you want them analyze in long term, then you need some storage (Elasticsearch, …) + visualization tool (we have to say Grafana here, of course ).

anon56958872 · May 16, 2018, 8:36am

Thank you @jangaraj!

I did look into Athena, though I think a low tech version of zcat *.log.gz | awk '{print $7, $13, $14, $15}' | sort suited me better to find the request paths that were slow from the logs.

Nonetheless I am still a little confused in Grafana how to specify the period

So I replace Grafana’s Min period input’s auto with 60 for a 1 minute period??

jangaraj · May 16, 2018, 6:48pm

Yes, just bear in mind retention period:

Data points with a period of 60 seconds (1 minute) are available for 15 days

anon56958872 · May 17, 2018, 1:39am

Sorry, I guess my real question is what is auto in Grafana parlance? One hour? One minute?

jangaraj · May 17, 2018, 8:00am

github.com

grafana/grafana/blob/6fa46d9539eb89fa7c70d29c6c55153985736e4f/public/app/plugins/datasource/cloudwatch/datasource.ts#L81-L98


if (!target.period) {
  if (now - start <= daySec * 15) {
    // until 15 days ago
    if (target.namespace === 'AWS/EC2') {
      periodUnit = period = 300;
    } else {
      periodUnit = period = 60;
    }
  } else if (now - start <= daySec * 63) {
    // until 63 days ago
    periodUnit = period = 60 * 5;
  } else if (now - start <= daySec * 455) {
    // until 455 days ago
    periodUnit = period = 60 * 60;
  } else {
    // over 455 days, should return error, but try to long period
    periodUnit = period = 60 * 60;
  }

=> it considers used timespan, namespace and AWS retention policy. It doesn’t make sense to have period 60 seconds, when you are displaying last month. So, implementation is smart and it tries to display the finest metric granularity.

anon56958872 · May 18, 2018, 2:32am

Why are the plot lines disjoint? Please seek to 1:19 of https://s.natalian.org/2018-05-18/autofill.mp4

Earlier part of the video is off topic and attributed to https://github.com/grafana/grafana/issues/11984

Thanks again!

jangaraj · May 18, 2018, 7:48am

Because you have sparse data values. If you request count metric values are:

00:00 1
01:00 100

Does it mean, that you have had ~50 requests at 00:30?
No, metric value at 00:30 is NA (null, nil, …). But almost all monitoring graphs show that (CloudWatch console as well). They just connect 2 values and it is OK for many use cases. You can enable/disable this behavior in Grafana (Display → Null value: connected).

I prefer bar graph for sparse values, for example for Lambda function stats:

Topic		Replies	Views
The AWS Cloudwatch PERIOD function (in expression) 5x smaller in 24hrs time range Dashboards dashboard , expressions	8	2919	October 15, 2024
Grafana cloudwatch integration is very slow + how to set up dashbaord? Grafana templating , cloudwatch	3	1514	October 26, 2024
Cloudwatch - Application Load Balancer metric value changes based on time period Grafana cloudwatch	0	491	February 20, 2020
Troubles with Cloudwatch Configuration cloudwatch	2	614	October 22, 2018
Can I change the scrape time when importing Cloudwatch metrics from Grafana? Cloudwatch	0	434	November 10, 2022

Drilling into AWS CloudWatch ALB response time data

Related topics