K6 failed to upload metrics due to canceled context (OTEL)

kiksplx · October 14, 2024, 5:09am

Hi,

I noticed that the last batch of metrics fails to upload to an Otel Collector.
Both gRPC and HTTP protocols have the same behavior.

To replicate set the export interval to 1 second, K6_OTEL_EXPORT_INTERVAL=1. (The default 10s interval doesn’t log if the test is completed, but all metrics from the final batch are not sent.)

Example: .\k6 run -e K6_OTEL_EXPORT_INTERVAL=1 -e K6_OTEL_GRPC_EXPORTER_INSECURE=true -e K6_OTEL_GRPC_EXPORTER_ENDPOINT=0.0.0.0:4317 -e K6_OTEL_EXPORTER_TYPE=grpc -o experimental-opentelemetry --vus 3 --duration 10s test.js

gRPC logs:
INFO[0000] Setting up source=console
INFO[0017] Tearing down source=console
INFO[0020] 2024/10/14 17:46:58 failed to upload metrics: context canceled: rpc error: code = Canceled desc = context canceled
INFO[0023] Handling summary source=console

HTTP logs:
INFO[0000] Setting up source=console
INFO[0017] Tearing down source=console
INFO[0020] 2024/10/14 17:00:44 failed to upload metrics: Post “http://0.0.0.0:4318/v1/metrics”: context canceled
INFO[0023] Handling summary source=console

Please advise how to resolve this.

olegbespalov · October 17, 2024, 11:45am

Hi @kiksplx !

Welcome to the community forums!

I believe, in that case, k6 doesn’t control the metrics upload. It delegates it entirely to the OTEL SDK. By checking it, I see that there is a configuration option:

// WithTimeout configures the time a PeriodicReader waits for an export to
// complete before canceling it. This includes an export which occurs as part
// of Shutdown or ForceFlush if the user passed context does not have a
// deadline. If the user passed context does have a deadline, it will be used
// instead.
//
// This option overrides any value set for the
// OTEL_METRIC_EXPORT_TIMEOUT environment variable.
//
// If this option is not used or d is less than or equal to zero, 30 seconds
// is used as the default.

Right now, we don’t provide the overwriting it on the k6 side, so you could try directly using OTEL_METRIC_EXPORT_TIMEOUT.

Let me know if that helps!

Cheers!

kiksplx · October 21, 2024, 2:10am

Hi @olegbespalov! Thanks for your response.

I tried using OTEL_METRIC_EXPORT_TIMEOUT, but it had no impact. Even just testing 1 VU with 100 iterations, the report summary will confirm the completion of the 100 iterations, but the OTLP receivers will miss 1-2% of them, and you get this log at the end:

*INFO[0006] 2024/10/21 14:53:25 failed to upload metrics: context canceled: rpc error: code = Canceled desc = context canceled*

Are you able to replicate this issue on your end?

Please let me know if you need more information. I appreciate your help.

olegbespalov · October 21, 2024, 7:23am

Hi @kiksplx

After looking again at your original message, I believe the issue is that you don’t specify the unit for the export interval K6_OTEL_EXPORT_INTERVAL=1. This resolves to not 1s (second) but 1ms (millisecond), which is too short.

Could you please try using K6_OTEL_EXPORT_INTERVAL=1s.

Hope that helps.

kiksplx · October 21, 2024, 9:41am

Hi @olegbespalov,

Ah yes, you’re right it needs the unit.
I tried passing K6_OTEL_EXPORT_INTERVAL=1s unfortunately more metrics were not sent (~80 of 100). I also tested using the default (10s) but lost more metrics (~60 of 100).

It seems that the increased duration of the K6_OTEL_EXPORT_INTERVAL results in more metrics potentially being lost at the end.

I’m guessing that the k6 exporter ends as soon as the main function exits, thus missing to push the remaining metrics to OTLP. Thoughts?

I tried adding the option --linger to see if it completes the push, but did not work.

olegbespalov · October 21, 2024, 1:12pm

@kiksplx

I’m guessing that the k6 exporter ends as soon as the main function exits, thus missing to push the remaining metrics to OTLP. Thoughts?

Do you still see the message in the logs?

*INFO[0006] 2024/10/21 14:53:25 failed to upload metrics: context canceled: rpc error: code = Canceled desc = context canceled*

I tried passing K6_OTEL_EXPORT_INTERVAL=1s unfortunately more metrics were not sent (~80 of 100). I also tested using the default (10s) but lost more metrics (~60 of 100).
It seems that the increased duration of the K6_OTEL_EXPORT_INTERVAL results in more metrics potentially being lost at the end.

Is there a way to get the script that can be used to reproduce this?

kiksplx · October 21, 2024, 9:54pm

Hi @olegbespalov,

Do you still see the message in the logs? (failed to upload metrics: context canceled)

I don’t. This has not happened after I applied your recommendation to increase the K6_OTEL_EXPORT_INTERVAL to at least 1s. Thank you.

Is there a way to get the script that can be used to reproduce this?

Yes. I was hoping you could confirm if you can reproduce it.

Here’s the test script:

import http from 'k6/http';
import { sleep } from 'k6';

export default function () {
    http.get('https://test-api.k6.io/public/crocodiles/');
	sleep(1);
}

and command:

./k6 run -e K6_OTEL_METRIC_PREFIX=k6_ -e K6_OTEL_GRPC_EXPORTER_INSECURE=true -e K6_OTEL_GRPC_EXPORTER_ENDPOINT=0.0.0.0:4317 .\test.js -o experimental-opentelemetry --vus 1 --iterations 100

olegbespalov · October 22, 2024, 6:16am

To reproduce, I need to understand what I should reproduce. After fixing the export interval, the error is gone, which I see on my end, and you just confirmed that, but if the issue remains, I need other details like the script and the exact issue.

Like:

I tried passing K6_OTEL_EXPORT_INTERVAL=1s unfortunately more metrics were not sent (~80 of 100). I also tested using the default (10s) but lost more metrics (~60 of 100).

What are these metrics you lost when detecting? How do you measure the loss? All that said, in order to help you, I need as many details as you can provide.

kiksplx · October 24, 2024, 12:32am

Hi @olegbespalov,

I think it’s working now, please ignore my previous findings. I’ve marked your recommendation as the solution.

Thank you for being so helpful.

Topic		Replies	Views
Error Grafana Tempo Grafana Tempo tempo	1	910	April 28, 2025
K6 Percentile Metric to Open Telemetry Grafana k6 otel	1	140	October 30, 2024
Unable to export the metrics to datadog on latest k6 version 0.53.0 Extensions opentelemetry	4	196	August 23, 2024
Send OpenTelemetry metrics to DataDog Grafana k6	30	725	January 31, 2025
K6 operator failed to send metrics to prometheus Grafana k6 k6	1	327	May 3, 2024

K6 failed to upload metrics due to canceled context (OTEL)

Related topics