Metrics aggregation in large test seems to cause excessive memory consumption

Hi there,

I have a test in K6 that is consuming excessive and constantly increasing memory unless I specify --no-thresholds and --no-summary. The test only uses 50 VUs but memory ramps up linearly over the course of a couple of hours to over 25GB, at which point I encounter oom problems.

I’ve been through the “Running large tests” guide and looked at, and tried all the suggestions - the only thing having a significant impact on memory consumption for my test is adding --no-thresholds and --no-summary.

I believe this is because of the nature of the URLs used in the test - I have approximately 30 unique URL “patterns”, however, each of these has a dynamic substitution for a dynamically generated number with 10 million possibilities.

I’ve tried using URL grouping - both explicitly with the “name” tag within the request, and with urlurl. I can see the grouping working, but it doesn’t affect the memory growth.

(Side note - I don’t think the “name” tag works properly for http.del() requests - I can only get it to work if I set a global name tag within options)

Setting --no-thresholds and --no-summary, I see no memory growth. Setting the output to cloud and looking at the performance insights, I don’t get warnings about the number of URLs or metrics, showing the grouping is working as expected.

However, I need local summary to work as I need to run tests for several hours, and everything else is working locally except the metrics reporting.

I attempted to output to a local influxdb but the volume / cardinality of metrics quickly blow memory and CPU on that too. Looking at the raw metrics by outputting to json, I see that although the name tag is correctly set, the unique url is still included with every metric - I believe this is the problem.

I tried to manually overwrite the url tag globally to a dummy value to get around this but it doesn’t seem to be possible. Is there any way to get around this, or anything else I should look at to potentially help the situation?

Hi @ian.frazer,
welcome to the community forum. :tada:

Thanks for the time spent writing details about your problem. If the solution using the URL grouping (name) works for you, you could just disable the url from the systemTags option. It could be good to disable all the non-required tags from that list.

(Side note - I don’t think the “name” tag works properly for http.del() requests - I can only get it to work if I set a global name tag within options)

Thanks for reporting, going to open an issue to fix it.

1 Like

Thanks for the reply, and highlighting the systemTags option I was unaware of. Unfortunately even after removing the majority of the system tags, and attempting some tuning of influxdb, it seems there’s still too much data for influxdb to cope.

Reducing the tags also doesn’t appear to have made a significant difference to k6’s own memory consumption when attempting to run with thresholds and summary active. I presume at this point I’m waiting on the implementation of Use HDR histograms for calculating percentiles in thresholds and summary stats · Issue #763 · grafana/k6 · GitHub to be able to have these options enabled for longer-running tests?

Regarding this part I have to correct myself, I think you are setting the wrong parameter. As documented, the Param option for http.get is the second argument but for methods that accept a body it is the third.

it seems there’s still too much data for influxdb to cope.

In this case, I think you have to use telegraf for aggregating more.

Yeah, unfortunately in this specific case if you need the local run + summary and threshold then there aren’t easier solutions.

@codebien are you saying that when unique URI are utilized in the face of a custom tag names that k6 still exports unique URI/

here is a case where a unique URI is produced due to accountIds but I’m aggregating them with the tags param. is there a bug in this operation based on what Ian reported?

	var uri = `/api/${accountId}`;
	var url = urlbase + uri;
	const params = {
		headers: {
			'Authorization': 'Bearer ' + api_token,
			'X-PSN-QA-Data-Marker': 'ltip',
			'Content-Type': 'application/json',
		tags: {
			name: 'GET /api/accountIds`'
		timeout: request_timeout,

Hi @PlayStay,
no, I meant that I expect @ian.frazer is doing:

http.del("", params)

instead, the correct call is:

http.del("", null, params)

because del method’s signature expects a body as a second argument del(url, [body], [params]).

Thanks - You’re correct I was indeed calling http.del() incorrectly. With the args set up properly it works as expected.

Thanks for the pointer to use telegraf - will give that a try.