Hey folks. Based on this page - Cloud IPs I’m trying to figure out a way to calculate the most efficient use of VU’s for the arrival-rate executor. As you know it is great for maintaining throughput under variable latency buy dynamically adjusting VUs throughout a test scenario. Love it However it’s expensive on VUh usage. I’m zeroing in on a crude way to calculate what to set for maxVUs when using arrival-rate and ramping-arrival-rate based on the response time of my service under test but it’s proving to be a brittle solution.
my biggest problem is how k6 decides which tier to use based on VU’s. I’m finding that given variables such as LZ count, think time, response time, VUs if I get it wrong I’ll max out CPU on the instance(s) in the LZ, If I kinda get it right one scenario works fine but another runs out of VUs and truncates load from reaching the target rate. the only time I get a clean run is when I give up and just add 1 VU per transaction rate. again expensive.
again any thoughts on how to optimize usage of Load generation instances within and LZ using ramping-arrival-rate? I’d like to take my particular 2 known values of target rate and expected response time to determine the minimum count to set for maxVUs on any give test.
here’s an example of my scenario(s). in this case I’m trying to drive no more than 3000 RPS from any given load zone. ultimately I want to run a test generating 30K (3k per LZ) regardless of the testing use case.
noConnectionReuse: options_trigger_true_false,
noVUConnectionReuse: options_trigger_true_false,
thresholds: {
// we can set different thresholds for the different scenarios because
// of the extra metric tags we set!
'http_req_duration{test_type:peak}': [{ threshold: 'med<200', abortOnFail: latency_trigger_true_false, delayAbortEval: '180s' }],
'http_req_duration{test_type:gate_rush}': [{ threshold: 'med<400', abortOnFail: latency_trigger_true_false, delayAbortEval: '180s' }],
// we can reference the scenario names as well
'http_req_failed{scenario:peak}': [{ threshold: 'rate < 0.05', abortOnFail: error_trigger_true_false, delayAbortEval: '180s' }],
'http_req_failed{scenario:gate_rush}': [{ threshold: 'rate < 0.05', abortOnFail: error_trigger_true_false, delayAbortEval: '180s' }],
'vus_max': [{ threshold: `value < ${maxvu_allow}`, abortOnFail: vu_trigger_true_false, delayAbortEval: '180s' }],
},
discardResponseBodies: false,
summaryTrendStats: ['avg', 'min', 'max', 'p(95)', 'p(99)'],
insecureSkipTLSVerify: error_trigger_true_false,
ext: {
loadimpact: {
distribution: {
ashburnDistribution1: { loadZone: 'amazon:us:ashburn', percent: 100 },
// ashburnDistribution2: { loadZone: 'amazon:us:ashburn', percent: 50 },
/* // dublinDistribution: { loadZone: 'amazon:ie:dublin', percent: 10},
// capeTownDistribution: { loadZone: 'amazon:sa:cape town', percent: 10},
// hongKongDistribution: { loadZone: 'amazon:cn:hong kong', percent: 10},
// mumbaiDistribution: { loadZone: 'amazon:in:mumbai', percent: 10},
osakaDistribution: { loadZone: 'amazon:jp:osaka', percent: 10},
seoulDistribution: { loadZone: 'amazon:kr:seoul', percent: 10},
singaporeDistribution: { loadZone: 'amazon:sg:singapore', percent: 10},
sydneyDistribution: { loadZone: 'amazon:au:sydney', percent: 10},
tokyoDistribution: { loadZone: 'amazon:jp:tokyo', percent: 10},
// montrealDistribution: { loadZone: 'amazon:ca:montreal', percent: 10},
frankfurtDistribution: { loadZone: 'amazon:de:frankfurt', percent: 10},
londonDistribution: { loadZone: 'amazon:gb:london', percent: 10},
// milanDistribution: { loadZone: 'amazon:it:milan', percent: 10},
// parisDistribution: { loadZone: 'amazon:fr:paris', percent: 10},
// stockholmDistribution: { loadZone: 'amazon:se:stockholm', percent: 10},
// bahrainDistribution: { loadZone: 'amazon:bh:bahrain', percent: 10},
saoPauloDistribution: { loadZone: 'amazon:br:sao paulo', percent: 10},
// paloAltoDistribution: { loadZone: 'amazon:us:palo alto', percent: 10},
portlandDistribution: { loadZone: 'amazon:us:portland', percent: 10 }, */
},
projectID: nnnnnn,
name: 'Accounts Smoke Test'
}
}
};
sorry forgot to add that the target rates for these 2 scenario’s are different. for instance.
peak target - 1000 RPS
gate_rush - 3000 RPS
By my observations for my test conditions 1K RPS should be possible with 2-300 VU spread amongst the LZ instance(s) and IP(s). as well the 3K should be possible with a capacity of 1k VUs. However what I’m seeing is that at 3K RPS the single m5.large instance maxes out CPU on a number of the LG’s (load generator IPs). when I add a second Ashburn LZ the CPU issue resolves but having to use 2 LZ in the same region is kind of a pain. I could change thinktime/pacing but now I’m playing whackamole…
thanks in advance,
PlayStay