I noticed with k6 output to InfluxDB 1.8/Grafana that the VU user count reported in Grafana is always off by a few VU users? I even tried logging output locally to JSON log file and parsing the logs in InfluxDB batch line format and it still shows VU user count is under reported?
Example at https://github.com/centminmod/k6-benchmarking/blob/master/bench-ramping-vus.sh-2.md
parsed k6 JSON output log at /home/k6-workdir/influxdb-vus.log
for a benchmark run using ramping-vus
executor in 4 stages from 5, 10, 15, 0 VUs respectively.
cat /home/k6-workdir/influxdb-vus.log
vus,testname=rampingvus value=1 1665497928000154000
vus,testname=rampingvus value=3 1665497929000153300
vus,testname=rampingvus value=4 1665497930000158500
vus,testname=rampingvus value=6 1665497931000157200
vus,testname=rampingvus value=8 1665497932000220000
vus,testname=rampingvus value=9 1665497933000154400
vus,testname=rampingvus value=11 1665497934000158500
vus,testname=rampingvus value=13 1665497935000171500
vus,testname=rampingvus value=14 1665497936000158000
vus,testname=rampingvus value=11 1665497937000153900
vus,testname=rampingvus value=6 1665497938000155600
vus,testname=rampingvus value=1 1665497939000153000
Edit: I’m using gracefulStop
of 30s and guess I’m hitting this The VU metric shows a smaller number than what I've configured - #3 by mstoykov - OSS Support - Grafana Labs Community Forums ?
mstoykov:
gracefulStop of 30s
by default we will wait for VUs to finish their already started iterations for 30s
. This means that some VUs will return before others(arguably this is always the case … even if we stop them “simultaneously” without gracefulStop).
Now unfortunately in this case (and I think this is the only such case at all …) vus is a counter which increases and decreases (see stages/ramping-vus , for an example of this), k6 actually stops all VUs at the end and that counter does go down to 0.
Unfortunately, it seems that the “all VUs have stopped we can print the summary now” and “all VUs have stopped and we have updated the counters and that have gone through so it will be seen in the summary” are two distinct events, that lead to the vus
last value (which shown here) to not be 0
as it should be … as at the end of a test we have always stopped all vus
.
tl;dr - in your particular case it was 10 vus for the whole duration you care about
Hope this answers your question
Sorry for the super-late response, I forgot to reply here earlier in the week
There are a couple of things that might be going on here. First, k6 only emits the vus
metric once every second . And second, the target
in a ramping-vus
executor’ stages represents the number of VUs the executor is supposed to reach right at the end of that stage. And if the following stage has a lower target
, k6 might just execute a single iteration (or no iteration, if gracefulRampDown
is 0 or some small duration) and drop the VU. If that happens in less than a second, the vus
metric emission might miss that one VU that was initialized.
Can you provide a self-contained example script that demonstrates the issue? There might also be some sort of a off-by-one bug in k6 or something like that
1 Like
eva2000
October 20, 2022, 12:48pm
4
So a longer gracefulRampDown
would = more chance of hitting all desired VU numbers for multi stage tests?
I don’t have a self-contained example now. I’ve evolved the script to combine 2 executor’s for constant arrival rate + ramping vus (benchmark-scenarios-multi.js ) and testing higher concurrency numbers sort of hiding the finer VU reported numbers i.e. 16,000 VUs test at https://github.com/centminmod/k6-benchmarking/blob/master/bench-ramping-vus.sh-3.md
taskset -c 0-3 k6 run --tag testname=rampingvus -e RPS=5000 -e REQRATE_USERS=16000 -e USERS=0 -e STAGETIME=180s -e STAGE_VU1=5000 -e STAGE_VU2=10000 -e STAGE_VU3=15000 -e STAGE_VU4=0 -e URL=https://domain1.com --no-usage-report --out json=/home/k6-workdir/summary-raw-scenarios-multi.gz benchmark-scenarios-multi.js
/\ |‾‾| /‾‾/ /‾‾/
/\ / \ | |/ / / /
/ \/ \ | ( / ‾‾\
/ \ | |\ \ | (‾) |
/ __________ \ |__| \__\ \_____/ .io
execution: local
script: benchmark-scenarios-multi.js
output: json (/home/k6-workdir/summary-raw-scenarios-multi.gz)
scenarios: (100.00%) 2 scenarios, 150000 max VUs, 15m23s max duration (incl. graceful stop):
* constant_arrival_rate: 5000.00 iterations/s for 3m0s (maxVUs: 16000-150000, exec: constantarrival, gracefulStop: 5s)
* ramping_vus: Up to 15000 looping VUs for 12m0s over 4 stages (gracefulRampDown: 5s, exec: default, startTime: 3m18s, gracefulStop: 5s)
running (15m22.6s), 000000/016000 VUs, 2707350 complete and 2 interrupted iterations
constant_arrival_rate âś“ [======================================] 000000/016000 VUs 3m0s 5000.00 iters/s
ramping_vus âś“ [======================================] 00000/15000 VUs 12m0s
INFO[0931] [k6-reporter v2.3.0] Generating HTML summary report source=console
âś“ is status 200
checks.........................: 100.00% âś“ 2707352 âś— 0
data_received..................: 6.5 GB 7.0 MB/s
data_sent......................: 122 MB 132 kB/s
http_req_blocked...............: avg=7.18ms min=132ns med=248ns max=2.41s p(95)=367ns p(99)=792.56µs p(99.99)=1.94s count=2707352
http_req_connecting............: avg=3.38ms min=0s med=0s max=1.73s p(95)=0s p(99)=70.6µs p(99.99)=1.02s count=2707352
http_req_duration..............: avg=3.93ms min=167.36µs med=241.02µs max=1.75s p(95)=1.32ms p(99)=17.46ms p(99.99)=1.03s count=2707352
{ expected_response:true }...: avg=3.93ms min=167.36µs med=241.02µs max=1.75s p(95)=1.32ms p(99)=17.46ms p(99.99)=1.03s count=2707352
http_req_failed................: 0.00% âś“ 0 âś— 2707352
http_req_receiving.............: avg=176.86µs min=7.85µs med=19.12µs max=1.11s p(95)=205.75µs p(99)=2.1ms p(99.99)=70.45ms count=2707352
http_req_sending...............: avg=87.23µs min=18.38µs med=28.61µs max=890.98ms p(95)=262.05µs p(99)=601.72µs p(99.99)=27.21ms count=2707352
http_req_tls_handshaking.......: avg=3.77ms min=0s med=0s max=1.58s p(95)=0s p(99)=645.25µs p(99.99)=928.06ms count=2707352
http_req_waiting...............: avg=3.67ms min=0s med=174.41µs max=1.16s p(95)=860.58µs p(99)=14.12ms p(99.99)=995.83ms count=2707352
http_reqs......................: 2707352 2934.531442/s
iteration_duration.............: avg=2.01s min=206.09µs med=2s max=5.03s p(95)=5s p(99)=5s p(99.99)=5.01s count=2707351
iterations.....................: 2707351 2934.530359/s
vus............................: 1 min=0 max=16000
vus_max........................: 16000 min=3089 max=16000
Yes, but the VU iterations will also need to be more than a second long. k6 will hit whatever VU target
you give it, but if the next target
is lower, it will immediately try to decrease the actively running VUs. And if either your iteration duration or your gracefulRampDown
option is super short, the 1-second measurement and emission of the vus
metric might not coincide with the short peak of active VUs.
If you don’t want to mess with these (and you might not be able to, depending on the test), you can also add a stage with the same peak target
and a duration
of a few seconds. Like this:
stages: [
{ duration: '30m', target: 16000 },
{ duration: '5s', target: 16000 },
{ duration: '30m', target: 0 },
],
1 Like
Yes, but the VU iterations will also need to be more than a second long. k6 will hit whatever VU target
you give it, but if the next target
is lower, it will immediately try to decrease the actively running VUs. And if either your iteration duration or your gracefulRampDown
option is super short, the 1-second measurement and emission of the vus
metric might not coincide with the short peak of active VUs.
I see. I compared my previous ones with current and noticed that after I introduced a randoms sleep time between 1.5s to 4.5s, reported VUs are slight more accurate because of this?
before
iteration_duration.............: avg=1.85ms min=555µs med=1.15ms max=36.28ms p(95)=5.13ms p(99)=12.22ms p(99.99)=30.84ms count=46674
after
iteration_duration.............: avg=2.01s min=206.09µs med=2s max=5.03s p(95)=5s p(99)=5s p(99.99)=5.01s count=2707351
Sweet, thanks for the tip
1 Like