Discrepancy between k6 response times and other tools (+20 seconds!) with large response body

Hi there

I hope someone can help with this issue which is driving me crazy.

While testing some newly implemented caching on an export endpoint here (locally) we ran into a weird problem with the k6 tests to verify the new and hopefully faster response times.

After some troubleshooting we have simplified the test as much as possible, and are now sending one single http get request (receiving about 23 MB of JSON). We have set discardResponseBodies to true. The script is about as minimal as is possible.

It seems like k6 adds about 20 seconds overhead on the response times. Using other tools (we have tried Postman, Insomnia, ThunderClient, curl and now finally Gatling) we were able to measure the response times to 300-500ms with a populated cache, while k6 consequently shows ~20 seconds.

Requests to other endpoints in the same API are showing fast response times, so we figure it must have something to do with k6 and the 20+ megabyte response, somehow.

Any idea what could be causing this?

edit;
here is an example of a report showing a http_req_receiving time of 19.4s:

     data_received..................: 23 MB  1.2 MB/s
     data_sent......................: 109 kB 5.5 kB/s
     http_req_blocked...............: avg=103.83ms min=103.83ms med=103.83ms max=103.83ms p(90)=103.83ms p(95)=103.83ms
     http_req_connecting............: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_duration..............: avg=19.51s   min=19.51s   med=19.51s   max=19.51s   p(90)=19.51s   p(95)=19.51s
       { expected_response:true }...: avg=19.51s   min=19.51s   med=19.51s   max=19.51s   p(90)=19.51s   p(95)=19.51s
     http_req_failed................: 0.00%  ✓ 0        ✗ 1
     http_req_receiving.............: avg=19.4s    min=19.4s    med=19.4s    max=19.4s    p(90)=19.4s    p(95)=19.4s
     http_req_sending...............: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_tls_handshaking.......: avg=102.1ms  min=102.1ms  med=102.1ms  max=102.1ms  p(90)=102.1ms  p(95)=102.1ms
     http_req_waiting...............: avg=111.22ms min=111.22ms med=111.22ms max=111.22ms p(90)=111.22ms p(95)=111.22ms
     http_reqs......................: 1      0.050977/s
     iteration_duration.............: avg=19.61s   min=19.61s   med=19.61s   max=19.61s   p(90)=19.61s   p(95)=19.61s
     iterations.....................: 1      0.050977/s
     vus............................: 1      min=1      max=1
     vus_max........................: 1      min=1      max=1

Hi @rodafr, Welcome to the community forum!

Can you please try to make reproducible example that I can run?

I did try this very fast with the second result of “big json” in google - https://raw.githubusercontent.com/json-iterator/test-data/master/large-file.json and curl and k6 basically had the same output

[mstoykov@fedora ~]$ curl https://raw.githubusercontent.com/json-iterator/test-data/master/large-file.json -O /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 24.9M  100 24.9M    0     0  20.5M      0  0:00:01  0:00:01 --:--:-- 20.5M
curl: (3) URL using bad/illegal format or missing URL
[mstoykov@fedora ~]$ nvim big-json.js
[mstoykov@fedora ~]$ k6.v0.46.0 run big-json.js

          /\      |‾‾| /‾‾/   /‾‾/
     /\  /  \     |  |/  /   /  /
    /  \/    \    |     (   /   ‾‾\
   /          \   |  |\  \ |  (‾)  |
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: big-json.js
     output: -

  scenarios: (100.00%) 1 scenario, 1 max VUs, 10m30s max duration (incl. graceful stop):
           * default: 1 iterations for each of 1 VUs (maxDuration: 10m0s, gracefulStop: 30s)


     data_received..................: 26 MB 22 MB/s
     data_sent......................: 33 kB 28 kB/s
     http_req_blocked...............: avg=145.69ms min=145.69ms med=145.69ms max=145.69ms p(90)=145.69ms p(95)=145.69ms
     http_req_connecting............: avg=137.47ms min=137.47ms med=137.47ms max=137.47ms p(90)=137.47ms p(95)=137.47ms
     http_req_duration..............: avg=1.02s    min=1.02s    med=1.02s    max=1.02s    p(90)=1.02s    p(95)=1.02s
       { expected_response:true }...: avg=1.02s    min=1.02s    med=1.02s    max=1.02s    p(90)=1.02s    p(95)=1.02s
     http_req_failed................: 0.00% ✓ 0        ✗ 1
     http_req_receiving.............: avg=1.02s    min=1.02s    med=1.02s    max=1.02s    p(90)=1.02s    p(95)=1.02s
     http_req_sending...............: avg=265.21µs min=265.21µs med=265.21µs max=265.21µs p(90)=265.21µs p(95)=265.21µs
     http_req_tls_handshaking.......: avg=7.25ms   min=7.25ms   med=7.25ms   max=7.25ms   p(90)=7.25ms   p(95)=7.25ms
     http_req_waiting...............: avg=5.68ms   min=5.68ms   med=5.68ms   max=5.68ms   p(90)=5.68ms   p(95)=5.68ms
     http_reqs......................: 1     0.851984/s
     iteration_duration.............: avg=1.17s    min=1.17s    med=1.17s    max=1.17s    p(90)=1.17s    p(95)=1.17s
     iterations.....................: 1     0.851984/s
     vus............................: 1     min=1      max=1
     vus_max........................: 1     min=1      max=1


running (00m01.2s), 0/1 VUs, 1 complete and 0 interrupted iterations
default ✓ [======================================] 1 VUs  00m01.2s/10m0s  1/1 iters, 1 per VU

Even without discardResponseBodies

import http from "k6/http";

export default () => {
  http.get("https://raw.githubusercontent.com/json-iterator/test-data/master/large-file.json")
}

Version of k6 might also be relevant I guess :person_shrugging: As well as anything else about the request. You might try --http-debug and copy the output here after some cleanup.

Do you have a proxy of some kind in between? Maybe it is configured to let postman and co. true without throttling but not k6?

2 Likes

Thanks for replying and trying to reproduce the situation.

Using the URL you provided, I can’t reproduce it either.

I have upgraded k6 to the latest version available (0.46.0) and tried again locally, and I get the same behaviour.
When I ran the test with the --http-debug flag I saw that I got the HTTP 200 OK almost immediately, and it seems like it’s the data transfer stage that takes time. (Although this was already indicated by the report in my OP). I don’t understand why this would take so long.

INFO[0000] Request:
GET /FlrExport HTTP/1.1
Host: localhost:52452
User-Agent: k6/0.46.0 (https://k6.io/)
Accept-Encoding: gzip

  group= iter=0 request_id=3c5cd14c-6f34-4aa2-545a-4aff364df5bb scenario=default source=http-debug vu=1
INFO[0000] Response:
HTTP/2.0 200 OK
Connection: close
Content-Type: application/json; charset=utf-8
Date: Thu, 21 Sep 2023 09:26:39 GMT
Server: Kestrel

  group= iter=0 request_id=3c5cd14c-6f34-4aa2-545a-4aff364df5bb scenario=default source=http-debug vu=1

     data_received..................: 23 MB 1.1 MB/s
     data_sent......................: 73 kB 3.4 kB/s
     http_req_blocked...............: avg=17.33ms  min=17.33ms  med=17.33ms  max=17.33ms  p(90)=17.33ms  p(95)=17.33ms
     http_req_connecting............: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_duration..............: avg=21.6s    min=21.6s    med=21.6s    max=21.6s    p(90)=21.6s    p(95)=21.6s
       { expected_response:true }...: avg=21.6s    min=21.6s    med=21.6s    max=21.6s    p(90)=21.6s    p(95)=21.6s
     http_req_failed................: 0.00% ✓ 0        ✗ 1
     http_req_receiving.............: avg=21.48s   min=21.48s   med=21.48s   max=21.48s   p(90)=21.48s   p(95)=21.48s
     http_req_sending...............: avg=526.59µs min=526.59µs med=526.59µs max=526.59µs p(90)=526.59µs p(95)=526.59µs
     http_req_tls_handshaking.......: avg=16.27ms  min=16.27ms  med=16.27ms  max=16.27ms  p(90)=16.27ms  p(95)=16.27ms
     http_req_waiting...............: avg=112.13ms min=112.13ms med=112.13ms max=112.13ms p(90)=112.13ms p(95)=112.13ms
     http_reqs......................: 1     0.046254/s
     iteration_duration.............: avg=21.61s   min=21.61s   med=21.61s   max=21.61s   p(90)=21.61s   p(95)=21.61s
     iterations.....................: 1     0.046254/s
     vus............................: 1     min=1      max=1
     vus_max........................: 1     min=1      max=1

I don’t have a proxy in between that would differentiate between user agents, though the target API is ran in a Docker container. This doesn’t seem to affect any of the other tools, so not sure if it’s relevant.
I do see in my report that I only get about 1 MB/s transfer speed.

To test this further, I will deploy the solution to an actual environment and do some further investigation.

I’m not sure how I could make a reproducible example for you, I’m afraid.

update:
I do not get the same behaviour if I straight up serve the same json as a static file, so something must be happening when the request is going through the asp.net core API logic.

This seems super strange.

With you saying the words “docker” and “asp.net” … I now wonder whether you are on windows and running one of the things throw WSL and some not through it and … there being strange performance implications :thinking:

I agree, it’s super weird and it now gets even weirder.

I am on Windows (10) and using Docker Desktop (which runs in WSL 2) to run the API container, so your assumptions are correct. :slight_smile:

I have recreated the same export functionality (with caching) in an API written in Go, and when I run that in the same way (in a container) I do not get the same performance issue! :exploding_head:

The first request takes ~17 seconds to fetch the data and fill the cache:

     data_received..................: 30 MB 1.8 MB/s
     data_sent......................: 89 B  5.27996084713258 B/s
     http_req_blocked...............: avg=1.68ms   min=1.68ms   med=1.68ms   max=1.68ms   p(90)=1.68ms   p(95)=1.68ms
     http_req_connecting............: avg=722µs    min=722µs    med=722µs    max=722µs    p(90)=722µs    p(95)=722µs
     http_req_duration..............: avg=16.85s   min=16.85s   med=16.85s   max=16.85s   p(90)=16.85s   p(95)=16.85s
       { expected_response:true }...: avg=16.85s   min=16.85s   med=16.85s   max=16.85s   p(90)=16.85s   p(95)=16.85s
     http_req_failed................: 0.00% ✓ 0                  ✗ 1
     http_req_receiving.............: avg=258.45ms min=258.45ms med=258.45ms max=258.45ms p(90)=258.45ms p(95)=258.45ms
     http_req_sending...............: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=16.59s   min=16.59s   med=16.59s   max=16.59s   p(90)=16.59s   p(95)=16.59s
     http_reqs......................: 1     0.059325/s
     iteration_duration.............: avg=16.85s   min=16.85s   med=16.85s   max=16.85s   p(90)=16.85s   p(95)=16.85s
     iterations.....................: 1     0.059325/s
     vus............................: 1     min=1                max=1
     vus_max........................: 1     min=1                max=1

and the following requests with caching takes just milliseconds as expected:

     data_received..................: 30 MB 113 MB/s
     data_sent......................: 89 B  332 B/s
     http_req_blocked...............: avg=2.52ms   min=2.52ms   med=2.52ms   max=2.52ms   p(90)=2.52ms   p(95)=2.52ms
     http_req_connecting............: avg=508.49µs min=508.49µs med=508.49µs max=508.49µs p(90)=508.49µs p(95)=508.49µs
     http_req_duration..............: avg=264.79ms min=264.79ms med=264.79ms max=264.79ms p(90)=264.79ms p(95)=264.79ms
       { expected_response:true }...: avg=264.79ms min=264.79ms med=264.79ms max=264.79ms p(90)=264.79ms p(95)=264.79ms
     http_req_failed................: 0.00% ✓ 0        ✗ 1
     http_req_receiving.............: avg=263.05ms min=263.05ms med=263.05ms max=263.05ms p(90)=263.05ms p(95)=263.05ms
     http_req_sending...............: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_tls_handshaking.......: avg=0s       min=0s       med=0s       max=0s       p(90)=0s       p(95)=0s
     http_req_waiting...............: avg=1.73ms   min=1.73ms   med=1.73ms   max=1.73ms   p(90)=1.73ms   p(95)=1.73ms
     http_reqs......................: 1     3.726341/s
     iteration_duration.............: avg=268.35ms min=268.35ms med=268.35ms max=268.35ms p(90)=268.35ms p(95)=268.35ms
     iterations.....................: 1     3.726341/s

Also, using the --http-debug flag I found the following difference in the http responses:

// dotnet api
INFO[0042] Response:
HTTP/2.0 200 OK
Connection: close
Content-Type: application/json; charset=utf-8
Date: Fri, 22 Sep 2023 09:38:53 GMT
Server: Kestrel

// go api 
INFO[0017] Response:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/json; charset=utf-8
Date: Fri, 22 Sep 2023 09:40:05 GMT

So it seems like it has something to do with HTTP 1.1 vs 2 maybe, and the handling of connections?

So it seems like it has something to do with HTTP 1.1 vs 2 maybe, and the handling of connections?

My github test uses http 2 :person_shrugging:

I can’t find any issues like this on the internet.

Can you test your code with just a bit of go code. As in just calling http.Get from http package - net/http - Go Packages?

I just tested it using some super simple Go. My main looks like this:

	startTime := time.Now()

	call, err := http.Get("https://localhost:50818/flrexport")
	if err != nil {
		fmt.Print(err)
	}

	fmt.Println(call.Proto)
	fmt.Println(call.StatusCode)
	fmt.Println(time.Since(startTime))

	defer call.Body.Close()

I got these results, the first one is populating the cache:

C:\dev\source\flr-go> go run .
HTTP/2.0
200
23.7698516s
C:\dev\source\flr-go> go run .
HTTP/2.0
200
111.2603ms
C:\dev\source\flr-go> go run .
HTTP/2.0
200
50.9962ms
C:\dev\source\flr-go> go run .
HTTP/2.0
200
260.4256ms
C:\dev\source\flr-go> go run .
HTTP/2.0
200
43.1727ms

Then I did the same call a couple of times using K6 (with a filled cache) and got this:

http_req_duration..............: avg=29.65s  

http_req_duration..............: avg=33.34s

edit: I added the call.Proto output. So HTTP/2.0 is excluded as a possible cause for the problem here too. I’m out of ideas.

is 404 the status you get ?

Whoops, sorry I pasted the wrong thing there … I’ll update

Fixed the post.
Some of these times look a bit small, but at least they give an indication I think.

Also, I guess I forgot to give an update on what happened after I deployed the API to an actual environment and tested it there.

When it runs in our on-prem cloud service I do not get the super delayed response times.

So I guess the problem only seems to exist under the following conditions:

  • K6 v.0.46.0 (or older) running on Win 10.
  • an asp.net core minimal api on dotnet 7 not too dissimilar to this.
  • the api runs in a docker container using Visual Studio’s automatic docker compose tools.
  • the api endpoint called returns a big json file (23 MB).
  • the endpoint is completely open, i.e. not behind any kind of auth

Symptoms are:

  • slow response times showing about 20 seconds used in the http_req_receiving stage
  • transfer speeds of about 1 MB/s

@rodafr can you create a (github) repo with a reproducible example including:

  1. the script
  2. asp.net application
  3. instructions how to run it.

At this point this seems strange and as a not asp.net or windows user I doubt I will be able to continue debugging it, but hopefully someone else on the team will be able to reproduce it and track it down to something more concrete.