What is a idiomatic way of tracking degradation between test runs?

Hey @rdt.one , thanks for the reply.

But the question is not about what to track but rather how? How do I collect p95 response times of multiple runs and track its degradation.