We have a varied set of applications that need to be "benchmarked". Now im a little confused here. my confusions:
1> Is benchmarking simply running the system under peak load? How is it different from load testing? Pardon my naiveness in the topic but i seem to have lost any powers of reasoning it out with myself. [img]/images/graemlins/confused.gif[/img]
I know that the client aint looking for benchmarking the servers in place, but only the transactions on the SUT.
Well they want me to suggest. I have 2 approaches that i could think of:
1> find user load through workload modelling excercise and then subject the system to peak load conditions monitoring the application parameters
2> find user load through workload modelling and then subject system to varying peak loads. But to what degree do i go for conducting varying peak load testing? When can i stop? DO i stop when say response time is breached or do i stop when throughput takes a beating?
Am i in the right direction? All too theoretical but practically it may be gruelling
It sounds like you're on the right track, but again, your client should dictate which components THEY want to use as their benchmark.
When response time exceeds a predefined threshold, my company usually refers to that as a capacity test.
We typically establish our "benchmark requirements" for an application and then use that test as a comparison to all future tests in which changes are made to any of the system components or the application itself.
As SteveO says, the point of a benchmark is to compare it against future results. That's it. Variables will be future versions of code, new hardware, and so forth. Whatever it is you measure, you simply have to be able to make the same measurements again.
One approach you might consider is determining your own benchmarks. You are probably not the only person on your team confused about this. Most people will check it off their project list and be satisfied that benchmarks were collected. It will be easier to get a definition of the SUT's transactions. Stick to that, and the rest is easy.
Without more context, I might collect the following datapoints and present them as benchmarks:
- response times for a single user
- response times and valid resource metrics (server CPU, process memory consumption, etc?) at 50% of peak load
- response times and valid resource metrics as 100% of peak load
What's peak load? Maybe you need to do exploratory testing to find out. Wind users slowly until the response time curves up sharply. Pick that point as peak load, or at least peak load for your test environment. It can be arbitrary, just make sure it is within operating ranges.
Maybe you collect metrics at 75% of peak too, but keep it relatively simple so that you can repeat it easily in the future. Report these metrics, and save all your scripts, scenarios resource data, and so forth for future use and comparison. Benchmarks complete!