Questions to ask
- Have performance test workflow(s) been identified?
- Has sanity functional tests been successful. You would not want to carry out performance tests on an application which is functionally broken.
- What does the target system (hardware) look like (specify all server and network appliance configurations)?. Monitor load average, cpu, memory etc on all servers.
These can be easily monitored using top command on unix system.
Don’t forget to monitor health of test agents. load average, cpu utilization, IO, Memory usage are the least you should monitor on test agent. Many a times you would encounter that you hit system limit on load test agent which degrades test results.
- Is test environment same as live environment else test results would have to be extrapolated
- Has benchmarking been done ? (The objective of Benchmark tests is to determine end-to-end timing of various critical business processes and transactions while the system is under low load with a production sized database.)
- Have performance test requirements been identified? They could entail -
- Response Time -
- Is response time with respect to workload identified?
- Is it browser render time or delivery time to browser?
- Are excluded components for ex - calls to 3rd parties beyond the control of system developer defined?
- What is acceptable error rate during the measurement of response time
- Workload Definition -
- What is workload pattern? for ex begin with 1 user, add 1 new user every 5 seconds and extend to 100 users. Notice that if transaction completion time is too short and test is not to be run for longer period repeatedly then you may never have all users up and running at any time since initial set of users would have got past before last user begins transaction
- Duration of test?
- Transactions -
- How many transactions should be completed during load test. This is also known as Throughput.
- Have type of tests been identified ?
- Stress test, Targeted Infrastructure test,
- Soak test (endurance test),
- Volume test (i.e. performance test with large database size etc),
- Failover test (start failing components (servers, routers, etc) and observe how response times are affected during and after the failover and how long the system takes to transition back to steady state)
- Network Sensitivity tests (measure the impact of that traffic on an application that is bandwidth dependant)
- Have performance measure characteristics been identified? for ex -
- System performance characteristics -
The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.
It’s (almost always) interpreted relative to the number of cores you have, so load average of 4 on a 4 core box probably means the box is saturated. http://stackoverflow.com/questions/21617500/understanding-load-average-vs-cpu-usage
- CPU, broken out by process including I/O wait time, user/system time, idle time >
How does your total CPU usage compare to load average? If you’ve routinely got a load average of 4 but your CPU usage is always under 50% (aggregated across all cores), then you’ve got some disk or network bottlenecks that aren’t letting you take advantage of all your cores.
- Memory usage, broken out by process, and used, cached, free >
What does your memory usage look like? Is free + cached memory very close to zero? Most apps, daemons, etc. will work much better with a sizeable disk cache. You don’t want to completely exhaust system memory or you’ll start swapping to disk, and that’s very bad. How much free (unused, non-cached) memory do you have? How does this vary over time? Tune your processes to use that free memory. But keep enough (a small margin, perhaps 10% of total) in reserve for sudden spikes.
- Disk activity, including requests and bytes read/written per second
- Network bytes read/written per second
- Is your web server dumping nearly a MB/sec to disk during normal operations? That could be some poorly tuned logging from apache or one of your applications. Turn that chattiness down to get more performance.
- Server Response Time - This refers to the time taken for one system node to respond to the request of another. A simple example would be a HTTP 'GET' request from browser client to web server. In terms of response time this is what all load testing tools actually measure. It may be relevant to set server response time goals between all nodes of the system.
some performance data collection tools - performance data, I like collectd, RRDTool, DStat, and IOStat.
You may like to learn more about linux performance management commands
- client-side, perceived performance-
- distribution of response times (or at least mean, median and 90%) Load-testing tools have difficulty measuring render-response time, since they generally have no concept of what happens within a node apart from recognizing a period of time where there is no activity 'on the wire'. To measure render response time, it is generally necessary to include functional test scripts as part of the performance test scenario. Many load testing tools do not offer this feature.
- counts of successful (probably 200 OK) and failed (anything else) responses
- throughput in total time to run a certain number of reports
- server-side errors and per-request details-
- You’ll almost certainly uncover some errors under load. You’ll want to make sure your application (and other server processes) have a reasonable amount of logging. Debug logging could result in lots of unnecessary disk writes, so be sure to turn those off. But it’s certainly okay to log errors for perf tests and in production. It’s also a good idea to have Apache request logging, including timing turned on so you can see responses the server gave out, and the time to process them. This will back up what you’re recording at the client.