​​Following Part 2 of this series of posts about my session at AUSPC Melbourne, ​in this third part I’ll focus on describing a methodology to adopt for performance testing.

Methodology for Software Testing

There is no successful testing without a proper strategy and execution. Nothing should be left to chance, and a good methodology should be adopted for obtaining the best results.

A very good guidance to performance testing can be found on MSDN: Performance Testing Guidance for Web Applications. Although specific to web applications, the recommendations apply also to SharePoint apps.

The core principles of a performance test plan can be summarised in the following points:

1. Identify the Test Environment
– Ideally, it should be an exact replica of the production environment. Oh yes, I forgot to mention: do NOT test directly on the live environment! Executing performance testing is a stressing activity for your system, and you really don’t want to put your live environment under this burden.
– Monitor hardware, software and network: CPU allocation, memory consumption, disk and Ethernet usage are your minimum aspects to consider. Latency on the network, especially when testing remote services, is an important factor that may produce bad performance results, as a throttled bandwidth may reduce the capacity of the application to fully scale.

Slide7b

2. Prepare your Test Scenarios
– We now enter in the specifics of a tool to use for executing performance tests. In the next posts, I’ll present the Microsoft offer for executing integrated performance tests. Irrespective of the tool adopted, the key non-functional requirements to test a software application are: Responsiveness, Scalability and Stability.
– Before executing any test, it is advisable to run a Smoke Test: A smoke test is the initial run of a performance test to see if your application can perform its operations under a normal load. Consider it as a warm-up before the proper exercise. Applications that use JIT compilation or pre-load data in cache before the first execution will benefit enormously from a preparation phase, that otherwise would hinder the performance results.
Spike Tests, on the converse, are a type of performance test focused on determining or validating performance characteristics of the application when subjected to workload models and load volumes that repeatedly increase beyond anticipated production operations for short periods of time. Spike Tests, in practice, measure the performance of an application under peak intense usage. Testing these conditions will assess the performance boundaries of your application.

3. Create the Baselines
– Setting the standard for comparison: this is the purpose of creating a baseline. Performance indicators are not absolute metrics; they are relative to an initial condition that is measured in the baseline. Information that your application consumes 10% of CPU is pointless without context: what CPU is it, is 10% good or bad, and what are your expectations?
– Baseline results can be articulated by using a broad set of key performance indicators; including response time, processor capacity, memory usage, disk capacity, and network bandwidth.

4. Analyse the Test Results
– So you wrote your test scenarios, you executed them and you got some results… so what? Are they good, are they bad, should you worry? The answer is in comparing to baselines: is your app performing better or worse? Is the performance acceptable or not? Define “acceptable”! Don’t use vague adjectives, be specific on your expectations. To put it more formally, define your KPIs (Key Performance Indicators) in terms of SMART objectives: Specific, Measurable, Attainable, Relevant and Timed. You don’t want to say that a web page should open “quickly”, but rather say that it should open in less than 2 seconds when subjected to a concurrent load of 100 users.
– A collection of all set KPIs represents your SLA to your stakeholders: do not commit to performance that is not attainable and unrealistic. Measure first, then deliver.
– Understand minimum, maximum, median, percentile, standard deviation: how does your app respond to limit conditions? What’s the average behaviour? What’s the user load that breaks your KPIs? Is your app a weightlifter or a marathon runner? Slide11b

Performance testing is not rocket science. Analysis of results does not give an immediate solution. Rather, the process is iterative of testing, tuning, re-testing, re-tuning, and so on, until the best attainable performance is achieved. This introduces the need for automation of the execution of performance tests, which is the subject of one of the next parts of this series of posts.

– Stefano Tempesta