How to write a test strategy

I’ve documented my overall approach to rapid, lightweight test strategy before but thought it might be helpful to post an example.  If you haven’t read the original post above, see that first.

This is the a sanitised version of the first I ever did, and while there are some concessions to enterprise concerns, it mostly holds up as a useful example of how a strategy might look.  I’m not defending this as a good strategy, but I think this worked as a good document of an agreed approach.

Project X Test Strategy


The purpose of this document is –

  • To ensure that testing is in line with the business objectives of Company.
  • To ensure that testing is addressing critical business risks.
  • To ensure that tradeoffs made during testing accurately reflect business priorities
  • To provide a framework that allows testers to report test issues in a pertinent and timely manner.
  • To provide guidance for test-related decisions.
  • To define the overall test strategy for testing in accordance with business priorities and agreed business risks.
  • To define test responsibilities and scope.
  • To communicate understanding of the above to the project team and business.


Project X release 2 adds reporting for several new products, and a new report format.

The current plan is to add metric capture for the new products, but not generate reports from this data until the new reports are ready.  Irrespective of whether the complete Project X implementation is put into production, the affected products must still be capturing Usage information.

Key Features

  • Generation of event messages by the new products (Adamantium, DBC, Product C, Product A and Product B.
  • New database schema for event capture and reporting
  • New format reports with new fields for added products
  • Change of existing events to work with the new database schema

Key Dates

  • UAT – Mon 26/03/07 to Tue 10/04/07
  • Performance/Load – Wed 28/03/07 to Wed 11/04/07
  • Production – Thu 12/04/07

Key Risks

Risk Impact Mitigation Strategy Risk area
The team’s domain knowledge of applications being modified is weak or incomplete for key products. Impact of changes to existing products may be misjudged by the development team and products adversely affected.
  1. Product C team to modify their application to create the required messages and perform regression testing.
  2. Product B application will not be modified.  Only the logs will be processed
  3. Product A has good selenium and unit test coverage, and domain skills exist in the team
  4. Greater focus will be placed on creating regression tests for PRODUCT D and leveraging automated QTP scripts from other teams.
Data architect being replaced. Supporting information that is necessary for generating reports may not be captured.  There may be some churn in technical details.
  1. Development team has improved domain knowledge from first release
  2. Intention is to provide improved technical specifications and mapping documents
  3. Early involvement of reporting testers to inspect output and provide up-front test cases
Strategy for maintaining version 1 and version 2 of the ZING database in production has not been defined. Test strategy may not be appropriate. None. Project
Insufficient time to test all events and perform regression testing of existing products. Events not captured, not captured correctly, or applications degraded in functionality or performance
  1. Early involvement of reporting testers to inspect output and provide up-front test cases (defect prevention)
  2. Additional responsibility of developers to write dbUnit tests.

These two activities will free testers to focus on QTP regression scripts.

Technical specifications and mapping documents not ready prior to story development Mappings may be incorrect.  Test to requirement traceability difficult to retrofit. Retrofit where time permits.  Business to determine value of this activity. Project
XML may not be correctly transformed Incorrect data will be collected Developers will use dBUnit to perform integration testing of XML to database mapping.  This will minimise error in human inspection. Product
Usage information may be lost.  It is critical that enough information be captured to relate event information to customers, and that the information is correct. No Usage information available. Alternate mechanisms exist for capturing information for Product A and Product B.  Product B needs to implement a solution.  Regression testing needs to ensure that existing Product D events are unaffected. Product
No robust and comprehensive automated regression test suite for PRODUCT D components.  May not be time to develop a full suite of QTP tests for all events and field mappings. PRODUCT D regressions introduced, or regression testing of Product D requires extra resourcing. Will attempt to leverage PRODUCT D scripts from other projects and existing scripts, while extending the QTP suite.
Project X changes affect performance of existing products Downtime of Product D products and/or loss of business. Performance testing needs to cover combined product tests and individual products compared to previous benchmark performance results Parafunctional
Project X may affect products when under stress Downtime of Product D products and/or loss of business. Volume tests should simulate large tables, full disks and overloaded queues to see impact to application performance. Parafunctional
Reliability tests may not have been performed previously.  That is,

tests that all events were captured under load. (I need to confirm this)

Usage information may be going missing Performance testing should include some database checks to ensure all messages are being stored Historical
Unable to integrate with the new reports prior to release of new event capturing. Important data may not be collected or data may not be suitable for use in reports.
  1. Production data being collected after deployment needs to be monitored.
  2. Output of transformations to be inspected by reports testers.
  3. Domain knowledge of developers is improved.

Project X Strategy Model

High-level architecture of application under test showing key interfaces and flows

The diagram above defines the conceptual view of the components for testing.  From this model, we understand the key interfaces that pertain to the test effort, and the responsibilities of different subsystems.

Products (Product D, Product A, Product C, Product B)


  • Generate events


  • Event messages should be generated in response to the correct user actions.
  • Event messages should contain the correct information
  • Event message should generate well-formed XML
  • Error handling?



  • Receive events
  • Pass events to the ZING database


  • Transform event XML to correct fields in ZING database for each event type
  • Error handling?



  • Transform raw event information into aggregate metrics
  • Re-submit rejected events to ZING
  • Generate reports


  • Correctly generate reports for event data which meets specifications
  • Correct data and re-load into ZING.


Product to SCE

This interface will not be tested in isolation.


Developers will be writing dBUnit integration tests, which will take XML messages and verify that the values in the XML are mapped to the correct place in the ZING database.

ZING to Reports

The reporting component will not be available to test against, and domain expertise may not be as strong as for previously releases with the departure of senior personnel.  Available domain experts will be involved as early as possible to validate the contents of the ZING database.

Product to ZING

System testing will primarily focus on driving the applications and ensuring that –

  • Application’s function is unaffected
  • Product generates events in response to correct user actions
  • XML can be received by SCE
  • Products send the correct data through

Key testing focus

  • Ensuring existing event capture is unaffected (PRODUCT D).
  • Ensuring event details correctly captured for systems.  This is more critical for systems in which there is currently no alternative capture mechanism (Product C, Adamantium, DBC).  Alternative event capture mechanisms exist for Product B and Product A.
  • Ensuring existing system functionality is not affected.  Responsibility for Product C’s regression testing will lie with Product C’s team.  There is no change to the Product B application, but sociability testing may be required for log processing.  Product A has an effective regression suite (selenium), so the critical focus is on testing of PRODUCT D functionality.

Test prioritisation strategy

These factors guide prioritisation of testing effort:

  • What is the application’s visibility?  (ie. Cost of failure)
  • What is the application’s value? (ie. Revenue)

For the products in scope, cost of failure and application value are proportional.

There may be other strategic factors as presented by the business as we go, but the above are the primary drivers.

Priority of products –

  1. Adamantium/Product D
  2. Product C
  3. Product A
  4. Product B
  5. DBC

Within Product D, the monthly Usage statistics show the following –

  • 97% of searches are business type or business name searches.
  • 3% of searches are browse category searches
  • Map based searches are less than 0.2% of searches

Test design strategy

Customer (Acceptance) Tests

For each event, test cases should address:

  • Ensuring modified applications generate messages in all expected situations.
  • Ensuring modified applications generate messages correctly (correct data and correct XML).
  • Ensuring valid messages can be processed by SCE.
  • Ensuring valid messages are transformed correctly go to the specified database fields.
  • Ensuring data in the database is acceptable for reporting needs.

Regression Tests

For each product where event sending functionality is added:

  • All other application functionality should be unchanged

Additionally, the performance test phase will measure the impact of modifications to each product.

Risk Factors

These tests correspond to the following failure modes –

  • Events are not captured at all.
  • Events are captured in a way which renders them unusable.
  • Systems whose code is instrumented to allow sending of events to SCE are adversely affected in their functionality.
  • Event data is mapped to field(s) incorrectly
  • Performance is degraded
  • Data is unsuitable for reporting purposes

Team Process

The development phase will consist of multiple iterations.

  1. At the beginning of each iteration, the planning meeting will schedule stories to be undertaken by the development team.
  2. The planning meeting will include representatives from the business, test and development teams.
  3. The goal of the planning meeting is to arrive at a shared understanding of scope for each story and acceptance criteria and record that understanding via acceptance tests in JIRA.
  4. Collaboration through the iteration to ensure that stories are tested to address the business needs (as defined by business representatives and specifications) and risks (as defined by business representatives and agreed to in this document).  This may include testing by business representatives, system testers and developers.
  5. The status of each story will be recorded in JIRA.

When development iterations have delivered the functionality agreed to by the business, deployment to environments for UAT and Performance and Load testing will take place.


High priority

  • QTP regression suite for PROJECT X events (including Adamantium, DBC) related to business type and business name searches
  • Test summary report prior to go/no go meeting

Secondary priority

  • QTP regression suite for Product A (Lower volume, fewer events and the application already collects metrics).  Manual scripts and database queries will be provided in lieu of this.


  • Product C should create PROJECT X QTP regression tests as part of their development work
  • Product B test suite will likely not be a QTP script as log files are being parsed as a batch process.  GUI regression scripts will be suitable when Product B code is instrumented to add event generation.  If time permits, we will attempt to develop a tool to parse a log file and confirm that the correct events were generated.

To do

  • Confirm strategy with stakeholders
  • Confirm test scope with Product C testers
  • Confirm events that are in scope for this release
  • Define scope of Product D testing and obtain Product D App. Sustain team testers for regression testing.

Tools for thinking about context – Agile sliders reimagined

Philosophically, I’m aligned to the context-driven testing view of the world. Largely, this is influenced by a very early awareness of contextual factors to success in my first job, and the wild difference between games testing and corporate testing roles that I had. Since 2003, the work of the context-driven school founders has been a significant influence in how I speak about testing.

In 2004, when I worked on my first agile project at ANZ, I was lucky to fall in with a group of developers and analysts who were skilled and keen to solve some of the problems we had in enterprise intranet projects. A huge piece of that was using agile ideas to solve *our* most pressing problems, not all of the problems that enterprise had. To do that we aligned the corporate project management practices and rules to more general principles, and then set about satisfying the principles in a way that met our other objectives (the main one being to make it cost less than $30,000 to put a static page on the intranet).

Another question that came up was how we might move away from a rule-based governance framework to one that was oriented to principles and context. The meat of this blog post is the result of how that initial idea connected to my context-driven approach. It then turned into a model for project context. It was also spurred on slightly be the over-simplification I perceived in the commonly used agile project sliders – Cost, Time, Scope, Quality (though Mike Cohn has a somewhat improved version, and reminds me I should finally read that Rob Thomsett book Steve Hayes recommended).

This has hidden in my blog drafts for a good seven or eight years. It is intended to support my test/delivery strategy mnemonic, though both are useful independently. I’ve recently started sharing it with my testers and colleagues, so I feel it’s time to open it up to the world for review.

The usual caveat applies, that this is a model that works for me. If you find it helpful, I would love to hear from you. If you improve it or change it, I’d love to know about that two. Here are a few ways I hope it might help:

- To help us and other stakeholders consider elements of context that require us to assess the suitability of our standard approaches.
- To help ensure stakeholders understand that each piece of work they undertake is different in subtle but important ways.
- To help ensure that the test/delivery strategy/approach is reasonable.
- It may help us to create a record of project characteristics that we could search for stories about projects similar to the one we’re undertaking now.

This is rough, but given my observations of the context-driven community and software development in general, getting this out is more important than polishing it. So here is a model for context, intended to be put up somewhere visible with associated sliders:

Time to Market/Time constrained/Time criticality
A scale that indicates how time critical this piece of work is. That is, how bad is it if it takes longer than expected?

Business Risk
A scale that indicates the likelihood of failing for business reasons

Technical Risk
A scale that indicates the likelihood of failing for technical/technology reasons

Is this inherently complex?

Similar, but different to complexity. A big system with simple functions brings its own challenges.

How well understood is this problem? Have others solved similar problems before?

Value ($)
What is the size of the benefit?

Team Size
How big is the team?

# External Stakeholders
How many of the stakeholders are not within the same management structures (eg. Subject to shared KPIs)?

Interfaces (external, internal)
Are there lots of interfaces to this product?

Cost/Budget/$ Constrained
How significant is the impact of spending more than planned?

Criticality (failure impact)
How bad will it be if this fails in production? (Max is life/safety critical)

How important is this relative to other projects in the organisation?

Scope/Feature constrained
How much opportunity is there to vary the scope of what is delivered? Fixed scope translates to risk if other things are constrained (especially time and budget).

Is everything required to solve the problem within in the team? What things/people/knowledge that are needed to deliver are shared or external to the group?

Feedback cycle time
How quickly can you get feedback on questions regarding the product? This includes how quickly and how often you can test, as well as how long it takes questions regarding direction of the solution to be answered (eg. Availability of product stakeholders).

Communication bandwidth
When you are able to communicate as a team, what is the quality of that communication? Is it limited by technology, language? Offshore teams are frequently low feedback, low bandwidth communications.

Communication frequency
How often are you able to communicate with the team? Subtle difference to feedback cycle, in that someone may be able to quickly provide answers when available, but not very often.

Time constrained?
How fixed is the schedule? What is the impact of overrunning the planned completion date?

Team cohesion/familiarity
How long has the team worked together?

Team experience/maturity/skill
Has the team worked on this domain for a long time? Is there broad experience of different ways of working? Is the team strong technically?

Compliance requirement? (Note that this is not a slider, it’s a checkbox)
This can arguably be modelled using other properties, but may be worth flagging separately when a project is something that must be done.

Additional contributors:
Thanks to Shane Clauson for prompting an addition to this model last week. Thanks to Vito Trifilo for cooking the barbecued ribs that brought me and Shane together!

Why record-playback (almost) never works and why almost nobody is using it anyway

Alister Scott once again calls out a number of spot-on technical points regarding the use of automation tools. In this case, he discusses record/playback automation tools.

Technical reasons aside, we also need to look at the non-technical reasons.

I’ve only once encountered someone trying to rely on the record-playback feature of an automation tool (my boss, working as a consultant, and earning a commission on the licence of the tool). Record-playback exists primarily as a marketing tool. When we say ‘record-playback fails’, I generally take that to mean the product was purchased based on the dream of programerless programming (I’m looking at you too, ‘Business Process Modelling’) and quickly fell into disuse when the maintenance cost exceeded the benefit of the automation.

The other common failing of course is that the most developed record-playback tools are (were?) expensive. I’m not sure what the per-licence cost of QTP/UFT is these days, but it used to be about 30% of a junior tester. Calculating the costs for even small teams, I could never defend the cost of the tool for regression automation over the value of an extra thinking person to do non-rote testing activities.

So if we can get through the bogus value proposition, especially relative to the abundance of licence-free options, there is a very limited space in which record-playback might add value:

- Generating code as a starting point for legacy projects. I’ve used record-playback to show testers how to start growing tools organically. That is, begin with a long, procedural record of something you want to automate. Factor out common steps, then activities, then business outcomes. Factor out data. Factor out environment, and so forth as you determine which parts of your system are stable in the face of change.
- If your test approach is highly-data driven, and the interface is pretty stable with common fields for test variations, you could quite feasibly get sufficient benefit from record playback if your testers were mostly business SMEs and there was little technical expertise. For example, if testing a lending product you might have input files for different kinds of loans with corresponding amortisation and loan repayment schedules. When testing a toll road, we had a pretty simple interface with lots of input variations to test pricing. In this situation, the cost of the test execution piece relative to the cost of identifying and maintaining test data is relative small.
- When we have some execution that we want to repeat a lot, in a short space of time with an expectation that it will be thrown away, quickly recording a test can be beneficial. In this case, we still have free macro-recording tools as alternatives to expensive ‘testing’ tools.

If we think of record-playback as tool assistance, rather than a complete test approach, there are some in-principle opportunities. In practice, they don’t usually stack up economically and we have other strategic options to achieve similar ends.

Testing does not prevent defects

There seems to be a bunch of discussion regarding whether testers prevent defects or not.

The main source of confusion that I see is confusing ‘testers’ with ‘testing’. Clearing this up seems pretty simple.

Testing does not prevent defects. Testers may. I do. But I don’t call that part of my work ‘testing’, even if testing and experiment design is a part of that work. Clarifying which aspects of my work are ‘testing’ and which are not is important for at least two reasons.

The first, is that I can discuss those skills clearly and keep that knowledge in one place. This allows me to make shortcut reference to those skills when I apply them in other roles I may be playing.

Secondly, time I spend playing some other role that may be about defect prevention (or team/product alignment, shared focus, or anything else) usually involves (sometimes irreconcilable) tradeoffs in the quality of my testing work or less available time for test effort. If I’m not making this clear to stakeholders who matter, or if I am unaware of the potential for conflict, I expose myself to a number of potential problems.

This blog post regarding test code being harder than application code was passed around the office, and I thought I would preserve my response here. You’ll need to read it first for this to make much sense.

I think there’s a reasonable point that testing is frequently trivialised, but I don’t think saying that the non-testing code is trivial is the right way to get that message across. Discussions regarding what developer testing looks like need to take into account a whole raft of context, which this article doesn’t seem to get into (eg. Defect prevention strategies, architecture, system criticality).

For example, this is completely untrue, unless he is assuming an ideological TDD view of the world:

“The tests are the program. Without the tests, the program does not work.”

Without *testing*, the program probably won’t work. Without tests, the program could easily work perfectly well. You’ll certainly have a program (of possibly unknown quality).

This is sometimes true:

“Sometimes the tests are the hard part, sometimes the code is the hard part.”

…but just as often, the code is the hard part and the tests are easy.

At my last job, I remember a simple code change that took a week of test planning, a custom execution framework, two weeks of execution and another week or so of results analysis (plus DBA support to write performant queries against the data). It would have been nice if his examples had included something along these lines. His search example could have provided an interesting example of this, as well as the limits of automated testing when the results are subjective (ie. ‘Good’ search results).

I’d suggest the main reason tests contain as many errors as code is because both code and test cases are the result of similar communication and modelling processes. The general process of getting working software is the process of bringing the code and the test models into alignment.

Where tests have *more* bugs than code, I’d suggest it’s usually a result of the test design time being squeezed, and the fact that nobody really likes writing procedural test cases.


More haiku updates

I’ve added some new ones, need to take one out. At some point, there should probably be a bunch of Scaled Agile haiku.

See the agile haiku page.

Updated haiku

I’m surprised at how relevant it all still is, but I have added something that I think is missing that I’ve learned in my last couple of roles. See my ‘Essence of Agile’ Haiku for the update.

Some thoughts on iteration vs incrementation

Alister Scott’s post on Incrementation vs Iteration was doing the rounds at work with some comments, and I felt the need to comment.  I had a couple of attempts at responding to this.  It’s a big topic, but to some degree I think iterative vs incremental is a bit of a distraction as a general philosophical discussion (and I *love* philosophical discussion, so I am not meaning to be dismissive).

I think it’s more important to ask -

- What is our plan for validating the product definition?
- What is our plan for validating the architecture?
- How does the organisation want to ‘manage the work’.
- As a team, how will we know when our work is ‘good enough’?
- As a team, how do we plan to manage incomplete work (ie. future enhancements and defects).

The first two are probably the biggest factors of uncertainty that will drive the degree of iteration vs incrementation.

The third is a question relating to the organisations belief systems around predictability of software projects and how much power management wants in designing work methods to support their agendas.

The fourth question will shape how actively the team works against ‘doneness’ by finding bugs, soliciting feedback and exploring the solution space throughout the iteration.

The last question is about how the team wants to manage defects and backlog.  If you don’t want to carry bugs and want to minimise backlog management, my experience is that the sprint needs to plan for internal iteration (or you need to get rid of sprints/iterations and go with a pull/dock assembly approach).

The most important thing is understanding why we iterate.  Alister highlights a couple of examples when he talks about time to market vs user experience prioritisation, but it’s only sensible to talk about iteration in specific contexts of uncertainty and risk management.  Similarly, we discuss incrementation in the context of the cost/benefit of a particular release approach and/or schedule.

I feel the biggest lesson in the Android/iOS history is as an example of how expensive it is to fix architecture if you get it wrong.  Excuse me before I start on my old-guy ‘RUP got it right’ rant.

More Ruby goodness for testing

Did I mention how much I love Ruby?

items = (“A”..”Z”).to_a.in_groups(5,false)

5.times do | i |
puts items[i].flatten.to_s
puts “—-”

Source code is at

Yet another reason why testers should use Ruby

Instant sequence of New South Wales licence plates:



return start_lpn.upto(end_lpn).collect { | lpn | lpn }

Page 1 of 1512345»10...Last »

About me

I'm Jared Quinert, a testing consultant located in Melbourne, Australia. With over fifteen years of experience, I specialise in agile testing, context-driven testing and intelligent toolsmithing with a focus on business outcomes over process. As one of the most experienced agile testers in Australia, I've been diving in hands-on since 2003 to discover how to build successful whole-team approaches to software development.

Contact Me