Essential for software testers TETER SUBSCRIBE

spanflockInternet and Web Development

Jun 24, 2012 (5 years and 3 months ago)

429 views

E s s e n t i a l f o r s o f t wa r e t e s t e r s
TE TER
SUBSCRIBE
It’s FREE
for testers
February 2012 v2.0 number 13
£ 4 ¤ 5
/
Including articles by:
development
testing
THIS ISSUE OF
PROFESSIONAL TESTER
IS SPONSORED BY
Chris Adlard
Coverity
Les Hatton
Kingston University and
Oakwood Computing
Associates
Harry M. Sneed and
Manfred Baumgartner
ANECON
Boguslaw Czwartkowski
Parasoft
Geoff Quentin
Test Studio
Easily record automated tests for
your modern HTML5 apps
Test the reliability of your rich, interactive JavaScript apps with just a few clicks.
Benefit from built-in translators for the new HTML5 controls, cross-browser support,
JavaScript event handling, and codeless test automation of multimedia elements.
www.telerik.com/html5-testing
TE TER
Coming soon in PT: you?
A preview of themes planned for
forthcoming issues of PT can be found
at http://professionaltester.com/magazine/
calendar.asp. We ask readers to spend
a few minutes considering them. If you
have any comments, or suggestions for
more themes, please email me direct at
editor@professionaltester.com.
Please consider also contributing to PT.
Fellow testers want to share your personal
insight and learn about how you and your
colleagues are meeting your challenges.
Some of our very best articles come from
first-time authors who are busy delivering
testing rather than commenting on,
speaking about or serving it. You don't
have to be working at the cutting edge
of innovation, be a great writer or have
time on your hands: what we want is your
idea, and we will offer any help you need
with presenting it.
Editor
Edward Bishop
editor@professionaltester.com
Managing Director
Niels Valkering
ops@professionaltester.com
Art Director
Christiaan van Heest
art@professionaltester.com
Sales
Rikkert van Erp
advertise@professionaltester.com
Publisher
Jerome H. Mol
publisher@professionaltester.com
Subscriptions
subscribe@professionaltester.com
Contributors to this issue
Chris Adlard
Les Hatton
Harry M. Sneed
Manfred Baumgartner
Boguslaw Czwartkowski
Geoff Quentin
Contact
3
From the editor
We aim to promote editorial independence
and free debate: views expressed by
contributors are not necessarily those of
the editor nor of the proprietors.
©Professional Tester Inc 2012.
All rights reserved. No part of this
publication may be reproduced in any form
without prior written permission.
“Professional Tester” is a trademark of
Professional Tester Inc.
4
16
Development testing
PT has always advocated doing more
testing earlier and the current, very
encouraging renewed interest in develop-
ment testing makes this the right time to
deliver an issue we've wanted to for some
years. We commend these approaches
to readers and thank contributors,
advertisers and especially its sponsor
Coverity for making it possible.
Professional Tester is published
by Professional Tester Inc
IN THIS ISSUE
Development testing
Series
Visit professionaltester.com for the latest news and commentary
Why testers should hate finding defects
How development testing makes empirical testing better. With Chris Adlard
Be honest
Professor Les Hatton on design, development and testing for reliability
The covers are off
Edward Bishop discusses code coverage and suggests an idea
From test-driven to tester-driven agile development
Harry M. Sneed and Manfred Baumgartner propose
a new approach to team and process organization
Static dynamism
Boguslaw Czwartkowski says it's time to reassess the benefits of code analysis
CTP7
Geoff Quentin concludes his Consolidated Testing Process
by explaining its meaning and aims
22
25
8
12
SUBS RC IBE
RIt’s F EE
for testers
Edward Bishop
Editor
PT - February 2012 - professi onal tester.com
Every defect detected in empirical
testing, that is by executing code after
integration, causes cost, delay and risk
that could have been avoided. Even in
the best case, where the fix is quite trivial,
retesting and regression testing are
necessitated. More complex rework often
requires complex, extensive revision of
test design and implementation which
diminishes the value and effectiveness of
testing already done. Doing these things
is expensive and doing them thoroughly
is difficult, but the alternatives are
dangerous.
Testers aim to make empirical testing
powerful. In other words, they try to make
their tests effective at finding defects if
they exist. In practice, that activity is the
same as trying to find defects and it is
only human nature to feel pleased when
one succeeds in what one has been
trying to do. But we must never allow
that emotional reaction to make us feel
that detecting a defect in empirical
testing is a good thing. In fact it is a
disaster for all concerned, except
possibly business competitors.
Why testers hate not finding defects
The only thing worse than finding defects
is missing them and empirical testing is
very good at doing that. Without effectively
infinite resources, it is dependent on luck:
testing with a particular combination of
circumstances that reveals failure, out of
a very large number not tested, then
observing it. Test design techniques exist
to help to increase the probability of that
happening but cannot do so by much.
Automating test execution provides
important help too, by increasing the
amount of testing that can be done with
the available resources and making it
more consistent, but doing it is very
challenging: the entire career of some
testers is built on mastering and improving
test automation skills. Most software is
designed either for use by humans or (as
in control systems etc) to react to physical
change. Making other software simulate
either of these and validate the resulting
behaviour correctly is generally approxi-
mately as hard as building the test item
itself. It's a good example of making a
computer do something for which it is
not suited.
Transform quality control into
quality assurance
So as IT systems become more complex
and critical it is more important than ever
to detect defects earlier, in analytical
testing, thus deliver cleaner builds to
empirical testing. That does not make
empirical testing less necessary: it makes
it better by preventing delay, iteration and
complication caused by detection of
defects that should not exist by the time
by Chris Adlard
Why testers should hate
finding defects
Empirical testing should demonstrate
correctness. If it does not, analytical
testing has already failed
PT - February 2012 - professi onal tester.com
4
Chris Adlard explains
the importance of
development testing
Development testing
makes it better. Every tester who
participates in document reviews knows
how important it is that the deliverable is
of good quality before the review begins.
Minor defects cause critical ones to be
missed by obfuscating the creators' true
intentions, obstructing the defect-detection
process in use and causing the reviewers
to lose focus. The same syndrome affects
code reviews but is prevented by effective
automated code analysis beforehand.
Today's development testing is much
more than highlighting wrong or
suspicious constructs and enforcing
standards. Coverity has spent almost
a decade advancing the technology in
its static, dynamic and architecture
analysers, continually enhancing their
ability to detect defects that previously
only humans could. They have been
combined to form a complete development
testing platform with sophisticated visibility
and reporting facilities. This platform can
integrate other analysis and unit testing
tools including Open Source ones to
it begins. That means the expertise and
effort put into it yield faster demonstration
of correctness and other defects, often
ones of fundamental business logic or
difficult non-functional attributes, that
might otherwise be missed. Opportunities
to achieve all this occur before integration,
one of the best being when code is being
written. The testing done at that time is
called development testing.
Development testing can be done by
developers or testers but, like agile,
works best when they collaborate closely.
A range of techniques is available. One
of the best known is code review: it is
essential and can bring great benefit as
one of the most effective ways to detect
functional defects caused by human error
in logic, understanding, or manipulation of
code (for example reuse). Again it is very
difficult and expensive. Even the best
developers find it a hard, painstaking
and slow task and typically miss as
many defects as they find. Here we have
an example of humans doing something
for which they are not suited, but unlike
empirical testing this work is perfectly
suited to be automated.
Source code is designed to be read and
analysed by a machine: even the simplest
compiler or interpreter must validate
syntax and structure before the conversion
to executable code. It is a logical extension
of that concept to write programs that
check other attributes, scanning for incon-
sistencies and constructs that could indi-
cate error or cause risk of failure. Unlike
a human, such a program will not miss
any defect it is capable of finding. They
are very fast, typically covering code at
least one order of magnitude faster than
empirical testing, even if automated.
Automated code analysis is not an
alternative to code review: rather, it
a single user interface and can itself be
integrated directly with HP ALM, Microsoft
Visual Studio, Eclipse IDE and continuous
integration servers such as Jenkins.
The effectiveness of automated develop-
ment testing can be measured precisely.
It deals with absolutes, not the uncertainty
that always surrounds testing later in the
life cycle. The defects it finds are evident
and specific, the difficulty in reproducing
and defining empirical failure does not
apply and there is certainty that more
instances of them do not exist anywhere
else in scanned code. That is important
because development testing is especially
good at detecting the types of defect that
cause security failure.
Keeping code agile
The focus of any software delivery
organization must be on the next release,
but that can cause problems to be stored,
both inadvertently and as a result of
necessary short-term decisions, which
will have their impact later when more
5
PT - February 2012 - professi onal tester.com
Development testing

Memory – corruption
Memory – illegal access
Resource leak
Uninitialized variable
API usage error
Class hierarchy inconsistency
Concurrent data access violation
Control flow issue
Error handling issue
Incorrect expression
Insecure data handling
Integer handling issue
Null pointer dereference
Program hang
Number detected
0 10 20 30 40 50 60 70 80
High risk
Medium risk
90
Figure 1: High- and medium-risk defects detected in the Android kernel,
from the Coverity Software Integrity Report published in 2010
extensive change of nature as yet un-
known is needed. The cost of the rework
that will be needed to make that possible
can be called technical debt. Without auto-
mated code analysis, measuring it is
speculative, subjective and prone to error.
Development testing makes the
measurement objective and accurate
by finding and providing clear visibility
of real properties of the code that cause
technical debt such as complexity that
can be simplified, tight coupling that
can be loosened and unnecessary
dependencies. With it in place, as well
as supporting development of correct
functionality and attributes testing
becomes able to inform and monitor
effective refactoring to extend that
correctness into the future and
eliminate technical debt.
The SCAN project
Because its development testing products
are so fast and efficient, Coverity is able
to provide them as a free cloud-based
service to Open Source projects. This
began in 2006 in collaboration with the
US Department of Homeland Security
and has now tested over 61 million lines
of code from more than 290 varied
products including Linux, Apache, PHP
and Android (see figure 1). Nearly
50,000 defects have been detected of
which over 15,000 have been fixed.
This work benefits countless people by
continually improving the quality of the
software they use, including to develop
more software.
The vastness of this sample enables
accurate defect statistics to be derived,
which we are certain are a true reflection
of defects in non-Open Source products
too. Figure 2 shows the frequency of
occurrence of the most common defect
types. It's important to note that while
some of these defects may have been
detected by empirical testing with conse-
quent cost, delay and risk as described
above, most would not and if undetected
would enter production where they would
cause serious functional, non-functional
and security risk
Chris Adlard is EMEA marketing director at Coverity (http://coverity.com). For more
information about the SCAN project see http://scan.coverity.com
PT - February 2012 - professi onal tester.com
6
Development testing
Figure 2: Frequency of detection of common defects in 61 MLOC

Null pointer dereference
Resource leak
Unintentional expression
Uninitialized value read
Use after free
Buffer overflow
Other
Design, development and testing are
emphatically not phases of a project.
They are states of mind which co-exist
throughout a project. This merely reflects
the essentially iterative nature of software
development. I don't think I have ever
embarked on a software project where
I knew what I was doing at the beginning
(in fortunately few, but memorable, cases
I didn't at the end either). Instead,
requirements capture, implementation
and testing must be woven together at
each stage of the project. In the early days
requirements dominate, but in all but the
most trivial of projects they are necessarily
accompanied by mini-prototypes to test
feasibility and testability. In the later
stages, this balance will change to favour
active testing and re-development.
In spite of this, we seem to feel a need to
compartmentalize software processes.
This may simply be a manifestation of the
bureaucratic urge to create mini-empires
and I have certainly seen enough of these
in my career. Separating software
processes in this way is very damaging
to the intellectual coherence of a software
project and in my view has had numerous
bad side-effects, for example:

testing is still often considered a
low-level and in particular a
low-status activity

implementation (ie development)
is considered almost a detail by
managers who after writing a few
Excel macros think that the
by Les Hatton
Be honest
Development and testing must be
inextricably intertwined
PT - February 2012 - professi onal tester.com
8
Les Hatton explains
why formal specification,
prototyping and parallel
test design are at the
heart of successful
software
Development testing
It might not look like it, but this is actually
the specification for the treatment of errno
in the ISO C programming language, the
error number returned by the C run-time
library. A1-A4 are the actions to be taken,
S1-S3 are the three independent binary
states which must be considered and the
operators, in order of first appearance, are
logical AND, NOT and OR.
The design specification is written in the
predicate calculus, more or less directly
from the fairly dense verbal description of
this in the ISO C standard itself. However,
it is also a program which can be
translated into C. For example the first
equation becomes:
Last but not least, it also shows the 12 test
cases, which are:
test case 1: (S1 true, S2 true,
S3 false): results in A1
test case 2: (S1 true, S2 true,
S3 true): results in A1
...
By using this specification language, it can
be seen that design, implementation and
testing are all intimately interlinked, so
much so that separating them is artificial
and in my view damaging to the clarity of
the development. Warnier-Orr diagrams
can be used instead of specification
languages (figure 1 shows the first
act of coding is 'just like writing
a big macro'

testers are still rarely involved in the
design stage of a project. I co-edit with
Michiel van Genuchten the Software
Impact column in IEEE Software (see
http://computer.org/software) and
recently we interviewed all the
contributors to date to solicit comments
about the design process. They were
talking about really major systems with
millions of lines of code and many
users. All the contributors stated that
testers should be involved at design
but none were. This seems astonishing
to me but is the norm

design is considered a high-status
activity but in truth, we have almost no
idea how to do it properly and there is
precious little standardization (another
fact which has emerged from the
column)

the influence of bureaucracy and the
intrusion of management into software
skills definition has meant that
programming skills are in significant
decline. I make this observation partly
as a part-time academic and partly
through my involvement with
commercial projects. For example,
it is now possible to graduate with
a degree in computing with almost
no programming experience.
So the situation is bad. No surprise there
for testers or developers. Now what might
we do to improve it?
Design IS development IS testing
Having caught your eye with a pithy
subheading, let me put a little flesh on this.
Consider the following computer program.
This radically simpler but identical
program has only 4 test cases.
You might well ask if this is the way I
develop all programs. The answer is no,
but you need some kind of similar
representation when things are getting
logically difficult. However the fact that I
might carry the specification informally in
my head whilst developing the code does
not mean that I should forget about the
test cases. As a developer I should be
developing the test cases in parallel so
that the package I would pass onto an
independent testing team is the code and
the test cases (and maybe the formal
design specification as well, if I have to
develop one). The independent test team
can then judge the package as a whole.
I would go further. The independent test
team should be able to swap places with
the developers.
Development testing and honesty
One of the most important traits a
developer should acquire is intellectual
honesty. After years of developing myself
and talking to many other developers, I
know that experienced developers sense
when they are in trouble. They know
when their code is incoherent and they
know when they have used a clumsy set
9
PT - February 2012 - professi onal tester.com
Development testing

if ( (S1 && S2 && !S3) ||
(S1 && S2 && S3) )
{
A1;
}

equation in this form) although these can't
be refactored logically in any trivial way
as the predicate calculus version will
be below.
If as in this case I am using a
specification language with formal
mathematical properties (the laws of
predicate calculus), I can use these laws
to simplify this design specification, whilst
simultaneously simplifying the code and
the testing. The design specification (1)-(4)
then turns out to be identical to:
of data structures for a particular
algorithm. The question is what do you
do about it?
In 1976 James W. Hunt and M. Douglas
McIlroy wrote the important paper An
Algorithm for Differential File Comparison
(Bell Labs CS Tech Rep 41) defining the
Unix utility diff (differential file comparator).
In an interview in The Unix Programming
Environment (Brian W. Kernighan and Rob
Pike, ISBN 9780139376818) McIlroy said
"I had tried at least three completely
different algorithms before the final one.
Diff is a quintessential case of not settling
for mere competency in a program but
revising it until it was right". They
developed it several times, but they were
intellectually satisfied only with the final
version. In essence, the others were good
working prototypes but the authors felt
they could do better. They did. The final
version was astonishingly reliable and half
the size of the first version. However, in
order to be able to see this, they had to
take the earlier versions through much of
the normal software process, linking the
various phases. Contemporaneously,
Fred Brooks was writing in The Mythical
Man Month (Addison-Wesley, ISBN
9780201835953) that we should plan to
throw (at least) one version away because
we will anyway and the third version is
usually the best.
We seem to have forgotten much of this
pioneering work. If managers allowed
developers to prototype more and
developers understood the need for it
rather better than they do, we would have
made a lot more progress. Successful
prototyping is at the heart of development
testing. It is at the heart of all successful
software systems. It is really at the heart
of all successful human creative achie-
vement, as I was reminded at the
Leonardo da Vinci exhibition at the
National Gallery in London last month.
Da Vinci prototyped parts of his great
works, such as the depiction of fall of
cloth on an arm, over and over until at
last he could put the parts together.
Beethoven did the same with his
symphonies.
Development testing and patterns
Coding is, for most who do it, pleasurable.
I like making code look good and I like
making code read transparently. Above all,
I like thinking about how I would test it.
Doing that is fun but really it's for self-
defence. I still maintain a lot of commercial
and Open Source code (in C, Tcl, Perl and
Fortran) and things get out of hand rapidly
if code is incoherent or rushed. Even
throw-away code should be good because
you never know when you will have to
re-use it for more prototyping or to plunder
a neat trick you have learned but
temporarily forgotten the mind-bending
syntax involved.
As time goes by, you begin to accumulate
knowledge and patterns of failure. This
knowledge can come from various sources.
Here is an example in C:
This of course is an example of a complex
decision with an else clause. However, we
know from the work of psychologist T. R. G.
Green (see http://homepage.ntlworld.
com/greenery) that humans are very poor
at reversing complex decisions. In other
words we are more likely to get the logical
condition in which we enter the else clause
above wrong than right. Here’s an example
of one of my mistakes with this construct:
PT - February 2012 - professi onal tester.com
10
Development testing
/*
* Check division operators for zero division.
* right_ok = TRUE, if the divisor has been defined correctly.
*/
case OP_DIV:
case OP_MOD:
if ( right_ok && (right_cval.cv_long == 0) )
{
/*
* Can't do the division, the divisor is zero.
* Give a hard warning if the expression has to be
* evaluated and a soft warning otherwise.
*/

}
else
{
/*
* All OK, do the division with right_cval.cv_long.
*/

}

if ( a && b && c ... ) {
...
} else {
/* There is about a 70% chance you will get this bit wrong */
}
11
PT - February 2012 - professi onal tester.com
Is it OK to do the division? Well actually
no. I had forgotten the case when right_ok
is not true but right_cval.cv_long is zero.
This emerged five months after releasing
a product with a very aggressive
(but not aggressive enough) set of
regression tests.
Development testing and open source
Perhaps the most obvious place in which
development testing has flourished is in
the Open Source world from which we can
learn many lessons because it is very
common in Open Source projects for
development to go hand in hand with
testing (this is not the case in all projects
but it appears to be all successful ones).
Availability of source code means that any
algorithm has its deepest secrets exposed
to anybody who wants to look. Faults are
flushed out and fixed perhaps before they
have ever had the chance to cause failure.
In very widely used systems such as the
Linux kernel, PHP, Perl and Apache, this
has led to exceptionally reliable systems
in an environment in which development
and testing are inextricably intertwined as
they must be
Les Hatton (http://leshatton.org) is professor of forensic software engineering at
Kingston University and also works at Oakwood Computing Associates
(http://oakcomp.com). His newest book E-mail Forensics: Eliminating Spam, Scams
and Phishing (Bluespear Publishing, ISBN 9781908422002) is available now
S1
S2


S3 …
NOT S3 A1
Development testing

Figure 1: Warnier-Orr diagram equivalent to specification equation (1)
The full test suite is run and statement
coverage is measured. It's 90% and the
exit criteria for the phase calls for only
80%. Great!
But 95% of the tests are covering exactly
the same statements. 50% of the
statements covered are executed by only
a handful of very similar tests. Not great.
10% of statements are not executed by the
test suite, yet static analysis indicates that
there is no unreachable code. This is
because the statements can be reached,
but only if the last transaction on the
currently-logged-in account happened at
0000 hours UTC. Also not great.
Simple and more complex coverage
Programmers who practice test-driven
development correctly measure coverage
of their code continually. At component
level its meaning is clear, especially using
one of the many powerful coverage mea-
surement tools available. These integrate
closely with development environments
and show in visual reports how many
times each line of code has been covered
and, in some cases, which tests covered
a particular line. This is all the information
a developer needs to know whether (i) the
unit tests are working as expected and (ii)
the code needs to be refactored: if code is
written, as it should be, only to pass unit
tests the coverage should always be close
to 100% and if it falls below that when all
the unit tests needed are run something
is wrong.
To help assure the effectiveness of testing
at higher levels, more and different
coverage information is needed. But after
integration, when tests are run in a test
rather than a development environment,
getting it becomes much more complicated
because simple measurements may or
may not be meaningful and that is hard to
establish. Most coverage measurement
tools can highlight code that is “insuffi-
ciently covered” but what appears to be
“sufficiently covered” may only have been
covered repeatedly by too few tests.
Dynamic analysers that generate diagrams
visualizing control and data flow can also
help to find rarely-occurring paths. But
testers need a way to find out more: which
statements or decisions are covered as
the result of which test events and data.
Having that information would create a lot
of potential: not just to detect more defects
with existing tests, but to improve the tests
to provide more assurance and detect yet
more defects if they exist.
Getting it requires a way to link each
coverage event to the test that caused it.
It has been suggested, including by
Harry M. Sneed, a contributor to this issue,
that this can be done using time. Here is a
theoretical method:
Doing this could reveal which tests are
very similar and, more interestingly, which
are very different, to others in terms of the
code they cause to be executed. Introdu-
cing more variations of these tests, then
repeating steps 2 and 3 to show that more
code is executed by more different tests,
would increase the defect-finding potential
of the suite.
by Edward Bishop
The covers are off
Coverage is not a number
PT - February 2012 - professi onal tester.com
12
PT editor
Edward Bishop
proposes a test design
improvement method
using Ranorex
Development testing
1.The probes inserted when the code
is instrumented are designed to record
not only that they have been executed,
but when, according to the system
clock
2.Each test script, or the tool executing
it, also records when it starts and
terminates and, in data-driven testing,
a means of identifying the data it uses
(eg the number of a line read from
a CSV file)
3.After execution, the two output files
are correlated and analysed.
A free trial of Ranorex is available from http://ranorex.com
automate them with a complex
configuration of the test execution tool.
I have noted in previous articles (see the
July 2010 and April 2011 issues of PT)
that the functional test automation tool
Ranorex provides exceptional flexibility
that promotes close collaboration between
development and testing. An example of
this is the key to a simple way to build the
coverage comparator tool this article des-
cribes. A test suite created in Ranorex is
a standard .NET project saved as a .EXE
file executable from the command line
A coverage comparator tool
It may be possible to get the time-based
method to work, but there are obvious
difficulties including the familiar problem
of time dependencies. There will be a dis-
crepancy between the two recorded times
and for many system architectures it will
vary significantly within and between runs.
Reliable correlation may be a significant
challenge. It seems likely but not certain
that technical solutions could be found.
A more direct method is easier: execute
tests or groups of tests individually and
discover the detailed coverage each
achieves separately. That removes the
need for the probes to do anything other
than identify themselves which is how
nearly all coverage measurement tools
work. One of these could be used, or a
new program written, to instrument the
code: figure 1 shows how it can be done
in and for VB.NET for simple line cover-
age. The number of probes needed could
be reduced and other coverage types
achieved by slightly more sophisticated
parsing. The analysis program is nearly
as easy to write: it reads all the coverage
data generated by the test runs, searches
for long blocks of text duplicated between
them, then sorts the lines in each of them
and compares the sorted lists. Figure 2
shows an example of some of the
output that can be generated.
The hardest part is running the individual
tests or groups of tests and organizing
their output. Starting them manually would
take a lot of effort, and between runs each
coverage file created would have to be
renamed or moved, then the information
about the name or location of the files
passed to the analysis program. It would
be better to collect all the coverage
information (the lines of text written by
the probes) in one large file, but the blocks
created by each run need to be separated.
That could be achieved by a tiny program
that appends the identity of the run to the
coverage file, but that would have to be
executed (and fed the run ID) between
runs. Conducting any of these tasks
manually would be onerous and prone to
human error. It might be possible to
requiring only standard Windows runtime
components and accepting arguments.
One of them allows an individual test in
the suite to be executed, thus:
So individual tests and the “run boundary
marker” program can be run from a batch
file. To define a group of tests as a single
test for coverage measurement purposes
they are run consecutively without running
that program between them
13
PT - February 2012 - professi onal tester.com
Development testing

Identical statement coverage (same path of control): Tests: 2, 2a,
2b, 2c, 2d, 2e, 3, 5a
Equivalent statement coverage (same statements executed in
different order): 2, 2f, 2g, 5, 5c
Unique statement coverage (cover statements no other test does): 3,
3c, 6b
Sub Main()
Dim CommandLineArgs As
System.Collections.ObjectModel.ReadOnlyCollection(Of String) =
My.Application.CommandLineArgs
Dim themodule, theinstrumentedmodule, thecoveragefile, thisline,
probeline As String, probenum As Integer
themodule = CommandLineArgs(0)
theinstrumentedmodule = "instumented-" & themodule
thecoveragefile = "coverage.txt"
If System.IO.File.Exists(theinstrumentedmodule) Then
My.Computer.FileSystem.DeleteFile(theinstrumentedmodule)
End If
FileOpen(1, themodule, 1)
FileOpen(2, theinstrumentedmodule, 8)
thisline = ""
While Not Left(LTrim(thisline), 3) = "Sub"
thisline = LineInput(1)
PrintLine(2, thisline)
End While
probenum = 0
PrintLine(2, "FileOpen(3, """ & thecoveragefile & """, 8)")
While Not EOF(1) And Not Left(LTrim(thisline), 7) = "End Sub"
thisline = LineInput(1)
If thisline <> "" Then
probenum = probenum + 1
probeline = "PrintLine(3, """ & themodule & "--PROBE--" & probenum
& """)"
PrintLine(2, probeline)
End If
PrintLine(2, thisline)
End While
While Not EOF(1)
thisline = LineInput(1)
PrintLine(2, thisline)
End While
FileClose()
Figure 2: Output of the analysis program
Figure 1: Instrumentation program
project.exe /testcase|tc:<name
of test case>
CERN
+ Coverity
= Certainty
How static analysis is
supporting the world's
most dynamic
experiments
Tangled ROOT
CERN provides the software tools used by its physicists, many of whom are also active in
their development and maintenance. One of the tools, ROOT, is used by all 10,000 physicists
to store, analyse and visualize the many petabytes of data generated by experiments using the
LHC. In addition, each individual experiment has around 1,000 people developing programs to
record its execution and process its results, of which only a few are software specialists. At the
Coverity Exchange event in London last December, physicist Fons Rademachers described the
environment as “anarchic development” in which “we can convince people to do something
but cannot tell them to” because their only motivation is “to make software that helps them
do their work”. Multiple platforms and environments are in use and there are hundreds of
external dependencies including many to Open Source code.
Yet the quality of the tools is critical. Their failure could cause invalidation or even loss of
results and findings at immense cost. The Higgs mechanism that might lead to the discovery
of the elusive particle occurs approximately once in ten billion collisions. Risk of failing to
observe and record every detail when it does cannot be tolerated.
The quality accelerator
These challenges led CERN to implement Coverity Static Analysis. Within a week, it detected
thousands of defects in ROOT with very few false positives. Six weeks later, the defects were
all resolved. Over the next two months adoption became viral and CSA is now in continuous
use by thousands of people developing and maintaining ROOT and the experiment-specific
programs, some 50 million lines of code. Using CSA's web interface, even non software
specialists can understand defects and how to fix them easily and have clear visibility of the
quality status of their software. In a few months CERN has achieved greater accuracy and
reduced risk.
More than 1,100 organizations use development testing products from Coverity to help them
develop dependable software. This figure does not include the many Open Source projects
Coverity assists but comprises commercial customers in a large range of domains. One of
Coverity's customers, which engages with Coverity on both Open Source and proprietary
software projects, is CERN, the European Organization for Nuclear Research.
CERN is a household name because of frequent media coverage of its pioneering research in
particle physics using the world's largest and most complex scientific instruments such as the
vast Large Hadron Collider (LHC) which smashes particles together at close-to-light speeds.
In particular, the search for the Standard Model Higgs boson, a particle predicted by theory
which may or may not exist, has captured the interest and imagination of both scientists and
laypeople everywhere. CERN is also famous as the workplace of Tim Berners-Lee when he
invented the World Wide Web in 1989, originally as a collaboration tool for CERN scientists.
Detecting and removing defects while code is being written
saves time and money and makes all other development and
testing activities work better. Learn more about the benefits
of implementing or improving development testing in your
organization at http://coverity.com
Image © CERN
Image © CERN
Im
a
g
e
©
C
E
R
N
Image © CERN
Image © CERN
CERN
+ Coverity
= Certainty
How static analysis is
supporting the world's
most dynamic
experiments
Tangled ROOT
CERN provides the software tools used by its physicists, many of whom are also active in
their development and maintenance. One of the tools, ROOT, is used by all 10,000 physicists
to store, analyse and visualize the many petabytes of data generated by experiments using the
LHC. In addition, each individual experiment has around 1,000 people developing programs to
record its execution and process its results, of which only a few are software specialists. At the
Coverity Exchange event in London last December, physicist Fons Rademachers described the
environment as “anarchic development” in which “we can convince people to do something
but cannot tell them to” because their only motivation is “to make software that helps them
do their work”. Multiple platforms and environments are in use and there are hundreds of
external dependencies including many to Open Source code.
Yet the quality of the tools is critical. Their failure could cause invalidation or even loss of
results and findings at immense cost. The Higgs mechanism that might lead to the discovery
of the elusive particle occurs approximately once in ten billion collisions. Risk of failing to
observe and record every detail when it does cannot be tolerated.
The quality accelerator
These challenges led CERN to implement Coverity Static Analysis. Within a week, it detected
thousands of defects in ROOT with very few false positives. Six weeks later, the defects were
all resolved. Over the next two months adoption became viral and CSA is now in continuous
use by thousands of people developing and maintaining ROOT and the experiment-specific
programs, some 50 million lines of code. Using CSA's web interface, even non software
specialists can understand defects and how to fix them easily and have clear visibility of the
quality status of their software. In a few months CERN has achieved greater accuracy and
reduced risk.
More than 1,100 organizations use development testing products from Coverity to help them
develop dependable software. This figure does not include the many Open Source projects
Coverity assists but comprises commercial customers in a large range of domains. One of
Coverity's customers, which engages with Coverity on both Open Source and proprietary
software projects, is CERN, the European Organization for Nuclear Research.
CERN is a household name because of frequent media coverage of its pioneering research in
particle physics using the world's largest and most complex scientific instruments such as the
vast Large Hadron Collider (LHC) which smashes particles together at close-to-light speeds.
In particular, the search for the Standard Model Higgs boson, a particle predicted by theory
which may or may not exist, has captured the interest and imagination of both scientists and
laypeople everywhere. CERN is also famous as the workplace of Tim Berners-Lee when he
invented the World Wide Web in 1989, originally as a collaboration tool for CERN scientists.
Detecting and removing defects while code is being written
saves time and money and makes all other development and
testing activities work better. Learn more about the benefits
of implementing or improving development testing in your
organization at http://coverity.com
Image © CERN
Image © CERN
Im
a
g
e
©
C
E
R
N
Image © CERN
Image © CERN
As the Ancient Greek philosopher
Democritus observed, everything is in a
state of fluctuation. In today's fast-moving
IT world software systems and the people
who develop and maintain them must be
flexible. Goals, requirements and
environments change and to remain
consistent so must software, reflecting
the changing world of which it is part. It is
never finished. This is the main reason for
agile development.
Development must be able to accommo-
date change in code architecture and even
the reasons for a system to be developed
in the first place. The direction of an agile
development project may change on a
monthly, weekly or even daily basis [1].
But change is the enemy of test. To test
something requires that it remains stable
long enough to test it. A tester wants to
know that what was tested yesterday is
still valid today. The proper goal of testing
is to confirm that a given state of the
software corresponds to a given state
of the requirements. For that the software
and the requirements must remain in
state long enough to demonstrate that
their states are equivalent.
Here we will explain how this goal can
be reconciled with the goal of agile
development which is to be flexible.
Conventional testing approach
In conventional testing, empirical testing
follows development. The test is planned
in advance based on the requirements.
While the developers are producing the
code, the testers are specifying test cases
and designing test scenarios. When the
software components are finished, they
are turned over to the testers and remain
by Harry M. Sneed and Manfred Baumgartner
From test-driven to tester-driven
agile development
Testing’s role will be to inhibit
code proliferation
PT - February 2012 - professi onal tester.com
16
The future of software
predicted by
Harry M. Sneed and
Manfred Baumgartner
Development testing
several project advantages. First, it elimi-
nates the team's reliance on a single,
assigned project tester who would
otherwise constitute a major bottleneck...
The professional tester adds value not
through more testing but by writing some
of the developers' tests, thus freeing them
to code more new features…”. It is inte-
resting to note that in this project the
testing went on even when the tester
was temporarily removed from the project.
This was because by that time everyone
was involved in the testing effort. The
developers tested themselves, in do-it-
yourself mode (figure 2).
Initially, the testers did not work with
developers but alone, to make cross-
feature tests and automate customer-
defined use cases. The test manager, who
was not a member of the team, demanded
this separation for two reasons. First, that
testers should be independent of
programmers to prevent implementation
details from affecting their test designs.
This is referred to as the contamination
argument [5]: that a tester cannot write
adequate tests if a programmer, who is
biased, dictates what the testing
requirements are. The second reason is
the verification argument [6], that one of
the goals of testing is to verify that the
product meets its specification, therefore
testers must create tests according to
approved specifications, not according to
what someone has told them informally
or unofficially.
These two arguments represent the main
reasons many QA professionals object to
agile testing. The counter argument is not
that the arguments are wrong, but that to
ignore them is better than to enforce them.
Enforcing them would mean bringing the
project to a grinding halt due to lack of
testing capacity. Accepting them as a fact
of life may weaken the validity of the test,
but not enough to endanger the project.
Some published reports on testers'
experience in agile teams support this
counter argument, for example [7].
To ignore the two arguments the tester
must change his or her mindset and
in a steady state as long as they are being
tested. If a new version of the software is
delivered in the meantime, the testers put
it on standby status until they reach their
testing goals for the current version.
This approach is based on the waterfall
model. As long as the interval between
versions is long enough it can be justified.
If versions are delivered faster than they
can be tested, testing becomes a bottle-
neck. For that reason such separation
between development and test is not
possible in an agile project.
Tester in the team approach
Agile development has created a new
situation which has to be dealt with in
another way. It is no longer possible to
wait until the developers are finished to
begin the integration and system testing.
The system has to be tested while it is in
development. One solution is to put the
tester or testers in the development team.
They sit next to the developers, testing
components as soon as they have been
compiled (see figure 1).
Do-it-yourself test-driven approach
Another solution is to have the developers
test. The tester is a test adviser helping
the developer to set up test environments,
design tests and above all validate test
results. One could extend the testing
responsibility to analysts and designers
as well. This is the approach taken by the
Israeli Air Force in their agile development
projects [4]. They claim that in their project
everyone is involved in test: developers,
business analysts and customers. The
testers are only there to help them and to
ensure they do their work properly. The
authors state “having everyone test offers
abandon some principles. The tester must
see that continuous interaction with the
developers is better than withdrawing into
a corner and writing tests based on one's
own interpretation of the requirements
specification: especially if, as is always
the case in agile environments, that lacks
detail and rather is implied by stories told
by the “user in the team” which are subject
to continual change and documented
informally if at all. The only accurate, up-to-
date description of what the system does
is the code. Therefore, agile testing as
practiced by the Israeli Air Force was not
about comparing documents and system
behaviour with those documents, but
human judgment of what was appropriate
for each situation that arose.
According to the Israeli authors the key
was communication. Everyone, including
the testers, must sit in the same room.
Everyone participated in planning,
everyone was present when the stories
were told and everyone attended daily
stand-up meetings. In this way everyone
knew the strengths and weaknesses of
colleagues and could sense where to
look for defects. Test decisions were
based more on human intuition than on
written documents. When a defect was
detected they knew who was responsible
for it and got him or her to fix it with mini-
mum bureaucratic administration. The
goal was to find as many defects, in as
little time, as possible and for that intimate
collaboration by all team members was
necessary. This testing together with the
tester as an adviser to others proved very
effective. The number of defects remaining
in the software after release was four times
fewer and completion was reached faster
than in comparable projects using a
conventional testing approach [8].
Tester-driven approach
There is a large body of literature on
test-driven development. It is well
summarized in a book by Kent Beck, the
father of extreme programming, of which
it is a pillar [9].
TDD is a method developers use when
coding. Before beginning to code a
17
PT - February 2012 - professi onal tester.com
Development testing

Software
Developers
Tester
Requests
Develops
User in the Team
The tester runs after the developers
to test what they have produced as
soon as it has been developed
method, test cases for it are devised.
These can be coded in a separate test
driver class or built into the class
containing the method to be tested. Now
code intended to pass the test cases is
written. In this way the developer can test
one method at a time and then combi-
nations, at first with one object state and
then with others. The goal should be
to test all relevant combinations of
methods with all relevant representative
states of the object type under test [10].
In test-driven development the developer
tests. There is no role for a tester.
Tester-driven agile development carries
that principle to a higher level. The test
objects are not classes but components,
ie packages and subsystems. One or more
professional testers drive the project. Their
first task, after the system architecture is
designed but before coding begins, is to
set up an integration test environment
based on that design. An essential
element is the specification of the
interfaces between subsystems and
components and between the system
and users, which are implemented as web
services, XML batches or mocked-up user
interfaces, for example HTML forms. The
testers can then specify test cases and
generate test data based on these
interface definitions.
Only at this point are the developers
involved. The testers assign them
components to implement. When one is
ready it is turned over to the testers who
build it into the integration test environ-
ment and start testing it. Components are
integrated one after another so change to
requirements can be accommodated
easily. The project is agile but its pace is
determined by testing not by development,
in tune with the Kanban method [11].
Many project effort distribution studies
assert that testing takes the greatest part
of project effort. If that is true then
according to the principles of Kanban
testing should be started first. The testers
should be the leaders and the developers
the followers. Instead of running after the
developers trying to test what they
produce as soon as possible and falling
farther and farther behind in an attempt
to do so, the testers determine what is
to be delivered next, on the basis of
testing expediency, and order it from
the developers, who act as suppliers
to the testing project.
The user in the team works with the
testers and system architects but only
indirectly with the developers. First the
user must get the architect to either
take over or construct a suitable system
framework. Often one will already be avail-
able. Once it is in place the user can begin
to formulate requirements. The stories are
told to the testers who decide how best to
implement them. In many cases it will be
possible to implement them with existing
software or with a cloud service: then
there would be no need for developers.
Only if some function is absolutely unique
to the target application will developers be
asked to develop code to implement it.
The goal of TDAD is to develop as little as
possible and reuse as much as possible.
It may be that some services have to be
wrapped in order to fit into the target
environment and would constitute an
order to the developers. So the developers
may get orders to produce some entirely
new software but more often they will
get orders to adapt existing
software (figure 3).
There is already far too much code in the
world. Maintaining and improving it costs
a lot and ties down valuable resources
leading to a worsening shortage of skilled
personnel. The only way to get out of this
legacy code trap is to produce less code.
Extreme programming and agile non-
tester-driven development are certainly
not ways to achieve that. Given the
freedom that extreme programming grants
them developers inevitably produce more
code than is necessary. No self-respecting
programmer would ever miss the oppor-
tunity to prove his competence on some
hairy problem for which there are already
dozens of other solutions. We could hardly
PT - February 2012 - professi onal tester.com
18
Development testing
Figure 1: The tester in the team
Own software
Standard
services
Standard
services
Developers
Requests
Integrates
develops
User
Tester as
integrator
& tests
Software
Developers
Develops
User in the Team
The tester works next to the developers to
advise them on testing themselves
Tester as
adviser
& tests
19
PT - February 2012 - professi onal tester.com
expect him to look for them, since his
main purpose in life is to solve hairy
problems himself.
The goal of a software project should not
be solving the user's problem as quickly
as possible. It should be solving it with the
least new, untested code possible: in other
words, to deliver the best and longest-term
solution possible. Every additional code
unit produced by a user organization is
like a mortgage whose interest is its
annual maintenance which accumulates
and increases over time. Producing more
and more new code is as destructive as
producing more and more superfluous
consumer goods.

But will developers be willing to accept this
role and will testers have the ability to take
on responsibility for assembling a system?
This is mainly a question of training and
conditioning. In agile projects as they are
currently conducted developers are asked
to transform the wishes of the user into
executing code as quickly as possible.
Testers are trained and conditioned to
run after the developers and to clean up
whatever mess they create. They are kept
eager to test anything developers decide
to give them, in any state and of
any quality.
TDAD calls for these roles to be reversed.
Developers must be convinced that it is
in their best interest to produce as little
new code as possible because every line
they do not produce is a code line that
they will not have to maintain. Users must
be made to understand that it is in their
best interest to depend on as little new
code as possible.
Testers must grow into their new role as
system integrators: practicing the search
for and testing of available services then
persuading the user to adjust require-
ments to make them acceptable. If the
user refuses, testers commission
developers to build a wrapper around
suitable services with the objective of
using as much of their existing
functionality as possible. Only if there is
evidently no suitable ready-made solution
should the testers consider contracting
developers to produce a new one.
The role of the testers involves the same
considerations as in any environment:
Development testing

Figure 2: Do-it-yourself test-driven approach
Figure 3: Tester-Driven Agile Development
Requests
Harry Sneed (http://harrysneed.de) is a software development consultant, trainer,
published author and conference speaker. Manfred Baumgartner is head of
software testing at ANECON (http://www.anecon.com). The authors wish to thank
Rudolf van Megen, CEO and co-founder of SQS (http://sqs.com), for his original
suggestion of combining cloud services and agile development principles which
was their inspiration for this article
Development testing
PT - February 2012 - professi onal tester.com
20
functional, non-functional and security-
related [12]. When an appropriate service
is found, acceptance testing is carried out.
What is accepted and new components
required because of what is not accepted
are integrated iteratively into the required
whole and that integration is tested.
Should an integration problem occur the
testers specify an appropriate solution
and have the developers implement it.

The target system emerges component-
by-component until it has reached a state
that the user can begin to use. After it has
gone into productive use it can continue
to grow by addition of more functionality,
as far as possible using ready-made cloud
services. The production system will be
a hybrid, comprising as little locally-
developed code as possible. The cost of
developing and maintaining that will be
small and will tend to fall. Users and
developers will be freed from the burden
of legacy code.
TDD has already proven its merit [13].
TDAD is a logical continuation that defines
a new supply chain. The user gives re-
quirements (tells stories) to the test/inte-
gration team (minimum two members)
who search for suitable services in service
libraries, eg http://uddi.xml.org. If they find
a candidate they integrate it then test it.
If no candidate passes they commission
code necessary to meet requirements then
test it just as they would a foreign service.
They treat services of both origins equally:
both must be integrated into the system
framework then tested.
It goes without saying that the framework
must be tested. This may take time but is
essential. Once it is done the framework
can be filled with application-oriented
functions that deliver user requirements
very quickly and these can be exchanged
and added to very easily. At that point
managing application evolution will
become cheap
[1] Mugridge, R.: Managing Agile Project Requirements with Storytest-Driven
Development, IEEE Software Magazine, Jan. 2008, p. 68
[2] Crispin, L., Gregory, J.: Agile Testing: A practical Guide for Testers and agile Teams,
Addison-Wesley, Boston MA, 2008
[3] Sneed, H., Majoros, M.: Testing Programs Against a Formal Specification, Proc. of
IEEE COMPSAC-83 Conference, IEEE Computer Society Press, Chicago, 1983, S.
512
[4] Talby, D., Keren, A, Hazzan, O., Dubinsky, Y.: Agile Software Testing in a Large-
Scale Project, IEEE Software Magazine, July 2006, p. 30
[5] Stephens, M., Rosenberg, D: Extreme Programming Refactored – The Case
Against XP, Apress Publishers, Chicago, 2003
[6] Kuhn, R., Kacker, R.: Combinatorial Software Testing, IEEE Computer Magazine,
August, 2009, p. 94
[7] Crispin, L., Extreme Rules of the Road – How an XP Tester Can Steer the Project
Toward Success, STQE Magazine, July 2001, p. 24
[8] Dubinsky, Y.: Agile Metrics at the Israeli Air Force, Proceedings of Agile 2005
Conference, IEEE Press, May 2005, p. 12
[9] Beck, K.: Test-Driven Development by Example, Addison-Wesley, Boston, 2003
[10] Zhang, Y., Patel, S.: Agile Model-Driven Development in Practice, IEEE Software
Magazine, March, 2011, p. 84
[11] Poppendieck, M., Poppendieck, T.: Implementing Lean Software Development,
Addison-Wesley, Boston, 2006
[12] Tsai, W.T., Zhou, X., Chen, Y: On Testing and Evaluating Service-Oriented
Software, IEEE Computer Magazine, August, 2008, p. 40
[13] Shull, F. et al.: What Do We Know about Test-Driven Development?, IEEE Software
Magazine, Dec. 2010, p. 16
More testers and developers should use
more development testing tools. Not doing
so means ignoring opportunities to make
testing, whoever does it, better and easier.
Of course there are valid reasons why
some do not or cannot, but all should be
sure not to make the most common
mistake: assuming one understands all the
tools can do from the too-simple
explanations in basic testing books and
syllabuses. Things have moved on a great
deal in recent years. In this article I will
present some of the key benefits available,
aiming to make you want to take a fresh
look at them.
What makes static analysis so good?
Static analysis is a broad term used for
many activities done for many different
reasons. The thing they all have in
common is that they involve scanning (ie
having a program examine) source code.
This is very fast and cheap and detects
critical defects. It achieves 100% coverage
and its results are 100% objective. It's
almost impossible to argue against doing
it continually.
One way to try is to point to the false
positives issue. Even the most advanced
tools sometimes flag things that don't
need to be fixed, so time is wasted
investigating and discussing them. But
all the information needed to do that
correctly is at hand: it's a purely technical
job whose risk is close to zero and whose
by Boguslaw Czwartkowski
Static dynamism
The latest development testing tools blur
distinctions between test types
21
PT - February 2012 - professi onal tester.com
Development testing

Boguslaw Czwartkowski
presents the argument
for more DT
formatting, which are still important
because they can cause human error and
have a negative impact on maintainability.
PBSA tools are usually integrated with
the IDE where they can perform analysis
continually (similar to as-you-type spell-
checking in a word processor) or on
demand and provide easy-to-use facilities
to view, understand and fix (sometimes
automatically) the issues they highlight in
the same way as a debugger helps with
simple syntax or runtime errors. They
can be used to promote and enforce
architectural and design rules and
decisions as well as coding ones: for
example some developers stipulate that
common or core components should not
have dependencies on domain-specific
or specialized ones.
cost is insignificant compared with the
ever-present burden of having to argue
with biased people about the meaning
of results of empirical testing of what
they have produced based on
ambiguous information.
Even so, it's important to work on
reducing the false positive rate, because
it can have a negative effect on the
attitude of developers to actual defects.
The best tools are easily configurable to
suppress reporting of defined issue types
so become quieter as they are adapted to
the organization's or team's way of coding.
Another approach, favoured by some, is to
change coding practices to prevent false
positives even though doing so is not
strictly necessary: there are almost always
alternative ways to code the same thing
just as well. This “spotless code” policy
reduces risk of ignoring something
incorrectly.
Pattern-based analysis
This type of static analysis looks for
patterns that violate defined coding rules,
also known as “checkers”. A set of these
rules can be called a policy, a standard or
best practice guidelines, depending on how
rigorously they are defined and enforced,
which usually depends on the criticality of
the software being produced. Many
organizations maintain their own set or use
external ones, for example the JavaServer
Faces (JSF) specification or the Motor
Industry Software Reliability Association C
standard (MISRA-C). These would be very
difficult to enforce without a pattern-based
analysis tool so it is essential to any
organization whose products must comply
with them, but it also detects other defect
types including serious ones such as
resource leaks, performance and security
issues, logical errors and misuse of APIs
as well as less serious ones such as
violation of naming conversions and text
Flow analysis
Despite its name this is another static
analysis technique: it means finding and
analyzing the various paths that can be
taken through the code, both by “control”
(ie the orders in which lines can be
executed) and by data (ie the sequences
in which a variable or similar entity can be
created, changed, used and destroyed).
Whereas pattern-based analysis highlights
code that in some way violates a rule, flow
analysis finds combinations of input that
will cause failure and displays them along
with the path they will cause to be
executed and information about why it
is undesirable (see figure 1), detecting
different types of defect. Many of these
types will lead to system-level errors,
hangs or crashes that many people
wrongly assume can be prevented only
PT - February 2012 - professi onal tester.com
22
Development testing
Figure 1: Flow analysis results
Figure 2: Metrics analysis results
PT - February 2012 - professi onal tester.com
24
executed or very-multi-purpose code
which should be taken into account when
defining scope, choosing techniques and
setting exit criteria.
Further static analysis
Static analysis tools can also find other
types of defect including unreachable
and duplicated code. Neither of these
can in itself cause failure, although they
sometimes indicate that an error, often
of logic or configuration management,
has occurred that may also have
introduced other defects that can. In any
event they should be eliminated because
they make empirical testing, especially
where dynamic coverage analysis is
used, more difficult and prone to error.
What static analysis cannot do
Static analysis provides information to
help predict what may happen when
code is integrated and executed. It de-
tects defects, according to however its
user defines what is and is not a defect,
but cannot demonstrate failure. That can
be achieved only by executing the code
under test.
So how do we define what is and is not
a failure? A common definition is
deviation from expected behaviour. In
other words, the system under test or
in production does not do what it is
expected to do, or does something it
is not expected to do.
The challenge is observing the
unexpected behaviour. For example
a transaction may appear to proceed
correctly to a user (or tester or test
execution tool) whereas in fact a
component has thrown an unhandled
exception and failed to process it
correctly. A control system may respond
quickly and correctly under test for 3
days yet be leaking memory and heading
for a crash on its 4th day in production.
Fixing all defects detected by static
analysis gives no assurance against
other defects that will cause failures
like these.
This makes it important to apply the
definition of failure to internal as well as to
external behaviour, even after integration.
The internal failure must be detected
before it manifests itself externally.
At Parasoft we refer to this activity as
automated runtime error detection (RED),
a form of dynamic analysis. Some weak
definitions of testing types break down
here: it is analytical testing in that the
intention is to examine the test item rather
than exercise it. It is white box testing in
that we examine internal rather than
external behaviour. Yet the code under
test must be executed and that is done
by running the same black box tests used
for dynamic testing.
RED detects and reports internal failure
at the instant it occurs, so it is easy for
the tester to correlate it exactly with test
actions for incident reporting. Like good
static analysis, it provides full technical
details to enable the developer to isolate
and fix the underlying defect. RED
extends the capability of empirical testing
at all levels, from unit to acceptance, by
making it able to detect internal failure
that indicates that otherwise unobservable
external failure has occurred, or will occur
after testing has stopped
Development testing
Boguslaw Czwartkowski is professional services manager at Parasoft
(http:/parasoft.com)
by dynamic testing, for example memory
corruptions (buffer overwrites), memory
access violations, null pointer
dereferences, race conditions or
deadlocks. Flow analysis can also detect
security issues by pointing out paths that
bypass code that performs, for example,
authentication or encryption.
Because in most non-trivial programs
there are many possible paths, flow
analysis needs more computing power
and takes longer than pattern-based
analysis, but nowhere near as much
as empirical testing. Even with modest
resources it's usually possible to achieve
100% coverage. More importantly, flow
analysis computes the combinations of
ranges of input that will cause failure
rather than hoping to hit on them by luck
as empirical testing does, greatly reducing
the risk of missing severe defects.
Flow analysis also tends to give more false
positives than pattern-based analysis,
usually because of uncertain assumptions
that must be made about the behaviour of
external systems and services, proprietary
third-party libraries for which source code
is not available etc.
Metrics analysis
Measuring and visualizing various aspects
of code can help detect current defects,
but more often it warns of potential
difficulty in preventing and detecting
future defects when code is maintained,
by finding complexity and unwieldiness
such as overly large components,
excessive nesting of loops, too-lengthy
series of decisions and convoluted inter-
component dependencies (see figure 2).
This information of course is useful only
if acted upon in time, so metrics analysis
needs to be done right from the beginning
of a project and used to trigger and inform
refactoring whenever needed. It's less
effective in projects that use a lot of
overcomplex legacy code: the horse
has already been allowed to bolt.
Metrics analysis can also provide useful
information to influence test strategy, by
identifying very critical, frequently-
maintenance. More work is needed to analyse
these and consolidate testing into them too.
Terminology
“Consolidated” refers to a closely-bound
combination of a robust development
process with testing ideas to create a testing
process which is manageable by the use of
checkpoints. The V model is only a model,
not a process, and is too often cited,
explained and even implemented using
poorly-defined terms that mean different
things to different people. The CTP is the
V model but uses defined terms from
established standards.
Weak terminology has always dogged testing
and continues to do so. The list of useless
words I hate includes “smoke test” (a joke
describing what electronics hobbyists do to
their projects) and “white box” (you can't see
into it any more than you can a black one).
The CTP is an attempt to move away from the
ambiguous and towards the defined. The key
message is that all project information is open,
available and intelligible to all involved. It is
especially critical that test results such as
coverage and issue reports are clearly
defined, measured, graded and balanced
against other project data such as risks and
benefits. Nothing should be obscured, whether
deliberately or inadvertently, for example due
to incompetent use of words. Testers must use
only formal and defined terminology.
Requirements
Both development and testing, at all levels,
absolutely require detailed lists of attributes
to be tested, expressed in such way that they
can be assigned a binary (yes or no) result.
This can be complicated for non-functional or
content-related requirements so these must
be analysed and expanded until it becomes
simple. For example “the system must be
very responsive” breaks down into testable
attributes such as “when the user has chosen
a valid file 10MB in size to be processed, no
more than 5 seconds must elapse between
the user pressing the return key and the result
being displayed”. The failure described by
Hans Schaefer in the December 2011 issue
of PT (see http://professionaltester.com/files/
PT-issue12.pdf, page 20), where a help file
displayed text in the wrong language, should
The conclusion of Professional
Tester’s exclusive publication of
Geoff Quentin’s formal,
rigorous, standards-based
method for testing any new
software-based system
have been avoided by specifying and testing
something like “after the user has selected
German mode, all words displayed must be in
the list http://busintranet/resources/
dictionary/german.pdf until either (i) the user
selects a different language mode; (ii) the user
logs out; or (iii) the user's session times out”.
Vague statements that can give rise to difficult
non-functional requirements can often be
converted to functional ones: for example “the
system must be appropriately secure” may be
proposed to mean “all data must be 128-bit
AES encrypted before it is stored or
transmitted by the system”.
Standards and reviews
Understanding the purpose and proper
conduct of acceptance testing eventually
leads to the logical conclusion that the most
important sentence in a specification of
requirements document is “The product
must pass all acceptance tests devised by
the acceptance test team”. This places great
responsibility upon that team: to test all its
tests and to ensure that the tests represent
what is truly required and acceptable. In order
to rise to this challenge testers must:
by Geoff Quentin
CTP7
Part seven – key points and
messages of the Consolidated
Testing Process
25
For everyone's sake – and I do mean
everyone – testing must no longer be
considered an optional extra. Putting an end
to this dreadful mistake is the purpose of the
CTP. It aims to achieve it by showing how a
mature test process can be consolidated into
a mature development process.
Process maturity is a critical issue for me. As
a trainer I have spent several decades repea-
tedly introducing, examining and justifying
what I understand about testing then working
with many project teams trying to apply those
ideas to their varied projects. That has left me
in no doubt that to succeed the test process
must be as mature as the development pro-
cess and vice versa. Much has been made
of “test process improvement” but that is
pointless without a formal development
process including integrated test processes.
ISO/IEC 12207 is a very mature and scalable
development process with close alignment to
many other standards. This was my reason for
choosing it to be the starting point of the CTP.
It does not describe development alone: it
goes on to provide processes for support and
The Consolidated Testing Process
PT - February 2012 - professi onal tester.com
The key points of the CTP support this
endeavour.
Traceability
The attributes to be tested need to be
applicable to the level where they occur.
For example the maximum response time for
acceptance expressed above may give rise
to file system retrieval and screen refresh
know and apply all the standards that
relate to software testing
draw on existing standards as appropriate
and enhance the standards where
ambiguity exists
enhance and develop further standards
within the framework
ensure that all testing work is done within
an agreed framework and to agreed
standards established at project start
ensure that testing activities and
materials are reviewed thoroughly at
each project milestone
have a good understanding of all
development processes appropriate
within the framework.
and neglect more difficult areas such as ease
of use and security. They should consider more
frequently a possible future scenario in which
they are required to explain from a witness box
how these attributes were handled at all stages
of their project. The CTP aims to encourage
and provide a basis for visibility that will allow
them to do so and to show that all testing done
was cost-effective, appropriate and accountable.
It can allow everyone involved to be confident
at all times that what is being developed will
be acceptable.
Testing in all disciplines
We are often told that the number of testers
is growing fast. For those of us who believe in
speeds at unit test, interface throughput speeds
at integration test and performance-under-load
profiles at system test. But all must form part of
and be easily traceable to the requirement for
acceptance and the business objective it
supports. A programmer required
to demonstrate that a component meets given
criteria must be able to check that each
criterion exists for a purpose. It is the unique,
and much discussed, nature of software that
every fragment of source code contributes to
the product. All code must have a known
purpose and be checked to see it achieves it.
Then tests must be devised which can
demonstrate that check has been done.
Business and project leaders
The challenge for people whose view is too
far removed from detailed code is to avoid
specifying subjective, untestable or otherwise
useless attributes. Those in senior positions are
too often allowed to concentrate on what
is easy, for example performance and capacity,
testing that seems at first sight a good thing,
and of course it's partly attributable to the
evolution and spread of technology. But how
many of those testers are needed for the
wrong reasons: to shore up badly-run projects
or replace proper process with thankless
hard work hunting for defects whose
introduction should not have been allowed?
Following the CTP properly requires that
every analyst, architect, designer and
programmer must have testing as part of his
or her job specification. In the future perhaps
people will become professional testers first
then move into one of these roles, rather than
moving in the opposite direction or being
seen as an optional addition to them
Geoff Quentin was chairman of the British Computer Society Special Interest Group in
Software Testing at its foundation in 1989, is author of The Tester's Handbook and
many seminal training courses, won the European Testing Excellence Award in 2006
and founded, with his wife Caroline, this magazine. The previous parts of this series are
in the March, May, July and November 2010 and June and October 2011 issues. All are
available free in the archive at http://professionaltester.com
The Consolidated Testing Process
Did you know that
AddQ work with
Telefon 08-501 108 90. Fax 08-501 108 91
info@addq.se www.addq.se
consultants
business development
requirements analysis
test framework
education
business value
agile
test automation
efficiency
test management
quality assurance
Develop your career
in software testing
With BCS, The Chartered Institute for IT
We support the crucial role software testers play in
delivering reliable and e￿ective IT. Our internationally
recognised software testing certificates:
• draw on internationally accepted best practice
• are structured to provide ongoing career development
• provide industry recognition of skills and experience
Get certified and further your career at:
www.bcs.org/softwaretesting
MTG/AD/1058/0811
© BCS, The Chartered Institute for IT, is the business name of The British Computer Society (Registered charity no. 292786) 2011
1059_professional_tester_hp_ma_Layout 1 15/08/2011 09:30 Page 1
For more information, visit www.coverity.com/development-testing
Development testing is an emerging category, including a set of
processes and software, such as static analysis, designed to help
developers, management, and the business easily ￿nd and ￿x quality and
security problems early in the development cycle, as the code is being
written, without impacting time to market, cost, or customer satisfaction.
Development testing augments traditional testing, including QA
functional and performance testing and security audits, providing
development teams with a quick and easy way to test their code for
defects in a non-intrusive manner, so development stays focused on
innovation, management gets visibility into problems early in the cycle
to make better decisions, and the business continues to deliver high
quality products to market for competitive advantage.
MANAGEMENT
Increases visibility for better decision
making, creates a predictable release
process (on-schedule, on-budget), reduces
costly support and production issues
downstream, and provides consistent
measurement of teams against common
metrics to track improvement over time.
SECURITY
Creates a more efficient security audit
process by eliminating a portion of security
issues upfront, and helps focus testing and
remediation efforts on the problems that
require their expertise.
QUAlITY ASSURANCE
Creates a more efficient QA testing process
by receiving a higher quality build from the
start, reduces wasted testing time due to
buggy code, and focusses testing efforts on
the problems that require their expertise.
DEvElopMENT
Finds hard-to-spot defects in code,
provides guidance to help fix problems
fast – without requiring subject-matter
expertise, reduces the time spent on
re-work and de-bugging, and enhances
skills by showing developers where they
are most susceptible to error.
Development Testing
Management
Quality Assurance
Development Security
DEvElopMENT TESTING =
faster time to market + reduced cost
+ greater customer satisfaction and brand equity
+ increased visibility and traceability
+ improved cross team collaboration + less risk
Presented by
In partnership with
Development
Testing

Why testers should hate
finding defects.

Join the webinar on
Wednesday 21st March
@ 14.00 UK time
www.coverity.com/ptwebinar
Coverity_Ad_AW_Final_A4_REVISED2.indd 1
19/01/2012 16:19