Model-Driven Test Design

learningsnortSecurity

Nov 3, 2013 (3 years and 7 months ago)

77 views

Model
-
Driven Test Design

Jeff Offutt

Professor, Software
Engineering

George Mason University

Fairfax, VA USA

www.cs.gmu.edu/~offutt/

offutt@gmu.edu

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

2

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

IDGA Military Test & Evaluation Summit

© Jeff Offutt

3

Here! Test This!

MicroSteff



big

software system

for the
mac

V.1.5.1
Jan/2001

Verdatim

DataLife

MF2
-
HD

1.44 MB

Big software program

Jan/2007

My first “professional” job

A stack of computer printouts

and no documentation

IDGA Military Test & Evaluation Summit

© Jeff Offutt

4

Cost of Testing


In the real
-
world, testing is the
principle post
-
design activity


Restricting early testing usually
increases cost


Extensive hardware
-
software integration requires
more

testing

You’re going to spend at least half of
your development budget on testing,
whether you want to or not

IDGA Military Test & Evaluation Summit

© Jeff Offutt

5

Part 1

: Why Test?


Written test objectives

and requirements
are rare


What are your planned coverage levels?


How much testing is
enough
?


Common objective


spend the budget



If you don’t know
why

you’re conducting a
test, it won’t be very helpful

IDGA Military Test & Evaluation Summit

© Jeff Offutt

6

Why Test?


1980: “The software shall be easily
maintainable



Threshold
reliability

requirements?


What fact is each test trying to
verify
?


Requirements

definition teams should include
testers!

If you don’t start planning for each test when the
functional requirements are formed, you’ll never
know why you’re conducting the test

IDGA Military Test & Evaluation Summit

© Jeff Offutt

7

Cost of
Not

Testing


Not

testing is even
more expensive


Planning for testing after development is
prohibitively
expensive


A test station for circuit boards costs
half a
million

dollars …


Software test tools cost less than
$10,000

!!!

Program Managers often say:
“Testing is too expensive.”

Testing & Gov Procurement


A common government model is to develop
systems


Usually by
procuring components


Integrating

them into a fully working system


Software

is merely one component


This model has many
advantages




But some
disadvantages




Little control over the
quality

of components


Lots and lots of really
bad software


21
st

century

software is more than a “component”


Software is the
brains

that defines systems
core behavior


Government must impose testing requirements

IDGA Military Test & Evaluation Summit

© Jeff Offutt

8

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

9

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

Software Testing

Academic

View



1970s and 1980s
: Academics looked almost
exclusively at
unit testing


Meanwhile
industry & government

focused almost
exclusively on system testing


1990s

: Some academics looked at
system

testing,
some at
integration

testing


Growth of
OO

put complexity in the interconnections


2000s

: Academics trying to move our rich
collection of ideas
into practice


Reliability requirements in industry & government are
increasing

exponentially

IDGA Military Test & Evaluation Summit

© Jeff Offutt

10

Academics and Practitioners


Academics

focus on
coverage criteria

with strong
bases in theory

quantitative

techniques


Industry

has focused on human
-
driven, domain
-
knowledge based,
qualitative

techniques


Practitioners
said “
criteria
-
based coverage is too
expensive



Academics

say “
human
-
based testing is more expensive
and ineffective


IDGA Military Test & Evaluation Summit

© Jeff Offutt

11

Practice is going through a revolution in what
testing means to the success of software products

My Personal Evolution


In the ’
80s
, my PhD work was on
unit testing


That is all I saw in testing


In the ’
90s

I recognized that
OO

put most of the
complexity into software connections


Integration testing


I woke up in the ’
00s




Web apps

need to be very reliable and system testing is
often appropriate


Agitar

and
Certess

convinced me that we must
relax
theory

to put our ideas into practice

IDGA Military Test & Evaluation Summit

© Jeff Offutt

12

Testers
ain’t

mathematicians !

IDGA Military Test & Evaluation Summit

© Jeff Offutt

13

They’re teaching a new way

of plowing over at the Grange

tonight
-

you going?

Naw
-

I already
don’t plow as good
as I know how...

“Knowing is not enough, we must apply. Willing is not enough,
we must do
.”



Goethe

Tech Transition in the 1990s

Failures in Production Software


NASA’s Mars
lander
, September 1999, crashed due to a
units integration fault

over
$50 million US !


Huge losses
due to web application failures


Financial

services : $6.5 million per hour


Credit card sales

applications : $2.4 million per hour


In Dec 2006,
amazon.com’s

BOGO

offer turned into a
double discount


2007 : Symantec says that most
security vulnerabilities
are
due to faulty software


Stronger testing
could solve most of these problems

IDGA Military Test & Evaluation Summit

© Jeff Offutt

14

World
-
wide monetary loss due

to
poor software
is
staggering

Testing in the 21st Century


We are going through a dramatic
time of change


Software defines
behavior


network routers, finance, switching networks,


other infrastructure


Today’s software
market

:


is much
bigger


is more
competitive


has more
users


The way we
use

software continues to expand


Software testing
theory

is very advanced


Yet
practice

continues to lag

IDGA Military Test & Evaluation Summit

© Jeff Offutt

15

Industry & government
orgs are going through
a revolution in what
testing means to the
success of software
products

Testing in the 21st Century


The
web

offers a new deployment platform


Very
competitive

and very
available

to more users


Enterprise

applications means bigger programs, more users


Embedded

software is ubiquitous … check your pockets !


Paradoxically, free software
increases

our expectations !


Security

is now all about software faults


Secure

software is
reliable

software


Agile

processes put enormous pressure on testers and
programmers to test better

IDGA Military Test & Evaluation Summit

© Jeff Offutt

16

No theory, research and little practical knowledge
in how to test integrated
software systems

How to Improve Testing ?


We need more and better
software tools


A stunning
increase in available tools

in the last 10 years!


We need to adopt
practices and techniques
that lead
to more
efficient

and
effective

testing


More
education


Different
management

organizational strategies


Testing / QA teams need to
specialize
more


This same trend happened for
development

in the 1990s


Testing / QA teams need more
technical expertise


Developer

expertise has been increasing dramatically

IDGA Military Test & Evaluation Summit

© Jeff Offutt

17

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

18

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

Test Design in Context


Test Design

is the process of designing
input values that will effectively test
software


Test design is one of
several activities

for testing software


Most
mathematical


Most
technically

challenging


This
process

is based on my text book
with Ammann,
Introduction to
Software Testing

IDGA Military Test & Evaluation Summit

© Jeff Offutt

19


http://www.cs.gmu.edu/~offutt/softwaretest/

Types of Test Activities


Testing can be broken up into
four

general types of
activities

1.
Test Design

2.
Test Automation

3.
Test Execution

4.
Test Evaluation


Each type of activity requires different
skills
, background
knowledge
,
education

and
training


No reasonable software development organization uses the
same people for requirements, design, implementation,
integration and configuration control

IDGA Military Test & Evaluation Summit

© Jeff Offutt

20

Why do test organizations still use the same people
for all four test activities??

This
clearly
wastes

resources

1.a) Criteria
-
based

1.b) Human
-
based

Summary of Test Activities


These four general test activities are quite
different


It is a poor use of
resources

to use people inappropriately

IDGA Military Test & Evaluation Summit

© Jeff Offutt

21

1a.

Design

Design test values to satisfy engineering goals

Criteria

Requires knowledge of discrete math, programming and testing

1b.

Design

Design test values from domain knowledge

and intuition

Human

Requires knowledge of domain, UI, testing

2.

Automation

Embed test values into executable

scripts

Requires knowledge of scripting

3.

Execution

Run tests on the software and record the results

Requires very little knowledge

4.

Evaluation

Evaluate results of testing,

report to developers

Requires domain knowledge

Model
-
Driven Test Design


Steps

IDGA Military Test & Evaluation Summit

© Jeff Offutt

22

s
oftware
artifacts

model /
structure

test
requirements

refined
requirements /
test specs

input
values

test
cases

test
scripts

test
results

pass /
fail

IMPLEMENTATION

ABSTRACTION

LEVEL

DESIGN

ABSTRACTION

LEVEL

mathematical
analysis

criterion

refine

generate

prefix

postfix

expected

automate

execute

evaluate

test
requirements

domain
analysis

MDTD


Activities

IDGA Military Test & Evaluation Summit

© Jeff Offutt

23

s
oftware
artifact

model /
structure

test
requirements

refined
requirements /
test specs

input
values

test
cases

test
scripts

test
results

pass /
fail

IMPLEMENTATION

ABSTRACTION

LEVEL

DESIGN

ABSTRACTION

LEVEL

Test Design

Test
Execution

Test
Evaluation

Raising our abstraction level makes

test design MUCH easier

Here

be

math

Test Design

Using MDTD in Practice


This approach lets
one test designer
do the math



Then traditional
testers

and
programmers

can do
their parts


Find values


Automate the tests


Run the tests


Evaluate the tests

IDGA Military Test & Evaluation Summit

© Jeff Offutt

24

Testers
ain’t

mathematicians !

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

25

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

GMU’s Software Engineering Lab


At GMU’s
Software Engineering Lab
, we are
committed to finding
practical

solutions to
real

problems

IDGA Military Test & Evaluation Summit

© Jeff Offutt

26

Useful research in making better software


We are
inventing

new testing solutions


Web applications

and web
services


Critical

software


Investigating

current state of the practice and
comparing

with available techniques

Mismatch in Needs and Goals


Industry & contractors

want
simple

and
easy

testing


Testers with no background in
computing

or
math


Universities

are graduating
scientists


Industry

needs
engineers


Testing needs to be done more
rigorously


Agile

processes put lots of demands on testing


Programmers

have to do
unit

testing


with no training,
education or tools !


Tests are key components of
functional requirements



but who builds those tests ?

IDGA Military Test & Evaluation Summit

© Jeff Offutt

27

Bottom line result

lots of crappy software

How to Improve Testing ?


Testers need more and better
software tools


Testers need to adopt
practices and techniques
that
lead to more
efficient

and
effective

testing


More
education


Different
management

organizational strategies


Testing / QA teams need more
technical expertise


Developer

expertise has been increasing dramatically


Testing / QA teams need to
specialize
more


This same trend happened for
development

in the 1990s

IDGA Military Test & Evaluation Summit

© Jeff Offutt

28

Quality of Industry Tools


A recent evaluation of
three industrial

automatic unit
test data generators :


Jcrasher
,
TestGen
, JUB


Generate tests for
Java classes


Evaluated on the basis of
mutants killed


Compared

with two test criteria


Random

test generation (special
-
purpose tool)


Edge coverage

criterion (by hand)


Eight

Java classes


61 methods, 534 LOC, 1070 faults (
seeded by
mutation
)

IDGA Military Test & Evaluation Summit

© Jeff Offutt

29



Shuang

Wang and Jeff Offutt,
Comparison of Unit
-
Level Automated Test Generation Tools
, Mutation 2009

Unit Level ATDG Results

IDGA Military Test & Evaluation Summit

© Jeff Offutt

30

0%
10%
20%
30%
40%
50%
60%
70%
JCrasher
TestGen
JUB
EC
Random
45%

40%

33%

68%

39%

These tools essentially generate random values !

Quality of Criteria
-
Based Tests


In another study, we compared
four test criteria


Edge
-
pair, All
-
uses, Prime path, Mutation


Generated tests for
Java classes


Evaluated on the basis of finding hand
-
seeded
faults


Twenty
-
nine

Java packages


51 classes, 174 methods, 2909 LOC


Eighty
-
eight

faults

IDGA Military Test & Evaluation Summit

© Jeff Offutt

31



Nan Li,
Upsorn

Praphamontripong

and Jeff Offutt,
An Experimental Comparison of Four Unit Test Criteria: Mutation, Edge
-
Pair, All
-
uses
and Prime Path Coverage
, Mutation 2009

Criteria
-
Based Test Results

IDGA Military Test & Evaluation Summit

© Jeff Offutt

32

0
10
20
30
40
50
60
70
80
Edge
Edge-Pair
All-Uses
Prime
Path
Mutation
35

54

53

56

75

Faults
Found
Tests
(normalized)
Researchers have invented very powerful techniques

Industry and Research Tool Gap


We
cannot compare

these two studies directly


However, we can compare the
conclusions

:


Industrial test data generators are
ineffective


Edge coverage

is
much better

than the tests the tools
generated


Edge coverage is by far the
weakest criterion


Biggest challenge was
hand generation

of tests


Software companies need to
test better


And luckily, we have lots of room for
improvement
!

IDGA Military Test & Evaluation Summit

© Jeff Offutt

33

Four Roadblocks to Adoption

1.
Lack of test education




2.
Necessity to change process


3.
Usability of tools



4.
Weak and ineffective tools

IDGA Military Test & Evaluation Summit

© Jeff Offutt

34

Bill Gates says
half

of MS engineers are
testers
, programmers spend
half

their time testing

Number of
UG CS

programs in US that require testing ?

0

Number of
MS CS
programs in US that require testing ?

Number of
UG testing classes

in the US ?

0

~10

Most test tools
don’t do much



but most users do not realize they could be better

Adoption

of many test techniques and tools require changes in development process

Many testing tools require the user to know the
underlying theory

to use them

This is very
expensive

for most software companies

Do we need to understand an
internal combustion
engine to drive ?

Do we need to understand
parsing and code generation

to use a compiler ?

Few tools solve the
key technical problem



generating test values automatically

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

35

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

General Problems with Web Apps


The
web

offers a new deployment platform


Very
competitive

and very
available

to more users


Web apps are distributed


Web apps

must be highly reliable


Web applications are
heterogeneous
,
dynamic

and

must
satisfy very high
quality attributes


Use of the Web is hindered by
low quality

Web sites and
applications


Web applications need to be
built

better and
tested

more


Most software faults are introduced during
maintenance


Difficult because the web uses new and novel
technologies

IDGA Military Test & Evaluation Summit

© Jeff Offutt

36

Technical Web App Issues

IDGA Military Test & Evaluation Summit

© Jeff Offutt

37

1. Software components are extremely loosely coupled

2.
Potential control flows change dynamically

3. State management is completely different


HTTP is
stateless


Coupled

through the Internet


separated by space


Coupled to
diverse

hardware and software applications


User control



back buttons, URL rewriting, refresh, caching


Server



redirect, forward, include, event listeners



HTTP is
stateless

and software is
distributed



Traditional
object oriented scopes

are not available



Page
,
request
,
session
,
application scope



© Jeff Offutt

38

Input Space Grammars


The input space can be
described

in many ways


User manuals


Unix man pages


Method signature / Collection of method preconditions


A language


Most input spaces can be described as
grammars


Grammars are usually not provided, but
creating them

is a
valuable service by the tester


Errors will often be found simply by creating the grammar

Input Space

The set of allowable inputs to software

IDGA Military Test & Evaluation Summit

© Jeff Offutt

39

Using Input Grammars


Software should
reject

or
handle

invalid data


Programs often do this
incorrectly


Some programs (rashly)
assume

all input data is correct


Even if it works
today




What about after the program goes through some
maintenance
changes

?


What about if the component is
reused

in a new program ?


Consequences can be
severe




The
database

can be corrupted


Users
are not satisfied


Most
security vulnerabilities

are due to unhandled exceptions …
from invalid data

IDGA Military Test & Evaluation Summit

© Jeff Offutt

40

Validating Web App Inputs


Before starting to process inputs, wisely written programs
check that the
inputs are valid


How should a program
recognize

invalid inputs ?


What should a program
do with

invalid inputs ?


If the input space is described as a grammar, a
parser

can
check for validity automatically


This is very
rare


It is easy to write input checkers


but also easy to make
mistakes

Input Validation

Deciding if input values can be processed by the software

IDGA Military Test & Evaluation Summit

Representing Input Domains

IDGA Military Test & Evaluation Summit

© Jeff Offutt

41

Desired inputs
(
goal

domain)

Described inputs
(
specified

domain)

Accepted inputs
(
implemented

domain)

Representing Input Domains


Goal domains are often
irregular

IDGA Military Test & Evaluation Summit

© Jeff Offutt

42


Goal

domain for
credit cards



First digit is the Major Industry Identifier (bank, government, …)


First 6 digits and length specify the issuer


Final digit is a “check digit”


Other digits identify a specific account


More details are on : http://www.merriampark.com/anatomycc.htm


Common
specified

domain


First digit is in { 3, 4, 5, 6 } (travel and banking)


Length is between 13 and 16


Common
implemented

domain


All digits are numeric

All digits are numeric

Representing Input Domains

IDGA Military Test & Evaluation Summit

© Jeff Offutt

43

goal

domain

specified

domain

implemented

domain

This region is a rich source of software errors …

Web Application Input Validation

Sensitive
Data

Bad Data



Corrupts data base



Crashes server



Security violations

Check data

Check data

Malicious

Data

Can “bypass”
data checking

Client

Server

IDGA Military Test & Evaluation Summit

44

© Jeff Offutt

Bypass Testing Results

IDGA Military Test & Evaluation Summit

© Jeff Offutt

45

v



Vasileios

Papadimitriou. Masters thesis,
Automating Bypass Testing for Web Applications
, GMU 2006

Theory to Practice

Bypass Testing


Inventions

from scientists are slow to move into
practice


Wanted to investigate whether the
obstacles

are :

1.
Technical

difficulties of applying to practical use

2.
Social

barriers

3.
Business

constraints


Tried to
technology transition

bypass testing to the
research arm of Avaya Research Labs


IDGA Military Test & Evaluation Summit

© Jeff Offutt

46



Offutt, Wang and
Ordille
, An Industrial Case Study of Bypass Testing on Web Applications, ICST 2008

Avaya Bypass Testing Results


Six

screens were tested


Tests are
invalid

inputs


exceptions are expected


Effects on
back
-
end

were not checked


Failure analysis based on
response

screens

IDGA Military Test & Evaluation Summit

© Jeff Offutt

47

Web

Screen

Tests

Failing Tests

Unique Failures

Points of Contact


42

23

12

Time Profile


53

23

23

Notification

Profile


34

12


6

Notification

Filter


26

16


7

Change PIN


5


1


1

Create Account


24

17

14

TOTAL

184

92

63

33% “efficiency”
rate is spectacular!

OUTLINE

IDGA Military Test & Evaluation Summit

© Jeff Offutt

48

1.
Why Do We Test?

2.
Poor Testing Has Real Costs

3.
Model
-
Driven Test Design

4.
Industrial Software Problems

5.
Testing Solutions that Work

1.
Web Applications are Different

2.
Input Validation Testing

3.
Bypass Testing

6.
Coming Changes in How We Test Software

Needs From Researchers

1.

Isolate

:
Invent

processes and techniques that
isolate the theory from most test practitioners

2.

Disguise

:
Discover

engineering techniques,
standards and frameworks that disguise the theory

3.

Embed

: theoretical ideas in
tools

4.

Experiment

: Demonstrate
economic value

of
criteria
-
based testing and ATDG


Which

criteria should be used and
when

?


When

does the extra effort pay off ?

5.

Integrate

high
-
end testing with
development

IDGA Military Test & Evaluation Summit

© Jeff Offutt

49

Needs From Educators

1.

Disguise

theory from engineers in classes

2.

Omit

theory when it is not needed

3.

Restructure

curriculum to teach more than test
design and theory


Test
automation


Test
evaluation


Human
-
based

testing


Test
-
driven

development

IDGA Military Test & Evaluation Summit

© Jeff Offutt

50

Changes in Practice

1.

Reorganize

test and QA teams to make effective
use of individual abilities


One math
-
head can support many testers

2.

Retrain

test and QA teams


Use a process like MDTD


Learn more of the concepts in testing

3.

Encourage

researchers to embed and isolate


We are very responsive to research grants

4.

Get involved

in curricular design efforts through
industrial advisory boards

IDGA Military Test & Evaluation Summit

© Jeff Offutt

51

Future of Software Testing

1.
Increased
specialization

in testing teams will lead
to more
efficient

and
effective

testing

2.
Testing and QA teams will have more
technical
expertise

3.
Developers will have more
knowledge

about
testing and
motivation

to test better

4.

Agile processes

puts testing first

putting
pressure

on both testers and developers to test better

5.
Testing and
security

are starting to merge

6.
We will develop new ways to
test connections
within software
-
based systems

IDGA Military Test & Evaluation Summit

© Jeff Offutt

52

© Jeff Offutt

53

Contact

Jeff Offutt

offutt@gmu.edu

http://cs.gmu.edu/~offutt/

IDGA Military Test & Evaluation Summit