tripastroturfAI and Robotics

Nov 7, 2013 (3 years and 9 months ago)



Software Measurement:

Uncertainty and Causal Modeling

Norman Fenton, Paul Krause, Martin Neil

Review of Article by: Chua Jit Chee (S1)


This article is about the important role of
software measureme
nts in process control and
risk management. In the past, project managers
and quality engineers rely on simple regression
models that fail to take into account the major
causal influences on a project’s quality goals.
This article suggested the use of Baye
networks to construct models that encapsulate
causal influences on a development project.

1. Introduction

The two specific roles of software measurement
are quality control and effort estimation. There
are two different viewpoints of software quali
and they include external and internal product

External product view looks at the characteristics
that make up the user’s perception of quality in
the final product and can only be done when the
product is complete. Internal product view
es factors that can be used to control the
quality of the software as it is being produced.
These factors can form early predictions of the
external product quality. However, the
relationships between external and internal
quality are uncertain.

, project managers will use informal
assessments of factors such as complexity
measures and test results to predict the final
product’s quality during the software
development. However, there is no formal
attempt to combine the evidence collected into a
ngle quality model. This resulted in the use of
naïve regression models.

The problem with naïve regression models could
be the identification of one simple measurement
and conclude that there is a potential hazard. The
authors highlighted the example of t
development of a product using a set of software
modules. We may assume that the modules with
the highest number of defects during testing
would have the highest risk of causing a failure
once the product is in operation. However, two
published studies
indicate the opposite effects.

Due to the above problems, the authors proposed
the identification of causal influences in a
problem of interest. As in the earlier example, we
cannot judge the software quality from defect
data alone. We must at least take
into account the
effectiveness of the testing.

Then we have to augment the causal model with
a calculus that lets us update the model as we get
new evidence. The authors suggested that we
could do this by assigning probability tables to
the nodes in the
model and then using Bayes’
theorem to revise the probabilities as we obtain
new information about a specific problem.

This kind of approach is known as a Bayesian
Networks model problem that involves
uncertainty. A Bayesian Network is a directed
graph, w
hose nodes are the uncertain variables
and whose edges are the causal or influential
links between the variables. Associated with each
node is a set of conditional probability functions
that model the uncertain relationship between the
node and its parents
I will elaborate more on
causal models and Bayesian networks under the

2. Elaboration on the techniques

2.1 Causal models

The authors illustrate how causal model handles
the relationship between defects detected during
software testing an
d residual defects delivered.

Fig 1. Simple causal model

The attribute we are interested in is the defects
present. The defects present will have a causal
influence on the number of defects detected
during testing. Test effectiveness will a
lso have a
causal influence on the number of defects
detected and fixed.

From the model, we can see the cause to effect. If
test effectiveness is low, then the number of
defects detected would also be low. However, it
would also be low if the number of
present is low.

Causal models also allow us to reason in both
forward and reverse direction. We can identify
the possible causes given the observation of some
effect. In this example, if the number of defects
detected is low, the possible causes a
re low test
effectiveness and a low number of defects

2.2 Bayesian networks

Bayesian network is the combination of causal
graphical model with node probability tables.
That is assigning probabilities to the nodes in the
model and using Bayes’ th
eorem as shown

p(E | C) =
p(C | E)p(E)


Using the model in fig 1 and the variables in the
parentheses to represent the nodes, we will
explain this idea by looking at one branch of the

Suppose the likelihood p(DD = High | TE

High) = 0.8. This means the probability that DD
is high given that TE is high. We want to find
p(TE = High | DD = High) which is the
probability that the product is well tested when
the number of defects detected is high. We need
to know p(TE = High) an
d p(DD = High) to
calculate this. We will take p(TE = High) = 0.2 as
a measure of the typical case and p(DD = High) =
0.5 assuming it is equal likely that the number of
defects detected to be high as low.

p(TE = High | DD = High)

p(DD = High | TE = Hig
h)p(TE = High)


0.8 x 0.2



We see that given that there is high number of
defects, the probability that the product was
effectively tested has increased from 0.2 to 0.32.
This approach can be used by more than two

2.3 AID Tool

The authors also make use of a tool called AID
(assess, improve, decide) to illustrate the
relationship between defects detected during test
and residual defects delivered. This tool allows
us to assign one of the five states, ranging from
very simple to very complex for the intrinsic
complexity of the problem.

The bar chart with y axis (probability that the
number of defects will fall within a certain
Test effectiveness (TE)

Defects present (DP)

Defects d
etected (DD)

interval) against x axis (interval values for the
number of defects detected and fixed)
for both
very complex and very simple modules are
plotted. The value of the median for very
complex module is 30 and for very simple is 125
showing that more complex modules are harder
to test. The respective figures for residual defects
delivered is 70 fo
r very complex and 30 for very
simple modules. The more complex module is
predicted to have more residual defects which
correspond to the real world data.

3. Topics related to articles

This article is not really related to any of the
topics discussed in
the SC207 lectures. However,
I have learned about the Bayes’ theorem in
SC109 and this article shows me how it can be of
practical use in software measurement.

I think it is also a bit related to software cost
estimation lectures as they have the same pu
to eliminate software failures through some form
of calculations. In cost estimation, we use
COCOMO model and FP. In fact, Bayesian
network is a dynamic prediction model which is
better than static prediction models like
COCOMO model. Bayesian is bet
ter because it
deals with uncertainties while COCOMO do not
take into account the causal relationships that
exists between various variables.

4. New contributions

The techniques discussed in this article can help
to reduce risk when developing software.

Particularly, the causal predictive models can be
used to help identify potential hazards if some
aspect of a process under performs and also to
identify improvement measures if a current
prediction fails to meet a target.

Risk management is no longer o
nly associated to
critical system, it is also very important to
a company’s reputation and profits so software
measurement is a very important topic.

5. Relation to Lab project

My lab project is about the development of a
timetable planner. It i
s a small
scale project so
software measurement is not really crucial in
helping the development of the project.

I think it is not an easy task to come up with the
causal model for my project. It is difficult to
identify causal influences related to my p
and this article did not really mention ways to
help in identifying the causal influences.

Bayesian network is extremely computationally
intensive but my team does not have that much
time to implement that. In addition, it is very
difficult or almo
st impossible to know the
probabilities of the causal influences related to
my project. This is because I do not have typical
cases of such projects to get the probabilities

However, it might still be feasible if a few of my
team members can focus
on gathering statistics
related to our project from past years teams
inorder to derived the probabilities of the causal

6. Possible extensions

I would like to include an important issue on how
to determine the update of a node in the Bayesian

model given new evidence. We can do so by
looking at the
three types of dependency
connection or “d
connection”. I think they are
very efficient guides.

Fig a. Pipeline

Serial d
: Node C is conditionally
dependent and B is conditi
onally dependent on
A. Entering evidence at node A or C will lead to
an update in the probability distribution of B.




However, if we enter evidence at node B, we can
see that nodes A and C are conditionally
independent given evidence at node B. This
means t
hat evidence at node B “blocks the

Fig b. Converging

b) Converging d
: Node B is
conditionally dependent on nodes A and C.
Entering evidence at node A will update node B
but will have no effect on node C. If we have
evidence at node B, then entering
evidence at node A will update node C. Here
nodes A and C are conditionally dependent given
evidence at node B.

Fig c. Diverging

Diverging d
: Node A and C are
conditionally dependent on node B. Ent
evidence at node B will effect nodes A and C, but
if we then enter evidence on node A it will not
effect C when there is evidence at node B. Here,
nodes A and C are conditionally independent
given evidence at node B.

7.Critical comments on notations

The figures 1a in the article shows a hypothetical
plot of prerelease against post
release defects for
a range of modules using regression models. Dots
are used to represents modules in the graphs.

Figure 1b in the article shows the actual case

I think the two diagrams are very hard to see and
understand at a glance with so many dots on the
plots. It would be better if they use lines or bar
charts. They can compute the average post
release defects for modules with the same
number of
prerelease defects to plot the lines or
bar charts instead of having so many modules on
the diagrams. My suggested versions of figure 1a
and 1b are shown below in Fig 2 and Fig 3. It can
be seen easily in fig 2 that the modules with more
number of prerelea
se defects will have more
number of post
release defects when a regression
model is used. The actual case in fig 3 is the




Prerelease defects

Fig 2. Suggested figure 1a in article




Prerelease defects

Fig 3. Suggested figure 1b in article

Other than that, the other bar charts related to the
AID tool are quite clear with the related
information including mean and median
highlighted near the diagrams.

8.Validation of the ideas

From the IEEE Trans. Software Eng. reference in
the article titled “A Probabilistic Model for
Software Defect Prediction” by N.E. Fenton, P.J.
Krause, and M. Neil, the AID tool has been
validated against data from over 20 devel
projects within Philips. 15 of the 20 projects with
extensive data available produced predictions
within 10 percent of the actual values. The rest
involved a lot of graphical user interface or
involved subsystems that integrated modules
developed a
t different sites. These are beyond the






expected usage of AID tool. These results from
real projects are strong evidence that the AID
tool is very reliable in the predictions.

9. Comments on results and relevance to the
immediate future

The results are s
hown in great details with the
support of various diagrams. I think it would be
better to highlight more real life results of the
usage of Bayesian networks. In this article, the
authors only mentioned very briefly about real
life examples.

I think th
at ideas in this article especially on
Bayesian network will be the answer to help
project managers to make better predictions
models rather than using the current approaches.

10.Related Work


Rodriguez, D.; Harrison, R.;Satpathy,
M.; Dolado, J




An investigation of prediction models
for project management


Experiments to show that prediction
models are indeed helpful to project
managers in making predictions.

Relation to
main article

Gives more real life results
to support
the ideas in the main article.


N. Ohlsson and N.E. Fenton




Quantitative analysis of faults and


Describe a number of results from
quantitative study of faults and failures in
two releases of a
major commercial
software system.

Relation to
main article

More details on the result of the example
in the main article about post release
defects and prerelease defects.


N.E. Fenton and M.Heil




A critique of software defec
t prediction


Show the problems with current defect
prediction approaches and recommended
Bayesian belief networks.

Relation to
main article

More on Bayesian networks mention in
the main article.