SC207 SOFTWARE ENGINEERING TERM PAPER 2003
Software Measurement:
Uncertainty and Causal Modeling
Norman Fenton, Paul Krause, Martin Neil
Review of Article by: Chua Jit Chee (S1)
Abstract
This article is about the important role of
software measureme
nts in process control and
risk management. In the past, project managers
and quality engineers rely on simple regression
models that fail to take into account the major
causal influences on a project’s quality goals.
This article suggested the use of Baye
sian
networks to construct models that encapsulate
causal influences on a development project.
1. Introduction
The two specific roles of software measurement
are quality control and effort estimation. There
are two different viewpoints of software quali
ty
and they include external and internal product
views.
External product view looks at the characteristics
that make up the user’s perception of quality in
the final product and can only be done when the
product is complete. Internal product view
includ
es factors that can be used to control the
quality of the software as it is being produced.
These factors can form early predictions of the
external product quality. However, the
relationships between external and internal
quality are uncertain.
Typically
, project managers will use informal
assessments of factors such as complexity
measures and test results to predict the final
product’s quality during the software
development. However, there is no formal
attempt to combine the evidence collected into a
si
ngle quality model. This resulted in the use of
naïve regression models.
The problem with naïve regression models could
be the identification of one simple measurement
and conclude that there is a potential hazard. The
authors highlighted the example of t
he
development of a product using a set of software
modules. We may assume that the modules with
the highest number of defects during testing
would have the highest risk of causing a failure
once the product is in operation. However, two
published studies
indicate the opposite effects.
Due to the above problems, the authors proposed
the identification of causal influences in a
problem of interest. As in the earlier example, we
cannot judge the software quality from defect
data alone. We must at least take
into account the
effectiveness of the testing.
Then we have to augment the causal model with
a calculus that lets us update the model as we get
new evidence. The authors suggested that we
could do this by assigning probability tables to
the nodes in the
model and then using Bayes’
theorem to revise the probabilities as we obtain
new information about a specific problem.
This kind of approach is known as a Bayesian
Networks model problem that involves
uncertainty. A Bayesian Network is a directed
graph, w
hose nodes are the uncertain variables
and whose edges are the causal or influential
links between the variables. Associated with each
node is a set of conditional probability functions
that model the uncertain relationship between the
node and its parents
.
I will elaborate more on
causal models and Bayesian networks under the
techniques.
2. Elaboration on the techniques
2.1 Causal models
The authors illustrate how causal model handles
the relationship between defects detected during
software testing an
d residual defects delivered.
Fig 1. Simple causal model
The attribute we are interested in is the defects
present. The defects present will have a causal
influence on the number of defects detected
during testing. Test effectiveness will a
lso have a
causal influence on the number of defects
detected and fixed.
From the model, we can see the cause to effect. If
test effectiveness is low, then the number of
defects detected would also be low. However, it
would also be low if the number of
defects
present is low.
Causal models also allow us to reason in both
forward and reverse direction. We can identify
the possible causes given the observation of some
effect. In this example, if the number of defects
detected is low, the possible causes a
re low test
effectiveness and a low number of defects
present.
2.2 Bayesian networks
Bayesian network is the combination of causal
graphical model with node probability tables.
That is assigning probabilities to the nodes in the
model and using Bayes’ th
eorem as shown
below.
p(E  C) =
p(C  E)p(E)
p(C)
Using the model in fig 1 and the variables in the
parentheses to represent the nodes, we will
explain this idea by looking at one branch of the
model.
Suppose the likelihood p(DD = High  TE
=
High) = 0.8. This means the probability that DD
is high given that TE is high. We want to find
p(TE = High  DD = High) which is the
probability that the product is well tested when
the number of defects detected is high. We need
to know p(TE = High) an
d p(DD = High) to
calculate this. We will take p(TE = High) = 0.2 as
a measure of the typical case and p(DD = High) =
0.5 assuming it is equal likely that the number of
defects detected to be high as low.
p(TE = High  DD = High)
=
p(DD = High  TE = Hig
h)p(TE = High)
p(DD=High)
=
0.8 x 0.2
0.5
=0.32
We see that given that there is high number of
defects, the probability that the product was
effectively tested has increased from 0.2 to 0.32.
This approach can be used by more than two
variables.
2.3 AID Tool
The authors also make use of a tool called AID
(assess, improve, decide) to illustrate the
relationship between defects detected during test
and residual defects delivered. This tool allows
us to assign one of the five states, ranging from
very simple to very complex for the intrinsic
complexity of the problem.
The bar chart with y axis (probability that the
number of defects will fall within a certain
Test effectiveness (TE)
Defects present (DP)
Defects d
etected (DD)
interval) against x axis (interval values for the
number of defects detected and fixed)
for both
very complex and very simple modules are
plotted. The value of the median for very
complex module is 30 and for very simple is 125
showing that more complex modules are harder
to test. The respective figures for residual defects
delivered is 70 fo
r very complex and 30 for very
simple modules. The more complex module is
predicted to have more residual defects which
correspond to the real world data.
3. Topics related to articles
This article is not really related to any of the
topics discussed in
the SC207 lectures. However,
I have learned about the Bayes’ theorem in
SC109 and this article shows me how it can be of
practical use in software measurement.
I think it is also a bit related to software cost
estimation lectures as they have the same pu
rpose
to eliminate software failures through some form
of calculations. In cost estimation, we use
COCOMO model and FP. In fact, Bayesian
network is a dynamic prediction model which is
better than static prediction models like
COCOMO model. Bayesian is bet
ter because it
deals with uncertainties while COCOMO do not
take into account the causal relationships that
exists between various variables.
4. New contributions
The techniques discussed in this article can help
to reduce risk when developing software.
Particularly, the causal predictive models can be
used to help identify potential hazards if some
aspect of a process under performs and also to
identify improvement measures if a current
prediction fails to meet a target.
Risk management is no longer o
nly associated to
safety

critical system, it is also very important to
a company’s reputation and profits so software
measurement is a very important topic.
5. Relation to Lab project
My lab project is about the development of a
timetable planner. It i
s a small

scale project so
software measurement is not really crucial in
helping the development of the project.
I think it is not an easy task to come up with the
causal model for my project. It is difficult to
identify causal influences related to my p
roject
and this article did not really mention ways to
help in identifying the causal influences.
Bayesian network is extremely computationally
intensive but my team does not have that much
time to implement that. In addition, it is very
difficult or almo
st impossible to know the
probabilities of the causal influences related to
my project. This is because I do not have typical
cases of such projects to get the probabilities
from.
However, it might still be feasible if a few of my
team members can focus
on gathering statistics
related to our project from past years teams
inorder to derived the probabilities of the causal
influences.
6. Possible extensions
I would like to include an important issue on how
to determine the update of a node in the Bayesian
model given new evidence. We can do so by
looking at the
three types of dependency
connection or “d

connection”. I think they are
very efficient guides.
Fig a. Pipeline
a)
Serial d

connection
: Node C is conditionally
dependent and B is conditi
onally dependent on
A. Entering evidence at node A or C will lead to
an update in the probability distribution of B.
A
B
C
However, if we enter evidence at node B, we can
see that nodes A and C are conditionally
independent given evidence at node B. This
means t
hat evidence at node B “blocks the
pipeline”.
Fig b. Converging
b) Converging d

connection
: Node B is
conditionally dependent on nodes A and C.
Entering evidence at node A will update node B
but will have no effect on node C. If we have
entered
evidence at node B, then entering
evidence at node A will update node C. Here
nodes A and C are conditionally dependent given
evidence at node B.
Fig c. Diverging
c)
Diverging d

connection
: Node A and C are
conditionally dependent on node B. Ent
ering
evidence at node B will effect nodes A and C, but
if we then enter evidence on node A it will not
effect C when there is evidence at node B. Here,
nodes A and C are conditionally independent
given evidence at node B.
7.Critical comments on notations
/diagrams
The figures 1a in the article shows a hypothetical
plot of prerelease against post

release defects for
a range of modules using regression models. Dots
are used to represents modules in the graphs.
Figure 1b in the article shows the actual case
by
studies.
I think the two diagrams are very hard to see and
understand at a glance with so many dots on the
plots. It would be better if they use lines or bar
charts. They can compute the average post

release defects for modules with the same
number of
prerelease defects to plot the lines or
bar charts instead of having so many modules on
the diagrams. My suggested versions of figure 1a
and 1b are shown below in Fig 2 and Fig 3. It can
be seen easily in fig 2 that the modules with more
number of prerelea
se defects will have more
number of post

release defects when a regression
model is used. The actual case in fig 3 is the
opposite.
Post

release
defects
Prerelease defects
Fig 2. Suggested figure 1a in article
Post

release
defects
Prerelease defects
Fig 3. Suggested figure 1b in article
Other than that, the other bar charts related to the
AID tool are quite clear with the related
information including mean and median
highlighted near the diagrams.
8.Validation of the ideas
From the IEEE Trans. Software Eng. reference in
the article titled “A Probabilistic Model for
Software Defect Prediction” by N.E. Fenton, P.J.
Krause, and M. Neil, the AID tool has been
validated against data from over 20 devel
opments
projects within Philips. 15 of the 20 projects with
extensive data available produced predictions
within 10 percent of the actual values. The rest
involved a lot of graphical user interface or
involved subsystems that integrated modules
developed a
t different sites. These are beyond the
A
B
C
A
B
C
expected usage of AID tool. These results from
real projects are strong evidence that the AID
tool is very reliable in the predictions.
9. Comments on results and relevance to the
immediate future
The results are s
hown in great details with the
support of various diagrams. I think it would be
better to highlight more real life results of the
usage of Bayesian networks. In this article, the
authors only mentioned very briefly about real
life examples.
I think th
at ideas in this article especially on
Bayesian network will be the answer to help
project managers to make better predictions
models rather than using the current approaches.
10.Related Work
Author(s)
Rodriguez, D.; Harrison, R.;Satpathy,
M.; Dolado, J
Year
2002
Article
An investigation of prediction models
for project management
Description
Experiments to show that prediction
models are indeed helpful to project
managers in making predictions.
Relation to
main article
Gives more real life results
to support
the ideas in the main article.
Author(s)
N. Ohlsson and N.E. Fenton
Year
2000
Article
Quantitative analysis of faults and
failures
Description
Describe a number of results from
quantitative study of faults and failures in
two releases of a
major commercial
software system.
Relation to
main article
More details on the result of the example
in the main article about post release
defects and prerelease defects.
Author(s)
N.E. Fenton and M.Heil
Year
1999
Article
A critique of software defec
t prediction
research
Description
Show the problems with current defect
prediction approaches and recommended
Bayesian belief networks.
Relation to
main article
More on Bayesian networks mention in
the main article.
Comments 0
Log in to post a comment