Budgeted Machine Learning of
Bayesian Networks
Michael R.
Gubbels
Dr. Stephen D. Scott
Department of Computer Science and Engineering
University of Nebraska

Lincoln
McNair Scholars Program
August 2009
Overview
•
Introduction
•
Methods
•
Results
•
Discussion
•
Conclusions
2
Machine Learning
The concern of
machine learning
is to design
algorithms
for learning
general knowledge from a collection of related special cases called
examples
. A collection of related examples forms a data set for the
concept
relating its examples.
Learning algorithms construct a general
model
for relationships
among the characteristics or
attributes
in data set.
Using this model, the attributes of interest or
labels
of an example
can be learned in terms of its other attributes. This allows the label
of a new example or
instance
to be
predicted
in terms of its
attributes.
3
Machine Learning
A
TTRIBUTES
Smoker
Bronchitis
Fatigue
Chest
X

ray
Lung
Cancer
E
XAMPLES
Yes
No
Yes
Negative
No
No
No
No
Positive
No
∙∙∙
Yes
Yes
No
Positive
Yes
4
L
EARNING
A
LGORITHM
observed by
L
EARNED
M
ODEL
Yes
generates
predicts
presented to
(Yes, No, Yes, Positive, ?)
Budgeted Machine Learning
5
In
budgeted machine learning
, an algorithm is given a
budget
and
must pay for the attributes observed during learning. Using the
attributes purchased from examples, the algorithm constructs a
general model for a concept.
The labels of examples are free, so can be observed without penalty,
but the attributes have an associated cost.
Attributes are purchased until the budget is exhausted.
Budgeted Machine Learning
6
A
TTRIBUTES
Smoker
Bronchitis
Fatigue
Chest
X

ray
Lung
Cancer
C
OST
$10
$50
$10
$100
E
XAMPLES
?
?
?
?
No
?
?
?
?
No
∙∙∙
?
?
?
?
Yes
B
UDGETED
L
EARNING
A
LGORITHM
L
EARNED
M
ODEL
Yes
purchases from
generates
predicts
observed by
evaluated by
(Yes, No, Yes, Positive, ?)
presented to
Budgeted Machine Learning
•
Attribute selection policies
–
Round robin
Purchases one value at a time from each attribute
e.g., (A
1
, A
2
, A
3
, A
4
, A
1
, A
2
, A
3
, A
4
,
A
1
,
… )
–
Biased robin
Repeatedly purchases values from the same while
those purchases improve the model
e.g., ( A
1
, A
1
, A
2,
A
3,
A
3,
A
3,
A
4,
A
1,
A
1,
… )
7
Budgeted Machine Learning
8
Naïve
Bayes
classifier
Bayesian network
–
卩浰S敲et漠瑲慩a
–
啮U敡汩V瑩c 慳獵浰瑩潮o 潦oat瑲i扵t攠
i湤数敮摥湣n
–
P潯o 敳瑩浡t敳 潦o瑲略⁰ 潢慢ili瑩敳
–
䑩f晩f畬u
t漠瑲慩a
–
䍡C
浯摥l潮摩瑩潮慬o
i湤数敮摥湣攠潦oat瑲i扵t敳
–
䍯浰Ct敳
慣a畲at攠
灲潢慢ili瑩敳
Purpose
•
Evaluate how well existing algorithms learn
Bayesian networks for use in classification
–
Produce more accurate probability estimates
•
Should improve efficacy of existing algorithms that
depend on such estimates
–
Explicitly represent attribute independencies
•
Should facilitate learning of a more accurate model
•
Should improve classification performance of model
9
Methods
1.
Generated data sets
–
Used
Asia
and
ALARM
Bayesian network models
•
Asia
network has 8 attributes:
–
Predicted
Bronchitis
•
ALARM
network has 37 attributes
–
Predicted
Breathing Pressure
2.
Constructed model
3.
Evaluated the learned networks
10
Methods
1.
Generated data sets
2.
Constructed model
–
Used round robin and biased robin policies
–
Learned naïve and complex Bayesian networks
•
Structures were given
•
Uniform and noisy prior knowledge
–
Uniform attribute cost
–
Varied the learning algorithm’s total budget
3.
Evaluated the learned networks
11
Methods
1.
Evaluated the learned networks
2.
Constructed model
3.
Evaluated the learned networks
–
For uniform and noisy prior knowledge
–
For many numbers of purchases
–
Baseline (“best possible”) classification
performance of model
12
Results
13
Results
14
Discussion
•
Naïve Bayesian networks
–
Converge to baseline faster with uniform priors
•
Bayesian networks
–
Have more accurate baseline than naïve networks
–
Converge to baseline faster with noisy priors
15
Conclusions
•
Bayesian networks learned using existing algorithms
converge to baseline
•
Bayesian networks may be preferable to naïve
networks when learning complex concepts or when
prior knowledge is available
•
Future work
–
Evaluate more existing policies with Bayesian networks
–
Analysis of models learned for complex concepts
–
New algorithms to exploit Bayesian network structure
–
Learning from data with different cost models
16
Acknowledgements
•
Dr. Stephen D. Scott
Research Mentor
•
Kun Deng
Graduate Student
•
Amy Lehman
Graduate Student Mentor
•
UNL McNair Scholars Program
17
Bibliography
Lizotte
, D. J.,
Madani
, O., & Greiner, R. (2003). Budgeted learning of
naïve

Bayes
classifiers.
Uncertainty in Artificial Intelligence
, 378

385.
Tong, S., &
Koller
, D. (2001). Active learning for parameter estimation
in Bayesian networks.
International Joint
Converences
on
Artificial Intelligence.
Deng, K., Bourke, C., Scott, S.,
Sunderman
, J., &
Zheng
, Y. (2007).
Bandit

based algorithms for budgeted learning.
Seventh IEEE
International Conference on Data Mining
, 463

468.
Neapolitan, R. E. (2004).
Learning
bayesian
networks.
New Jersey:
Pearson Prentice Hall.
Mitchell, T. M. (1997).
Machine learning.
New York: McGraw

Hill.
18
Pseudocode
A
TTRIBUTE
S
ELECTION
P
OLICIES
R
OUND
R
OBIN
B
IASED
R
OBIN
R
ANDOM
a
←
S
ELECT
(
M
IN

C
OST
(A)
)
U
NTIL
(
B
UDGET

E
XHAUSTED
?)
:
e
←
S
ELECT
(
R
ANDOM
(E)
)
v
←
P
URCHASE
(
a,e
)
M
←
L
EARN
M
(v)
a
←
S
ELECT
(
N
EXT
(A)
)
a
←
S
ELECT
(
M
IN

C
OST
(A)
)
U
NTIL
(
B
UDGET

E
XHAUSTED
?)
:
e
←
S
ELECT
(
R
ANDOM
(E)
)
m
old
←
C
ORRECTNESS
(M
)
v
←
P
URCHASE
(
a,e
)
M
←
L
EARN
M
(v)
m
new
←
C
ORRECTNESS
(M
)
I
F
(
m
new
< m
old
):
a
←
S
ELECT
(
N
EXT
(A)
)
a
←
S
ELECT
(
R
ANDOM
(A)
)
U
NTIL
(
B
UDGET

E
XHAUSTED
?)
:
e
←
S
ELECT
(
R
ANDOM
(E)
)
v
←
P
URCHASE
(
a,e
)
M
←
L
EARN
M
(v)
a
←
S
ELECT
(
R
ANDOM
(A)
)
Figure 4.
Pseudocode
for the round robin, biased robin, and random data selection policies.
A
is the set of
attributes available to purchase values from, and
a
is a particular attribute in
A
.
E
is the set of examples,
and
e
is a particular example in
E
.
v
is an attribute value in an example.
M
denotes the specific model
being learned, and the variables
m
old
and
m
new
represent the correctness of model
M
before and after
learning new information.
19
Pseudocode
A
TTRIBUTE
S
ELECTION
P
OLICIES
R
OUND
R
OBIN
B
IASED
R
OBIN
a
←
S
ELECT
(
M
IN

C
OST
(A)
)
U
NTIL
(
B
UDGET

E
XHAUSTED
?)
:
e
←
S
ELECT
(
R
ANDOM
(E)
)
v
←
P
URCHASE
(
a,e
)
M
←
L
EARN
M
(v)
a
←
S
ELECT
(
N
EXT
(A)
)
a
←
S
ELECT
(
M
IN

C
OST
(A)
)
U
NTIL
(
B
UDGET

E
XHAUSTED
?)
:
e
←
S
ELECT
(
R
ANDOM
(E)
)
m
old
←
C
ORRECTNESS
(M
)
v
←
P
URCHASE
(
a,e
)
M
←
L
EARN
M
(v)
m
new
←
C
ORRECTNESS
(M
)
I
F
(
m
new
< m
old
):
a
←
S
ELECT
(
N
EXT
(A)
)
20
Comments 0
Log in to post a comment