A Revised Comparison of Bayesian Logic Programs and Stochastic Logic Programs

strawberrycokevilleΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

81 εμφανίσεις

A Revised Comparison of
Bayesian
Logic Programs
and
Stochastic Logic
Programs
Jianzhong Chen Stephen Muggleton
Department of Computing,Imperial College London,UK
{cjz,shm}@doc.ic.ac.uk
http://www.doc.ic.ac.uk/
~
cjz/research
August 22,2006
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
Outline
1
Introduction
2
BLPs vs.SLPs:Semantics
3
BLPs vs.SLPs:a Revised Translation
4
BLPs vs.SLPs:Learnability
5
Conclusions
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
Introduction
Probabilistic Logic Learning (PLL) [DeRaedtKersting03],
Probabilistic ILP.(PILP) [DeRaedtKersting06]
Expressivity of different PLL models.
Bayesian Logic Programs (BLPs)
[Kersting00],
Stochastic
Logic Programs (SLPs)
[Muggleton96].
Extension to a previous study [PuechMuggleton03]:
semantics,a revised translation and learnability.
Developing an integrated theory of PLL.
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:Semantics - Benchmark
Halpern’s first-order probabilistic
logics [Halpern89]
Type 1 probability structure
:(D,π,µ),probabilities on the
domain,domain-frequency approach,objective and ‘sampling’
probabilities of domains.
Type 2 probability structure
:(D,W,π,µ),probabilities on
possible worlds,possible-world approach,subjective and
‘degree-of-belief’ probabilities of domains.
PLL learning settings [DeRaedtKersting06]
Probabilistic learning from
entailments
.
Probabilistic learning from
interpretations
.
Probabilistic learning from
proofs
.
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:Semantics - Comparison
BLPs
First-order Bayesian networks,type 2 probability structure,
possible-world approach.
Probabilistic learning from interpretations setting:examples
are Herbrand interpretations,probabilistic cover relation can
be computed in the induced Bayesian networks.
SLPs
Close to type 1 domain-frequency approach:
(HB,< G,SLDT
G
>,π,µ),probabilities on atoms in
Herbrand base in terms of their proofs (or SLD-trees),
‘sampling’ probabilities of domains.
Two PLL settings:learning from probabilistic entailment
(examples are ground atoms entailed by
SLP
) [Muggleton00,Cussens01],learning from probabilistic
proofs (examples are proofs).
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:a Revised Translation
A
behavioral translation
approach:knowledge encoded in one
framework can be transformed into another framework in
terms of probability distributions.
A
BLP
corresponds to an
impure SLP
:deterministic/logical
background knowledge need to be translated as well as
probabilistic predicates.
Extension to a previous study [PuechMuggleton03]:to solve
potential ‘
contradictory refutation
’ problem in
BLP
-
SLP
translation.

(a)a first-order BN







(b)corresponding BLP


A (tom)
B (tom) | A (tom)
C (tom) | A (tom)
D (tom) | B (tom), C (tom)

A={y,n
}

B={y,n
}

C={y,n
}

D={y,n
}

A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:a Revised Translation

p
1
: a(tom, y) ← .
p
2
: a(tom, n) ← .
p
3
: b(T, y) ← a(T, y).
p
4
: b(T, y) ← a(T, n).
p
5
: b(T, n) ← a(T, y).
p
6
: b(T, n) ← a(T, n).
p
7
: c(T, y) ← a(T, y).
p
8
: c(T, y) ← a(T, n).
p
9
: c(T, n) ← a(T, y).
p
10
: c(T, n) ← a(T, n).
p
11
: d(T, y) ← b(T, y), c(T, y).
p
12
: d(T, y) ← b(T, y), c(T, n).
p
13
: d(T, y) ← b(T, n), c(T, y).
p
14
: d(T, y) ← b(T, n), c(T, n).
p
15
: d(T, n) ← b(T, y), c(T, y).
p
16
: d(T, n) ← b(T, y), c(T, n).
p
17
: d(T, n) ← b(T, n), c(T, y).
p
18
: d(T, n) ← b(T, n), c(T, n).

















a(tom,y
)
,c(tom,y
).


a(tom, y
).

refutation
p
2

p
1

p
8

← d(t
om
, y)
.


b
(to
m, y
)
, c(tom, y
).




a(tom,n
)
,c(tom,y
).




a(t
om, n
)
.

contradictory refutation
p
11

p
1
2

p
3

p
4

p
1

p
7


c(tom, y
).

Figure:
Translated
SLP
with contradictory refutation and SSLD-tree
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:a Revised Translation

p
1
: a(tom,
[y,B,C,D]
) ← .
p
2
: a(tom,
[n,B,C,D]
) ← .
p
3
: b(T,
[y,y,C,D]
) ← a(T,
[y,y,C,D]
).
p
4
: b(T,
[n,y,C,D]
) ← a(T,
[n,y,C,D]
).
p
5
: b(T,
[y,n,C,D]
) ← a(T,
[y,n,C,D]
).
p
6
: b(T,
[n,n,C,D]
) ← a(T,
[n,n,C,D]
).
p
7
: c(T,
[y,B,y,D]
) ← a(T,
[y,B,y,D]
).
p
8
: c(T,
[n,B,y,D]
) ← a(T,
[n,B,y,D]
).
p
9
: c(T,
[y,B,n,D]
) ← a(T,
[y,B,n,D]
).
p
10
: c(T,
[n,B,n,D]
) ← a(T,
[n,B,n,D]
).
p
11
: d(T,
[A,y,y,y]
) ← b(T,
[A,y,y,y]
), c(T,
[A,y,y,y]
).
p
12
: d(T,
[A,y,n,y]
) ← b(T,
[A,y,n,y]
), c(T,
[A,y,n,y]
).
p
13
: d(T,
[A,n,y,y]
) ← b(T,
[A,n,y,y]
), c(T,
[A,n,y,y]
).
p
14
: d(T,
[A,n,n,y]
) ← b(T,
[A,n,n,y]
), c(T,
[A,n,n,y]
).
p
15
: d(T,
[A,y,y,n]
) ← b(T,
[A,y,y,n]
), c(T,
[A,y,y,n]
).
p
16
: d(T,
[A,y,n,n]
) ← b(T,
[A,y,n,n]
), c(T,
[A,y,n,n]
).
p
17
: d(T,
[A,n,y,n]
) ← b(T,
[A,n,y,n]
), c(T,
[A,n,y,n]
).
p
18
: d(T,
[A,n,n,n]
) ← b(T,
[A,n,n,n]
), c(T,
[A,n,n,n]
).






← d(t
om, [A,y,y,y]
)
.


b
(to
m, [A,y,y,y]
)
, c(tom,
[A,y,y,y]).



←a(tom,[y,y,y,y]),c(tom,[y,y,y,y]).

c(tom,
[y,y,y,y]
).

←a(tom,[n,y,y,y]),c(tom,[n,y,y,y]).


refutation
p
11

p
1
2

p
3

p
4

p
1

p
7

p
1

← a(tom, [y,y,y,y]).
Figure:
Revised translation and SSLD-tree
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
BLPs
vs.
SLPs
:Learnability
Purpose
To investigate the computational properties and
sample complexities of different PLL models.
Based on
Computational learning
theory [Valiant84,Mitchell97],Probably
Approximately Correct (PAC) learning
paradigm [Haussler90],Vapnik-Chervonenkis (VC)
dimensions [Blumer+89],first-Order jk-clausal
theories are PAC-learnable [DeRaedtDveroski94].
Result
BLPs
and
SLPs
are polynomial-sample
polynomial-time PAC-learnable with one-sided error
from positive examples only;the ratio of
sample-complexities of
BLP
and its translation
SLP
can be estimated by the encoding information of the
annotated logic programs.
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
Conclusions
Conclusions
1
Comparison of
SLPs
’ and
BLPs
’ semantics in terms of their
definitions of probability distributions and learning settings.
2
Revised translation to solve the existing ‘contradictory
refutation’ problem.
3
An initial result of analyzing the learnabilities of different PLL
models based on the translations and computational learning
theory.
4
Further study to develop an integrated theory of PLL.
Acknowledgements
APrIL II Project (www.aprill.org)
– Application of Probabilistic
Inductive Logic Programming
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
References
De Raedt,L.,Kersting,K.:Probabilistic Logic Learning.ACM-SIGKDD
Explorations:Special Issue on Multi-Relational Data Mining.5(1) (2003)
31–48
Kersting,K.,De Raedt,L.:Bayesian Logic Programs.In:Cussens,J.,
Frisch,A.(eds.):Proceedings of the Work-in-Progress Track at the 10th
International Conference on Inductive Logic Programming.(2000)
138–155
Muggleton,S.:Stochastic logic programs.In:De Raedt,L.(eds.):
Advances in Inductive Logic Programming.(1996) 254–264
Puech,A.,Muggleton,S.:A Comparison of Stochastic Logic Programs
and Bayesian Logic Programs.IJCAI03 Workshop on Learning Statistical
Models from Relational Data.(2003)
Halpern J.Y.:An Analysis of First-Order Logics of Probability.Artificial
Intelligence.46 (1989) 311–350
De Raedt,L.,et al:Overview of Probabilistic Logic Theory,
Representations,Inference,Learning and Systems.APrIL II Project
Deliverables,www.aprill.org,2006
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain
References
Muggleton,S.:Learning Stochastic Logic Programs.Electronic
Transactions in Artificial Intelligence.5(041) (2000)
Cussens,J.:Parameter Estimation in Stochastic Logic Programs.Machine
Learning.44(3) (2001) 245–271
Valiant,L.G.:A Theory of the Learnable.Communications of the ACM.
27(11) (1984) 1134–1142
Haussler,D.:Probably Approximately Correct Learning.In:National
Conference on Artificial Intelligence.(1990) 1101–1108
Mitchell,T.M.:Machine Learning.The McGraw-Hill Companies,Inc.
(1997)
De Raedt,L.,Dˇzeroski,S.:First-Order jk-clausal Theories are
PAC-learnable.Aritificial Intelligence.70 (1994) 375–392
Blumer,A.,Ehrenfeucht,A.,Haussler,D.,Warmuth,M.:Learnability and
the Vapnic-Chervonenkis Dimension.Journal of the ACM.36(4) (1989)
929–965
A Revised Comparison of
BLPs
and
SLPs
,ILP’06,Santiago de Compostela,Spain