Research at the Decision Making Lab

ocelotgiantΤεχνίτη Νοημοσύνη και Ρομποτική

7 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

56 εμφανίσεις

Research at the

Decision Making Lab

Fabio Cozman

Universidade de São Paulo

Decision Making Lab (2002)

Research tree

Robotics

(a bit)

Bayes nets

Sets of probabilities

Algorithms

independence


Applications

MDPs, robustness analysis, auctions

Anytime, anyspace

(embedded systems)

Classification

Applications

Medical decisions

MCMC algorithms

inference & testing

Some (bio)robotics

Bayesian networks

Decisions in medical domains

(with the University Hospital)


Idea:

To improve decisions at medical posts in urban,
poor areas



We are building networks that represent cardiac arrest


can be caused by stress, cardiac problems,
respiratory problems, etc





Support by FAPESP

The HU
-
network

A better interface for teaching

Embedded Bayesian networks


Challenge:

to implement inference algorithms
compactly and efficiently


Real challenge
: to develop anytime anyspace
inference algorithms


Idea:

decompose networks, apply several
algorithms (UAI2002 workshop on RT)



Support by HP Labs

Decomposing networks


How to decompose and assign algorithms to
meet space and time constraints with
reasonable accuracy

Application: Failure analysis in
car
-
wash systems

The car
-
wash network

Generating random networks


Problem is easy to state, hard to solve: critical
properties of DAGs are not known



Method based


on MCMC simulation,


with constraints on


induced width and


degree



Support by FAPESP

Research tree (again)

Biorobotics

(a bit of it)

Bayes nets

Sets of probabilities

Algorithms

independence


Applications

MDPs, robustness analysis, auctions

Anytime, anyspace

(embedded systems)

Classification

Applications

Medical decisions

MCMC algorithms

inference & testing

Bayesian network classifiers


Goal is to use probabilistic models for
classification


to “learn” classifiers using
labeled and unlabeled data



Work with Ira Cohen, Alex Bronstein and
Marsha Duro (UIUC and HP Labs)

Using Bayesian networks to learn
from labeled and unlabeled data




Suppose we want to classify events based on
observations; we have recorded data that are
sometimes labeled and sometimes unlabeled


What is the value of unlabeled data?

The Naïve Bayes classifier


A Bayesian
-
network like classifier with excellent
credentials:








Use Bayes rule to get classification

p(label | attrs.)
a

p(label)
P
i=0…N

p(attr. i | Class)

Attribute 1

Class

Attribute 2

Attribute N

The TAN classifier

Attribute N

X
N

Attribute 1

X
1

Class

Attribute 2

X
2

Attribute 3

X
3

Now, let’s consider unlabeled data


Our database:


American

baseball


hamburger


Brazilian

soccer



rice and beans


American

golf



apple pie


?



saloon soccer

rice and beans


?



golf



rice and beans


Question: How to use the unlabeled data?

Unlabeled data can help…


Learning a Naïve Bayes for data generated
from a Naïve Bayes model (10 attributes):

10
0
10
1
10
2
10
3
10
4
0.06
0.07
0.08
0.09
0.1
0.11
Number of Unlabeled records
Probability of error
30 Labeled
300 Labeled
3000 Labeled
… but unlabeled data may degrade
performance!


Surprising fact:

more data may

not help; more

data may hurt

Some math: asymptotic analysis


Asymptotic bias:








Variance decreases with more data

A very simple example


Consider the following situation:



Class

X

Y

Class

X

Y

“Real”

“Assumed”

X and Y are Gaussian given Class

Effect of unlabeled data


a
different perspective

10
1
10
2
10
3
10
4
10
5
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Classification error: 0%, 50%, 99% unlabeled records
Number of records (log)
Classification error
0%, complete
50%, only labeled
50%, complete
99%, only labeled
99%, complete
Searching for structures


Previous tests suggest that we should pay
attention to modeling assumptions when
dealing with unlabeled data



In the context of Bayesian network classifiers,
we must look for structures



This is not easy; worse, existing algorithms do
not focus on classification


Stochastic Structure Search (SSS)


Idea:

search for structures using classification
error



Hard:

search space is too messy



Solution:

Metropolis
-
Hastings sampling with
underlying measure proportional to 1/p
error

Some classification results

Some words on unlabeled data


Unlabeled data can improve performance, can
degrade performance


really hard!



Current understanding about this problem is
shaky


people think outliers or mismatches between
labeled and unlabeled data cause the problem


Research tree (once again)

Biorobotics

(a bit of it)

Bayes nets

Sets of probabilities

Algorithms

independence


Applications

MDPs, robustness analysis, auctions

Anytime, anyspace

(embedded systems)

Classification

Applications

Medical decisions

MCMC algorithms

inference & testing

Sets of probabilities


Instead of


probability of rain is 0.2,

say


probability of rain is [0.1, 0.3]



Instead of


expected value of stock is 10
,

admit


expected value of stock is [0, 1000]

An example


Consider a set of probabilities


p(
q
1
) p(
q
2
), p(
q
3
)


Set of

probabilities

Why?


More realistic and quite expressive as
representation language



Excellent tool for


robustness/sensitivity analysis


modeling incomplete beliefs (probabilistic logic)


group decision
-
making


analysis of economic interactions


for example, to
study arbitrage and design auctions

What we have been doing


Trying to formalize and apply “interval”
reasoning, particularly
independence



Building algorithms for manipulation of these
intervals and sets


To deal with independence and networks


JavaBayes is the only available software that can
deal with this (to some extent!)

Credal networks


Using graphical models
to represent sets of joint
probabilities



Question: what do the
networks represent?



Several open questions
and need for algorithms

Family In?

Dog Sick?

Lights On?

Dog Barking?

Dog Out?

Concluding



To summarize, we want to understand how to
use probabilities in AI, and then we add a bit of
robotics



Support from FAPESP and HP Labs has been
generous



Visit the lab in your next trip to São Paulo