In Pr o c e e dings of the T enth National Confer enc e on A rtiial Intel ligenc e San Jose AAAI Press

An Analysis of Ba y esian Classirs

y

Langley Iba KThompso tolemyr casao v

AI Researc h Branc h

NASA Ames Researc h Cen ter

Mott Field CA USA

b er of attributes and the class and attribute frequen

cies they obtain predictions ab out the b eha vior of

In this pap er w e presen t an a v eragease analysis

induction algorithms and used exp erimen ts to c hec k

of the Ba y esian classir a simple induction algo

their analyses Ho w ev er their researc h do es not fo

rithm that fares remark ably w ell on man y learning

cus on algorithms t ypically used b y the exp erimen tal

tasks Our analysis assumes a monotone conjunc

and practical sides of mac hine learning and it is im

tiv e target concept and indep enden t noiseree

p ortan t that a v eragease analyses b e extended to suc h

Bo olean attributes W e calculate the probabilit y

metho ds

that the algorithm will induce an arbitrary pair of

Recen tly there has b een gro wing in terest in proba

concept descriptions and then use this to compute

bilistic approac hes to inductiv e learning F or example

the probabilit y of correct classiation o v er the in

Fisher has describ ed Cobweb an incremen tal

stance space The analysis tak es in to accoun t the

algorithm for conceptual clustering that dra ws hea vily

n um b er of training instances the n um b er of at

on Ba y esian ideas and the literature rep orts a n um b er

tributes the distribution of these attributes and

of systems that build on this w ork Allen Lang

the lev el of class noise W e also explore the b e

ley Iba Gennari Thompson Langley

ha vioral implications of the analysis b y presen ting

Cheeseman et al ha v e outlined A uto

predicted learning curv es for artiial domains

Class a nonincremen tal system that uses Ba y esian

and giv e exp erimen tal results on these domains

metho ds to cluster instances in to groups and other

as a c hec k on our reasoning

researc hers ha v e fo cused on the induction of Ba y esian

inference net w orks Co op er Kersk o vits

These recen t Ba y esian learning algorithms are com

plex and not easily amenable to analysis but they

One goal of researc h in mac hine learning is to disco v er

principles that relate algorithms and domain c haracter share a common ancestor that is simpler and more

tractable This sup ervised algorithm whic h w e re

istics to b eha vior T o this end man y researc hers ha v e

fer to simply as a Bayesian classir comes originally

carried out systematic exp erimen tation with natural

from w ork in pattern recognition uda Hart

and artiial domains in searc h of empirical regularities

The metho d stores a probabilistic summary for eac h

Kibler Langley Others ha v e fo cused

class this summary con tains the conditional probabil

on theoretical analyses often within the paradigm of

it y of eac h attribute v alue giv en the class as w ell as

probably appro ximately correct learning Haus

the probabilit y r base rate of the class This data

sler Ho w ev er most exp erimen tal studies are

structure appro ximates the represen tational p o w er of

based only on informal analyses of the learning task

a p erceptron it describ es a single decision b oundary

whereas most formal analyses address the w orst case

through the instance space When the algorithm en

and th us b ear little relation to empirical results

coun ters a new instance it up dates the probabilities

A third approac h prop osed b y Cohen and Ho w e

stored with the sp ecid class Neither the order of

in v olv es the form ulation of a v eragease mo d

training instances nor the o ccurrence of classiation

els for sp eci algorithms and testing them through

errors ha v e an y ect on this pro cess When giv en a

exp erimen tation P azzani and Sarrett study

test instance the classir uses an ev aluation function

of conjunctiv e learning pro vides an excellen t example

hic h w e describ e in detail later to rank the alter

of this tec hnique as do es Hirsc h b erg and P azzani

w ork on inducing k NF concepts By assum

A related approac h in v olv es deriving the optimal learn

ing information ab out the target concept the n um

ing algorithm under certain assumptions and then imple

Also aiated with RECOM T ec hnologies

men ting an appro ximation of that algorithm Opp er

y

Also aiated with Sterling Soft w are Haussler Anal ysis of Ba yesian Classifiers

gorithm will learn a particular pair of concept descrip

tions After this w e deriv e the accuracy of an arbi

Domain Ba yes IND Freq

trary pair of descriptions o v er all instances T ak en to

gether these expressions giv e us the o v erall accuracy

So ybean

of the learned concepts W e d that a n um b er of fac

Chess

tors inence b eha vior of the algorithm including the

L ympho

n um b er of training instances the n um b er of relev an t

Splice

and irrelev an t attributes the amoun t of class and at

Pr omoters

tribute noise and the class and attribute frequencies

Finally w e examine the implications of the analysis b y

predicting b eha vior in sp eci domains and c hec k our

T able P ercen tage accuracies for t w o induction al

gorithms on e classiation domains along with the reasoning with exp erimen ts in these domains

accuracy of predicting the most frequen t class

Consider a concept C deed as the monotone con

nativ e classes based on their probabilistic summaries

junction of r relev an t features A A in whic h

r

and assigns the instance to the highest scoring class

none of the features are negated Also assume there

Both the ev aluation function and the summary de are i irrelev an t features A A Let P A b e

r r i j

scriptions used in Ba y esian classirs assume that at

the probabilit y of feature A o ccurring in an instance

j

tributes are statistically indep enden t Since this seems

The concept descriptions learned b y a Ba y esian clas

unrealistic for man y natural domains researc hers ha v e sir are fully determined b y the n training instances

often concluded that the algorithm will b eha v e p o orly it has observ ed Th us to compute the probabilit y of

in comparison to other induction metho ds Ho w ev er eac h suc h concept description w e m ust consider dir

no studies ha v e examined the exten t to whic h violation en t p ossible com binations of n training instances

of this assumption leads to p erformance degradation First let us consider the probabilit y that the algo

and the probabilistic approac h should b e quite robust rithm has observ ed exactly k out of n p ositiv e in

with resp ect to b oth noise and irrelev an t attributes stances If w e let P C b e the probabilit y of observing

Moreo v er earlier studies Clark Niblett a p ositiv e instance and w e let x b e the observ ed frac

presen t evidence of the practicalit y of the algorithm tion of p ositiv e instances then w e ha v e

T able presen ts additional exp erimen tal evidence

k n

k n k

for the utilit y of Ba y esian classirs In this study P x P C P C

n k

w e compare the metho d to IND em ulation of the

C algorithm un tine Caruana and an al

This expression also represen ts the probabilit y that

gorithm that simply predicts the mo dal class The e

one has observ ed exactly n k negativ e instances

domains from the UCI database collection urph y

Since w e assume that the concept is monotone con

Aha include the mall so yb ean dataset c hess

junctiv e and that the attributes are indep enden t w e

Q

r

end games in v olving a kingo okinga wn confron ta

ha v e P C P A whic h is simply the pro duct

j

j

tion cases of lymphograph y diseases and t w o biologi

of the probabilities for all relev an t attributes

cal datasets F or eac h domain w e randomly split the

A giv en n um b er of p ositiv e instances k can pro duce

data set in to training instances and test in

man y alternativ e descriptions of the p ositiv e class de

stances rep eating this pro cess to obtain separate

p ending on the instances that are observ ed One can

pairs of training and test sets The table sho ws the

en vision eac h suc h concept description as a cell in an

mean accuracy and conence in terv als on the

r i dimensional matrix with eac h dimension rang

test sets for eac h domain

ing from to k and with the coun t on dimension j

In four of the domains the Ba y esian classir is at

represen ting the n um b er of p ositiv e instances in whic h

least as accurate as the C reimplemen tation W e will

attribute A w as presen t One can en vision a similar

j

not argue that the Ba y esian classir is sup erior to this

matrix for the negativ e instances again ha ving dimen

more sophisticated metho d but the results do sho w

sionalit y r i but with eac h dimension ranging from

that it b eha v es w ell across a v ariet y of domains Th us

to n k and with the coun t on eac h dimension j rep

the Ba y esian classir is a promising induction algo

resen ting the n um b er of negativ e instances in whic h A

j

rithm that deserv es closer insp ection and a careful

o ccurred Figure sho ws a p ositiv e cell matrix with

analysis should giv e us insigh ts in to its b eha vior

r i k The designated cell holds the prob

W e simplify matters b y limiting our analysis to the abilit y that the algorithm has seen t w o instances with

induction of conjunctiv e concepts F urthermore w e A presen t instance with A presen t and instances

assume that there are only t w o classes that eac h at with A presen t

tribute is Bo olean and that attributes are indep en In b oth matrices one can index eac h cell or concept

den t of eac h other W e divide our study in to three description b y a v ector of length r i Let P cel l

k

u

parts W e st determine the probabilit y that the al b e the probabilit y that the algorithm has pro duced theAnal ysis of Ba yesian Classifiers

If w e let P I j C b e the probabilit y of I giv en a neg

j j

A

3 2

ativ e instance w e can use the m ultinom i al distribution

1

0

to compute the probabilit y that exactly d of the n k

instances will b e instance I d will b e instance I

2

and d will b e instance I Th us the expression

w w

n k

d d d

1

P I j C P I j C P I j C

w

d d d

A w

2

giv es us the probabilit y of a particular com bination

0

of negativ e instances and from that com bination w e

can compute the concept description cell indices

0

1

2

that result Of course t w o or more com binations of in

A

1

stances ma y pro duce the same concept description but

one simply sums the probabilities for all suc h com bina

Figure A p ositiv e cell matrix for three attributes

tions to get the total probabilit y for the cell All that

and k V alues along axes represen t n um b ers of

w e need to mak e this op erational is P I j C the prob

j

p ositiv e instances for whic h A w as presen t

j

abilit y of I giv en a negativ e instance In the absence

j

of noise this is simply P I C since P C j I

j j

W e can extend the framew ork to handle class noise

cell indexed b y v ector u in the p ositiv e matrix giv en

b y mo difying the deitions of three basic terms

k p ositiv e instances let P cel l b e the analogous

v n k

probabilit y for a cell in the negativ e matrix Then a P C P A j C and P I j C One common deition

j j

of class noise in v olv es the corruption of class names

w eigh ted pro duct of these terms giv es the probabilit y

replacing the actual class with its opp osite with

that the learning algorithm will generate an y particular

pair of concept descriptions whic h is a certain probabilit y z b et w een and The proba

bilit y of the class after one has corrupted v alues is

k

P k u v P x P cel l P cel l

n u k v n k

P C z P C z P C P C z z

n

as w e ha v e noted elsewhere ba Langley

In other w ords one m ultiplies the probabilit y of seeing

F or an irrelev an t attribute A the probabilit y

j

k out of n p ositiv e instances and the probabilities of

P A j C is unacted b y class noise and remains equal

j

encoun tering cell u in the p ositiv e matrix and cell v in

to P A since the attribute is still indep enden t of the

j

the negativ e matrix

class Ho w ev er the situation for relev an t attributes

Ho w ev er w e m ust still determine the probabilit y of

is more complicated By deition w e can reexpress

a giv en cell from the matrix F or those in the p ositiv e

the corrupted conditional probabilit y of a relev an t at

matrix this is straigh tforw ard since the attributes re

tribute A giv en the ossibly corrupted class C as

j

main indep enden t when the instance is a mem b er of a

conjunctiv e concept Th us w e ha v e

P A C

j

P A j C

j

r i P C

Y

u

j

P cel l P y

u k j

where P C is the noisy class probabilit y giv en ab o v e

k

j

Also w e can rewrite the n umerator to sp ecify the situ

ations in whic h corruption of the class name do es and

as the probabilit y for cel l in the p ositiv e matrix

u

do es not o ccur giving

where y represen ts the observ ed fraction of the k in

j

stances in whic h attribute A w as presen t F urther

j z P C P A j C z P C P A j C

j j

P A j C

j

more the probabilit y that one will observ e A in ex

j

P C

actly u out of k suc h instances is

j

Since w e kno w that P A j C for a relev an t at

j

u k

j

u k u tribute and since P A j C P A P C C

j j

P y P A j C P A j C

j j

k u for conjunctiv e concepts w e ha v e

j

z P C z P A P C

j

In the absence of noise w e ha v e P A j C for all

j

P A j C

j

relev an t attributes and P A j C P A for all irrel

P C z z

j j

ev an t attributes

whic h in v olv es only terms that existed b efore corrup

The calculation is more diult for cells in the neg

tion of the class name

ativ e matrix One cannot simply tak e the pro duct of

W e can use similar reasoning to compute the p ost

the probabilities for eac h index of the cell since for a

noise probabilit y of an y particular instance giv en that

conjunctiv e concept the attributes are not statistically

it is negativ e As b efore w e can rewrite P I j C as

j

indep enden t Ho w ev er one can compute the probabil

P I C z P C P I j C z P C P I j C

it y that the n k observ ed negativ e instances will b e j j j

comp osed of a particular com bination of instances

P C z z

P C Anal ysis of Ba yesian Classifiers

but in this case the sp ecial conditions are somewhat T o compute the exp ected accuracy for instance I

j

diren t F or a negativ e instance w e ha v e P I j C w e sum o v er all p ossible v alues of k and pairs of con

j

so that the second term in the n umerator b ecomes cept descriptions the pro duct of the probabilit y of se

zero In con trast for a p ositiv e instance w e ha v e lecting the particular pair of concept descriptions af

P I j C so that the st term disapp ears T ak en ter k p ositiv e instances and the pair accuracy on I

j j

together these conditions let us generate probabilities Th us w e ha v e

for cells in the negativ e matrix after one has added

n

X X X

noise to the class name

K I P k u v ac cur acy I

j n n j n u v

After replacing P C with P C P A j C with

j

k u v

P A j C and P I j C with P I j C the expressions

j j j

earlier in this section let us compute the probabilit y

where the second and third summations o ccur o v er the

that a Ba y esian classir will induce an y particular

p ossible v ectors that index in to the p ositiv e matrix

pair of concept descriptions ells in the t w o matri

and the negativ e matrix T o complete our calcula

ces The information necessary for this calculation is

tions w e need an expression for P I whic h is the

j

the n um b er of training instances the n um b er of rele

pro duct of the probabilities of features presen t in I

j

v an t and irrelev an t attributes their distributions and

the lev el of class noise This analysis holds only for

monotone conjunctiv e concepts and in domains with

Although the equations in the previous sections giv e a

indep enden t attributes but man y of the ideas should

formal description of the Ba y esian classir b eha vior

carry o v er to less restricted classes of domains

their implications are not ob vious In this section w e

examine the ects of v arious domain c haracteristics

on the algorithm classiation accuracy Ho w ev er

T o calculate o v erall accuracy after n training instances

b ecause the n um b er of p ossible concept descriptions

w e m ust sum the exp ected accuracy for eac h p ossible

gro ws exp onen tially with the n um b er of training in

instance w eigh ted b y that instance probabilit y of o c

stances and the n um b er of attributes our predictions

currence More formally the exp ected accuracy is

ha v e b een limited to a small n um b er of eac h

I

In addition to theoretical predictions w e rep ort

X

K P I K I learning curv es that summarize runs on randomly

n j j n

generated training sets Eac h curv e rep orts the a v er

j

age classiation accuracy o v er these runs on a single

T o compute the exp ected accuracy K I for instance

j n

test set of randomly generated instances con tain

I w e m ust determine for eac h pair of cells in the p osi

j

ing no noise In eac h case w e b ound the mean accu

tiv e and negativ e matrices the instance classiation

racy with conence in terv als to sho w the degree

A test instance I is classid b y computing its score

j

to whic h our predicted learning curv es the observ ed

for eac h class description and selecting the class with

ones These exp erimen tal results pro vide an imp ortan t

the highest score ho osing randomly in case of ties

c hec k on our reasoning and they rev ealed a n um b er of

W e will dee accur acy I for the pair of con

j n u v

problems during dev elopmen t of the analysis

cept descriptions u and v to b e if this sc heme cor

Figure sho ws the ects of concept complexit y

rectly predicts I class if it incorrectly predicts the

j

on the rate of learning in the Ba y esian classir when

class and if a tie o ccurs

no noise is presen t In this case w e hold the n um

F ollo wing our previous notation let n b e the n um b er

b er of irrelev an t attributes i constan t at one and w e

of observ ed instances k b e the n um b er of observ ed p os

hold their probabilit y of o ccurrence P A constan t at

itiv e instances u b e the n um b er of p ositiv e instances

j

W e v ary b oth the n um b er of training instances and

in whic h attribute A o ccurs and v b e the n um b er

j j

the n um b er of relev an t attributes r whic h determine

of negativ e instances in whic h A o ccurs F or a giv en

j

the complexit y of the target concept T o normalize for

instance I one can compute the score for the p ositiv e

j

ects of the base rate w e also hold P C the prob

class description as

abilit y of the concept constan t at this means that

r i u

for eac h of the r relev an t attributes P A is P C

Y

k if A is presen t in I

j j

k

scor e C

and th us is v aried for the diren t conditions

j k u

n otherwise

k

j As t ypical with learning curv es the initial accuracies

b egin lo w t and gradually impro v e with increasing

and an analogous equation for the negativ e class sub

n um b ers of training instances The ect of concept

stituting n k for k and v for u T o a v oid m ultiplyi ng

complexit y also agrees with our in tuitions in tro ducing

b y when an attribute has nev er lw a ys b een ob

serv ed in the training instances but is s not presen t

An alternativ e approac h w ould hold constan t for

in the test instance w e follo w Clark and Niblett

relev an t attributes causing to b ecome This

suggestion of replacing with a small v alue

n udges the initial accuracies up w ard but otherwise has little

suc h as n ect on the learning curv esAnal ysis of Ba yesian Classifiers

(a) (b)

1 relevant

2 relevants

Class noise = 0.0

3 relevants

Class noise = 0.1

Class noise = 0.2

0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40

Number of training instances Number of training instances

Figure Predictiv e accuracy of a Ba y esian classir in a conjunctiv e concept assuming the presence of one

irrelev an t attribute as a function of training instances and n um b er of relev an t attributes and amoun t of

class noise The lines represen t theoretical learning curv es whereas the error bars indicate exp erimen tal results

additional features in to the target concept slo ws the tributes whereas the Ba y esian classir is sensitiv e to

learning rate but do es not act asymptotic accuracy b oth the n um b er of relev an t and irrelev an t attributes

whic h is alw a ys for conjunctiv e concepts on noise Ho w ev er the Ba y esian classir is robust with resp ect

free test cases The rate of learning app ears to degrade to noise whereas the Wholist algorithm is not

gracefully with increasing complexit y The predicted

and observ ed learning curv es are in close agreemen t

whic h lends conence to our a v eragease analysis

In this pap er w e ha v e presen ted an analysis of a

Theory and exp erimen t sho w similar ects when w e

Ba y esian classir Our treatmen t requires that the

v ary the n um b er of irrelev an t attributes learning rate

concept b e monotone conjunctiv e that instances b e

slo ws as w e in tro duce misleading features but the al

free of attribute noise and that attributes b e Bo olean

gorithm gradually con v erges on p erfect accuracy

and indep enden t Giv en information ab out the n um

Figure presen ts similar results on the in terac

b er of relev an t and irrelev an t attributes their frequen

tion b et w een class noise and the n um b er of training

cies and the lev el of class noise our equations compute

instances Here w e hold the n um b er of relev an t at

the exp ected classiation accuracy after a giv en n um

tributes constan t at t w o and the n um b er of irrelev an ts

b er of training instances

constan t at one and w e examine three separate lev els

T o explore the implications of the analysis w e ha v e

of class noise F ollo wing the analysis w e assume the

plotted the predicted b eha vior of the algorithm as

test instances are free of noise whic h normalizes ac

a function of the n um b er of training instances the

curacies and eases comparison As one migh t exp ect

n um b er of relev an t attributes and the amoun t of

increasing the noise lev el z decreases the rate of learn

noise ding graceful degradation as the latter t w o

ing Ho w ev er the probabilistic nature of the Ba y esian

increased As a c hec k on our analysis w e run the al

classir leads to graceful degradation and asymptotic

gorithm on artiial domains with the same c haracter

accuracy should b e unacted W e d a close b e

istics W e obtain close s to the predicted b eha vior

t w een the theoretical b eha vior and the exp erimen tal

but only after correcting sev eral errors in our reasoning

learning curv es Although our analysis do es not in

that the empirical studies rev ealed

corp orate attribute noise exp erimen ts with this factor

In additional exp erimen ts w e compare the b eha vior

pro duce similar results In this case equiv alen t lev els

of the Ba y esian classir to that of a reimplemen tation

lead to somewhat slo w er learning rates as one w ould

of C a more widely used algorithm that induces de

exp ect giv en that attribute noise can corrupt m ultiple

cision trees In general the probabilistic metho d p er

v alues whereas class noise acts only one

forms comparably to C despite the latter greater

Finally w e can compare the b eha vior of the Ba y esian sophistication These results suggest that suc h simple

classir to that of Wholist azzani Sarrett metho ds deserv e increased atten tion in future studies

One issue of in terest is the n um b er of train whether theoretical or exp erimen tal

ing instances required to ac hiev e some criterion lev el In future w ork w e plan to extend this analysis in

of accuracy A quan titativ e comparison of this nature sev eral w a ys In particular our curren t equations han

is b ey ond the scop e of this pap er but the resp ectiv e dle only class noise but as Angluin and Laird

analyses and exp erimen ts sho w that the Wholist al ha v e sho wn attribute noise can b e ev en more prob

gorithm is only acted b y the n um b er of irrelev an t at lematic for learning algorithms W e ha v e dev elop ed

Probability of correct classification

0.5 0.6 0.7 0.8 0.9 1

Probability of correct classification

0.5 0.6 0.7 0.8 0.9 1Anal ysis of Ba yesian Classifiers

ten tativ e equations for the case of attribute noise but Co op er G F Hersk o vits E A Ba y esian

the expressions are more complex than for class noise metho d for constructing Ba y esian b elief net w orks from

in that the p ossible corruption of an y com bination of databases Pr o c e e dings of the Seventh Confer enc e on

attributes can mak e an y instance app ear lik e another Unc ertainty in A rtiial Intel ligenc e p Los

W e also need to relax the constrain t that target con Angeles Morgan Kaufmann

cepts m ust b e monotone conjunctiv e

Duda R O Hart P E Pattern classi

Another direction in whic h w e can extend the

c ation and sc ene analysis New Y ork John Wiley

presen t w ork in v olv es running additional exp erimen ts

Sons

Ev en within the assumptions of the curren t analysis

w e could empirically study the exten t to whic h vio Fisher D H Kno wledge acquisition via incre

lated assumptions alter the observ ed b eha vior of the men tal conceptual clustering Machine L e arning

algorithm In addition w e could analyze the attribute

frequencies in sev eral of the domains commonly used

Haussler D Probably appro ximately cor

in exp erimen ts to determine the analytic mo del abil

rect learning Pr o c e e dings of the Eighth National

it y to predict b eha vior on these domains giv en their

Confer enc e on A rtiial Intel ligenc e p

frequencies as input This approac h w ould extend the

Boston AAAI Press

usefulness of our a v eragease mo del b ey ond the arti

ial domains on whic h w e ha v e tested it to date

Iba W Gennari J H Learning to rec

Ov erall w e are encouraged b y the results that w e

ognize mo v emen ts In D H Fisher M J P azzani

ha v e obtained W e ha v e demonstrated that a simple

P Langley ds Conc ept formation Know le dge

Ba y esian classir compares fa v orably with a more so

and exp erienc e in unsup ervise d le arning San Mateo

phisticated induction algorithm and more imp ortan t Morgan Kaufmann

w e ha v e c haracterized its a v eragease b eha vior for a

Iba W Langley P Induction of oneev el

restricted class of domains Our analysis conms in tu

decision trees Pr o c e e dings of the Ninth International

itions ab out the robustness of the Ba y esian algorithm

Confer enc e on Machine L e arning Ab erdeen Morgan

in the face of noise and concept complexit y and it pro

Kaufmann

vides fertile ground for further researc h on this under

studied approac h to induction

Hirsc h b erg D S P azzani M J A ver age

c ase analysis of a k NF le arning algorithm ec hni

cal Rep ort Irvine Univ ersit y of California

Departmen t of Information Computer Science

Thanks to Stephanie Sage Kim ball Collins and Andy

Kibler D Langley P Mac hine learning

Philips for discussions that help ed clarify our ideas

as an exp erimen tal science Pr o c e e dings of the Thir d

Eur op e an Working Session on L e arning p

Glasgo w Pittman

Allen J Langley P In tegrating mem

Murph y P M Aha D W UCI R ep ository

ory and searc h in planning Pr o c e e dings of the Work

of machine le arning datab ases ac hineeadable data

shop on Innovative Appr o aches to Planning Sche dul

rep ository Irvine Univ ersit y of California Depart

ing and Contr ol p San Diego Morgan

men t of Information Computer Science

Kaufmann

Opp er M Haussler D Calculation of the

Angluin D Laird P Learning from noisy

learning curv e of Ba y es optimal classiation algorithm

examples Machine L e arning

for learning a p erceptron with noise Pr o c e e dings of the

F ourth A nnual Workshop on Computational L e arning

Bun tine W Caruana R Intr o duction to

The ory p San ta Cruz Morgan Kaufmann

IND and r e cursive p artitioning ec hnical Rep ort FIA

Mott Field CA NASA Ames Researc h Cen

P azzani M J Sarrett W Av eragease

ter Artiial In telligence Researc h Branc h

analysis of conjunctiv e learning algorithms Pr o c e e d

ings of the Seventh International Confer enc e on Ma

Cheeseman P Kelly J Self M Stutz J T a ylor

chine L e arning p Austin TX Morgan

W F reeman D A utoclass A Ba y esian

Kaufmann

classiation system Pr o c e e dings of the Fifth Interna

tional Confer enc e on Machine L e arning p

Thompson K Langley P Concept forma

Ann Arb or MI Morgan Kaufmann

tion in structured domains In D H Fisher M J P az

zani P Langley ds Conc ept formation Know l

Clark P Niblett T The CN induction

e dge and exp erienc e in unsup ervise d le arning San Ma

algorithm Machine L e arning

teo Morgan Kaufmann

Cohen P R Ho w e A E Ho w ev aluation

guides AI researc h AI Magazine

## Comments 0

Log in to post a comment