The introduction of the 40 algorithms whose correct rate on training set is higher than
50%.
BayesNet
BayesNet learns Bayesian networks under the assumptions that normal attributes and no
missing values with two different algorithms for estimating the con
ditional probability tables of
the network. K2 or TAN algorithm or more sophisticated methods is employed to search.
ComplementNaiveBaye
ComplementNaiveBaye builds and uses a Complement class Naive Bayes classifier. (Jason et
al. 2003)
NaiveBayes
NaiveBa
yes implements the probabilistic Naive Bayes classifier. And kernel density
estimators is employed in this classifier. (George and Pat 1995)
NaiveBayesMultinomial
NaiveBayesMultinomial implements the multinomial Bayes classifier which is a modified
form o
f Naive Bayes by accommodating words frequencies. (Andrew and Kamal 1998)
NaiveBayesSimple
NaiveBayesSimple builds and uses a simple Naive Bayes classifier. Normal distribution is
employed to model numeric attributes. (Richard and Peter 1973)
NaiveBayesU
pdateable
NaiveBayesUpdateable is the updateable version of NaiveBayes which can process only one
instance at a time. Kernel estimator but not discretization is employed in this classifier. (Jason et
al.2003)
Logistic
Logistic builds and uses a multinomia
l logistic regression model with a ridge estimator which
can guard against overfitting by penalizing large coefficients. (le and van 1992)
MultilayerPerceptron
MultilayerPerceptron
is a neural network that trains using backpropagation
to classify
instance
s. The network can be built either by hand or an algorithm which can also be monitored
and modified during training time.
SimpleLogistic
SimpleLogistic builds linear logistic regression models. In order to fit this models, LogitBoost
with simple regressio
n functions as base learners is employed. The optimal number of iterations
toperform is determined by using cross

validated, which supports automatic attribute
selection.
(Niels et al. 2005, Marc et al. 2005 )
SMO
SMO implements John Platt's sequential min
imal optimization algorithm, using polynomial or
Gaussian kernels, for training a support vector classifier. (Platt 1998, Keerthi 2001, Trevor and
Robert 1998)
IB1
IB1 is a nearest

neighbour classifier. Normalized Euclidean distance is employed to find th
e
training instance closest to the given test instance, and it predicts the same class as this training
instance. If several instances have the same (smallest) distance to the test instance, the first one
found is used. (Aha and Kibler 1991)
IBk
IBK is a
k

nearest

neighbour classifier that uses Euclidean distance metric. The number of
nearest neighbors can be determined automatically using leave

one

out cross

validation. (Aha
and
Kibler 1991)
Kstar
KStar is a nearest

neighbor classifier using a generalize
d distance function which is defined as
the complexity of transforming one instance into another. It uses an entropy

based distance
function which is different from other instance

based learners. (John and Leonard 1995)
BFTree
BFTree builds a best

first d
ecision tree which uses binary split for both nominal and numeric
attributes. (Shi 2007, Jerome et al. 2000)
J48
J48 generates a pruned or unpruned C4.5 decision tree. (Ross 1993)
J48graft
J48graft generates a grafted (pruned or unpruned) C4.5 decision t
ree. (Geoff 1999)
NBTree
NBTree
is a
hybrids between decision tree and Naive Bayes which creates trees whose leaves
are Naive Bayes classifiers for instances that reach the leaf. (Ron 1996)
RandomForest
RandomForest constructs random forests by bagging e
nsembles of random trees. (Leo 2001)
REPTree
REPTree builds a decision or regression tree using information gain or variance, and
reduced

error pruning is employed to prune this tree.
SimpleCart
SimpleCart implements minimal cost

complexity pruning which
deals with missing values by
using the method of fractional instances instead of surrogate split method. (Leo 1984)
DecisionTable
DecisionTable builds a simple decision table majority classifier which
evaluates feature
subsets using best

first search and
use cross

validation for evaluation.
(Ron 1995)
Jrip
Jrip implements Repeated Incremental Pruning to Produce Error Reduction (RIPPER), which
is an optimized version of IREP. (William 1995)
PART
PART generates a PART decision list using separate

and

conq
uer. It builds a partial C4.5
decision tree in each iteration and makes the best leaf into a rule. (Eibe and Ian 1998)
AttributeSelectedClassifier
AttributeSelectedClassifier selects attributes to reduce the data’s dimensionality before
passing it to the
classifier.
Bagging
Bagging bags a classifier to reduce variance which can do classification and regression
depending on the base learner. (Leo 1996)
ClassificationViaClustering
ClassificationViaClustering uses a cluster for classification which uses a f
ixed number of
clusters in cluster algorithms. The number of clusters to generate is equal to the number of class
labels in the dataset in order to obtain a useful model.
ClassificationViaRegression
ClassificationViaRegressions performs classification usi
ng regression methods. Class is
binarized and one regression model is built for each class value. (Frank et al. 1998)
Dagging
Dagging creates a number of disjoint, stratified folds out of the data and feeds each chunk of
data to a copy of the supplied bas
e classifier. Since all generated base classifiers are put into the
vote classifier, majority voting is employed to predict. (Ting and Witten 1997)
Decorate
Decorate builds diverse ensembles of classifiers by using specially constructed artificial
trainin
g examples. (Melville and Mooney 2003, Melville and Mooney 2004)
END
END builds an ensemble of nested dichotomies to handle multi

class datasets with 2

class
classifiers. (Dong et al.2005, Eibe and Stefan 2004)
EnsembleSelection
EnsembleSelection uses en
semble selection method to combine several classifiers from
libraries of thousands of models which are generated using different learning algorithms and
parameter settings. (Caruana 2004)
FilteredClassifier
FilteredClassifier runs an arbitrary classifier
on data which has been passed through an
arbitrary filter whose structure is based exclusively on the training data. And test instances will
be
processed by the filter without changing their structure.
LogitBoost
LogitBoost performs additive logistic regr
ession using a regression scheme as the base learner.
And it can handle multi

class problems. (Friedman 1998)
MultiClassClassifier
MultiClassClassifier handles multi

class datasets with 2

class classifiers using any of the
following methods:
one versus al
l the rest, pairwise classification using voting to predict,
exhaustive error

correcting codes and randomly selected error

correcting codes
.
RacedIncrementalLogitBoost
RacedIncrementalLogitBoost learns large datasets by way of racing LogitBoosted committe
es
and operates incrementally by processing that datasets in batches.
RandomCommittee
RandomCommittee builds an ensemble of randomizable base classifiers which are built using
a different random number seed (but based one the same data). The final predict
ion is a straight
average of the predictions generated by the individual base classifiers.
RandomSubSpace
RandomSubSpace constructs a decision tree based classifier that maintains highest accuracy
on training data and improves on generalization accuracy a
s it grows in complexity. The classifie
consists of multiple trees constructed systematically by pseudorandomly selecting subsets of
components of the feature vector, that is, trees constructed in randomly chosen subspaces. (Tin
1998)
ClassBalancedND
Clas
sBalancedND handles multi

class datasets with 2

class classifiers by building a random
class

balanced tree structure. (Dong et al.2005, Eibe and Stefan 2004)
DataNearBalancedND
DataNearBalancedND handles multi

class datasets with 2

class classifiers by bu
ilding a
random data

balanced tree structure. (Dong et al.2005, Eibe and Stefan 2004)
ND
ND handles multi

class datasets with 2

class classifiers by building a random tree structure.
(Dong et al.2005, Eibe and Stefan 2004)
Reference:
Aha, D., Kibler, D.
1991. Instance

based learning algorithms. Machine Learning. 6:37

66.
Andrew Mccallum, Kamal Nigam. 1998. A Comparison of Event Models for Naive Bayes Text
Classification. In: AAAI

98 Workshop on 'Learning for Text Categorization'.
Caruana, Rich, Nicules
cu, Alex, Crew, Geoff, and Ksikes, Alex. 2004. Ensemble Selection from
Libraries of Models, The International Conference on Machine Learning (ICML'04).
Dong Lin, Eibe Frank, Stefan Kramer.
2005. Ensembles of Balanced Nested Dichotomies for
Multi

class Pro
blems. In: PKDD, 84

95.
Eibe Frank, Ian H. Witten.
1998. Generating Accurate Rule Sets Without Global Optimization. In:
Fifteenth International Conference on Machine Learning, 144

151.
Eibe Frank, Stefan Kramer. 2004. Ensembles of nested dichotomies for
multi

class problems. In:
Twenty

first International Conference on Machine Learning.
Frank, E., Wang, Y., Inglis, S., Holmes, G., Witten, I.H. 1998.
Using model trees for classification.
Machine Learning. 32(1):63

76.
Friedman, J., Hastie, T., Tibshirani
, R. 1998. Additive Logistic Regression: a Statistical View of
Boosting. Stanford University.
Geoff Webb. 1999. Decision Tree Grafting From the All

Tests

But

One Partition. In, San
Francisco, CA.
George H. John, Pat Langley. 1995. Estimating Continuous D
istributions in Bayesian Classifiers.
In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, 338

345.
Ian H. Witten, Eibe Frank.
2005. Data Mining Practical Machine Learning Tools and Techniques
(Second Edition), MORGAN KAUFMANN PUB
LISHER.
Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger. 2003. Tackling the Poor
Assumptions of Naive Bayes Text Classifiers. In: ICML, 616

623.
Jerome Friedman, Trevor Hastie, Robert Tibshirani. 2000. Additive logistic regression : A
stati
stical view of boosting. Annals of statistics. 28(2):337

407.
John G. Cleary, Leonard E. Trigg. 1995. K*: An Instance

based Learner Using an Entropic
Distance Measure. In: 12th International Conference on Machine Learning, 108

114.
Keerthi, S.S., Shevade
, S.K. C. Bhattacharyya, K.R.K. Murthy. 2001. Improvements to Platt's
SMO Algorithm for SVM Classifier Design. Neural Computation. 13(3):637

649.
le Cessie, S., van Houwelingen, J.C. 1992. Ridge Estimators in Logistic Regression. Applied
Statistics. 41(1)
:191

201.
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone. 1984. Classification and
Regression Trees. Wadsworth International Group, Belmont, California.
Leo Breiman. 1996. Bagging predictors. Machine Learning. 24(2):123

140.
Leo Br
eiman. 2001. Random Forests. Machine Learning. 45(1):5

32.
Marc Sumner, Eibe Frank, Mark Hall. 2005. Speeding up Logistic Model Tree Induction. In: 9th
European Conference on Principles and Practice of Knowledge Discovery in Databases, 675

683.
Melville,
Mooney, R.J. 2003. Constructing Diverse Classifier Ensembles Using Artificial Training
Examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505

510.
Melville, Mooney, R.J. 2004. Creating Diversity in Ensembles Using Artific
ial Data. Information
Fusion: Special Issue on Diversity in Multiclassifier Systems.
Niels Landwehr, Mark Hall, Eibe Frank.
2005. Logistic Model Trees.
Platt, J. 1998. Machines using Sequential Minimal Optimization. In B. Schoelkopf and C. Burges
and A.
Smola, editors, Advances in Kernel Methods

Support Vector Learning.
Richard Duda, Peter Hart. 1973. Pattern Classification and Scene Analysis. Wiley, New York.
Ron Kohavi. 1995. The Power of Decision Tables. In: 8th European Conference on Machine
Learni
ng, 174

189.
Ron Kohavi. 1996. Scaling Up the Accuracy of Naive

Bayes Classifiers: A Decision

Tree Hybrid.
In: Second International Conference on Knoledge Discovery and Data Mining, 202

207.
Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers, San
Mateo, CA.
Shi Haijian. 2007. Best

first decision tree learning.
Hamilton, NZ.
Tin Kam Ho. 1998.
The Random Subspace Method for Constructing Decision Forests. IEEE
Transactions on Pattern Analysis and Machine Intelligence. 20(8)
:832

844.
Ting, K. M., Witten, I. H. 1997. Stacking Bagged and Dagged Models. In: Fourteenth
international Conference on Machine Learning, San Francisco, CA, 367

375.
Trevor Hastie, Robert Tibshirani. 1998. Classification by Pairwise Coupling. In: Advanc
es in
Neural Information Processing Systems.
William W. Cohen. 1995. Fast Effective Rule Induction. In: Twelfth International Conference on
Machine Learning, 115

123.
Comments 0
Log in to post a comment