COMPARISON OF NEAREST NEIGHBOR (IBK), REGRESSION BY DISCRETIZATION AND ISOTONIC REGRESSION CLASSIFICATION ALGORITHMS FOR PRECIPITATION CLASSES PREDICTION. Solomon Mwanjele Mwagha, P. W. Waiganjo, C. M. Moturi Department of Mathematics and Physics, Pwani University College, Kilifi, Kenya soproltd@gmail.com

siberiaskeinΔιαχείριση Δεδομένων

20 Νοε 2013 (πριν από 3 χρόνια και 4 μήνες)

418 εμφανίσεις

COMPARISON OF NEAREST NEIGHBOR (IBK), REGRESSION BY DISCRETIZATION
AND ISOTONIC REGRESSION CLASSIFICATION ALGORITHMS FOR PRECIPITATION
CLASSES PREDICTION.

1
Solomon Mwanjele Mwagha,
2
P. W. Waiganjo,
2
C.
M. Moturi

1
Department of Mathematics and Physics,
Pwani University College, Kilifi, Kenya

soproltd@gmail.com

2
School of Computing and Informatics, University of Nairobi, Kenya


ABSTRACT


Selection of classifier for use in prediction is a challenge. In order to select the best classifier
comparisons can be made on various aspects of the classifiers. The key objective of this paper
was to compare performance of nearest neighbor (ibk), regres
sion by discretization and isotonic
regression classifiers for predicting predefined precipitation classes over Voi, Kenya. We sought
to train, test and evaluate the performance of nearest neighbor (ibk), regression by discretization
and isotonic regressio
n classification algorithms in predicting precipitation classes. A period of
1979 to 2008 daily KMD historical dataset on minimum/maximum temperatures and
precipitations for Voi KMD station was obtained. Knowledge discovery and data mining (KDD)
process st
eps were applied. A preprocessing module was designed to produce training and testing
sets of files for use with the classifiers. Three classifiers (Isotonic Regression, K
-
nearest
neighbours classifier, and RegressionByDiscretization) were used for trainin
g training and
testing of the data sets. On running the classifiers the error of the predicted values, root relative
squared error and the time taken to train/build each classifier model were computed. Each
classifier provided predicted output classes 12 m
onths in advance. Performance of the three
classifiers was compared in terms of error of the predicted values, root relative squared error and
the time taken to train/build each classifier model. The predicted output classes were also
compared to actual cl
asses. Percentage performance for each classifier to actual precipitation
classes was computed and compared. The study evaluation showed that the nearest neighbor
classifier is a suitable tool for training rainfall data for precipitation classes prediction
.


Keywords

Regression by Discretization, isotonic regression, nearest neighbor (ibk), precipitation
prediction, supervised classifier, classifier performance, cross validation.