Evaluating the Quality of Attributes

randombroadΤεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

77 εμφανίσεις

Evaluating the Quality of Attributes


Igor Kononenko
1


1
University of Ljubljana, Faculty of Computer and Information Science

Trzaska 25, 1000 Ljubljana, Slovenia

e
-
mail:igor.kononenko@fri.uni
-
lj.si



Abstract


One of crucial tasks in machine learning is t
he evaluation of the quality of
attributes. For that purpose a number of measures have been developed that
estimate the usefulness of the attribute for predicting the target variable. We
will describe separately measures for classification (which are appro
priate
also for relational problems) and for regression. Most of the measures
estimate the quality of one attribute independently of the context of other
attributes. However, algorithm ReliefF and its regressional version RReliefF
take into account also th
e context of other attributes and are therefore
appropriate for problems with strong dependencies between attributes. The
following measures will be described:

-

Measures for guiding the search in classification and relational

problems
are: information gain,

Gain

ratio, distance measure,
minimum description length (MDL),

J
-
measure,

Gini
-
index and
ReliefF.

-

T
he quality of attributes
in regression
can be evaluated

using the
following measures:
expected change of variance, regressional
ReliefF, and minimum desc
ription length principle (MDL).



References



I.Kononenko, M. Kukar: Machine Learning and Data Mining


Introduction to principles and algorithms,
Chichester, UK:
Horwood

publ., to
appear in Jan. 2006.


I.Kononenko: Estimating attributes: Analysis a
nd extensions of RELIEF.
Proc. Machine learning: ECML
-
94 / European conference on machine
learning, Catania, Sicily, April 1994 (F.Bergadano, L.de Readt (eds.)),
Springer Verlag, pp.171
-
182.


I.Kononenko: On Biases in Estimating Multivalued Attributes
, Proc.
International joint conference on artificial intelligence IJCAI
-
95, Montreal,
August 20
-
25 1995, pp. 1034
-
1040


P.Smyth in R.M.Goodman:

Rule induction using information theory. In:
Piatetsky
-
Shapiro G. in Frawley W. (eds.) Knowledge Discovery in

Databases, MIT Press
, 1990
.


Keywords


Impurity measures, entropy, gini
-
index, ReliefF, MDL, J
-
measure