Introduction To Machine Learning

randombroadAI and Robotics

Oct 15, 2013 (3 years and 7 months ago)

66 views

Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

1

Introduction To
Machine Learning








Study material


Handouts, your notes and course readings


Primary textbook:






Chris. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.


Other books:



Friedman, Hastie, Tibshirani. Elements of stat
istical learning. Springer, 2001.



Duda, Hart, Stork. Pattern classification. 2
nd
edition. J Wiley and Sons, 2000.



C. Bishop. Neural networks for pattern recognition. Oxford U. Press, 1996.



T. Mitchell. Machine Learning. McGraw Hill, 1997


machine learning




The field of
machine learning
studies the design of computer programs
(agents) capable of learning from past experience or adapting to changes in the
environment




Learning process:
Learner (a computer program) processes data
D
representing past experience
s and tries to either develop an appropriate
response to future data, or describe in some meaningful way the data seen

Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

2


Example:

Learner sees a set of patient cases (patient records) with corresponding diagnoses. It
can either try:



to predict the presence
of a disease for future patients



describe the dependencies between diseases, symptoms


M
achine
L
earning Applications


• The need for building agents

(application)

capable of learning is everywhere

Perception



computer vision




natural language processing





syntactic pattern recognition




search engines




medical diagnosis





bioinformatics





brain
-
machine interfaces




cheminformatics




Detecting




credit card fraud




stock market




email spam detection,

Classifying



DNA sequences




Sp
eech and




handw
riting recognition




Voice recognition



Iris classification



text/graphics discrimination,




sleep

EEG staging

Object recognition




game playing





adaptive websites




robot locomotion, and




structural health monitoring
.




breast X
-
ray

screening



Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

3
















Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

4

From
The Real World

to
The
Machine Learning

B
i
g

Pi
ct
ur
e


Machine
Learning

Feature
Selection and
preparation

dimensionality
reduction

Feature Extraction

Application

Kernel Ridge
Regression.



Naive Bayes.


Neural
Network


Random
F
orest


Support
Vector


Bayesian
method


Neural
Network.


Gaussian
Processes


Relevance
Vector


Graphical
Models.


Hidden
Markov Model


Mixture
Models and
EM





Standardization


Normalization


Global data matrix
normalization


Selection with
Gram
-
Sc
hmidt
orthogonalization.


Ranking with the
Relief score.


Ranking with
Random Forests.


Ranking with the
signal
-
to
-
noise
ratio.


Ranking with
recursive feature
elimination using a
SVC classifier.


Principal component
analysis


Semidefinite
embedding


Multi
factor
dimensionality
reduction


Multilinear subspace
learning


Nonlinear
dimensionality
reduction


Isomap


Kernel PCA


Multilinear PCA


Latent semantic
analysis


Partial least squares


Independent
component analysis


Autoencoder


All pixels

Edge detection

Canny
,

Canny
-
Deriche
,
Differential
,
Sobel
,

Prewitt
,
Roberts Cross


Corner detection

Harris operator
,
Shi and Tomasi
,
Level curve curvature
,

S
USAN
,
FAST

Blob detection

Laplacian of Gaussian (LoG)

Difference of Gaussians (DoG)

Determinant of Hessian (DoH)


Maximally stab
le extremal
regions

PCBR

Ridge detection

Hough transform

Chord Competition

Structure tensor

Template matching

Images and
video

Zero Crossings with Peak
Amplitudes

(ZCPA
),
Mel
frequency cepstral coefficients
(MFCCs)
,
perceptually based
linear prediction analysis (PLP),

generalized synchron
y detector
(GSD),
the ensemble interval
histogram (EIH),

Linear
Predictive Cepstral(LPC),
(RastaPLP
),
Subband
-
based
Periodicity and Aperiodicity
DEcomposition (SPADE).

Voice

intersection
, shadow feature,

chain code histogram
,
line fitting

,
Curvature
,
Horizontal and
Vertical Histogram
,
Topological
Features
,

Parameters Of
Polynomials
,

Contour
Information

Handwritten

…………..


Bioinform
tic

……………

Natural
Language

…………….

Records

Financial

Computer vision

Search engines

Medical diagnosis

Cheminf
ormatics

credit card

Stock market

Geography

Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

5









Types of learning


Supervised learning



Learning mapping between input
x
and desired output
y



Teacher gives me y’s for the learning purposes


Unsupervised learning



Learning relations between dat
a components



No specific outputs given by a teacher


Reinforcement learning



Learning mapping between input
x
and desired output y



Critic does not give me y’s but instead a signal (reinforcement) of
how good my answer was


Other types of learning:



Concept learning, explanation
-
based learning, etc.




Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

6






Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

7










Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

8








Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

9






Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

10







Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

11






Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

12











Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

13








Optimization



Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

14







Optimization has been used in many machine learning models such as:



Perceptron



Neural network



Support vector
machine



HMM



Density estimation



……..






Nonlinear Optimization


Examples






Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

15









Steepest descent



The gradient is everywhere perpendicular to the contour lines.



After each line minimization the new
gradient is always
orthogonal
to the
previous step direction (true of any line minimization).



Consequently, the iterates tend to zig
-
zag down the valley in a very inefficient
manner



Conjugate gradient




Each
p
k

is chosen to be

conjugate to all previous search directions with
respect to the Hessian
H
:

Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

16




The resulting search directions are mutually linearly independent.



Newton method

Expand
f
(
x
)

by its Taylor series about
the point
x
k

It fast, but Newton

method requires computing the Hessian matrix at each
iteration


this is not always feasible


Quasi
-
Newton methods



If the problem size is large and the Hessian matrix is dense then it may be
infeasible/inconvenient to com
pute it directly.



Quasi
-
Newton methods avoid this problem by keeping a “rolling estimate” of
H(x), updated at each iteration using new gradient information.


Levenberg
-
Marquardt method

Is another useful method when
the main problem has

least squares
subpr
oblem.


Sequential quadratic programming


Penalty methods


Constrained Optimization


Example








Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

17

Matlab


Minimization Problems





Least
-
Squares (Model
-
Fitting) Problems




Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

18

Optimization problems can be:


Simple

Hard

Few

decision variables

Differentiable

Single modal

Objective easy to calculate

No or light constraints

Feasibility easy to determine

Single objective

deterministic

Many decision variables

Discontinuous, combinatorial

Multi modal

Objective difficult to calcula
te

Severely constraints

Feasibility difficult to determine

Multiple objective

Stochastic


Meta
-
H
euristics



Meta
-
heuristics are not tied to any special problem type and are general
methods that can be altered to fit the specific problem.



combinatorial optim
ization is a topic that consists of finding an optimal object
from a finite set of objects

such as TSP and minimum cover



The inspiration

methods

such as
:





Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

19


Advantages



Very flexible



Often global optimizers



Often robust to problem size, problem instanc
e and random variables



May be only practical alternative

Disadvantages



Often need problem specific information / techniques



Optimality (convergence) may not be guaranteed



Lack of theoretic basis



Different searches may yield different solutions to the same

problem
(stochastic)



Stopping criteria



Multiple search parameters










Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

20










Chapter 1: Introduction To Machine Learning



Dr. Essam Al Daoud

21










Independent