Machine Learning Tutorial

unknownlippsAI and Robotics

Oct 16, 2013 (4 years and 7 months ago)


Machine Learning Tutorial

Amit Gruber

The Hebrew University of

Example: Spam Filter

Spam message: unwanted email

Dozens or even hundreds per day

Goal: Automatically distinguish
between spam and non
spam email

Spam message 1

Spam message 2

Spam message 3

Spam message 4

How to Distinguish ?

Message contents ?

Automatic semantic analysis is yet to be

Message sender ?

What about unfamiliar senders or fake
senders ?

Collection of keywords ?

Message Length ?

Mail server ? Time of delivery ?

How to Distinguish ?

It’s hard to define an explicit set of rules to
distinguish between spam and non

Learn the concept of “spam” from
examples !

Example: Gender Classification

The Power of Learning:

Real Life example

How much time does it take you to get to
work ?

First approach: Analyze your route

Distance, traffic lights, traffic, etc…

Can be quite complicated…

Second Approach: how much time does it
usually take ?

Despite of some variance, works remarkably well!

Requires “training” for different times

May fail in special cases

Machine Translation

Collaborative Filtering

Collaborative Filtering: Prediction of user
ratings based on the ratings of other users


Movie ratings

Product recommendation

Is this of merely theoretical interest ??

Netflix Prize

Over 100 million ratings from 480 thousand customers over
17000 movie titles (sparsity: 0.0123)

Recommendation system

Machine Learning Applications

Search Engines

Collaborative Filtering (Netflix, Amazon)

Face, speech and pattern Recognition

Machine Translation

Natural language processing

Medical diagnosis and treatment


Computer games

Many more !

Generalization: Train vs. Test

The central assumption we make is that
the train set and the new examples are

Formally, the assumption is that samples
are drawn from the same distribution

Is this assumption realistic ?

Train vs. Test:

Might Fail to Generalize

Acquiring a good train set

Have a huge train set

Train data might be available on the web

Use humans to collect data

Collect results (or aggregations thereof) of
user actions

Unsupervised methods

require only raw
data, no need for labels !

Machine Learning Strategies

Discriminative Approach

Feature selection: find the features that carry
the most information for separation

Generative Approach

Model the data using a generative process

Estimate the parameters of the model

Supervised vs. Unsupervised

Supervised Machine Learning

Classification (learning)

Collection of large representative train set
might not be simple

Unsupervised Machine Learning


The number of clusters may be known or unknown

Usually plenty of train data is available

Discriminative Learning

Data representation and Feature selection: What
is relevant for classification ?

Gender classification: hair, ears, make up, beard,
moustache, etc.

Linear Separation

SVM, Fisher LDA, Perceptron and more

Different criteria for separation

what would
generalize well ?

linear separation

Linear Separation

Nonlinear Separation

(Kernel Trick)

Generative Approach

Model the observations using a generative

The generative process induces a
distribution over the observations

Learn a set of parameters

Statistical Approach

Real Life

You’re stuck in traffic. Which Lane is

The complicated approach:

Consider the traffic, trucks, merging lanes,

The statistical (Bayesian) Approach:

Which lane is usually faster ? (prior)

What are you seeing ? (evidence)


Machine Learning: Learn a concept from

For good generalization, train data has to
faithfully represent test data

Many potential applications

Already in use and works remarkably well