Dense Object Recognition

jabgoldfishΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

77 εμφανίσεις

1

4054 Machine Vision


Dense Object Recognition


Dr. Simon Prince

Dept. Computer Science

University College London

http://www.cs.ucl.ac.uk/s.prince/4054.htm

1.
Individual pixels


measurement at single pixels


pixel models not connected



2.
Markov Random Fields


pixels connected to 4 neighbours



3.
Textures


models larger regions



4.
Dense Object Recognition


larger regions still


spatially varying structure




Introduction

3

Modelling Textures

1.
Introduction to Object Recognition

2.
Template Matching

3.
Mixtures of Templates





(INTERLUDE: Gaussian and Matrix Identities)

4.
Subspace Models: factor analysis

5.
Known objects across pose and illumination

6.
Finding objects under partial occlusion

7.
Relationship to non
-
probabilistic methods

4

Dense Object Recognition

1. Introduction to Object Recognition

perceptible

vision

material

thing

What is an object?

Regardless of exact definition, there are three things we can say for sure:

1. There are a very large number of
object categories

The

number

of

object

categories

can

be

estimated

by

selecting

a

large

dictionary

and

counting

the

number

of

object

nouns
.

Based

on

this

type

of

methodology

it

has

been

estimated

that

there

are

between

10
,
000

and

30
,
000

object

categories

(Biederman


87
)
.

Objects

fall

into

various

hierarchies,

based

on

refinements

of

their

descriptions

(household

goods,

kettles,

plastic

kettles),

and

by

other

criteria

(for

example,

all

living

creatures

are

organized

into

a

taxonomy)
.

2. Objects Form Groups

OBJECTS

ANIMALS

INANIMATE

PLANTS

MAN
-
MADE

NATURAL

VERTEBRATE


…..

MAMMALS

BIRDS

GROUSE

BOAR

TAPIR

CAMERA

3.
Objects have parts: objects can often be broken down into smaller constituent parts,
which
may or may not be necessary in a given instance. For example, faces certainly
have eyes, mouths and noses, but only optionally have beards

3. Objects have parts

What is object recognition?

Object recognition concerns a number of tasks

Verification: is that a lamp?

What is object recognition?

Detection: are there people?

What is object recognition?

Identification: is that
Potala

Palace?

What is object recognition?

Object categorization

What is object recognition?

mountain

building

tree

banner

vendor

people

street lamp



outdoor



city





Scene and context categorization

What is object recognition?

What is object recognition good for?

Computational
photography

What is object recognition good for?

Assisted driving

meters

meters

Ped

Ped

Car

Lane detection

Pedestrian and car detection



Collision warning
systems with adaptive
cruise control,



Lane departure warning
systems,



Rear object detection
systems,


Improving online search

Query:

STREET

Organizing photo collections

What is object recognition good for?

Why is object recognition hard?

Viewpoint variation

Michelangelo 1475
-
1564

slide credit: S. Ullman

Illumination changes

Why is object recognition hard?

Magritte, 1957

Occlusion

Why is object recognition hard?

Scale

Why is object recognition hard?

Xu, Beihong 1943

Why is object recognition hard?

Deformation

Klimt, 1913

Background
clutter

Why is object recognition hard?

Intra
-
class variation

Why is object recognition hard?


Turk and Pentland, 1991


Belhumeur, Hespanha, &
Kriegman, 1997


Schneiderman & Kanade
2004


Viola and Jones, 2000


Amit

and
Geman
, 1999


LeCun

et al. 1998


Belongie

and
Malik
,
2002


Schneiderman

&
Kanade
, 2004


Argawal

and Roth, 2002


Poggio

et al. 1993

History: early object categorization

Most early work focused on common and useful categories such
as faces, digits and cars

Object categorization:

the statistical viewpoint

vs.


Bayes rule:



posterior ratio

likelihood ratio

prior ratio

Object categorization:

the statistical viewpoint

posterior ratio

likelihood ratio

prior ratio


Discriminative methods model posterior


Generative methods model likelihood and
prior


Discriminative


Direct modeling of


Zebra

Non
-
zebra

Decision

boundary


Model and


Generative

Low

Middle

High

Middle

Low

Three main issues


Representation


How to represent an object category



Learning


How to form the classifier, given training data



Recognition


How the classifier is to be used on novel data

Representation


Generative /
discriminative / hybrid

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


Invariances


View point


Illumination


Occlusion


Scale


Deformation


Clutter


etc.

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


invariances


Part
-
based or global
w/sub
-
window

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


invariances


Parts or global w/sub
-
window


Use set of features or each
pixel in image


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


Methods of training: generative vs.
discriminative

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels

Learning

Contains a motorbike


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )


Training images:


Issue of overfitting


Negative images for discriminative methods
Priors

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )


Training images:


Issue of overfitting


Negative images for discriminative methods


Priors

Learning


Scale / orientation range to search over


Speed


Context


Recognition

Hoiem, Efros, Herbert, 2006

Representation


Generative /
discriminative / hybrid

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


Invariances


View point


Illumination


Occlusion


Scale


Deformation


Clutter


etc.

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


invariances


Part
-
based or global
w/sub
-
window

Representation


Generative /
discriminative / hybrid


Appearance only or
location and appearance


invariances


Parts or global w/sub
-
window


Use set of features or each
pixel in image


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


Methods of training: generative vs.
discriminative

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels

Learning

Contains a motorbike


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )


Training images:


Issue of overfitting


Negative images for discriminative methods
Priors

Learning


Unclear how to model categories, so we learn
what distinguishes them rather than manually
specify the difference
--

hence current interest
in machine learning)


What are you maximizing? Likelihood (Gen.) or
performances on train/validation set (Disc.)


Level of supervision


Manual segmentation; bounding box; image
labels; noisy labels


Batch/incremental (on category and image
level; user
-
feedback )


Training images:


Issue of overfitting


Negative images for discriminative methods


Priors

Learning


Scale / orientation range to search over


Speed


Recognition