TRACKING and DETECTION in COMPUTER VISION - Chair for ...

coatiarfAI and Robotics

Oct 17, 2013 (3 years and 11 months ago)

77 views

TRACKING
and
DETECTION
in

COMPUTER
VISION
Slobodan
Ilić
Technischen
Universität
München
Winter
Semester
2009/2010
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
A Picture’s Worth a
Thousand Words!
“One
of
the
Family”,
Frederick
Cotman,
1880
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
“One
of
the
Family”,
Frederick
Cotman,
1880
A Picture’s Worth a
Thousand Words!
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
What is Computer Vision?
Human Vision
(eyes and the visual cortex in the
brain) discovers from images what object are
present in the scene, where they are, how they
move and what is their shape.

Computer Vision
(using cameras attached to the
computers) automatically interprets images trying to
understand their content similar to the human
vision.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
What is
not

Computer Vision?
Image
Processing

‐

Takes
an
image
and
process
is

to
produce
new,
more
desirable
image.
Image

enhancement,
image
compression,
image

restoration.
Pattern
Recognition
‐

Takes
a
pattern
and
classifies

it
into
one
of
predefined,
finite
set
of
classes.
Computer
Graphics
‐

Synthesize
images
using

powerful
algorithms
so
that
they
correspond
as

close
as
possible
to
the
real
images.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
How did everything start?
Computer Vision
started as a semester project
at MIT in 1965.
The assumptions were very strong (block world)
and the data were perfect, so it seemed to
researchers to be an easy task.
However the wold is not perfect !
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Why to study
Computer Vision?

Intellectual curiosity -- try to mimic the most
powerful human sense

A number of industrial applications:

automation of industrial
processes

medicine, diagnostics

entertainment: film and video
games

security and surveillance

visualization and
augmented reality

communication

human computer
interactions (HCI)

military and space
research
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
The eye
• Retina measures about 5 × 5 cm and contains 108 sampling elements (rods:
sense brightness, low intensity, e.g night vision and cones: sense color with higher
intensity light).
•The eye’s spatial resolution is about 0.01

over a 150

field of view (not evenly
spaced, there is a fovea and a peripheral region).
•Intensity resolution is about 11 bits/element, spectral resolution is about 2 bits/
element (400–700nm).
•Temporal resolution is about 100 ms (10 Hz).
•Two eyes give a data rate of about
3 GBytes/s
!
•A large chunk of our brain is dedicated to processing the signals from our eyes.
Fovea
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
The camera

A typical digital SLR CCD measures about 24×16mm and contains about 6×106
sampling elements(pixels).
•Intensity resolution is about 8bits/pixel for each colour chanel (RGB).
•Most computer vision applications work with monochrome images.
•Temporal resolution is about 40 ms (25 Hz), SNR is about 50dB(Pulnix camera spec.).
•One camera gives a raw data rate of about
450MBytes/s
(color), i.e
150MBytes/s
(mono)
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Should we copy
biology?

No! Human vision is a product of millions of years of
the evolution created under different constraints.

It consists of 60 billion neurons heavily interconnected.

Computes we have today cannot perform like a human
brain.
We really do not understand how the brain works!
We need to try understand underlying principles rather
then the particular implementation.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Image ambiguities
GiveImage
forma
Image formation is many to one mapping. It is simple
projection of the 3D object representation and does
not say anything about the depth.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
What should we do?
“One cannot understand what seeing is and
how it works unless one understands the
underlying information processing task being
solved”
The imaging process is ambiguous and we should try to
resolve the ambiguities by introducing constraints to
our problem like:

use more then one image of the scene

make assumptions about the world in the scene

introduce knowledge about the observed problem
David Marr
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
What is tracking?

Tracking means following one or multiple objects or
of interest in the scene providing continuously their
position

estimate parameters of the dynamic system, e.g.
feature point positions, object position, human
joint angles etc.

The applications of tracking are various and
represent one of the primal Computer Vision tasks

The source of information is video from one or
multiple cameras
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Different Approaches to
Vision-based 3D Tracking
No
a priori
3D
knowledge
Make use of visual
markers
Consider natural
features
Use some 3D
knowledge
Consider natural
features
Use some 3D
knowledge
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Tracking examples
Tracking 3D objects, CVLAB, EPFL
Tracking in 2D
CONDENSATION alg.
M. Isard, A. Blake
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Face tracking
2D face tracking using AAM
courtesy of Robotics Institute CMU
3D face tracking
courtesy of CVLAB, EPFL
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
People tracking - monocular
A.
Ess
and
B.
Leibe
and
K.
Schindler
and
and
L.
van
Gool
,

A
Mobile
Vision
System
for
Robust

Multi‐Person
Tracking,

CVPR
2008
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
People tracking
multi camera
F.
Fleuret,
J.
Berclaz,
R.
Lengagne
and
P.
Fua,

Multi‐Camera
People
Tracking
with
a
Probabilistic

Occupancy
Map
,
IEEE
Transactions
on
PAMI,
Vol.
30,
Nr.
2,
pp.
267
‐
282,
February
2008
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Articulated people tracking
R.
Urtasun,
D.
Fleet
and
P.
Fua,

Temporal
Motion
Models
for
Monocular
and
Multiview
3‐‐D
Human
Body

Tracking
,
Computer
Vision
and
Image
Understanding,
Vol.
104,
Nr.
2,
pp.
157
‐
177,
December
2006.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Deformable surface tracking
S.
Ilic,
M.
Salzmann,
P.
Fua,

Implicit
Meshes
for
Effective
Silhouette
Handling
,
International
Journal
of

Computer
Vision,
Vol.
72,
Nr.
2,
pp.
159
‐
178,
2007
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Augmented reality
J.
Pilet,
A.
Geiger,
P.
Lagger,
V.
Lepetit
and
P.
Fua,

An
all‐in‐one
solution
to
geometric
and
photometric
calibration
,

International
Symposium
on
Mixed
and
Augmented
Reality,
October
2006
M.
Salzmann,
J.Pilet,
S.Ilic,
P.Fua,

Surface
Deformation
Models
for
Non‐Rigid
3‐‐D
Shape
Recovery
,
IEEE
Transactions
on

Pattern
Analysis
and
Machine
Intelligence,
Vol.
29,
Nr.
8,
pp.
1481
‐
1487,
August
2007
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Sport applications
N.
Gehrig,
V.
Lepetit,
and
P.
Fua.
"
Golf
Club
Visual
Tracking
for
Enhanced
Swing
Analysis

Tools
",
In
proceedings
of
British
Machine
Vision
Conference,
September
2003.
Video
is
courtesy
of
Dartfish
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Robotic and industrial AR
applications
Video
is
courtesy
of
METAIO
Video
is
courtesy
of

University
of
Cambridge
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Entertainment applications
Videos above are courtesy of Prof. A. Hilton, Univ. of Surrey
C.
Cagniart,
E.
Boyer,
S.
Ilic,

“Iterative
Mesh

Deformation
for
Dense

Surface
Tracking”,
3DIM

workshop
on
ICCV09,

Kyoto,
Japan,
2009
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Entertainment applications
Videos above are courtesy of Prof. A. Hilton, Univ. of Surrey
C.
Cagniart,
E.
Boyer,
S.
Ilic,

“Iterative
Mesh

Deformation
for
Dense

Surface
Tracking”,
3DIM

workshop
on
ICCV09,

Kyoto,
Japan,
2009
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Medical applications
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Medical applications
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
More applications
VIdeos
are
courtesy
of
EPFL,

Alinghi
and
Hydroptere
companies
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
What is detection?

Detection means finding the object of interest and
providing its position in the image.

no assumption of the system dynamics

the response is not based on temporal consistency

The applications of detection are various: machine
vision and quality control, surveillance, robotics etc.

The source of information is a single image
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Toy example
M.
Ozuysal,
M.
Calonder,
V.
Lepetit
and
P.
Fua,

Fast
Keypoint
Recognition
using
Random
Ferns
,

accepted
to
IEEE
Transactions
on
Pattern
Analysis
and
Machine
Intelligence,
2009.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
3D object detection
S.
Hinterstoisser
,

S.
Benhimane
,

N.
Navab
,

“
N3M:
Natural
3D
Markers
for
Real‐Time
Object
Detection
and

Pose
Estimation
”,

IEEE
International
Conference
on
Computer
Vision,
Rio
de
Janeiro,
Brazil,
October
14‐20,
2007
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Face detection
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Deformable object detection
J.
Pilet,
V.
Lepetit,
and
P.
Fua,

Fast
Non‐Rigid
Surface
Detection,
Registration
and
Realistic
Augmentation
,

International
Journal
of
Computer
Vision,
Vol.
76,
Nr.
2,
February
2008.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
People detection
M.
Dimitrijevic,
V.
Lepetit
and
P.
Fua,

Human
Body
Pose
Detection
Using
Bayesian
Spatio‐Temporal

Templates
,
Computer
Vision
and
Image
Understanding,
Vol.
104,
Nr.
2,
pp.
127
‐
139,
December
2006
A.
Fossati,
M.
Dimitrijevic,
V.
Lepetit
and
P.
Fua,

Bridging
the
Gap
between
Detection
and
Tracking
for
3D

Monocular
Video‐Based
Motion
Capture
,
Conference
on
Computer
Vision
and
Pattern
Recognition,

Minneapolis,
MI,
June
2007
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Detection in Machine Vision
A.
Hocauser
,
Carsten
Steger,

N.
Navab
,

Harmonic
deformation
model
for
edge
based
template
matching
,

International
Conference
on
Computer
Vision
Theory
and
Applications,
Funchal,
Portugal,
January
2008.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Detecting objects in
range data
Ajmal
S.
Mian,
M.
Bennamoun
and
R.
Owens,

"3D
Model‐based
Object
Recognition
and
Segmentation
in

Cluttered
Scenes"
,
to
appear
in
IEEE
Transactions
in
Pattern
Analysis
and
Machine
Intelligence
(PAMI),
2006
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Patch detection

S.
Hinterstoisser
,

O.
Kutter
,

N.
Navab
,
P.
Fua,
V.
Lepetit,

Real‐Time
Learning
of
Accurate
Patch
Rectification

(Oral
presentation)
,

IEEE
Computer
Society
Conference
on
Computer
Vision
and
Pattern
Recognition
(CVPR),

Miami,
Florida
(USA),
June
2009
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Contour Detection
S.
Holzer

,

S.
Hinterstoisser
,

S.
Ilic
,

N.
Navab
,

Distance
Transform
Templates
for
Object
Detection
and
Pose

Estimation
,

IEEE
Computer
Society
Conference
on
Computer
Vision
and
Pattern
Recognition
(CVPR),
Miami,

Florida
(USA),
June
2009.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Overview of the course

Introduction

Filtering and edge detection: Convolution; Gaussians; Image
derivatives; Edge detection; Canny edge detector

Local invariant feature detectors:

Corner detection: Harris corner detector; Scale space; Harris-
Laplace; Harris-Affine; EBR(Edge based regions)

Region detectors: MSER(Maxumal Stable Extremal Regions);
IBR(Image Based Regions)

Blob detectors: Hessian; Hessian Laplace/Affine;

Feature descriptors: SIFT, SURF, HoG

Feature point recognition: Randomized Trees, FERNS,Keypoint
Signatures

Advanced methods: Panter, Lepard, Gepard and DTT
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Overview of the course

Camera models and projections; Model based tracking; Pose
estimation form 2D-3D correspondences(DLT, PnP,POSIT);
Rotation parameterization; Non-linear optimization;

Robust estimators; RANSAC. Active appearance models

Template matching: Similarity measures; KLT tracker; Juri-Dhome
algorithm; ESM

Kalman filtering, Particle filtering, CONDENSATION Alg.

Visual SLAM, PTAM

Haar features; Integral images; Ada-Boost; Viola-Jones face
detection.
Intro: Tracking and Detection in Computer Vision Ilic Slobodan
Exam, exercises
and homeworks

Final exam
100pts (50pts to pass)

Mid-term exam
(20pts)  Home works (20pts)

Total: 140pts (100pts  1.0!!!)

Lectures:
Mondays from 10am-11:30am at MI 03.13.010

Exercises:
Wednesdays 10am-11:30am at MI 03.13.008

mainly practical on the computer and will serve to explain you given
homework tasks from theoretical and practical point of views.

check your previous home work

you can start doing your home work on the exercises class and ask
questions

Home works:
will be check individually during the exercises!