Themes in Computer Vision

blabbedharborAI and Robotics

Feb 23, 2014 (3 years and 7 months ago)

105 views

Themes in Computer Vision

Carlo Tomasi

Applications


autonomous cars, planes, missiles, robots, ...


space exploration


aid to the blind, ASL recognition


manufacturing,

quality control


surveillance, security


image retrieval


medical imaging


...


perceptual input for

cognition


(CMU NavLab ‘90)

Vision is Effortless to Us


driving a car


threading a needle


recognizing a distant, occluded object


understanding (flat!) pictures


perceive the mood of a painting

Technical Difficulties


512x512x3x30 ≈
23.5MB/s was a
problem 10 years ago


technology just got

good enough


great opportunity!

Fundamental Challenges I


3D

2D implies information loss








sensitivity

to errors


need for
models

graphics

vision

Reconstruction and Geometry

must use redundancy to

address sensitivity to noise

Reconstruction Example

(Tomasi & Kanade ‘91)

Fundamental Challenges II


Appearance changes with viewpoint, i.e., the
same thing looks different


Geometric changes: surface slant depends on
viewpoint


Photometric changes: surface brightness and
color depend on viewpoint


Occlusions: what is hidden depends on
viewpoint


Ambiguity: different things look similar


Correspondence is hard

Photometric and Geometric Change

Occlusion

?

Technicality: Motion Blur

Wrong Correspondence

Simple Images are Harder

(Birchfield and Tomasi ‘01)

Models


must be insensitive to


viewing position
changes


lighting changes


object configuration
changes


occlusion


clutter


must be sensitive to


object changes!

Low
-
Level Models are General

Model: surfaces are smooth, connected

(Marr and Poggio ‘80)

Higher
-
Level Models Work Better…




… when they are right



(and much worse when
they are wrong)

(Lin and Tomasi ‘01)

State of the Art

left input image

ground truth disparity

our result

disparity error

(Lin and Tomasi, 01)

Fundamental Challenges III


An old problem in the
new context of
recognition:


Variation of appearance
:
Objects change over
time, with context,
viewpoint, lighting, pose,
expression,…


Similarity
: Different
objects look similar


[BTW, objects do not
always appear in
isolation…]

(US Army FERET Database)

Modeling Images as Points

1

2

n

1

2

n

principal components

form an approximate
basis

for all the images in the set


Example: Eigenfaces

(Turk, Pentland ‘91; Murase
-
Nayar ‘93; many others)

=

the projection of a new image

onto the eigenbasis is

a compressed representation

of that image

can use this to recognize faces,

synthesize new images, ...

Fundamental Challenges IV:

“read my lips”

“run”



Variation, self
-
occlusion,

occlusion, clutter, …

Motions can be
complex

Simple Models Are Fast

(Birchfield ‘98)

a head is an ellipse with two colors,

surrounded by strong intensity gradients

(Bregler ‘93)

2D Articulated Models for Tracking

3D Models are More Accurate…


… when they are right



[BTW, why is she wearing a
black shirt?]

(Isard & Blake ‘99)

Probabilistic Models Handle Uncertainty


world state
w
, observation (image)
p


prior
P(
w
)


colors change moderately (?)


arms move with limited acceleration (boxing?)


the height of a head can only change so much (dancing?)


contours are smooth and change smoothly


balls follow the laws of gravity





sensor model
P(
p
|
w
)


image motion can be measured only so well


motion blurs the image


noise corrupts pixel values


...


Bayesian Tracking


Bayes’ rule:

P(
w
|
p
)


P(
p
|
w
) P(
w
)


what is the world state
w

likely to be, given
that we observed the image

p

?

(Isard & Blake ‘99)

Even Higher Models May Be Needed


[MY COMPUTER CAN UNDERSTAND SIGN]

computer No(1(HandsIpsi 1 1 0 S Out Down, NeutralIpsi 0 0 0 S Out Down)( ,
-
)


0(" " 0
-
1 " " ", " " " " " " ") (",
-
)


0(" "
-
1 0 " " ", " " " " " " ") (",
-
)


0(" " 0 1 " " ", " " " " " " ") (",
-
)


1(" " 1 0 " " ", " " " " " " "))

understand No(1(HandIn 0 0 0 X Out Contra,NeutralOut 0 0 0 D Up Contra)(
-
,
-
)


"(" 1 " " " " ", " " " " " " "))

signs No(1( 0 0 0 B Up Out,
-

-

-

-

-

-

-
) (
-
,
-
)


"(" 1 0 0 " " ",
-

-

-

-

-

-

-
))

can No(1(HandUp 0 0 0 Out Contra,NeutralOut 0 0
-
1 B Out Up) (
-
,
-
)


"(" " " " " " ", " " " 1 " " "))

(Richards & Tomasi ‘02)

Fundamental Challenge V:

Images are Diverse

Previous Work in Image Retrieval

Hulton Deutsch

Color and Texture Models

orientation

scale

texture

Image Distances

(Rubner & Tomasi ‘97)

(Rubner & Tomasi ‘97)

Retrieval by Refinement
-

1

(Rubner & Tomasi ‘97)

Retrieval by Refinement
-

2

(Rubner & Tomasi ‘97)

Vision is AI Complete


Vision is an inverse problem


Strong models of the world are required


Vision implies
reasoning
about the world


Vision is AI