Computer Vision: Multiview Stereo - TAMU Computer Science ...

meanchildlikeΜηχανική

31 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

63 εμφανίσεις

CSCE 641 Computer Graphics:

Image
-
based Modeling

Jinxiang Chai

Image
-
based modeling

Estimating 3D structure


Estimating motion, e.g., camera motion


Estimating lighting


Estimating surface model


Traditional modeling and rendering

User input

Texture map
survey data

Geometry
Reflectance
Light source
Camera model

Images

modeling

rendering

For photorealism:


-

Modeling is hard


-

Rendering is slow

Can we model and render this?

What do we want to do for this model?

Image based modeling and rendering

Images

user input range
scans

Model

Images

Image
-
based
modeling

Image
-
based
rendering

Spectrum of IBMR

Images

user input range
scans

Model

Images

Image
based
modeling

Image
-
based
rendering

Geometry+ Images

Geometry+ Materials

Images + Depth

Light field

Panoroma

Kinematics

Dynamics

Etc.

Camera + geometry

Spectrum of IBMR

Images

user input range
scans

Model

Images

Image
based
modeling

Image
-
based
rendering

Geometry+ Images

Geometry+ Materials

Images + Depth

Light field

Panoroma

Kinematics

Dynamics

Etc.

Camera + geometry

Spectrum of IBMR

Images

user input range
scans

Model

Images

Image
based
modeling

Image
-
based
rendering

Geometry+ Images

Geometry+ Materials

Images + Depth

Light field

Panoroma

Kinematics

Dynamics

Etc.

Camera + geometry

Stereo reconstruction

Given two or more images of the same scene or object,
compute a representation of its shape








What are some possible applications?

known

camera

viewpoints

3D modeling

From one stereo pair to a 3D head model










[
Frederic Deverney
, INRIA]

3D modeling

The Digital Michelangelo Project, Levoy et al.

Optical mocap

Vicon mocap system

Z
-
keying: mix live and synthetic

Takeo Kanade, CMU (
Stereo Machine
)

Virtualized Reality
TM

[Takeo Kanade
et al.
, CMU]


collect video from 50+ stream


reconstruct 3D model sequences











steerable version used for

SuperBowl XXV “
eye vision



http://www.cs.cmu.edu/afs/cs/project/VirtualizedR/www/VirtualizedR.html

View interpolation






input


depth image


novel view

[Szeliski & Kang ‘95]

View morphing

Morph between pair of images using epipolar
geometry
[Seitz & Dyer, SIGGRAPH’96]

Image warping

Video view interpolation

Performance Interface

Microsoft
Natal project

Additional applications?


Real
-
time people tracking (systems from Pt. Gray
Research and SRI)


“Gaze” correction for video conferencing
[Ott,Lewis,Cox InterChi’93]


Other ideas?

Stereo matching

Given two or more images of the same scene or
object, compute a representation of its shape


What are some possible representations for shapes?


depth maps


volumetric models


3D surface models


planar (or offset) layers

Outline

Stereo matching


-

Traditional stereo


-

Multi
-
baseline stereo


-

Active stereo


Volumetric stereo


-

Visual hull


-

Voxel coloring


-

Space carving





Stereo matching


Masatoshi Okutomi and Takeo Kanade. A multiple
-
baseline stereo. IEEE Trans.
on Pattern Analysis and Machine Intelligence (PAMI), 15(4), 1993, pp. 353
--
363.


D. Scharstein and R. Szeliski.
A taxonomy and evaluation of dense two
-
frame
stereo correspondence algorithms
.

International Journal of Computer Vision
, 47(1/2/3):7
-
42, April
-
June 2002.

Visual
-
hull reconstruction


Szeliski, “Rapid Octree Construction from Image Sequences”, Computer Vision,
Graphics, and Image Processing: Image Understanding, 58(1), 1993, pp. 23
-
32.


Matusik, Buehler, Raskar, McMillan, and Gortler , “Image
-
Based Visual Hulls”,
Proc. SIGGRAPH 2000, pp. 369
-
374.


Photo
-
hull reconstruction


Seitz & Dyer, “Photorealistic Scene Reconstruction by Voxel Coloring”, Intl.
Journal of Computer Vision (IJCV), 1999, 35(2), pp. 151
-
173.


Kutulakos & Seitz, “A Theory of Shape by Space Carving”, International Journal of
Computer Vision, 2000, 38(3), pp. 199
-
218.


Papers

Stereo

scene point

optical center

image plane

Stereo

Basic Principle: Triangulation


Gives reconstruction as intersection of two rays


Requires

>
calibration

>
point correspondence

Camera calibration

From world coordinate to image coordinate

u
0

v
0

1

0

0

-
s
y

0

s
x

a

u

v

1

Perspective
projection

View
transformation

Viewport
projection

Camera parameters

3D points

2D projections

Stereo correspondence

Determine Pixel Correspondence


Pairs of points that correspond to same scene point

Epipolar Constraint


Reduces correspondence problem to 1D search along
conjugate

epipolar lines


Java demo:
http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html

epipolar line

epipolar line

epipolar plane

Stereo image rectification

Stereo image rectification


reproject image planes onto a common


plane parallel to the line between optical centers


pixel motion is horizontal after this transformation


two homographies (3x3 transform), one for each
input image reprojection


C. Loop and Z. Zhang.
Computing Rectifying Homographies for
Stereo Vision
. IEEE Conf. Computer Vision and Pattern
Recognition, 1999
.

Rectification

Original image pairs

Rectified image pairs

Stereo matching algorithms

Match Pixels in Conjugate Epipolar Lines


Assume brightness constancy


This is a tough problem


Numerous approaches

>
A good survey and evaluation:
http://www.middlebury.edu/stereo/


Your basic stereo algorithm

For each epipolar line


For each pixel in the left image


compare with every pixel on same epipolar line in right image


pick pixel with minimum matching cost

Improvement: match
windows


This should look familiar..


Can use Lukas
-
Kanade or discrete search (latter more common)

Window size


Smaller window

+


-



Larger window

+


-


W = 3

W = 20

Effect of window size

Stereo results

Ground truth

Scene


Data from University of Tsukuba


Similar results on other images without ground truth

Results with window search

Window
-
based matching

(best window size)

Ground truth

Better methods exist...

State of the art method

Boykov et al.,
Fast Approximate Energy Minimization via Graph Cuts
,

International Conference on Computer Vision, September 1999.



Ground truth

Stereo reconstruction pipeline

Steps


Calibrate cameras


Rectify images


Compute disparity


Estimate depth



Camera calibration errors


Poor image resolution


Occlusions


Violations of brightness constancy (specular reflections)


Large motions


Low
-
contrast image regions

Stereo reconstruction pipeline

Steps


Calibrate cameras


Rectify images


Compute disparity


Estimate depth

What will cause errors?

Outline

Stereo matching


-

Traditional stereo


-

Multi
-
baseline stereo


-

Active stereo


Volumetric stereo


-

Visual hull


-

Voxel coloring


-

Space carving





Depth from disparity

f

x

x’

baseline

z

C

C’

X

f



input image (1 of 2)



[Szeliski & Kang ‘95]



disparity map



3D rendering

width of

a pixel

Choosing the stereo baseline

What’s the optimal baseline?


Too small: large depth error


Too large: difficult search problem

Large Baseline

Small Baseline

all of these

points project

to the same

pair of pixels

The effect of baseline on depth estimation

1/z

width of

a pixel

width of

a pixel

1/z

pixel matching score

Multi
-
baseline stereo

Basic Approach


Choose a reference view


Use your favorite stereo algorithm BUT

>
replace two
-
view SSD with SSD over all baselines


Limitations


Must choose a reference view (bad)


Visibility!


CMU’s 3D Room Video

Outline

Stereo matching


-

Traditional stereo


-

Multi
-
baseline stereo


-

Active stereo


Volumetric stereo


-

Visual hull


-

Voxel coloring


-

Space carving





Active stereo with structured light

Project “structured” light patterns onto the object


simplifies the correspondence problem

camera 2

camera 1

projector

camera 1

projector

Li Zhang’s one
-
shot stereo

Active stereo with structured light

Laser scanning

Optical triangulation


Project a single stripe of laser light


Scan it across the surface of the object


This is a very precise version of structured light scanning

Digital Michelangelo Project

http://graphics.stanford.edu/projects/mich/


Laser scanned models

The Digital Michelangelo Project
, Levoy et al.

Laser scanned models

The Digital Michelangelo Project
, Levoy et al.

Desktop scanner

Convenient to use



Good quality


Relatively low
-
cost


-

next engine
(about 2k)