stereo video 2x

molassesitalianAI and Robotics

Nov 6, 2013 (3 years and 7 months ago)

345 views

Stereo Video

1.
Temporally Consistent
Disparity
Maps from
Uncalibrated
Stereo
Videos

2.
Real
-
time Spatiotemporal Stereo Matching Using the
Dual
-
Cross
-
Bilateral
Grid

3.
Temporally Consistent Disparity and Optical
Flow via
Efficient Spatio
-
temporal
Filtering

4.
Efficient
Spatio
-
temporal
Local Stereo
Matching Using
Information Permeability Filtering

1

A. Temporally Consistent

Disparity
Maps from

Uncalibrated
Stereo
Videos

Michael Bleyer and Margrit
Gelautz


International Symposium on Image and Signal
Processing and
Analysis (ISPA) 2009

2

B. Real
-
time Spatiotemporal

Stereo Matching Using

The Dual
-
cross
-
bilateral Grid

Christian
Richardt,
Douglas
Orr,
Ian
Davies,
Antonio Criminisi, and
Neil A.
Dodgson1


The European Conference on Computer Vision
(ECCV) 2010

3

C. Temporally Consistent Disparity

And Optical
Flow Via

Efficient Spatio
-
temporal Filtering

Asmaa Hosni, Christoph Rhemann
,


Michael Bleyer, and Margrit
Gelautz


The Pacific
-
Rim Symposium on Image and
Video Technology (PSIVT) 2011

4

D. Efficient Spatio
-
temporal

Local Stereo Matching Using

Information Permeability Filtering

Cuong
Cao Pham, Vinh Dinh Nguyen,

and
Jae Wook
Jeon



International Conference on Image

Processing


(
ICIP)2012

5

Outline


Introduction


Related Works


Methods and Results


A. Median Filter


B. Temporal DCB Grid


C. Spatial
-
temporal Weighted Smoothing


D. Three
-
pass Aggregation


Comparison


Conclusion

6

INTRODUCTION

7

Introduction


Stereo matching
issues only focus on static
image
pairs
.


The
conventional methods estimate the disparities by using
spatial
and color

information.



The important problem of extending to
video

is
flickering.


Solution :


Base on local methods (for real
-
time)


Enforce

temporally
consistent (for flickering)



8

RELATED WORKS

9

Related Works


About Local Methods


The key of local method lies in the
cost aggregation
step
.


A
ggregate the cost data from
the neighboring pixels within
a
finite size
window
.


The most well
-
known method is
edge
-
preserving

algorithm.


Adaptive
support wight


Geodesic Diffusion



Bilateral filter


Guided filter

10

Related
Works


Single
-
frame stereo matching

11

Related Works


Spatio
-
temporal stereo
matching


The inter disparity
difference between two successive frames is minimized
to
enforce the
temporal
consistency
.

12

METHODS AND RESULTS

13

A. Median filter

14

A. Median filter

15

A. Median filter


Computing 1 disparity map takes 1 second.


But a video content about 30~60
frames per second
.


=> Can
NOT
achieve
real
-
time.


No data
and

comparison.


16

B
. Temporal
DCB
Grid


Bilateral Grid


It runs faster and
uses
less memory as
σ
increases.










Dual
-
Cross
-
Bilateral Grid






17

B
. Temporal
DCB Grid


Dichromatic DCB
Grid






Comparison

(fps)

18

200
x

B
. Temporal
DCB Grid


Temporal DCB
Grid









Last
n = 5 frames, each weighted by w
i


i=0 : current frame


i=1 : previous frame

19

Weighted

Sum

B
. Temporal
DCB Grid

20

16

fps

14

fps

21

B.
Temporal
DCB Grid

Source data

B
. Temporal
DCB Grid










Only

use

intensity

information


Just

near
-
real
-
time

22

C.

Spatial
-
temporal Weighted Smoothing


Cost initialization


Construct a
spatio
-
temporal
cost volume for each disparity
d.


Cost aggregation


Smooth cost
volume
with
a spatio
-
temporal
filter.(
Guided filter [1]
)


Disparity computation


Select the lowest costs as disparity(WTA)


Refinement


Wighted

median
filter


23

[1]Rhemann
, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M
.

Fast Cost
-
Volume Filtering
for Visual Correspondence and Beyond.

CVPR(2011) and PAMI
(
2013)

C. Spatial
-
temporal Weighted Smoothing


24

C. Spatial
-
temporal Weighted Smoothing


Cost initialization







Cost
aggregation








25

w
k
:

w
x

*

w
y*
w
t


: smoothness parameter


C. Spatial
-
temporal Weighted Smoothing


The guided filter weights
can be implemented by
a
sequence of
linear
operations.







All
summations are 3D box filters and can be computed
in
O
(
N
)

time.

26

C. Spatial
-
temporal Weighted Smoothing


Disparity
computation : Winner take all




Refinement : Wighted Meadian filter




=> Just adjust to reduce single frame error.


27

C. Spatial
-
temporal Weighted Smoothing


Temporal vs. frame
-
by
-
frame processing.


2
nd
row: Disparity maps computed by a
frame
-
by
-
frame implementation
show
flickering
artifacts
.


3
rd
row:
Our proposed
method exploits temporal information, thus can
remove most
artifacts

28

C. Spatial
-
temporal Weighted Smoothing


29

C. Spatial
-
temporal Weighted Smoothing


30

C. Spatial
-
temporal Weighted Smoothing


31

D. Three
-
pass cost aggregation


Three
-
pass
cost aggregation
technique based
on
information
permeability
(Adaptive Support
-
Weight).[2]



32

[2] Yoon
, K.J., Kweon, I.S.: Locally Adaptive Support
-
Weight Approach for Visual

Correspondence Search. In: CVPR (2005)

D. Three
-
pass cost aggregation

33

Frame i+1

Frame i

Frame i
-
1

D. Three
-
pass cost aggregation


Matching cost initialization





v
= (
x, y,
t
) represents
the spatial and temporal positions of a voxel
.




Similarity(weighted) function



34

Show
the
effectiveness
of using temporal
information in addition to spatial
information
.

D. Three
-
pass cost aggregation


Spatial Aggregation
:
Horizontal and then Vertical



35

D. Three
-
pass cost aggregation


Temporal Aggregation : Forward and backward







Disparity computation : WTA


Refinement


consistency check



3
×

3 median filter.


36

D. Three
-
pass cost
aggregation


Computational
Complexity


Only
six

multiplications and
nine
additions per
voxel


It is
still
more
efficient
than the adaptive support
-
weight
approach
.



Without

motion

estimation



37

D. Three
-
pass cost aggregation


38

D. Three
-
pass cost aggregation


39

COMPARISON

40

Comparison

A.

B.

C.

D.

Method

Optical

flow

+

Median

filter

Weighted

last

5

frames

Guided

filter

temporally

Three

pass


Drawback

Too

slow

Over

smoothness

Reference

frame

number

3

frames

-
1~1

5

frames

-
4~0

5

frames

-
2~2

3frames

-
1~1

41

Comparison

42

No

post
-
processing

Include

post
-
processing

:

consistency
check

and
3
×

3 median filter


CONCLUSION

43

Conclusion


Based on edge
-
preserving methods.


Extend these concepts to
time
dimension.



T
hese
methods only
solved slow motion scenes.


They
do not perform well with dynamic scenes
that
contain
large object motions.



44