based on combination of visual-hull

yellvillepotatocreekSoftware and s/w Development

Dec 2, 2013 (3 years and 4 months ago)

65 views


Multi
-
view real
-
time depth estimation
based on combination of visual
-
hull
and hybrid recursive matching


HHI


Wolfgang Waizenegger

Overview


Field of application: 3D Presence


2D Videoconferencing


3D Videoconferencing


3D Presence concept and 3D displays


The camera system


3D Analysis


3D algorithmic chain


Hybrid recursive matching (HRM)


Visual Vull (VH)


HRM and VH combination


Results


Hardware


Conclusion and Outlook

3D Presence Consortium

SoA of Telepresence Systems

Polycom TPX
System

Telepresence
System by
CISCO

HP Halo
Telepresence
System

Drawbacks of conventional
telepresence systems


Drawback:


No eye contact, e.g. it is hard to
recognize who is talking to whom


Misleading gestures and body
language



Ideal situation:


Every local participant has its own view
for each remote conferee



Solution:


Immersive 3D videoconferencing


Missing eye contact (CISCO system)

SoA of 3D Videoconferencing

MultiView

by

Univ. of
California,Berkeley,
2004

Virtue/im.point

by Fraunhofer
HHI, 2003/2004

Real Meet Room
, France
Telecom R&D, 2001

The concept of 3D Presence






Three parties

Two conferees per party


Multi
-
party 3D videoconferencing


3D multi
-
user auto
-
stereoscopic display technology


Multi
-
party eye contact and gesture
-
based interaction

Replace remote conferees

by 3D displays

Multi
-
View 3D Displays


Multiple 3D views from different
perspectives



Advantages
:

-

Own view for each local conferee

-

Adapted viewing perspective

-

3D impression

-

Multiple views allow conferees to


switch perspective by moving the


head

multiple

viewing cones

Multi
-
View 3D Display

The Multi
-
View Camera System

Narrow baseline system



Robust disparity estimation



Consistency check by trifocal matching


b

b

k
b

c
ombined trifocal system

vertical wide
baseline
system

horizontal wide
baseline
system

horizontal
narrow

baseline
system

vertical
narrow

baseline
system

vertical wide
baseline
system


Wide baseline system



Increased depth resolution



Option to combine with Visual Hull

The Mock
-
up for Camera
Configuration Testing

3D Analysis Chain



n stereo
streams

segmentati
on

disparity
estimation

vo
lumetric

reconstructi
on

head
tracking
hand
tracking

data fusion

depth
maps

3D
modeling
data

occlusion
information etc.

video +
depth (n)

Hybrid
-
Recursive Matching
(HRM)

pixel
recursion

choice of best
disparity

disparity
memory

block
recursion

3 candidates

disparity
vector

left image

start vector

update vector

right image

Trifocal system

vertical

narrow baseline

after consistency
check


horizontal

narrow baseline

Multi
-
View Video Analysis Chain



n stereo
streams

segmentati
on

disparity
estimation

vo
lumetric

reconstructi
on

head
tracking
hand
tracking

data fusion

depth
maps

3D
modeling
data

occlusion
information etc.

video +
depth (n)

Colored Visual Hull
reconstruction

Visual Hull Techniques


Polygonal


Volume based space carving (VH)


Image based (IBVH)


3D Presence demands real
-
time processing!!


Parallelization of the last two approaches on

graphics hardware is straightforward!

IBVH Algorithm


Our implementation is based on the initial work of Matusik et al. (2000)


Advantages of our algorithm


Improved caching strategy that allows pixel pre
-
selection which
significantly speeds up the computation


GPU only implementation using CUDA


Establishes an interconnection to voxel based implementation by
applying cameras at infinity.

IBVH interconnection to voxel
based methods

VH vs. IBVH

Timings for two GPU based implementations with different resolutions. The image

upload time is included.


Volume based approach from Ladikos et al. 2008 (VH_Lad)

Our image based approach (PPSIBVH, without pixel pre
-
selection IBVH)


Input: Middlebury
dinoRig

dataset ( 48 images, 640 x 480 )

Hardware

128
3

256
3

512
3

VH_Lad

4 x 8800GTX

99.89 ms

296.71 ms

-

IBVH

1 x GTX280

47.9 ms

82.5 ms

280.6 ms

PPSIBVH

1 x GTX280

41.6 ms

60.9 ms

150.6 ms

IBVH result for the
dinoRig

dataset

left) Voxel representation of the IBVH result (512
3
), right) image based depth map

IBVH result for a


3D Presence conferee

Timing for a typical 3D Presence setup with depth maps of 192x256 and 8 Visual
Hull cameras: 10

20 msec on a single GTX280.

Soares et al. use an eight CPU dual Opteron 2.2GHz machine to achieve almost
the same results with 5 cameras and an octree based Visual Hull algorithm

Combination HRM and VH

Result for the combination of HRM and VH

Combination HRM and VH (cont.)

Realization: Hardware Overview for
the 3D Presence setup



5 x PCs with dual Nehalem Xeon CPUs


2 x Geforce GTX295 per cluster node


Infiniband 40GB/s interconnection

3D Presence System
Architecture

Node_VH

Node_2

Node_0

Node_1

Node_3

Node_N

-
Capture (4 cameras)

-
Segmentation

-
Lens un
-
distortion

-
Rectification

-
HRM (trifocal)

-
Bilateral filtering

-
Virtual view generation

-
Encoding (video+depth)

-
Networking

Inalienability of GPUs


Hardware:


CPU: Intel 3.0GHz (single core computation)


GPU: Geforce GTX280


Input:


Images: 1024 x 768, RGB24


Depth Maps: 1024 x 768, float



GPU results
include

up
-

and download times

GPU

CPU

Lens un
-
distortion + rectification

2 msec

68 msec

Bilateral filtering of depth map

Virtual view synthesis (RGB)

11 msec

1000 msec

1 msec

150 msec

Demo

Virtual view generation based on estimated depth maps

Conclusion and Outlook



Three party immersive 3D Videoconferencing system


Real
-
time 3D analysis for a 16 camera setup


Fast IBVH algorithm which runs entirely on a single GPU


Combination of trifocal HRM and VH significantly improves the results


All processing runs in real
-
time on only 5 PCs


System allows to rapidly test various camera configuration



First real
-
time demonstrator prototype available by October 2009


Future: Full HD real
-
time 3D processing chain



Thank you!

Contact: Wolfgang.Waizenegger@fraunhofer.hhi.de

Web:


www.3dpresence.eu

References

Atzpadin, N., Kauff, P. and Schreer, O.: Stereo Analysis by Hybrid Recursive Matching for Real
-
Time
Immersive Video Conferencing, IEEE Transactions on Circuits and Systems for Video
Technology, special Issue on Immersive Telecommunications, vol. 14, no. 3, pp. 321
-
334,
January 2004.


Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image
-
based visual hulls.
In
Proceedings of the 27th Annual Conference on Computer Graphics and interactive
Techniques

International Conference on Computer Graphics and Interactive Techniques.


Lakikos, A., Benhimane, S., Navab, N., Efficient Visual Hull Computation for Real
-
Time 3D
Reconstruction using CUDA, IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, Anchorage, Alaska (USA), June 2008. Workshop on Visual Computer
Vision on GPUs (CVGPU).


Soares, L., Menier, C., Raffin, B., and Roch, J.L.
Parallel adaptive octree carving for real
-
time 3d
modeling
. Poster at IEEE VR'2007
-

Virtual Reality Charlotte, Northe Carolina, USA, March
2007.