for mobile devices

bijoufriesAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

69 views

MACHINE VISION GROUP

Head
-
tracking virtual 3
-
D display
for mobile devices

Miguel Bordallo López*, Jari Hannuksela*, Olli Silvén* and

Lixin Fan**,

* University of Oulu, Finland

** Nokia Research Center, Tampere, Finland

MACHINE VISION GROUP

Contents

Introduction

Head
-
tracking

3D
virtual

display


Interaction

design



Face
-
tracking

for mobile
devices


Mobile
device’s

constrains


Field

of
view


Energy

efficiency

Implementation

Latency

considerations

Performance

Summary


MACHINE VISION GROUP

Introduction


3D virtual
displays

Calculate

the
relative

position

of the
user

respect

to the
screen


Calculate

the
angle

of the
user’s

point

of
view


Render

an
image

according

to the
point

of
view


Result

is a
Virtual

Window
:


-

Shows

realistic

3D
objects


-

Based

on
parallax

effect


* Video
from

Johnny Lee (
Wiimote

head

tracking

project
)

The position information is used to render the 3D UI/content as if the user watched it from different angles.

The technology enable users to watch the content from different angles and become more immersed
.

MACHINE VISION GROUP

Introduction


Mobile

3D virtual
displays


Mobile
head
-
coupled

display


can

take

advantage

of the
small

size


Movement

of
either

user

or

device



Mobile
Devices

have

cameras

and
sensors

integrated


No
need

for
external

periferics



Can

increase

UI
functionalities


New
applications

and
concepts


Realistic

3D
objects

can

be

rendered

and
perceived




New
interaction

methods

can

be

developed


We

know

what

the
user

looks

at and
we

can

use

that

information



MACHINE VISION GROUP

Demo

MACHINE VISION GROUP

Head
-
tracking mobile virtual 3D display


A
simple

use

case

MACHINE VISION GROUP

Interaction design



MACHINE VISION GROUP


Head
-
coupled displays require robust and fast face
-
tracking



Based on multiscale LBP, Cascade classifier and AdaBoost



Excellent results in face recognition and authentication, face
detection, facial expression recognition, gender classification

Introduction


Mobile
face
-
tracking

MACHINE VISION GROUP

Introduction


Evaluating the distance to the screen



Essential to compute de relative angle



Ground truth determined With Kinect



Two methods evaluated:



Face size obtained with face tracking



Flickering between frames



No extra computations needed



Good accuracy



Motion estimation library:



Harris corners + BLUE



Computes changes of scale between frames



Presents about 10% more accuracy



Less flickering between frames



Needs extra computations:


Introduces latency, decreases framerate


Worse input sequence for tracking


More differences between frames

MACHINE VISION GROUP

Mobile constrains


Field of view



Front Camera is on the device’s corner and


not pointing to the user:




Reduced field of view (<45dg)



Assymmetric FoV



Even more reduced effective FoV



Considerable minimum


distance to the screen





User often outside of the point of view



Tracking sometimes lost



Need to show viewfinder on the screen

MACHINE VISION GROUP

Mobile constrains


Field of view



Implemented solution: Wide angle lens




Dramatically increases the effective


field of view (<160dg)



Requires calibrated lens



Requires de
-
warping routine


Implemented with lookup tables



Problems when several faces are on


the field of view


MACHINE VISION GROUP

Mobile constrains


Energy efficiency

Practical challenge of camera
-
based UI is to have an always active camera



Lower framerate
-
> High UI starting latencies


Higher framerate
-
> Small energy
-
efficiency


Application processor (even in mobile) is power hungry


Specific processors closer to the sensors are needed


Current devices include HW
-
codecs and GPUs:


Better energy efficiency due to small EPI


Mobile GPU already programable:


OpenGL ES


OpenCL Embeded Porfile

MACHINE VISION GROUP

Energy efficiency


GPU
-
accelerated face
-
tracking

Computational and energy costs per VGA frame of feature extraction

GPU can be treated as an independent entity


Can be use concurrently with CPU


Use of GPU for feature extraction (format conversion + multiscaling + LBP)

Mobile GPUs still not very efficient for certain tasks

MACHINE VISION GROUP

Implementation




Demo
platform
: N900 (
Qt

+
Gstreamer

+
openGL

ES)



Based

on
face
-
tracking

external

library



Implementation

details
:



Input
image

resolution

: 320x240



Frame

rate
: 16
-
20
fps
.



Base

latency
: 90
-
100 ms.



Accepted

field

of
view
: < 45dg
hori
. & < 35dg
vert
.




User’s

distance

range
: 25
-

300 cm.

MACHINE VISION GROUP

Implementation


Simple block diagram

MACHINE VISION GROUP

Implementation


Task distribution

MACHINE VISION GROUP

Implementation


Task distribution

Camera module

Application Processor

CPU

Graphics Processor

GPU

Touchscreen

Display

MACHINE VISION GROUP

Implementation


Task distribution

Camera module

Application Processor

Graphics Processor


Touchscreen

MACHINE VISION GROUP

Mobile constrains


Latency



User interface latency is a critical issue



Latency > 100ms. Very disturbing



Realistic 3D rendering even more sensitive



Not realistic if it happened a while ago !!!



MACHINE VISION GROUP

Mobile constrains


Latency hiding

A possible solution: Latency hiding


Requires good knowledge of the system’s timing


Predict the current position based on motion vector

MACHINE VISION GROUP

Performance

Demo platform: Nokia N900


ARM cortex A8, 600 MHz + PowerVR535 GPU

Comparison platform: Nokia N9


ARM cortex A8, 1 GHz + PowerVR535 GPU

MACHINE VISION GROUP

Remaining problems



Face
-
tracking based 3D User Interfaces provide support for new concepts



Face tracking can be offered as a platform level




Current mobile platforms still present several shortcomings



Energy efficiency compromises battery life



Camera not designed for UI purposes



Single camera implies difficult 3D context recognition



MACHINE VISION GROUP

Thank you


Any question?

MACHINE VISION GROUP

LBP fragment shader
implementation


Access the image via texture lookup


Fetch the selected picture pixel


Fetch the neighbours values


Compute binary vector


Multiply by weighting factor



Uses OpenGL ES interface


Two versions:


Version 1: calculates LBP map in one grayscale channel


Version 2: calculates 4 LBP maps in RGBA channels

MACHINE VISION GROUP

Preprocessing

Create quad

Divide texture &

Convert to grayscale

Render each piece

in one channel

MACHINE VISION GROUP

GPU assisted face analysis process