Dissertation Description - Kinect Head Tracking

stemswedishΤεχνίτη Νοημοσύνη και Ρομποτική

15 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

97 εμφανίσεις

Mark Semenenko

COM3600

01/10/2011

K
INECT
-
BASED HEAD TRACKING
FOR CONTROL OF A
GAME CHARACTER

S VIEW

Supervised by Steve Maddock

1.

I
NTRODUCTION


This project aims
to use the
Kinect

to track the head of a user to try and create a more immersive pseudo
-
3D
user experience.
There are already

examples
of this technique such
as
Johnny Chung Lee’s
1

head tracking with
a Wii remote

and mutualmobile
2

head tracking with the Kinect
. Both of the
se projects show simple demos of
using head tracking to change the perspective of the image on the screen
. During
Lee’s

head tracking example
Lee demonstrates how by tracking the head of a user, you can change the perspective of the camera in the 3D
comput
er space in such a way that it turns the TV from a 2D image of a 3D scene, into a window into a 3D
scene
, i
n which

the

distance
of the user
from the screen and
their
position relative to the screen change
s

the
image to create the illusion of looking throug
h a window into a 3D space.

This is very different to current day
stereoscopic 3D technologies in
which the user is presented with a different image for

each eye that makes up
a 3D
scene
,
however
when the user moves their head laterally they are presented
with the same image.

This
differs to head tracking in that if you duck to avoid something jumping out from the screen, it will have an
effect on what is displayed on the TV.
This experience would be much more immersive than the current
technique in which t
he illusion is broken and the experience jarred the moment the user moves their head and
it becomes obvious that they are not looking at a 3D scene at all.

This project will
implement these techniques in an SDK format

and

demonstrate the potential of head
tracking
by creating an immersive
demonstrative
computer game in which the player dodges attacks in the physical
space which in turn changes the perspective on the computer screen

as well as having an effect on gameplay
.

Other similar current projects are

simply demos whereas this project will develop tools which will enable the
technique to be applied easily to a wide variety of other programs.
For example,
it could be used
to create a
virtual fish tank, or virtual window in which
any

user walking past wo
uld
cause

the image
to
change
accordingly to show

a different
perspective.

This could be placed in a wall and
would

be visually appealing.
Head tracking

could also be used
in more practical ways
, f
or example
to help
CAD designers view
3D
prototype
s

without

having to
manufacture

one;

by intuitively looking around
the

object on
the

screen, the
software could change the perspective on that object, providing a much more natural way of viewing the
object.

An extension of this could be to add hand gestures in which the user could spin the object round, zoom
in or out or even make changes in a very ‘Minority Report’
3

or ‘Iron Man’
4

kind of way.

2.

L
ITERATURE
R
EVIEW

2.1

I
NTERACTION
T
ECHNIQUES




1

http://
johnnylee.net/projects/wii/

2

http://www.mutualmobile.com/2011/kinect
-
adds
-
a
-
new
-
dimension
-
to
-
businesses
-
literally/

3

http://www.technovelgy.com/ct/content.asp?Bnum=1186

4

http://www.youtube.com/watch?v=
-
KPhqy7ZwHU&feature=related

Mark Semenenko

COM3600

01/10/2011

In modern day compu
ting there are
a growing number of

user friendly
ways to interface with computers. From
the original keyboard and mouse, or other pointing device, to touch screens, voice commands, and input
devices especially for gaming, such as gamepads, joysticks and st
eering wheels.

A trend amongst the
development of input devices is for a more natural way to interface with computers. For example, speech
recognition is a very large field of study as it would be very beneficial to be able to communicate with a
computer i
n the same way that we would communicate with another human.

With the advent of the Kinect and
its

depth camera and other sensors there is a very large scope for using
natural body movements and voice commands to interact in a very natural way with comput
ers.

Using a Kinect
as opposed to the Wii’s IR camera will allo
w any users to interact, rather
than those only equipped with the
right sensors or emitters.

2.2

K
INECT

The Kinect, born as a device to control a game console is currently used to ‘turn the use
r into the controller’
5
.
As such it provides many excellent tools to aid in this project, especially since the release of the Microsoft
Kinect SDK
6
.

Currently the Kinect is used primarily as a games controller for the X
-
Box 360, using your hands arms and
the
rest of your body to control the game. This project will be taking the core ideas of the Kinect but applying them
in a different way.

2.
3

G
AME
E
NGINES

Today’s game engines mostly operate in the same way for each genre. For example in first person shooter
style games, the character is controlled by a mouse and keyboard or a gamepad. There is a control to look
around the world rotating the camera in a fixe
d place, or to move the character/camera in the y z plane. This
style of
game

has persisted for
over two decades and the control of your character has changed very little
during that time. The potential is there however for other methods of input to be abl
e to control different
aspects of the game, as will be demonstrated in this project.


3.

A
NALYSIS


This project consists of three constituent parts: The input, the game and the mapping between the two.

3.1

I
NPUT

The input will be

the data collect
ed

from the K
inect, including informatio
n from the depth camera and
web
cam. The scope of the project is large enough that once methods of tracking the head’s position with the
depth camera
have been developed, the

tilt, or pitch and yaw of the head

could also be invest
igated
.
A
lso
information from the webcam could

potential
ly be used

to

detect

facial expression
s

and eye movements.

This
is the information
that will be made

available in the SDK.

The initial input will be using

heuristic methods to detect a player making a ducking movement

using the depth
camera, potentially using the Kinect SDK
7
.

The potential for this is demonstrated below with an example of



5

http://www.xbox.com/en
-
GB/kinect

6

https://research.microsoft.com/en
-
us/um/redmond/projects/kinectsdk/

7

http://research.microsoft.com/en
-
us/um/redmond/projects/kinectsdk/

Mark Semenenko

COM3600

01/10/2011

skeletal tracking

from the Kinect using the Microsoft supplied sample

skeletal viewer, so
this method will need
to be

investigate
d as
the
re is a lot of

potential
for

extracting the head tracking data from this.

After the simple
heuristic approach to detect whether or not a player is ducking, more sophisticated ways of track
ing the
precise location of the head will be determined.



3.2

T
HE
G
AME

In the finished
project the

data obtain
ed

from the position of the head
will

be used to move the camera
position and angle in the 3D space.

Potentially simulating the
TV

as

the near v
iew plane, then the player
moving towards or further away from the TV would increase or decrease the amount that is viewable, and
moving
laterally
could rotate the camera around the near view plane.
For this to be effective

the best point of
rotation for t
he best user experience

will need to be determined
; whether it is better to rotate the camera
around the near view plane, far clip plane or origin.

Since the Kinect SDK provides skeletal tracking, obtaining the position of the head shouldn
’t prove too diff
icult.
H
owever,
to try and track the pitch and yaw of the head
could prove challenging as

the depth data may well
not be accurate e
nough on its own. At this point

the accuracy of

the depth data
could be an issue, so

it would
be
considered that
more useful
data

could be gathered

from the webcam.

For this
OpenCV
8

could be used to
help
determine what suita
ble data
can

be

obtain
ed

from the webcam.

With regards to a game engine the Unreal Developers Kit (UDK)
9

by Epic Games or the Source Engine
10

by
Valve

will be

considered
. Whilst
the

main object is to create a SDK for head tracking that can be used in many
applications,
this project will include a demonstration

game that is visually appealing in order to fully
demonstrate how immersive adding head tracking to ex
isting technology can really be.

3.
3

M
APPING

This constitutes the bulk of the project. The importance of this project lies with the ability to map the inputs
from the Kinect to the output on the screen in such a way that it can be used meaningfully and eas
ily to create
a head tracking experience in a variety of different applications.


This will therefore require the most time spent investigating which data can be obtained from the Kinect with
regards to the position of the head and also the accuracy and ti
ming of such information.




8

http://opencv.willowgarage.com/wiki/

9

http://www.unrealengine.com/features

10

http://source.valvesoftware.com/

Mark Semenenko

COM3600

01/10/2011


In this section it will also need to be determined which functions or methods the SDK will provide.
As
previously mentioned methods that will return Boolean values for such actions as ducking, jumping or leaning
could be implemen
ted but the main aim is to have one to one real time tracking of the head so perhaps a
more useful output would be to provide co
-
ordinates, angle and pit
ch

and yaw for a camera in the 3D space.


It will also be considered, for this stage of the project, us
ing heuristic methods versus machine learning
methods and the robustness of each. For this
it

will also need to consider whether there is any learning, setup,
or calibration time. Ideally the head would be tracked with no conscious input from the user as t
o not jar the
experience of the user.


4.

E
VALUATION

With regards to evaluating the success of the projects it has to be considered whether this can be done
objectively or subjectively. Since the main object is to create an SDK which can be easily and widely
applied a
subjective evaluation of the ease of using this SDK must be consider
ed
. But it must also be considered the
results from using the SDK as demonstrated in through the game. This must be done subjectively as it is the
end user experience that the pr
oject aims to improve by creating a very immersive experience.

As such the project will be demonstrated to users and then a survey completed, this will help to
measure

how
immersive this is versus other technologies and also how intuitive it is.

5.

P
LAN OF
A
C
TION

Week

Task

1

Install Kinect and SDK


Investigate available data streams

2

Write Project Description


Review related literature relevant to the project

3

Extract data from the Kinect


Save or view data that can be obtained from the Kinect

4

Install Games Engine


Install a games engine and script a simple game

5

Find data relevant to tracking the head from the Kinect

6

Experiment with camera options available in games engine

7

Investigate machine learning techniques versus heuristic approa
ch

8

Decide on the functions to be made available in SDK and feasibility of each

9

Use input from Kinect to control character or camera in games engine

10

Investigate the best way to change the camera to create correct perspective

11

Look at other
input device to use as a pointing device during game play

12

Summarise into survey and analysis document