pptx - DiUF

parisfawnAI and Robotics

Nov 17, 2013 (3 years and 10 months ago)

99 views

Didier Perroud

Raynald

Seydoux

Frédéric Barras


Abstract


Objectives


Modalities


Project
modalities


CASE/CARE


Implementation


VICI,
Iphone
, Voice recognition, Network


Demonstration


Conclusion


Coordination
between

two

persons

to move a
ball

into

a
labyrinth


Rotation possible on the x and y axis


Gates
can

be

opened

with

vocal and
gestural

commands


Coordinate

the
following

technologies:


Augmented

reality
with

tags


Gesture

detection

(
with

Iphone

accelerometers
)


Voice recognition (
words
)


Collaborative
environments


Physic

engine


Inputs


Hand rotation in x and y axis ( one axis per
player
)


direct manipulation of the
labyrinth

board


Hand
pumping

for
gates

openings


Voice recognition (
words
) for
selecting

gate

to open and
start

the
game


Outputs


Image on the
beamer


Iphone

vibrations




CASE



Semantic

level


of abstraction



CARE



Gesture

orientation:
assignment


Gesture

pumping
/Voice
selection
:
complementary

to open a
gate


Voice
commands
:
assignment



Decision

level

fusion

`
Fission: image, vibration


Blocks


Webcam, Tag
detection


OpenGL
,
Physic

engine


Multimodality

Management


state machine


Augmented

reality application


event

based


Messages
from

the
gateway


Voice
events


Gesture

events

(orientation X and Y,
shake
)


Messages to the
gateway


Vibration
events



Handle

the
UIAccelerometer

interface


Generate

motionEvent

when

shaking


Messages to the
gateway


Orientations (X or Y)


Shake


Messages
from

the
gateway


Vibrate

Windows speech API



SDK Features:


API
definition

files


Runtime

component


Control Panel applet


Text
-
To
-
Speech
engines

in multiple
languages
.


Speech Recognition
engines

in multiple
languages
.


Redistributable

components


Sample

application code
.


Sample

engines


Documentation
.


Our System


A speech recognition engine


A grammar


<
grammar

xmlns
="http://www.w3.org/2001/06/grammar"


xmlns:xsi
="http://www.w3.org/2001/XMLSchema
-
instance"


xsi:schemaLocation
="http://www.w3.org/2001/06/grammar


http://www.w3.org/TR/speech
-
grammar/grammar.xsd"


xml:lang
="en
-
EN" version="1.0">


<
rule

id="
Labyrinth
" scope="public">


<one
-
of>


<item>New
game
</item>


<item>Pause</item>


<item>Exit</item>


<item>Open
gate

one</item>


<item>Open
gate

two
</item>


<item>Close
gate

one</item>


<item>Close
gate

two
</item>


</one
-
of>


</
rule
>

</
grammar
>



Recognition comparison before training /
after training



Live


Videos


Problems

with

the
physic

engine


Coordination user moves


physic

moves


Voice recognition OK



High
-
level

programing


Heterogeneity not a problem



Functional

prototype







Thank

you