ppt - Soft Computing Lab.

chemistoddAI and Robotics

Nov 6, 2013 (4 years and 1 month ago)

98 views

SOMM: Self Organizing Markov Map for Gesture
Recognition

Pattern Recognition

2010 Spring

Seung
-
Hyun Lee

G.
Caridakis

et al.,
Pattern Recognition
,

Vol. 31, pp. 52
-
59, 2010.

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Contents


Introduction



Related Work


Hidden Markov Models


Other Method



Proposed Method



Experiments



Conclusion

1


S FT COMPUTING @ YONSEI UNIV . KOREA

16

Introduction


Gesture


A motion of the body that conveys information



In this paper


Focus on hand gestures

2


S FT COMPUTING @ YONSEI UNIV . KOREA

16

Introduction



Taxonomy of gesture(McNeill, 1992)


Gesticulation


Speech
-
linked


Pantomime


Emblems


Sign Languages


Other (Kendon,1992) (
Quek
, 1994)

3


S FT COMPUTING @ YONSEI UNIV . KOREA

16

Introduction


Taxonomy by functionality

4


Gestures

Definition

Symbolic gestures

gestures that, within each culture, have come to have a single meani
ng.

Deictic gestures

types of gestures most generally seen in HCI and are the gestures of
pointing to entities or direction.

Iconic gestures

gestures used to convey information about the size, spatial relations,
actions, shape or orientation of the object of discourse display.

Pantominic gestures

gestures typically used to mimic an action, object or concept.

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Related Work



Cogan(2006)


Discrete HMM which fuse hand shape and position




Hossain
(2005)


Implicit/Explicit Temporal Information Encoded HMM


Discriminated attention and non
-
attention gestures




Mantyla
(2000)


On mobile devices


Utilized SOM and HMM method



Starner
(1998)


HMM based American Sign Language(ASL) recognition


Sentence level recognition is possible

5


Hidden Markov Model

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Related Work



Black and Jepson(1998)


CONDitional

dENSity

propagATION

(CONDENSATION)
algorothm



Wong and
Ciipolla
(2006)


Sparse Bayesian classifier



Hong et al.(2000)


Finite State Machines(FSM)



Su(2000)


Fuzzy logic and rule
-
based approaches and hyper
-
rectangular
composite Neural network(HRCNNs)



Juang

and Ku(2005)


Fuzzified

Takagi
-
Sugeno
-
Kang(TSK) type recurrent network



Yang et al.(2002)


Time Delay Neural network



Huang and Huang(1998)


3D Hopfield Neural Network




6


Other method

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Modules


Image processing


: detection an tracking of hands


SOM


: quantization of


hand location and direction


HMM


: transition probability matrix

7


Overview

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Video based method


Creation of moving skin masks (Skin color area)


Tracking the
centroid

of the skin masks


Prior knowledge is required


It should indicate different body parts (Left, right hand, and head)



Environment


PC platform


OpenCV


8


Feature Extraction

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Dataset




Gesture instances





Gesture instances

9


S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


cf
) SOM

(1) continuous input space

(2) discrete output space in the form of lattice

(3) time
-
varying neighborhood function defined around winning
neuron

(4) decreasing learning rate parameter

10


Position Model

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method



Some based representation of hand position

11


Position Model

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method



Additional information: Moving direction


12


Direction Model

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Based on
Levenshtein

distance(edit distance)


Measuring the amount of difference between two sequences



Generalized median of data set
Mj





Mean
Levenstein

distance between members


13


Generalized Median

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Position


Probability





Calculation of
S
som


First state: initial probability


From second state: transition probability






Unit
u







14


Gesture Decoding

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Direction


Probability





Calculation of
S
of







Unit
u

15


Gesture Decoding

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Similarity measurement


Problem


Shorter gesture instances tend to gain an advantage by having less transitions and
thus less probabilities multiplication



Measurement


16


Gesture Decoding

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Proposed Method


Error definition for function
f





SOM based approach


If data containing small error is mapped to the same node of SOM




乯N灲潢汥l


Otherwise








䍯湳e煵e湴nyⰠ扥ba畳u映f
e楧桢潲楮o 牥污瑩潮映甬 e牲潲⁩猠湯琠
propagated to the next steps of the recognition process

17


Error Propagation

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Experiment



30 gestures 10 repetitions each

18


Data Set

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Experiment


SOM clustering


Blue: close to input vector


Red: not close




Recognition accuracy


Test with training data: 100%


10
-
fold cross validation: 93%


0.843 ms for decoding a gesture


Only HMM
-
based classifier: 86.36%



19


Result

S FT COMPUTING @ YONSEI UNIV . KOREA

16

Conclusion



Key features


SOM and HMM based automatic recognition architecture


ROI


Relative hand position


Moving direction


Similarity of pattern



Application


Sign language


Gaming environment

20


Thank you