Voice-Output Communication Aid

spectacularscarecrowΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

79 εμφανίσεις

Development of a Voice
-
Input

Voice
-
Output Communication Aid
(VIVOCA) for people with severe
dysarthria



Mark Hawley, Phil Green, Pam Enderby,

Stuart Cunningham and Rebecca Palmer





Barnsley Hospital

and University of Sheffield

Speech
Recogniser

‘translation’
algorithm

Speech

synthesiser
or
recording

Microphone

PDA

Results

User and professional consultation



Acceptable as a means of communication


Potential advantages over conventional VOCA


Quicker


Easier to use


Increased communication and independence


Useful where speed and intelligibility crucial


Meeting new people


Telephone


Shopping


Range of requirements for hardware and
software

Method

VOCA users
and speech
therapists

Semi
-
structured
interviews
and focus
groups

Thematic
analysis

Recogniser

‘translation’
algorithm

Speech

synthesiser
or
recording

Microphone

PDA

Speech Recognition

(STARDUST)


Designed for use by people with severe dysarthria
for control tasks


Speaker dependent recognition


Vocabulary tailored to speech capabilities of
individual


Confusability measures to predict performance
and refine vocabulary


User training to increase speech consistency


Closed loop between recogniser training and user
training


Recognition accuracy

for small vocabularies




















Recognition accuracy in the home in normal usage under uncontrolled
conditions








Sex




Diagnosis


Intelligibility
Rating

(%)


Recognition Accuracy

(%)


Word
(test)


Word


Phrase


1


F


Cerebral Palsy
(CP)

10


100.0


90.0


83.3


2


M


Multiple Sclerosis


22


100.0


81.6


76.7


3


M


CP


0


86.0


93.0


86.7


4


M


CP


22


99.7


86.7


76.7


5


F


CP


0


90.8


83.3


70.0


Effect of training

Particip
ant


Vocabul
ary size


Pre
-
training


Post
-
training


No. of
examples

recognition
accuracy
(%)


No. of
example
s


recognition
accuracy
(%)


1


11


30


95.8


103
-
108


100.0


2


7


13


96.2


32
-
34


100.0


3


10


30


82.0


51
-
58


86.0


5


13


30


96.9


79
-
110


99.7


6


11


30


92.7


46
-
55


96.4


7


11


30


77.3


35
-
50


95.5


8


13


30


80.0


56
-
66


90.8


overall








88.5





95.4


Recogniser

‘translation’
algorithms

Speech

synthesiser
or
recording

Microphone

PDA

Small
vocabularies

Large
vocabularies

Coding

Direct
‘translation’

‘Translation’ methods

Number of inputs

~10

Mobile
phone/T9

1
-
5

Switch
scanning

AAC

Morse
code

~1000

Word for
word
translation

~100

AAC
fixed
overlay

AAC
dynamic
screen

~30

Spelling

Word
prediction

AAC

Communication rate

0
5
10
15
20
25
30
35
50
60
70
80
90
100
Recognition Rate %
Communication rate
(correct words per minute)
Direct translation
MinSpeak
Word Prediction
Spelling
T9
Mobile phone multi-press
0
1
2
3
4
5
6
7
8
9
10
50
60
70
80
90
100
Recognition Rate %
Communication rate
(correct words per minute)
Direct translation
MinSpeak
Word Prediction
Spelling
T9
Mobile phone multi-press
Considerations


Input vocabulary


Communication rate


Output vocabulary


Memory load


Perceptual load


Knowledge requirement


Learning load


Set
-
up cost


www.shef.ac.uk/cast


www.barnsleyrd.nhs.uk

mark.hawley@nhs.net