Speech Recognition Technology Applications

matchmoaningΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

77 εμφανίσεις

Speech Recognition
Technology Applications


Denise Bilyeu, M.S. CCC
-
SLP

Scottish Rite Computer
Supported Literacy Program

Munroe
-
Meyer Institute

Omaha, NE


Speech Recognition


Utilizes hardware and software to
transcribe spoken words into orthographic
text


Allows users hands free operation of
computer systems


Applications for Persons
with Disabilities



Academic opportunities


Vocational opportunities


Access to WWW


Implementation Issues



System Training Requirements


Dictation in Written Form


Absence of Graphical Representation


Functional Grade Level


Dictation Environment


Higher Order Organizational
Skills/Strategies

System Training
Requirements


Samples of training protocol text (500
words) were taken from each of the
following programs


Dragon Naturally Speaking Standard*


Dragon Naturally Speaking Teen*


IBM Via Voice Gold


L & H Voice XPress


*two samples were analyzed and averaged


Samples were analyzed using Readability
Stack
(Tice, B. 1990)


Flesch Index


Dale Index


Dale
-
Chall Formula


Fry Readability Graph

Flesch Index


(RE= 206.835
-

(1.015 x words/sentence)
-

(84.6
x syllables/word)


Rates text on a 100 point scale


High scores indicate easier reading levels


Reading Ease based upon


Mean Sentence Length


Syllables per 100 words



Dale Index


DI = 11.534
-

(.053 x RE)


Based on the Flesch Index Reading Ease
Score


Dale
-
Chall Formula


Reading Grade Score (RGS) = .1579 x DS
(Dale Score) + .0496 x SL (Sentence
Length) + 3.6365


Dale Score = % of words not on Dale list of
3000


Sentence Length = average # of words per
sentence


Fry Readability Graph


Yields Readability Grade Score (RGS)
based upon:


Syllables per 100 words


Sentences per 100 words



Average the RGS for 3+ random passages
for reliable score



Dale
-
Chall Analysis

Mean Sentence
Length
Reading Grade
Score
Dragon Naturally Speaking Standard
17.43
8.60
Dragon Naturally Speaking Teen
14.18
8.47
IBM Via Voice Gold
10.44
7.46
L & H Voice
XPress
12.39
7.70
Fry Readability

Sentences /
100 words
Grade Level
Dragon Naturally Speaking Standard
5.8
8
Dragon Naturally Speaking Teen
7.3
6.5
IBM Via Voice Gold
9.6
4
L & H Voice
XPress
8.1
7
Flesch Index

Flesch Index
Dragon Naturally Speaking Standard
68.59
Dragon Naturally Speaking Teen
78.24
IBM Via Voice Gold
86.26
L & H Voice
XPress
74.13
Dale Index

Dale Index
Dragon Naturally Speaking Standard
7.90
Dragon Naturally Speaking Teen
7.39
IBM Via Voice Gold
6.96
L & H Voice
XPress
7.61
Conclusions



4th grade minimum literacy level required
to train voice recognition programs (most
programs need 6th to 8th grade reading
levels)


Respiratory support sufficient to produce
sentences of
M

= 10.44 words


No statistically significant differences in
training protocols

Dictation in Written Form


Dictation vs. Conversational speech


Children produce 86% more words in
slow dictation than in writing and 163%
more words in normal dictation than in
writing

(Breeder & Scardamalia)


Process is vastly different


Dictation skills must be taught

Absence of Graphical
Representation


Difficulty with dictation is often attributed
to absence of graphical representation;
may cause problems in text development
and revision
(Wetzel)


Speech Recognition has graphical
representation, but often with a delay that
interrupts the dictation process

Functional Grade Level


Classroom placement and curriculum
demands contribute to written text needs


Written text requirements may not be
extensive enough to warrant a Speech
Recognition system


Consider cognitive and/or language skills


Dictating Environment


Voice recognition requires an environment
relatively free of auditory stimuli


Ambient noise will effect the system’s
ability to function well


Dictating may be disruptive to others


Removal from the environment may solve
dictation problems, but result in
educational or vocational disruptions

Higher Order Organizational
Skills / Strategies


Persons must have cognitive abilities to
dictate and often need strategies to help
with the process


Pre
-
Writing Strategies


Writing instruction


Planning


Outlining/Mapping


Inspiration


Evaluation


Intelligibility


Sentence Intelligibility Test
(Yorkston, Beukelman & Tice,
1991)


Utilizes ten unrelated sentences


Transcribed by unfamiliar listeners


Variables elicited


Intelligibility (% of intelligible speech to unfamiliar
listener without context)


Rate of speech


Grade/Literacy Level


Fluency of Dictation


Attention to task


Writing/Dictating environment



Trial with voice recognition system


set up microphone/sound system to see if
voice is perceived


run system training session if user is capable


dictate known passages that require little
cognitive demand e.g., pledge of allegiance


dictate text that requires cognitive demand,
short expository


Alternate means for training systems


Utilize another person with similar voice
characteristics


Transcribe training protocols and allow user
to learn and practice dictating


Transcribe training protocols and dictate to
tape for user to listen to while dictating

Case Studies

Janae



9 years old


Athetoid Cerebral Palsy


Sentence Intelligibility Test Score
-

10%


Current System


Discover Board, Mouse Key


Reason for Referral


Mousing slow and fatiguing


Evaluation Tool


Dragon Dictate v. 3.0



Evaluation Results


With no training, could utilize Mouse Grid
with 80% accuracy, after one hour session,
could utilize Mouse Grid with 95% accuracy.


With extensive training could dictate small
amounts of text


Voice Recognition Status


Utilizing Dragon Dictate Mouse Grid on trial
basis


Training on selected, commonly used words
in progress to determine efficiency and
fatigue effect of dictating text

James



14 years old


Learning Disabled, reading and writing
skills 4 years below grade level


Sentence Intelligibility Test score = 100%


Reason for Referral


Slow input method


Input impeded cognitive writing process


Inability to monitor written work



Evaluation Tool


Dragon Naturally Speaking Standard


Dragon Naturally Speaking Teen


Evaluation Results


Training materials printed and practiced
before actual program training


Training required 2 weeks, 3 sessions/week


Needed alternate text program to review text


Worked on phrasing, assisted punctuation


Voice Recognition Status


Uses voice recognition at home for
homework and correspondence


Does not use voice recognition at school

Brett


18 years old


Quadriplegia, ventilator dependent


Sentence Intelligibility Test score = 100%


Current system


EZKeys for Windows with Morse Code input
via pneumatic switch


Reason for referral


slow input method


Evaluation Tool


Kurzweil


Evaluation Results


Ventilator had to be physically blocked at
Brett’s neck and in the back of the wheelchair


Training on segmentation of words and
phrases was necessary


Training required one month


Voice Recognition Status


Able to use voice recognition at home for
homework, correspondence and the internet


Unable to use voice recognition at school
because of ambient noise and disruption that
dictation causes

Katie


16 years old


Traumatic brain injury


Sentence Intelligibility Test score = 89%


Current system


regular keyboard with track ball


Reason for referral


slow input method


fine motor movement fatiguing


Evaluation tool


Dragon Naturally Speaking Standard


Evaluation results


Unable to train system during evaluation
likely because of nasal emission on specific
sounds, effecting the intelligibility of
surrounding sounds


Palatal lift was fitted subsequent to
evaluation, but further voice recognition
evaluation was not done


Voice Recognition Status


Unable to utilize voice recognition at time of
evaluation


Further evaluation was not done as fine
motor abilities were improving and alternate
strategies (word prediction, abbreviation
-
expansion) were effective

John


53 years old


Friedrich’s Ataxia


Sentence Intelligibility Test score = 53%


Current system


EZKeys for Windows scanning via pneumatic
switch


Reason for referral


slow input method


alternate access for versatility and fatigue


Evaluation Tool


Dragon Naturally Speaking Standard


IBM Via Voice Gold


Evaluation Results


Unable to train system after extensive trial
period (4 weeks, daily)


System would not “perceive” John’s voice


Voice Recognition Status


Unable to utilize voice recognition


Trial with Dragon Dictate scheduled

Clinical Implications


Decrease intelligibility results in decreased
success with voice recognition


Intelligibility may NOT predict success
with voice recognition


Rate of speech may effect success with
voice recognition

Future Directions


New voice recognition programs require
minimal training


New programs that do not learn as they
are used are in development


New programs that utilize a standard set
of distinct “sounds” are in development