Lab. 4: Speech Recognition

spectacularscarecrowΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 4 χρόνια και 4 μήνες)

96 εμφανίσεις

2008


봄학기

네트워킹

기법



응용


Communication Networks Research (CNR) Lab.

EECS, KAIST

1

Lab. 4: Speech Recognition


2008. 2. 25


Dan Keun Sung

dksung@ee.kaist.ac.kr

2

Communication Networks Research (CNR) Lab.

EECS, KAIST

Introduction


Needs/Advantages


Applications


Classification of Speech
Recognition Systems


Block Diagram of Speech
Recognition System

3

Communication Networks Research (CNR) Lab.

EECS, KAIST

Needs/Advantages of Voice(Speech)
Recognition Systems


Increased productivity


inspection, financial trading workstations, training systems,
inventory control, product demonstration systems


Rapid return to investment : reducing staffing
costs


voice activated dialing on telephone networks


Access to new markets


24
-
hour and weekend telephone service for customers using
rotary
-
dial telephones


Environmental control


remote control over personal environments for disabled people


User
-
Friendliness


the most natural and universal interface with machines


Differentiation in Products

4

Communication Networks Research (CNR) Lab.

EECS, KAIST

Applications


Command
-
and
-
control


: Verbal control of equipment or


software programs


Utilities in smart building and automotive


Consumer products : toys, personal digital assistants,
VCR

s


Navigating software menus and screens


Heavy equipment in dangerous environments


Controlling a wheelchair


Telephone applications : operator service and voice
dialing

5

Communication Networks Research (CNR) Lab.

EECS, KAIST

Applications (continued)


Data entry


: Verbal input of data into databases


Completing a form


Entering an order



Data access


: Information retrieval from online sources


Banking by phone


Accessing directory assistance

6

Communication Networks Research (CNR) Lab.

EECS, KAIST

Applications (Continued)


Dictation


: Creation of documents (listening typewriters)


Dictating a letter


Dictating a structured medical report



Requirements for a successful voice
recognition application


Actual benefit to user


User friendliness


Accuracy


Real
-
time response

7

Communication Networks Research (CNR) Lab.

EECS, KAIST

System Classifications


Mode of speech


Discrete word


Connected word


Continuous speech


Reading speech


Vocabulary size


Small : 10 ≤ S <100


Medium : 100 ≤ S <1000


Large : 1000 ≤ S <10,000


Very large : ≥ 10,000

8

Communication Networks Research (CNR) Lab.

EECS, KAIST

System Classifications (continued)


Speaker dependence


Speaker dependent system


Speaker independent system


Speaker adaptable system



Acoustic ambiguity (confusability)



Environmental noise (adverse conditions)


Mismatch between training and test
environments

9

Communication Networks Research (CNR) Lab.

EECS, KAIST

Block Diagram of a Speech
Recognition System

REFERENCE

PATTERNS

PATTERN

COMPARISON

DECISION

RULE

PARAMETER

MEASUREMENT

PATTERN RECOGNITION APPROACH

SPEECH

TEST

PATTERN

RECOGNIZED

SPEECH


Speech Recognition as a Pattern
Recognition

10

Communication Networks Research (CNR) Lab.

EECS, KAIST

Block Diagram of a Speech
Recognition System (continued)

Phoneme

Reference

patterns

Word

dictionary

Syntactic

And semantic

analysis

Acoustical

analysis

Phoneme

recognition

Word

recognition

Under
-

standing

Speech

input

Recognized

output

O

O



Acoustic phonetic based sentence recognition

11

Communication Networks Research (CNR) Lab.

EECS, KAIST



Acoustic phonetic based sentence recognition

Block Diagram of a Speech
Recognition System

Signal

Representation

Acoustic

Segmentation

Feature

Extraction &

Phoneme

Recognition

Lexical

Expansion

Language

Modeling

Signal

Lexicon

Higher
-
Level

Linguistic

Knowledge

Decoder

Decoded Utterance