Speech Processing

yakzephyrΤεχνίτη Νοημοσύνη και Ρομποτική

24 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

98 εμφανίσεις

Speech Processing

AEGIS RET All
-
Hands Meeting

University of Central Florida

June 22, 2012

Applications of Images and Signals in
High Schools

Contributors

Dr
.
Veton

Këpuska
,
Faulty Mentor, FIT


Jacob
Zurasky
, Graduate Student Mentor, FIT


Becky Dowell, RET Teacher, BPS Titusville High

Motivation


Speech audio processing has increased in its
usefulness.


Applications


Siri

on iPhone 4S


Automated telephone systems


Voice transcription (e.g. dictation software)


Hands
-
free computing (e.g., OnStar)


Video games (e.g., XBOX Kinect)


Military applications (e.g., aircraft control)


Healthcare applications


Motivation


Speech recognition requires speech to first be characterized
by a set of “features”.


Features are used to determine what words are spoken.


To
understand how the features are
computed is
very
important.


Our project will implement the feature
extraction stage of
a
speech
processing application.


Work Completed


MATLAB fundamentals


Introduction of Signal
P
rocessing and Filtering


Beginning Project
I
mplementation

Speech Recognition

Front End:

Pre
-
processing

Back End:
Recognition

Speech

Recognized
speech

Large amount of data.


Ex: 256 samples

Features

Reduced data size.
Ex: 13 features


Front End


reduce amount of data for back end, but keep enough data to
accurately describe the signal. Output is feature vector.


256 samples
------
> 13 features



Back End
-

statistical models used to classify feature vectors as a certain
sound in speech

Discrete Time Signals


Computer is a discrete system with finite memory
resources, requires a discrete representation of sound



Sound represented as a sequence of samples


time vs. amplitude


Amplitude = volume








Discrete Time Signals

Discrete Time Signals


Sampling rate (# of samples per second)


8 kHz
-

telephone


44.1 KHz


CD audio


96 kHz


DVD audio


Frequency Domain


Need to analyze signals over frequency rather
than time.


Sound is composed of many frequencies at
the same time


Frequency determines the pitch of the sound


To recognize the sound, we need to know the
frequencies that make the sound.

Fast Fourier Transform (FFT)


Algorithm used to transform time domain to frequency domain.










MATLAB function:



FFT(X,N)


X


discrete time signal




N


FFT size


X


frequency spectrum

K
-

frequency bin

N


FFT size

n

-

sample number

x[n]


input signal

1
,...,
0
1
0
2







N
k
e
x
X
N
n
N
n
k
i
n
k

Sine
W
ave Example


MATLAB function
sine_sound


Generate 3 sine waves and a composite signal


Play sound and plot graphs


Compute and plot FFT of composite signal

Sine Wave Example


% plays a C major chord (C4, E4, F4)

sine_sound(8000
, 261.626, 329.628, 391.995, 1, 4096);

Front
-
End Processing

of Speech Recognizer

Pre
-
emphasis

Window

FFT

Mel
-
Scale

log

IFFT

Work
Completed

Project Implementation


Pre
-
emphasis


Windowing


FFT

Pre
-
Emphasis


1
st

order FIR filter


In human speech, higher frequencies have less
energy. Need to compensate for higher
frequency roll off in human speech


High Pass filter

Windowing


Separates speech signal into frames


Smooth edges of framed of speech signal

Connections to High School Mathematics
Curriculum


Florida Math Standard (NGSSS) MA.912.T.1.8:


Solve
real world problems involving applications of
trigonometric functions using graphing technology when
appropriate
.



Pre
-
Calculus course


related topics include graphs of trigonometric functions,
unit circle, logarithmic scale, complex numbers in trig form

Timeline


Week 1



MATLAB fundamentals


MATLAB Filter Design & Analysis Tool


Introduction to Signal Processing, FFT, Filtering


Identified topics connected to high school math curriculum



Week 2


Continued tutorials on signal processing and filtering


Implementation of sample code for use in lesson plans


Implementation of Pre
-
emphasis, Windowing, FFT


Timeline


Week
3


Cepstral

Transform


Implementation of Front
-
End Speech Processing


Week
4


Implementation of Front
-
End Speech

Processing


Week 5


Implementation of Front
-
End Speech

Processing


Work on deliverables
.


Week 6


Work on deliverables.

References


Ingle
,
Vinay

K., and John G.
Proakis
.
Digital signal processing using
MATLAB
. 2nd ed. Toronto, Ont.: Nelson, 2007
.



Oppenheim, Alan V., and Ronald W. Schafer.
Discrete
-
time signal
processing
. 3rd ed. Upper Saddle River: Pearson, 2010
.



Weeks, Michael
. Digital signal processing using MATLAB and wavelets
.
Hingham,Mass
.: Infinity Science Press, 2007.


Thank you!


Questions?