•
Professor Douglas Lyon
•
Lyon@docjava.com
•
Fairfield University
•
http://www.docjava.com
Voice and Signal Processing
Two Course Texts!
•
Java for Programmers
•
Available from:
–
http://www.docjava.com
Java Digital Signal Processing
•
Java for Programmers
•
Available from:
–
http://www.docjava.com
Grading
•
Midterm: 1/3
•
Homework: 1/3
•
Final: 1/3
•
Midterm and Final
–
Take home!
Email
•
Please send me an e

mail asking to be
placed on the CR310 List
•
E

mail: lyon@docjava.com
Pre

reqs
•
You should have CS232 and MA 172
•
OR permission of the instructor
•
You need a working knowledge of Java!
What do I need to learn this?
•
Basic multimedia programming
–
It helps implement interesting programs
–
It enables active learning
–
It requires a good background in Java
programming
Preliminary Java Topics
•
exceptions (ch11)
•
nested reference data types (ch 12)
•
threads (ch13)
Preliminary IO Topics
•
files (ch14)
•
streams(15)
•
readers (16)
•
writers (17)
Preliminary GUI Topics
•
Swing (ch 18)
•
Events (ch 19)
What is Voice and Signal
Processing?
•
1D data processing
–
input sound
–
output sound
–
a time varying functions are used as both input
and output.
What is Digital Signal Processing?
•
A kind of data processing.
•
Typically numeric data processing
–
Look at
kind
and DIMENSION of data.
–
1D in, 1D out

> DSP.
–
2D in, 2D out

> Image Processing
–
2D in, symbols out

> computer vision
–
3D in, 2D out

> computer graphics
What are some DSP examples?
•
If the input is images and the output is images we
call it
image processing
•
If the input is images and the output is symbols we
call it pattern recognition or machine vision
•
If the input is text and the output is voice we call it
voice synthesis.
•
If the input is voice and the output is text we call it
voice recognition
•
If the input is images and geometry and the output is
images we call it image warping
What are some 1D DSP
applications?
•
Analysis
–
weak variables

> strong variables
•
Systhesis
–
Strong variables

> weak variables
What are some kinds of 1D data?
•
Any form of energy that can be digitized.
•
Any source of data (a function in 1D).
–
Voice data
–
Sound data
–
Temperature data
–
Range, blood pressure, EEG (brain stuff), EKG
(heart stuff), weight, age…..
non

physical phenomena and DSP
•
Anything that can produce a digital stream
of data is suitable for DSP
–
i.e., financial data,
–
statistical data,
–
network traffic, etc.
What is Audio?
•
Pressure wave that moves air.
•
Human auditory system (ear).
•
Audio is a sensation.
What is digitzation?
A low

pass filter removes high frequencies
ADC samples the signal and quantizes it
Parallel to serial converter is a shift

register
Sampling and Quantization
Quantization
•
1 part of digitization
•
input v(t)
•
ouput Vq(t)
•
let N = the number of quantization levels.
•
Suppose minimum voltage is 0 vdc
•
Suppose max voltage is 1 vdc
•
What is the min quantization step?
Computing the quantization step
•
maximum voltage / total number of steps.
•
For example, a CD has 16 audio sampling.
–
N = 2**16 = 65536
–
Voltage of quantization = 1/ 65536=0.00002
•
For AU files, N = 2 ** 8 = 256
–
Voltage of quantization = 1/256=.003
What is the noise relative to the
signal?
•
SNR = signal to noise ratio
•
Log(Signal power / noise power) to base 10.
•
This is named after Alexander Grahm Bell
•
It is called the decibel (dB).
•
10Log(65536/0.00002) = 95 db
•
Usually about 6 dB per bit.
General Analysis for the ADC
The role of the low

pass filter
•
anti

aliasing filter
•
Nyquest frequency = sample freq /2
•
only pass freqs below Nyquest Frequency
How do I reconstruct a signal?
sample/reconstruction process
v(t)
f
s
Amplifier
lowpass
filter
output
R
Digitizing Voice: PCM
Waveform Encoding
•
Nyquist Theorem: sample at twice the
highest frequency
–
Voice frequency range: 300

3400 Hz
–
Sampling frequency = 8000/sec (every 125us)
–
Bit rate: (2 x 4 Khz) x 8 bits per sample
–
= 64,000 bits per second (DS

0)
•
By far the most commonly used method
CODEC
PCM
64 Kbps
= DS

0
In 1D, DSP Is…
•
1D Digital signal processing is a kind of
data processing that operates on 1D PCM
data.
O

scope
Harmonics
•
The
fundamental
frequency of a sound is
said to be the component of strongest
magnitude.
•
Few sounds are just sine waves.
•
The extra waves in a sound refer to the
harmonic content or timbre.
Harmonic formula
•
A harmonic is a numeric multiple of
pitches.
•
If 440 Hz is the 1
st
harmonic then
•
880 Hz is the 2
nd
harmonic
•
Individual sine waves are called partials.
Harmonic Motion
The
frequency
of the oscillations is given by
How do I model Spectra?
•
Suppose the continuous signal is
v(t)
•
Let the Fourier coefficients be denoted:
v
(
t
)
a
0
(
a
1
cos
t
b
1
sin
t
)
(
a
2
cos
2
t
b
2
sin
2
t
)
a
0
,
a
1
,
b
1
,
a
2
,
b
2
Sawtooth Wave Form
K=10
Model of a Saw Wave
f
(
x
)
2
1
(
n
1
)
sin
(
n
x
)
n
n
1
K
Sawwave k=100
Example: a 4 voice synthesizer
•
Design a program that can:
–
Play sound
–
Provide a GUI for determining the amplitudes
of up to 7 harmonics
–
Enable the user to alter the frequency for the
fundamental tone.
–
Enable the playing of 4 voices
–
Enable the control of the overall volume.
Building an Oscillator in software
•
//the period of the wave form is
•
lambda = 1 / frequency in seconds
•
//The number of samples per period is
•
samplesPerCycle = sampleRate *
lambda;
•
sampleRate = 8000 samples/ second
Fourier transform
V
(
f
)
F
[
v
(
t
)
]
v
(
t
)
e
2
if t
dt
v
(
t
)
F
1
V
(
f
)
V
(
f
)
e
2
if t
dt
How do you compute the Fourier
Coefficients?
•
Use the Fourier transform!
v
(
t
)
a
0
(
a
1
cos
t
b
1
sin
t
)
(
a
2
cos
2
t
b
2
sin
2
t
)
V
(
f
)
F
[
v
(
t
)
]
v
(
t
)
e
2
if t
dt
v
(
t
)
F
1
V
(
f
)
V
(
f
)
e
2
if t
dt
Recall Euler’s identity
•
Complex numbers have a real and
imaginary part:
e
i
cos
i
sin
Another way to express a function
v
(
t
)
a
0
(
a
1
cos
t
b
1
sin
t
)
(
a
2
cos
2
t
b
2
sin
2
t
)
f
0
frequency
nf
0
nth harmonic of
f
0
Sine

Cosine Representation
x
(
t
)
a
n
cos
(
2
nf
0
t
)
b
n
sin
(
2
nf
0
t
)
n
1
n
0
f
0
frequency
nf
0
nth harmonic of
f
0
Correlation
•
Fourier coefficients, are found by
correlating the time dependent function,
x(t)
, with a Nth harmonic sine

cosine pair:
a
0
1
T
x
(
t
)
dt
0
T
a
n
2
T
x
(
t
)
cos
(
2
nf
0
t
)
dt
0
T
b
n
2
T
x
(
t
)
sin
(
2
nf
0
t
)
dt
0
T
amplitude

phase representation
x
(
t
)
=
c
0
c
n
cos
(
2
f
0
t
n
)
n
1
c
0
1
T
x
(
t
)
dt
0
T
c
n
a
n
2
b
n
2
n
tan
1
b
n
a
n
Average Power
P
1
t
1
t
2
x
(
t
)
2
t
1
t
2
2
0
1
( )
T
P x t dt
T
Periodic signal avg power
PSD (Power Spectral Density)
•
is the power at a
specific frequency, .
( )
S f
Linear combinations in the time
domain become linear combinations
in the frequency domain
1 1 2 2 1 1 2 1
( ) ( ) [ ( ) ( )]
aV f a V f F a v t a v t
Delay in the time domain causes a
phase shift in the frequency domain
2
( ) ( ( ))
if
d
V f e F v t t
Scale change in the time domain
causes a reciprocal scale change in
the frequency domain
1
( ( )),0
f
V F v t
convolution theorem: multiplication
in the time domain causes
convolution in the frequency domain
* ( ) ( ( ) ( ))
V W f F v t w t
Convolution between two functions
of the same variable is defined by
* ( ) ( ) ( )
V W f V W f d
Various Codec Bandwidth
Consumptions
Encoding/
Compression
Result
Bit Rate
G.711 PCM
A

Law/
u

Law
64 kbps (DS0)
G.726 ADPCM
16, 24, 32, 40 kbps
G.727 E

ADPCM
G.729 CS

ACELP
8 kbps
G.728 LD

CELP
16 kbps
G.723.1 CELP
6.3/5.3 kbps
Variable
16, 24, 32, 40 kbps
Standard
Transmission
Rate for Voice
A means to improve SNR
•
Compression uses a coder and a decoder.
•
One CODEC is called U

Law.
•
U

Law runs at 8 khz sampling and 8 bits per
digitized sample.
•
ULaw is meant for voice.
Voice grade audio

Application
•
voice over IP
•
Voice ranged to about 3.4 khz
•
Sample at 8 Khz, that should be plenty
•
Quantize to 8 bits of data (about 48 db
SNR)
•
Improve the SNR with compression
Voice Quality of Service (QoS)
Requirements
Loss
Delay
Delay Variation (Jitter)
Avoiding The 3 Main QoS Challenges
The
u

law codec
•
X is a number whose range is 0..255
•
Log, to the base 2 of X is a number whose
range is 0..8
•
U

law uses a scale factor (mu) that
multiplies the input before log is taken.
•
Log (x), base 2 = Log(x)/Log(2)
•
Mu

law takes the log to the base 1+mu.
Comments 0
Log in to post a comment