Signal Processing Review - Electrical and Computer Engineering

photohomoeopathAI and Robotics

Nov 24, 2013 (3 years and 11 months ago)

99 views

Topic 2


Signal Processing Review

(Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music)

Recording Sound

Mechanical

Vibration

Pressure

Waves

Motion
-
>Voltage

Transducer

Voltage over time

2

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Microphones

http://www.mediacollege.com/audio/microphones/how
-
microphones
-
work.html

3

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Pure Tone = Sine Wave

time

amplitude

frequency

i
nitial phase


𝑑
=

sin
(
2

+
πœ‘
)

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1
440Hz

Period T

4

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Reminders

β€’
Frequency,

=
1
/
𝑇
, is measured in cycles per
second , a.k.a.
Hertz

(Hz).


β€’
One cycle contains
2


radians.


β€’
Angular
frequency
Ξ©
, is measured in radians per
second and is related to frequency by
Ξ©
=
2 
.


β€’
So we can rewrite the sine wave as


𝑑
=

sin
(
Ξ©
𝑑
+
πœ‘
)

5

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1


=


(
𝑑
)

βˆ’
2
πœ‹π‘“
𝑑
∞
βˆ’
∞


Amplitude

Frequency (Hz)

0

440

-
440

|


|

6

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

We can also write

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1

Ξ©
=


(
𝑑
)

βˆ’

Ξ©

𝑑
∞
βˆ’
∞


Amplitude

Angular Frequency (radians)

0

440
Γ—
2

βˆ’
440
Γ—
2

|


|

7

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Complex Tone = Sine Wave
s

0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
+

+

=

220 Hz

660 Hz

1100 Hz

8

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Frequency Domain

Amplitude

Frequency (Hz)

Time (
ms
)

Amplitude

0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
220

660

1100



=


(
𝑑
)

βˆ’
2
πœ‹π‘“
𝑑
∞
βˆ’
∞


|


|

9

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Harmonic Sound

β€’
1 or more sine waves

β€’
Strong components at
integer multiples

of
a
fundamental frequency (F0)
in the range
of human hearing (20
H
z
~

20,000
H
z)


β€’
Examples

–
220 + 660 + 1100 is harmonic

–
220 + 375 + 770 is
not

harmonic



10

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Noise

β€’
Lots of
sines

at random
freqs
. = NOISE

β€’
Example: 100
sines

with random
frequencies, such that
100
<

<
10000
.

0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30
11

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How strong is the signal?

β€’
Instantaneous value?

β€’
Average value?

β€’
Something else?


0
2
4
6
-1
0
1
0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30

𝑑

12

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Acoustical or Electrical

β€’
Acoustical

𝐼
=
1

1
𝑇
𝐷


2
𝑑
𝑑

𝐷
0



β€’
Electrical


=
1

1
𝑇
𝐷


2
𝑑
𝑑

𝐷
0

View

𝑑

as
sound pressure

Average
intensity

View

𝑑

as
electric voltage

Average
power

density

sound
speed

resistance

13

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Root
-
Mean
-
Square (RMS)


 
=
1
𝑇
𝐷


2
𝑑
𝑑

𝐷
0

β€’
𝑇
𝐷

should be long enough.

β€’

(
𝑑
)

should have 0 mean, otherwise the DC
component will be integrated.

β€’
For sinusoids


 
=
1
𝑇


2
sin
2
2
𝑑
𝑑

0
=

2
/
2
=
0
.
707

14

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Sound Pressure Level (SPL)

β€’
Softest audible sound intensity
0.000000000001 watt/m
2

β€’
Threshold of pain is around 1 watt/m
2

β€’
12 orders of magnitude difference

β€’
A log scale helps with this

β€’
The decibel (dB) scale is a log scale, with
respect to a reference value

15

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Decibel

β€’
A logarithmic measurement that expresses the
magnitude of a physical quantity (e.g.
power or
intensity) relative to a specified
reference level
.

β€’
Since it expresses a ratio of two (same unit)
quantities, it is
dimensionless.


𝐿
βˆ’
𝐿
ref
=
10
log
10
𝐼
𝐼
ref

=
20
log
10

 

 
,
ref

16

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Lots of references!

β€’
dB SPL

–

A measure of sound pressure level. 0dB SPL is
approximately the quietest sound a human can hear,
roughly the sound of a mosquito flying 3 meters away
.


β€’
dbFS

–

relative to digital full
-
scale. 0 VU is the
maximum allowable signal. Values typically negative.

β€’
dBV

–

relative to 1 Volt RMS. 0dBV = 1V.

β€’
dBu

–

relative to 0
.
775 Volts RMS with an unloaded,
open circuit.

β€’
dBmV

–

relative
to 1 millivolt across 75
Ξ©. Widely
used
in
cable television networks.

β€’
……

17

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Typical Values

β€’
Jet engine at 3m

β€’
Pain threshold

β€’
Loud motorcycle, 5m

β€’
Vacuum cleaner

β€’
Quiet restaurant

β€’
Rustling leaves

β€’
Human breathing, 3m

β€’
Hearing threshold

140 db
-
SPL

130 db
-
SPL

110 db
-
SPL


80 db
-
SPL


50 db
-
SPL


20 db
-
SPL


10 db
-
SPL


0 db
-
SPL

18

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Digital Sampling

0

1

2

3

-
1

-
2

AMPLITUDE

TIME

quantization increment

sample

interval

011

010

0
01

101

100

000

RECONSTRUCTION

19

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

More quantization levels = more dynamic range

0

1

2

3

4

5

6

-
4

-
3

-
2

-
1

0000

0001

0010

0110

0100

0101

0011

1001

1010

1011

1000

AMPLITUDE

TIME

sample

interval

quantization increment

20

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Bit Depth and Dynamics

β€’
More bits = more quantization levels = better
sound


β€’
Compact Disc: 16 bits = 65,536 levels

β€’
POTS (plain old telephone service): 8 bits = 256
levels


β€’
Signal
-
to
-
quantization
-
noise ratio (SQNR), if the
signal is uniformly distributed in the whole range

SQNR
=
20
log
10
2

β‰ˆ
6
.
02

dB

–
E.g. 16 bits depth gives about 96dB SQNR.


21

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

RMS


 
=
1



2
[

]


βˆ’
1
𝑛
=
0

Amplitude

0
2
4
6
-1
0
1
The red dots
form the discrete
signal

[

]

22

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and
Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

23

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

24

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

25

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Nyquist
-
Shannon Sampling Theorem

β€’
You can’t reproduce the signal if your
sample rate isn’t faster than twice the
highest frequency in the signal.


β€’
Nyquist

rate: twice the frequency of the highest
frequency in the signal.

–
A property of the continuous
-
time signal.

β€’
Nyquist

frequency: half of the sampling rate

–
A property of the discrete
-
time system.


26

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete
-
Time Fourier Transform (DTFT)






Amplitude

0
2
4
6
-1
0
1

πœ”
=


[

]

βˆ’

πœ”
𝑛
∞
𝑛
=
βˆ’
∞

Amplitude

Angular frequency
πœ”

0

βˆ’
2


|

πœ”
|

The red dots form the
discrete signal

[

]
,
where

=
0
,
Β±
1
,
Β±
2
,
…

2



(
πœ”
)

is Periodic.

We often only show
βˆ’

,



πœ”

is a continuous variable

βˆ’




27

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Relation between FT and DTFT


πœ”
=
1
𝑇


𝑐
πœ”
𝑇
+
2
π‘˜
𝑇
∞

=
βˆ’
∞

β€’
Scaling:
πœ”
=
Ξ©
𝑇
, i.e.
πœ”
=
2

corresponds to
Ξ©
=
2πœ‹

=
2 

, which corresponds to

=


.

β€’
Repetition:

πœ”

contains infinite copies of

𝑐
,
spaced by
2

.


Amplitude

0
2
4
6
-1
0
1
Time (
ms
)

Sampling:


=

𝑐
(
𝑇
)

FT:

𝑐
(
Ξ©
)
=


𝑐
(
𝑑
)

βˆ’

Ξ©

𝑑
∞
βˆ’
∞


DTFT:

πœ”
=


[

]

βˆ’

πœ”π‘›
∞
𝑛
=
βˆ’
∞

28

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing

Ξ©

0

|

𝑐
Ξ©
|

1
800


3600


βˆ’
3600


βˆ’
1800


Complex tone

900Hz + 1800Hz

Sampling rate
= 8000Hz

0

|

πœ”
|



2


βˆ’
2


βˆ’


3600

8000

πœ”

Sampling rate
= 2000Hz

πœ”

0



2


βˆ’
2


βˆ’


|

πœ”
|

3600

2000

1800

2000

200Hz

29

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Series

β€’
FT and DTFT do not require the signal to be periodic, i.e.
the signal may contain arbitrary frequencies, which is
why the frequency domain is continuous.


β€’
Now, if the signal is periodic:


𝑑
+
𝑇
=

𝑑




βˆ€

∈
Ξ–

β€’
It can be reproduced by a series of sine and cosine
functions:


𝑑
=

0
+


𝑛
cos
Ξ©
𝑛
𝑑
+

𝑛
sin
Ξ©
𝑛
𝑑
∞
𝑛
=
1

β€’
In other words, the frequency domain is discrete.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform (DFT)

β€’
FT and DTFT are great, but the infinite integral
or summations are hard to deal with.

β€’
In digital computers, everything is discrete,
including both the signal and its spectrum



π‘˜
=


[

]

βˆ’
2
πœ‹𝑛
/


βˆ’
1
𝑛
=
0

frequency
domain index

t
ime domain
index

Length of the
signal, i.e.
length of DFT

31

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DFT and IDFT


π‘˜
=


[

]

βˆ’
2
πœ‹𝑛
/


𝑛
=
0




=

1



[
π‘˜
]

2
πœ‹𝑛
/


βˆ’
1

=
0


β€’
Both

[

]

and

[
π‘˜
]

are discrete and of length

.

β€’
Treats

[

]

as if it were infinite and periodic.

β€’
Treats

[
π‘˜
]

as if it were infinite and periodic.

β€’
Only one period is involved in calculation.

DFT:

IDFT:

32

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform

β€’
If the time
-
domain signal has no imaginary
part (like an audio signal
)
then the frequency
-
domain signal is
conjugate symmetric around
N/2.

DFT

0

N
-
1

0

N
-
1

0

N
-
1

0

N
-
1

Real portion

Imaginary portion

N/2

N/2

Real portion

Imaginary portion

Time domain

[

]



Frequency domain

[
π‘˜
]

IDFT

33

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DC

f
s
/2

Kinds of Fourier Transforms

Fourier Transform

Signals: continuous, aperiodic

Spectrum: aperiodic, continuous

Fourier Series

Signals: continuous, periodic

Spectrum: aperiodic, discrete

Discrete Time Fourier Transform

Signals: discrete, aperiodic

Spectrum: periodic, continuous

Discrete Fourier Transform

Signals: discrete, periodic

Spectrum: periodic, discrete

34

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The FFT

β€’
Fast Fourier Transform

–
A much, much faster way to do the DFT

–
Introduced by Carl F.
Gauss in 1805

–
Rediscovered by J.W. Cooley and John
Tukey

in 1965

–
The
Cooley
-
Tukey

algorithm is the one we use
today (mostly)

–
Big O notation for this is
O(N
log
N)

–
Matlab

functions
fft

and
ifft

are standard.

35

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowing

β€’
A function that is zero
-
valued outside of some
chosen interval.

–
When a signal (data) is multiplied by a window
function, the product is zero
-
valued outside the
interval: all that is left is the "view" through the
window.


x[n]

w[n]

z[n]

x

=

Example: windowing x[n] with a rectangular window

36

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Some famous windows

β€’
Rectangular




=
1


β€’
Triangular
(Bartlett
)




=

2

βˆ’
1

βˆ’
1
2
βˆ’

βˆ’

βˆ’
1
2


β€’
Hann




=
0
.
5
1
βˆ’
cos
2
πœ‹π‘›

βˆ’
1



Note: we assume w[
n
] = 0

outside some range [0,
N
]

sample

amplitude

sample

amplitude

sample

amplitude

37

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Why window shape matters

β€’
Don’t forget that a DFT assumes the
signal in the window is periodic

β€’
The boundary conditions mess things
up…unless you manage to have a window
whose length
is
exactly 1 period of your
signal

β€’
Making the edges of the window less
prominent helps suppress undesirable
artifacts

38

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform of Windows

-4
-2
0
2
4
-30
-20
-10
0
10
20
30
40
Normalized angular frequency
Amplitude (dB)
Main lobe

Sidelobes

We want

-
Narrow main lobe

-
Low
sidelobes

39

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Which window is better?

-4
-2
0
2
4
-150
-100
-50
0
50
Normalized angular frequency
Amplitude (dB)
-4
-2
0
2
4
-60
-40
-20
0
20
40
Normalized angular frequency
Amplitude (dB)
Hann

window



=
0
.
5
1
βˆ’
cos
2


βˆ’
1

Hamming window



=
0
.
54

βˆ’
0
.
46
Γ—
cos
2


βˆ’
1

40

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Multiplication
v.s
. Convolution

Time domain

Frequency Domain


[

]
βˆ™

[

]

1


[
π‘˜
]
βˆ—

[
π‘˜
]


[

]
βˆ—

[

]


[
π‘˜
]
βˆ™

[
π‘˜
]

β€’
Windowing is multiplication in time domain, so the spectrum
will be a convolution between the signal’s spectrum and the
window’s spectrum

β€’
Convolution in time domain takes

(

2
)
, but if we perform in
the frequency domain…

β€’
FFT takes


log


β€’
Multiplication takes



β€’
IFFT takes


log


41

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowed Signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

42

0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
Spectrum of Windowed Signal

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
β€’
Two sinusoids: 1000Hz + 1500Hz

β€’
Sampling rate: 10KHz

β€’
Window length: 100 (i.e. 100/10K = 0.01s)

β€’
FFT length: 400 (i.e. 4 times zero padding)

43

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Zero Padding

β€’
Add zeros after (or before) the signal to
make it longer

β€’
Perform DFT on the padded signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

44

0
200
400
600
800
1000
1200
1400
1600
-3
-2
-1
0
1
2
3
Windowed
signal

Padded zeros

Why Zero Padding?

β€’
Zero padding in time domain gives the ideal
interpolation in the frequency domain.

β€’
It doesn’t increase (the real) frequency resolution!

–
4 times is generally enough

–
Here the resolution is always
fs
/L=100Hz

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
No zero padding

4 times zero padding

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
8 times zero padding

45

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How to increase frequency resolution?

β€’
Time
-
frequency resolution tradeoff

βˆ†
𝑑
β‹…
βˆ†

=
1

(second) (Hz)

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
60
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-100
-50
0
50
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
46

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Window length: 10ms

Window length: 20ms

Window length: 40ms

Short time Fourier Transform

β€’
Break signal into windows

β€’
Calculate DFT of each window

47

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Spectrogram

β€’
There
is a
β€œspectrogram”
function in
matlab
, but you
can’t do zero padding using it.

48

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

A Fun Example

(Thanks to Robert
Remez
)

49

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Overlap
-
Add Synthesis

β€’
IDFT on each spectrum

–
The complex, full spectrum

–
Don’t forget the
phase (
often using the original
phase).

–
If you do it right, the time signal you get is real.

β€’
Multiply with a synthesis window (e.g.
Hamming)

–
Not dividing the analysis window

β€’
Overlap and add different frames together.


50

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

Shepard Tones

Continuous
Risset

scale

Barber’s pole

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

51

Shepard Tones

β€’
Make a sound composed of sine waves
spaced at octave intervals.

β€’
Control their amplitudes by imposing a
Gaussian (or something like it) filter in the
(log) frequency dimension

β€’
Move all the sine waves up a musical Β½
step.

β€’
Wrap around in frequency.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

52