Signal Processing Review - Electrical and Computer Engineering

photohomoeopathΤεχνίτη Νοημοσύνη και Ρομποτική

24 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

93 εμφανίσεις

Topic 2


Signal Processing Review

(Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music)

Recording Sound

Mechanical

Vibration

Pressure

Waves

Motion
-
>Voltage

Transducer

Voltage over time

2

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Microphones

http://www.mediacollege.com/audio/microphones/how
-
microphones
-
work.html

3

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Pure Tone = Sine Wave

time

amplitude

frequency

i
nitial phase


𝑡
=

sin
(
2

+
𝜑
)

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1
440Hz

Period T

4

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Reminders


Frequency,

=
1
/
𝑇
, is measured in cycles per
second , a.k.a.
Hertz

(Hz).



One cycle contains
2


radians.



Angular
frequency
Ω
, is measured in radians per
second and is related to frequency by
Ω
=
2 
.



So we can rewrite the sine wave as


𝑡
=

sin
(
Ω
𝑡
+
𝜑
)

5

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1


=


(
𝑡
)


2
𝜋𝑓
𝑡





Amplitude

Frequency (Hz)

0

440

-
440

|


|

6

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

We can also write

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1

Ω
=


(
𝑡
)



Ω

𝑡





Amplitude

Angular Frequency (radians)

0

440
×
2


440
×
2

|


|

7

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Complex Tone = Sine Wave
s

0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
+

+

=

220 Hz

660 Hz

1100 Hz

8

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Frequency Domain

Amplitude

Frequency (Hz)

Time (
ms
)

Amplitude

0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
220

660

1100



=


(
𝑡
)


2
𝜋𝑓
𝑡





|


|

9

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Harmonic Sound


1 or more sine waves


Strong components at
integer multiples

of
a
fundamental frequency (F0)
in the range
of human hearing (20
H
z
~

20,000
H
z)



Examples


220 + 660 + 1100 is harmonic


220 + 375 + 770 is
not

harmonic



10

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Noise


Lots of
sines

at random
freqs
. = NOISE


Example: 100
sines

with random
frequencies, such that
100
<

<
10000
.

0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30
11

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How strong is the signal?


Instantaneous value?


Average value?


Something else?


0
2
4
6
-1
0
1
0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30

𝑡

12

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Acoustical or Electrical


Acoustical

𝐼
=
1

1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0




Electrical


=
1

1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0

View

𝑡

as
sound pressure

Average
intensity

View

𝑡

as
electric voltage

Average
power

density

sound
speed

resistance

13

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Root
-
Mean
-
Square (RMS)


 
=
1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0


𝑇
𝐷

should be long enough.



(
𝑡
)

should have 0 mean, otherwise the DC
component will be integrated.


For sinusoids


 
=
1
𝑇


2
sin
2
2
𝑡
𝑡

0
=

2
/
2
=
0
.
707

14

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Sound Pressure Level (SPL)


Softest audible sound intensity
0.000000000001 watt/m
2


Threshold of pain is around 1 watt/m
2


12 orders of magnitude difference


A log scale helps with this


The decibel (dB) scale is a log scale, with
respect to a reference value

15

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Decibel


A logarithmic measurement that expresses the
magnitude of a physical quantity (e.g.
power or
intensity) relative to a specified
reference level
.


Since it expresses a ratio of two (same unit)
quantities, it is
dimensionless.


𝐿

𝐿
ref
=
10
log
10
𝐼
𝐼
ref

=
20
log
10

 

 
,
ref

16

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Lots of references!


dB SPL



A measure of sound pressure level. 0dB SPL is
approximately the quietest sound a human can hear,
roughly the sound of a mosquito flying 3 meters away
.



dbFS



relative to digital full
-
scale. 0 VU is the
maximum allowable signal. Values typically negative.


dBV



relative to 1 Volt RMS. 0dBV = 1V.


dBu



relative to 0
.
775 Volts RMS with an unloaded,
open circuit.


dBmV



relative
to 1 millivolt across 75
Ω. Widely
used
in
cable television networks.


……

17

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Typical Values


Jet engine at 3m


Pain threshold


Loud motorcycle, 5m


Vacuum cleaner


Quiet restaurant


Rustling leaves


Human breathing, 3m


Hearing threshold

140 db
-
SPL

130 db
-
SPL

110 db
-
SPL


80 db
-
SPL


50 db
-
SPL


20 db
-
SPL


10 db
-
SPL


0 db
-
SPL

18

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Digital Sampling

0

1

2

3

-
1

-
2

AMPLITUDE

TIME

quantization increment

sample

interval

011

010

0
01

101

100

000

RECONSTRUCTION

19

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

More quantization levels = more dynamic range

0

1

2

3

4

5

6

-
4

-
3

-
2

-
1

0000

0001

0010

0110

0100

0101

0011

1001

1010

1011

1000

AMPLITUDE

TIME

sample

interval

quantization increment

20

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Bit Depth and Dynamics


More bits = more quantization levels = better
sound



Compact Disc: 16 bits = 65,536 levels


POTS (plain old telephone service): 8 bits = 256
levels



Signal
-
to
-
quantization
-
noise ratio (SQNR), if the
signal is uniformly distributed in the whole range

SQNR
=
20
log
10
2


6
.
02

dB


E.g. 16 bits depth gives about 96dB SQNR.


21

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

RMS


 
=
1



2
[

]



1
𝑛
=
0

Amplitude

0
2
4
6
-1
0
1
The red dots
form the discrete
signal

[

]

22

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and
Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

23

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

24

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

25

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Nyquist
-
Shannon Sampling Theorem


You can’t reproduce the signal if your
sample rate isn’t faster than twice the
highest frequency in the signal.



Nyquist

rate: twice the frequency of the highest
frequency in the signal.


A property of the continuous
-
time signal.


Nyquist

frequency: half of the sampling rate


A property of the discrete
-
time system.


26

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete
-
Time Fourier Transform (DTFT)






Amplitude

0
2
4
6
-1
0
1

𝜔
=


[

]



𝜔
𝑛

𝑛
=



Amplitude

Angular frequency
𝜔

0


2


|

𝜔
|

The red dots form the
discrete signal

[

]
,
where

=
0
,
±
1
,
±
2
,


2



(
𝜔
)

is Periodic.

We often only show


,



𝜔

is a continuous variable






27

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Relation between FT and DTFT


𝜔
=
1
𝑇


𝑐
𝜔
𝑇
+
2
𝑘
𝑇


=




Scaling:
𝜔
=
Ω
𝑇
, i.e.
𝜔
=
2

corresponds to
Ω
=
2𝜋

=
2 

, which corresponds to

=


.


Repetition:

𝜔

contains infinite copies of

𝑐
,
spaced by
2

.


Amplitude

0
2
4
6
-1
0
1
Time (
ms
)

Sampling:


=

𝑐
(
𝑇
)

FT:

𝑐
(
Ω
)
=


𝑐
(
𝑡
)



Ω

𝑡





DTFT:

𝜔
=


[

]



𝜔𝑛

𝑛
=



28

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing

Ω

0

|

𝑐
Ω
|

1
800


3600



3600



1800


Complex tone

900Hz + 1800Hz

Sampling rate
= 8000Hz

0

|

𝜔
|



2



2





3600

8000

𝜔

Sampling rate
= 2000Hz

𝜔

0



2



2





|

𝜔
|

3600

2000

1800

2000

200Hz

29

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Series


FT and DTFT do not require the signal to be periodic, i.e.
the signal may contain arbitrary frequencies, which is
why the frequency domain is continuous.



Now, if the signal is periodic:


𝑡
+
𝑇
=

𝑡







Ζ


It can be reproduced by a series of sine and cosine
functions:


𝑡
=

0
+


𝑛
cos
Ω
𝑛
𝑡
+

𝑛
sin
Ω
𝑛
𝑡

𝑛
=
1


In other words, the frequency domain is discrete.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform (DFT)


FT and DTFT are great, but the infinite integral
or summations are hard to deal with.


In digital computers, everything is discrete,
including both the signal and its spectrum



𝑘
=


[

]


2
𝜋𝑛
/



1
𝑛
=
0

frequency
domain index

t
ime domain
index

Length of the
signal, i.e.
length of DFT

31

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DFT and IDFT


𝑘
=


[

]


2
𝜋𝑛
/


𝑛
=
0




=

1



[
𝑘
]

2
𝜋𝑛
/



1

=
0



Both

[

]

and

[
𝑘
]

are discrete and of length

.


Treats

[

]

as if it were infinite and periodic.


Treats

[
𝑘
]

as if it were infinite and periodic.


Only one period is involved in calculation.

DFT:

IDFT:

32

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform


If the time
-
domain signal has no imaginary
part (like an audio signal
)
then the frequency
-
domain signal is
conjugate symmetric around
N/2.

DFT

0

N
-
1

0

N
-
1

0

N
-
1

0

N
-
1

Real portion

Imaginary portion

N/2

N/2

Real portion

Imaginary portion

Time domain

[

]



Frequency domain

[
𝑘
]

IDFT

33

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DC

f
s
/2

Kinds of Fourier Transforms

Fourier Transform

Signals: continuous, aperiodic

Spectrum: aperiodic, continuous

Fourier Series

Signals: continuous, periodic

Spectrum: aperiodic, discrete

Discrete Time Fourier Transform

Signals: discrete, aperiodic

Spectrum: periodic, continuous

Discrete Fourier Transform

Signals: discrete, periodic

Spectrum: periodic, discrete

34

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The FFT


Fast Fourier Transform


A much, much faster way to do the DFT


Introduced by Carl F.
Gauss in 1805


Rediscovered by J.W. Cooley and John
Tukey

in 1965


The
Cooley
-
Tukey

algorithm is the one we use
today (mostly)


Big O notation for this is
O(N
log
N)


Matlab

functions
fft

and
ifft

are standard.

35

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowing


A function that is zero
-
valued outside of some
chosen interval.


When a signal (data) is multiplied by a window
function, the product is zero
-
valued outside the
interval: all that is left is the "view" through the
window.


x[n]

w[n]

z[n]

x

=

Example: windowing x[n] with a rectangular window

36

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Some famous windows


Rectangular




=
1



Triangular
(Bartlett
)




=

2


1


1
2





1
2



Hann




=
0
.
5
1

cos
2
𝜋𝑛


1



Note: we assume w[
n
] = 0

outside some range [0,
N
]

sample

amplitude

sample

amplitude

sample

amplitude

37

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Why window shape matters


Don’t forget that a DFT assumes the
signal in the window is periodic


The boundary conditions mess things
up…unless you manage to have a window
whose length
is
exactly 1 period of your
signal


Making the edges of the window less
prominent helps suppress undesirable
artifacts

38

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform of Windows

-4
-2
0
2
4
-30
-20
-10
0
10
20
30
40
Normalized angular frequency
Amplitude (dB)
Main lobe

Sidelobes

We want

-
Narrow main lobe

-
Low
sidelobes

39

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Which window is better?

-4
-2
0
2
4
-150
-100
-50
0
50
Normalized angular frequency
Amplitude (dB)
-4
-2
0
2
4
-60
-40
-20
0
20
40
Normalized angular frequency
Amplitude (dB)
Hann

window



=
0
.
5
1

cos
2



1

Hamming window



=
0
.
54


0
.
46
×
cos
2



1

40

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Multiplication
v.s
. Convolution

Time domain

Frequency Domain


[

]


[

]

1


[
𝑘
]


[
𝑘
]


[

]


[

]


[
𝑘
]


[
𝑘
]


Windowing is multiplication in time domain, so the spectrum
will be a convolution between the signal’s spectrum and the
window’s spectrum


Convolution in time domain takes

(

2
)
, but if we perform in
the frequency domain…


FFT takes


log



Multiplication takes




IFFT takes


log


41

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowed Signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

42

0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
Spectrum of Windowed Signal

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)

Two sinusoids: 1000Hz + 1500Hz


Sampling rate: 10KHz


Window length: 100 (i.e. 100/10K = 0.01s)


FFT length: 400 (i.e. 4 times zero padding)

43

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Zero Padding


Add zeros after (or before) the signal to
make it longer


Perform DFT on the padded signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

44

0
200
400
600
800
1000
1200
1400
1600
-3
-2
-1
0
1
2
3
Windowed
signal

Padded zeros

Why Zero Padding?


Zero padding in time domain gives the ideal
interpolation in the frequency domain.


It doesn’t increase (the real) frequency resolution!


4 times is generally enough


Here the resolution is always
fs
/L=100Hz

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
No zero padding

4 times zero padding

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
8 times zero padding

45

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How to increase frequency resolution?


Time
-
frequency resolution tradeoff


𝑡



=
1

(second) (Hz)

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
60
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-100
-50
0
50
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
46

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Window length: 10ms

Window length: 20ms

Window length: 40ms

Short time Fourier Transform


Break signal into windows


Calculate DFT of each window

47

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Spectrogram


There
is a
“spectrogram”
function in
matlab
, but you
can’t do zero padding using it.

48

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

A Fun Example

(Thanks to Robert
Remez
)

49

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Overlap
-
Add Synthesis


IDFT on each spectrum


The complex, full spectrum


Don’t forget the
phase (
often using the original
phase).


If you do it right, the time signal you get is real.


Multiply with a synthesis window (e.g.
Hamming)


Not dividing the analysis window


Overlap and add different frames together.


50

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

Shepard Tones

Continuous
Risset

scale

Barber’s pole

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

51

Shepard Tones


Make a sound composed of sine waves
spaced at octave intervals.


Control their amplitudes by imposing a
Gaussian (or something like it) filter in the
(log) frequency dimension


Move all the sine waves up a musical ½
step.


Wrap around in frequency.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

52