# Signal Processing Review - Electrical and Computer Engineering

Τεχνίτη Νοημοσύνη και Ρομποτική

24 Νοε 2013 (πριν από 4 χρόνια και 5 μήνες)

112 εμφανίσεις

Topic 2

Signal Processing Review

(Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music)

Recording Sound

Mechanical

Vibration

Pressure

Waves

Motion
-
>Voltage

Transducer

Voltage over time

2

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Microphones

http://www.mediacollege.com/audio/microphones/how
-
microphones
-
work.html

3

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Pure Tone = Sine Wave

time

amplitude

frequency

i
nitial phase


𝑡
=

sin
(
2

+
𝜑
)

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1
440Hz

Period T

4

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Reminders

Frequency,

=
1
/
𝑇
, is measured in cycles per
second , a.k.a.
Hertz

(Hz).

One cycle contains
2

Angular
frequency
Ω
, is measured in radians per
second and is related to frequency by
Ω
=
2 
.

So we can rewrite the sine wave as


𝑡
=

sin
(
Ω
𝑡
+
𝜑
)

5

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1


=


(
𝑡
)


2
𝜋𝑓
𝑡

Amplitude

Frequency (Hz)

0

440

-
440

|


|

6

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

We can also write

Time (
ms
)

Amplitude

0
2
4
6
-1
0
1

Ω
=


(
𝑡
)


Ω

𝑡

Amplitude

0

440
×
2

440
×
2

|


|

7

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Complex Tone = Sine Wave
s

0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
+

+

=

220 Hz

660 Hz

1100 Hz

8

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Frequency Domain

Amplitude

Frequency (Hz)

Time (
ms
)

Amplitude

0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
220

660

1100


=


(
𝑡
)


2
𝜋𝑓
𝑡

|


|

9

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Harmonic Sound

1 or more sine waves

Strong components at
integer multiples

of
a
fundamental frequency (F0)
in the range
of human hearing (20
H
z
~

20,000
H
z)

Examples

220 + 660 + 1100 is harmonic

220 + 375 + 770 is
not

harmonic

10

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Noise

Lots of
sines

at random
freqs
. = NOISE

Example: 100
sines

with random
frequencies, such that
100
<

<
10000
.

0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30
11

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How strong is the signal?

Instantaneous value?

Average value?

Something else?

0
2
4
6
-1
0
1
0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30

𝑡

12

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Acoustical or Electrical

Acoustical

𝐼
=
1

1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0

Electrical


=
1

1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0

View

𝑡

as
sound pressure

Average
intensity

View

𝑡

as
electric voltage

Average
power

density

sound
speed

resistance

13

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Root
-
Mean
-
Square (RMS)


 
=
1
𝑇
𝐷


2
𝑡
𝑡

𝐷
0

𝑇
𝐷

should be long enough.


(
𝑡
)

should have 0 mean, otherwise the DC
component will be integrated.

For sinusoids


 
=
1
𝑇

2
sin
2
2
𝑡
𝑡

0
=

2
/
2
=
0
.
707

14

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Sound Pressure Level (SPL)

Softest audible sound intensity
0.000000000001 watt/m
2

Threshold of pain is around 1 watt/m
2

12 orders of magnitude difference

A log scale helps with this

The decibel (dB) scale is a log scale, with
respect to a reference value

15

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Decibel

A logarithmic measurement that expresses the
magnitude of a physical quantity (e.g.
power or
intensity) relative to a specified
reference level
.

Since it expresses a ratio of two (same unit)
quantities, it is
dimensionless.

𝐿

𝐿
ref
=
10
log
10
𝐼
𝐼
ref

=
20
log
10

 

 
,
ref

16

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Lots of references!

dB SPL

A measure of sound pressure level. 0dB SPL is
approximately the quietest sound a human can hear,
roughly the sound of a mosquito flying 3 meters away
.

dbFS

relative to digital full
-
scale. 0 VU is the
maximum allowable signal. Values typically negative.

dBV

relative to 1 Volt RMS. 0dBV = 1V.

dBu

relative to 0
.
775 Volts RMS with an unloaded,
open circuit.

dBmV

relative
to 1 millivolt across 75
Ω. Widely
used
in
cable television networks.

……

17

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Typical Values

Jet engine at 3m

Pain threshold

Loud motorcycle, 5m

Vacuum cleaner

Quiet restaurant

Rustling leaves

Human breathing, 3m

Hearing threshold

140 db
-
SPL

130 db
-
SPL

110 db
-
SPL

80 db
-
SPL

50 db
-
SPL

20 db
-
SPL

10 db
-
SPL

0 db
-
SPL

18

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Digital Sampling

0

1

2

3

-
1

-
2

AMPLITUDE

TIME

quantization increment

sample

interval

011

010

0
01

101

100

000

RECONSTRUCTION

19

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

More quantization levels = more dynamic range

0

1

2

3

4

5

6

-
4

-
3

-
2

-
1

0000

0001

0010

0110

0100

0101

0011

1001

1010

1011

1000

AMPLITUDE

TIME

sample

interval

quantization increment

20

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Bit Depth and Dynamics

More bits = more quantization levels = better
sound

Compact Disc: 16 bits = 65,536 levels

POTS (plain old telephone service): 8 bits = 256
levels

Signal
-
to
-
quantization
-
noise ratio (SQNR), if the
signal is uniformly distributed in the whole range

SQNR
=
20
log
10
2

6
.
02

dB

E.g. 16 bits depth gives about 96dB SQNR.

21

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

RMS


 
=
1


2
[

]



1
𝑛
=
0

Amplitude

0
2
4
6
-1
0
1
The red dots
form the discrete
signal

[

]

22

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and
Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

23

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

24

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing and Nyquist

0

1

2

3

4

5

6

AMPLITUDE

TIME

-
4

-
3

-
2

-
1

sample

interval

25

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Nyquist
-
Shannon Sampling Theorem

You can’t reproduce the signal if your
sample rate isn’t faster than twice the
highest frequency in the signal.

Nyquist

rate: twice the frequency of the highest
frequency in the signal.

A property of the continuous
-
time signal.

Nyquist

frequency: half of the sampling rate

A property of the discrete
-
time system.

26

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete
-
Time Fourier Transform (DTFT)

Amplitude

0
2
4
6
-1
0
1

𝜔
=


[

]


𝜔
𝑛

𝑛
=

Amplitude

Angular frequency
𝜔

0

2

|

𝜔
|

The red dots form the
discrete signal

[

]
,
where

=
0
,
±
1
,
±
2
,

2

(
𝜔
)

is Periodic.

We often only show

,

𝜔

is a continuous variable

27

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Relation between FT and DTFT

𝜔
=
1
𝑇

𝑐
𝜔
𝑇
+
2
𝑘
𝑇


=

Scaling:
𝜔
=
Ω
𝑇
, i.e.
𝜔
=
2

corresponds to
Ω
=
2𝜋

=
2 

, which corresponds to

=


.

Repetition:

𝜔

contains infinite copies of

𝑐
,
spaced by
2

.

Amplitude

0
2
4
6
-1
0
1
Time (
ms
)

Sampling:


=

𝑐
(
𝑇
)

FT:

𝑐
(
Ω
)
=


𝑐
(
𝑡
)


Ω

𝑡

DTFT:

𝜔
=


[

]


𝜔𝑛

𝑛
=

28

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Aliasing

Ω

0

|

𝑐
Ω
|

1
800

3600

3600

1800

Complex tone

900Hz + 1800Hz

Sampling rate
= 8000Hz

0

|

𝜔
|

2

2

3600

8000

𝜔

Sampling rate
= 2000Hz

𝜔

0

2

2

|

𝜔
|

3600

2000

1800

2000

200Hz

29

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Series

FT and DTFT do not require the signal to be periodic, i.e.
the signal may contain arbitrary frequencies, which is
why the frequency domain is continuous.

Now, if the signal is periodic:


𝑡
+
𝑇
=

𝑡

Ζ

It can be reproduced by a series of sine and cosine
functions:


𝑡
=

0
+

𝑛
cos
Ω
𝑛
𝑡
+

𝑛
sin
Ω
𝑛
𝑡

𝑛
=
1

In other words, the frequency domain is discrete.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform (DFT)

FT and DTFT are great, but the infinite integral
or summations are hard to deal with.

In digital computers, everything is discrete,
including both the signal and its spectrum

𝑘
=


[

]


2
𝜋𝑛
/



1
𝑛
=
0

frequency
domain index

t
ime domain
index

Length of the
signal, i.e.
length of DFT

31

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DFT and IDFT

𝑘
=


[

]


2
𝜋𝑛
/


𝑛
=
0



=

1

[
𝑘
]

2
𝜋𝑛
/



1

=
0

Both

[

]

and

[
𝑘
]

are discrete and of length

.

Treats

[

]

as if it were infinite and periodic.

Treats

[
𝑘
]

as if it were infinite and periodic.

Only one period is involved in calculation.

DFT:

IDFT:

32

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Discrete Fourier Transform

If the time
-
domain signal has no imaginary
part (like an audio signal
)
then the frequency
-
domain signal is
conjugate symmetric around
N/2.

DFT

0

N
-
1

0

N
-
1

0

N
-
1

0

N
-
1

Real portion

Imaginary portion

N/2

N/2

Real portion

Imaginary portion

Time domain

[

]

Frequency domain

[
𝑘
]

IDFT

33

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

DC

f
s
/2

Kinds of Fourier Transforms

Fourier Transform

Signals: continuous, aperiodic

Spectrum: aperiodic, continuous

Fourier Series

Signals: continuous, periodic

Spectrum: aperiodic, discrete

Discrete Time Fourier Transform

Signals: discrete, aperiodic

Spectrum: periodic, continuous

Discrete Fourier Transform

Signals: discrete, periodic

Spectrum: periodic, discrete

34

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The FFT

Fast Fourier Transform

A much, much faster way to do the DFT

Introduced by Carl F.
Gauss in 1805

Rediscovered by J.W. Cooley and John
Tukey

in 1965

The
Cooley
-
Tukey

algorithm is the one we use
today (mostly)

Big O notation for this is
O(N
log
N)

Matlab

functions
fft

and
ifft

are standard.

35

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowing

A function that is zero
-
valued outside of some
chosen interval.

When a signal (data) is multiplied by a window
function, the product is zero
-
valued outside the
interval: all that is left is the "view" through the
window.

x[n]

w[n]

z[n]

x

=

Example: windowing x[n] with a rectangular window

36

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Some famous windows

Rectangular


=
1

Triangular
(Bartlett
)


=

2


1


1
2





1
2

Hann


=
0
.
5
1

cos
2
𝜋𝑛


1

Note: we assume w[
n
] = 0

outside some range [0,
N
]

sample

amplitude

sample

amplitude

sample

amplitude

37

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Why window shape matters

Don’t forget that a DFT assumes the
signal in the window is periodic

The boundary conditions mess things
up…unless you manage to have a window
whose length
is
exactly 1 period of your
signal

Making the edges of the window less
prominent helps suppress undesirable
artifacts

38

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Fourier Transform of Windows

-4
-2
0
2
4
-30
-20
-10
0
10
20
30
40
Normalized angular frequency
Amplitude (dB)
Main lobe

Sidelobes

We want

-
Narrow main lobe

-
Low
sidelobes

39

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Which window is better?

-4
-2
0
2
4
-150
-100
-50
0
50
Normalized angular frequency
Amplitude (dB)
-4
-2
0
2
4
-60
-40
-20
0
20
40
Normalized angular frequency
Amplitude (dB)
Hann

window


=
0
.
5
1

cos
2


1

Hamming window


=
0
.
54

0
.
46
×
cos
2


1

40

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Multiplication
v.s
. Convolution

Time domain

Frequency Domain


[

]


[

]

1

[
𝑘
]


[
𝑘
]


[

]


[

]

[
𝑘
]


[
𝑘
]

Windowing is multiplication in time domain, so the spectrum
will be a convolution between the signal’s spectrum and the
window’s spectrum

Convolution in time domain takes

(

2
)
, but if we perform in
the frequency domain…

FFT takes


log

Multiplication takes


IFFT takes


log

41

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Windowed Signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

42

0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
Spectrum of Windowed Signal

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)

Two sinusoids: 1000Hz + 1500Hz

Sampling rate: 10KHz

Window length: 100 (i.e. 100/10K = 0.01s)

FFT length: 400 (i.e. 4 times zero padding)

43

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Add zeros after (or before) the signal to
make it longer

Perform DFT on the padded signal

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

44

0
200
400
600
800
1000
1200
1400
1600
-3
-2
-1
0
1
2
3
Windowed
signal

Zero padding in time domain gives the ideal
interpolation in the frequency domain.

It doesn’t increase (the real) frequency resolution!

4 times is generally enough

Here the resolution is always
fs
/L=100Hz

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)

45

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

How to increase frequency resolution?

Time
-

𝑡


=
1

(second) (Hz)

0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
60
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-100
-50
0
50
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
46

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Window length: 10ms

Window length: 20ms

Window length: 40ms

Short time Fourier Transform

Break signal into windows

Calculate DFT of each window

47

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

The Spectrogram

There
is a
“spectrogram”
function in
matlab
, but you
can’t do zero padding using it.

48

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

A Fun Example

(Thanks to Robert
Remez
)

49

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao Duan 2013

Overlap
-

IDFT on each spectrum

The complex, full spectrum

Don’t forget the
phase (
often using the original
phase).

If you do it right, the time signal you get is real.

Multiply with a synthesis window (e.g.
Hamming)

Not dividing the analysis window

Overlap and add different frames together.

50

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

Shepard Tones

Continuous
Risset

scale

Barber’s pole

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

51

Shepard Tones

Make a sound composed of sine waves
spaced at octave intervals.

Control their amplitudes by imposing a
Gaussian (or something like it) filter in the
(log) frequency dimension

Move all the sine waves up a musical ½
step.

Wrap around in frequency.

ECE 492
-

Computer Audition and Its Applications in Music, Zhiyao
Duan

2013

52