Topic 2
Signal Processing Review
(Some slides are adapted from Bryan Pardo’s course slides on Machine Perception of Music)
Recording Sound
Mechanical
Vibration
Pressure
Waves
Motion
-
>Voltage
Transducer
Voltage over time
2
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Microphones
http://www.mediacollege.com/audio/microphones/how
-
microphones
-
work.html
3
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Pure Tone = Sine Wave
time
amplitude
frequency
i
nitial phase
𝑡
=
sin
(
2
+
𝜑
)
Time (
ms
)
Amplitude
0
2
4
6
-1
0
1
440Hz
Period T
4
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Reminders
•
Frequency,
=
1
/
𝑇
, is measured in cycles per
second , a.k.a.
Hertz
(Hz).
•
One cycle contains
2
radians.
•
Angular
frequency
Ω
, is measured in radians per
second and is related to frequency by
Ω
=
2
.
•
So we can rewrite the sine wave as
𝑡
=
sin
(
Ω
𝑡
+
𝜑
)
5
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Fourier Transform
Time (
ms
)
Amplitude
0
2
4
6
-1
0
1
=
(
𝑡
)
−
2
𝜋𝑓
𝑡
∞
−
∞
Amplitude
Frequency (Hz)
0
440
-
440
|
|
6
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
We can also write
Time (
ms
)
Amplitude
0
2
4
6
-1
0
1
Ω
=
(
𝑡
)
−
Ω
𝑡
∞
−
∞
Amplitude
Angular Frequency (radians)
0
440
×
2
−
440
×
2
|
|
7
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Complex Tone = Sine Wave
s
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
+
+
=
220 Hz
660 Hz
1100 Hz
8
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Frequency Domain
Amplitude
Frequency (Hz)
Time (
ms
)
Amplitude
0
10
20
30
40
50
60
70
80
90
100
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
220
660
1100
=
(
𝑡
)
−
2
𝜋𝑓
𝑡
∞
−
∞
|
|
9
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Harmonic Sound
•
1 or more sine waves
•
Strong components at
integer multiples
of
a
fundamental frequency (F0)
in the range
of human hearing (20
H
z
~
20,000
H
z)
•
Examples
–
220 + 660 + 1100 is harmonic
–
220 + 375 + 770 is
not
harmonic
10
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Noise
•
Lots of
sines
at random
freqs
. = NOISE
•
Example: 100
sines
with random
frequencies, such that
100
<
<
10000
.
0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30
11
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
How strong is the signal?
•
Instantaneous value?
•
Average value?
•
Something else?
0
2
4
6
-1
0
1
0
0.5
1
1.5
2
2.5
3
3.5
x 10
4
-30
-20
-10
0
10
20
30
𝑡
12
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Acoustical or Electrical
•
Acoustical
𝐼
=
1
1
𝑇
𝐷
2
𝑡
𝑡
𝐷
0
•
Electrical
=
1
1
𝑇
𝐷
2
𝑡
𝑡
𝐷
0
View
𝑡
as
sound pressure
Average
intensity
View
𝑡
as
electric voltage
Average
power
density
sound
speed
resistance
13
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Root
-
Mean
-
Square (RMS)
=
1
𝑇
𝐷
2
𝑡
𝑡
𝐷
0
•
𝑇
𝐷
should be long enough.
•
(
𝑡
)
should have 0 mean, otherwise the DC
component will be integrated.
•
For sinusoids
=
1
𝑇
2
sin
2
2
𝑡
𝑡
0
=
2
/
2
=
0
.
707
14
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Sound Pressure Level (SPL)
•
Softest audible sound intensity
0.000000000001 watt/m
2
•
Threshold of pain is around 1 watt/m
2
•
12 orders of magnitude difference
•
A log scale helps with this
•
The decibel (dB) scale is a log scale, with
respect to a reference value
15
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
The Decibel
•
A logarithmic measurement that expresses the
magnitude of a physical quantity (e.g.
power or
intensity) relative to a specified
reference level
.
•
Since it expresses a ratio of two (same unit)
quantities, it is
dimensionless.
𝐿
−
𝐿
ref
=
10
log
10
𝐼
𝐼
ref
=
20
log
10
,
ref
16
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Lots of references!
•
dB SPL
–
A measure of sound pressure level. 0dB SPL is
approximately the quietest sound a human can hear,
roughly the sound of a mosquito flying 3 meters away
.
•
dbFS
–
relative to digital full
-
scale. 0 VU is the
maximum allowable signal. Values typically negative.
•
dBV
–
relative to 1 Volt RMS. 0dBV = 1V.
•
dBu
–
relative to 0
.
775 Volts RMS with an unloaded,
open circuit.
•
dBmV
–
relative
to 1 millivolt across 75
Ω. Widely
used
in
cable television networks.
•
……
17
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Typical Values
•
Jet engine at 3m
•
Pain threshold
•
Loud motorcycle, 5m
•
Vacuum cleaner
•
Quiet restaurant
•
Rustling leaves
•
Human breathing, 3m
•
Hearing threshold
140 db
-
SPL
130 db
-
SPL
110 db
-
SPL
80 db
-
SPL
50 db
-
SPL
20 db
-
SPL
10 db
-
SPL
0 db
-
SPL
18
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Digital Sampling
0
1
2
3
-
1
-
2
AMPLITUDE
TIME
quantization increment
sample
interval
011
010
0
01
101
100
000
RECONSTRUCTION
19
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
More quantization levels = more dynamic range
0
1
2
3
4
5
6
-
4
-
3
-
2
-
1
0000
0001
0010
0110
0100
0101
0011
1001
1010
1011
1000
AMPLITUDE
TIME
sample
interval
quantization increment
20
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Bit Depth and Dynamics
•
More bits = more quantization levels = better
sound
•
Compact Disc: 16 bits = 65,536 levels
•
POTS (plain old telephone service): 8 bits = 256
levels
•
Signal
-
to
-
quantization
-
noise ratio (SQNR), if the
signal is uniformly distributed in the whole range
SQNR
=
20
log
10
2
≈
6
.
02
dB
–
E.g. 16 bits depth gives about 96dB SQNR.
21
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
RMS
=
1
2
[
]
−
1
𝑛
=
0
Amplitude
0
2
4
6
-1
0
1
The red dots
form the discrete
signal
[
]
22
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Aliasing and
Nyquist
0
1
2
3
4
5
6
AMPLITUDE
TIME
-
4
-
3
-
2
-
1
sample
interval
23
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Aliasing and Nyquist
0
1
2
3
4
5
6
AMPLITUDE
TIME
-
4
-
3
-
2
-
1
sample
interval
24
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Aliasing and Nyquist
0
1
2
3
4
5
6
AMPLITUDE
TIME
-
4
-
3
-
2
-
1
sample
interval
25
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Nyquist
-
Shannon Sampling Theorem
•
You can’t reproduce the signal if your
sample rate isn’t faster than twice the
highest frequency in the signal.
•
Nyquist
rate: twice the frequency of the highest
frequency in the signal.
–
A property of the continuous
-
time signal.
•
Nyquist
frequency: half of the sampling rate
–
A property of the discrete
-
time system.
26
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Discrete
-
Time Fourier Transform (DTFT)
Amplitude
0
2
4
6
-1
0
1
𝜔
=
[
]
−
𝜔
𝑛
∞
𝑛
=
−
∞
Amplitude
Angular frequency
𝜔
0
−
2
|
𝜔
|
The red dots form the
discrete signal
[
]
,
where
=
0
,
±
1
,
±
2
,
…
2
(
𝜔
)
is Periodic.
We often only show
−
,
𝜔
is a continuous variable
−
27
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Relation between FT and DTFT
𝜔
=
1
𝑇
𝑐
𝜔
𝑇
+
2
𝑘
𝑇
∞
=
−
∞
•
Scaling:
𝜔
=
Ω
𝑇
, i.e.
𝜔
=
2
corresponds to
Ω
=
2𝜋
=
2
, which corresponds to
=
.
•
Repetition:
𝜔
contains infinite copies of
𝑐
,
spaced by
2
.
Amplitude
0
2
4
6
-1
0
1
Time (
ms
)
Sampling:
=
𝑐
(
𝑇
)
FT:
𝑐
(
Ω
)
=
𝑐
(
𝑡
)
−
Ω
𝑡
∞
−
∞
DTFT:
𝜔
=
[
]
−
𝜔𝑛
∞
𝑛
=
−
∞
28
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Aliasing
Ω
0
|
𝑐
Ω
|
1
800
3600
−
3600
−
1800
Complex tone
900Hz + 1800Hz
Sampling rate
= 8000Hz
0
|
𝜔
|
2
−
2
−
3600
8000
𝜔
Sampling rate
= 2000Hz
𝜔
0
2
−
2
−
|
𝜔
|
3600
2000
1800
2000
200Hz
29
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Fourier Series
•
FT and DTFT do not require the signal to be periodic, i.e.
the signal may contain arbitrary frequencies, which is
why the frequency domain is continuous.
•
Now, if the signal is periodic:
𝑡
+
𝑇
=
𝑡
∀
∈
Ζ
•
It can be reproduced by a series of sine and cosine
functions:
𝑡
=
0
+
𝑛
cos
Ω
𝑛
𝑡
+
𝑛
sin
Ω
𝑛
𝑡
∞
𝑛
=
1
•
In other words, the frequency domain is discrete.
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Discrete Fourier Transform (DFT)
•
FT and DTFT are great, but the infinite integral
or summations are hard to deal with.
•
In digital computers, everything is discrete,
including both the signal and its spectrum
𝑘
=
[
]
−
2
𝜋𝑛
/
−
1
𝑛
=
0
frequency
domain index
t
ime domain
index
Length of the
signal, i.e.
length of DFT
31
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
DFT and IDFT
𝑘
=
[
]
−
2
𝜋𝑛
/
𝑛
=
0
=
1
[
𝑘
]
2
𝜋𝑛
/
−
1
=
0
•
Both
[
]
and
[
𝑘
]
are discrete and of length
.
•
Treats
[
]
as if it were infinite and periodic.
•
Treats
[
𝑘
]
as if it were infinite and periodic.
•
Only one period is involved in calculation.
DFT:
IDFT:
32
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Discrete Fourier Transform
•
If the time
-
domain signal has no imaginary
part (like an audio signal
)
then the frequency
-
domain signal is
conjugate symmetric around
N/2.
DFT
0
N
-
1
0
N
-
1
0
N
-
1
0
N
-
1
Real portion
Imaginary portion
N/2
N/2
Real portion
Imaginary portion
Time domain
[
]
Frequency domain
[
𝑘
]
IDFT
33
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
DC
f
s
/2
Kinds of Fourier Transforms
Fourier Transform
Signals: continuous, aperiodic
Spectrum: aperiodic, continuous
Fourier Series
Signals: continuous, periodic
Spectrum: aperiodic, discrete
Discrete Time Fourier Transform
Signals: discrete, aperiodic
Spectrum: periodic, continuous
Discrete Fourier Transform
Signals: discrete, periodic
Spectrum: periodic, discrete
34
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
The FFT
•
Fast Fourier Transform
–
A much, much faster way to do the DFT
–
Introduced by Carl F.
Gauss in 1805
–
Rediscovered by J.W. Cooley and John
Tukey
in 1965
–
The
Cooley
-
Tukey
algorithm is the one we use
today (mostly)
–
Big O notation for this is
O(N
log
N)
–
Matlab
functions
fft
and
ifft
are standard.
35
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Windowing
•
A function that is zero
-
valued outside of some
chosen interval.
–
When a signal (data) is multiplied by a window
function, the product is zero
-
valued outside the
interval: all that is left is the "view" through the
window.
x[n]
w[n]
z[n]
x
=
Example: windowing x[n] with a rectangular window
36
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Some famous windows
•
Rectangular
=
1
•
Triangular
(Bartlett
)
=
2
−
1
−
1
2
−
−
−
1
2
•
Hann
=
0
.
5
1
−
cos
2
𝜋𝑛
−
1
Note: we assume w[
n
] = 0
outside some range [0,
N
]
sample
amplitude
sample
amplitude
sample
amplitude
37
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Why window shape matters
•
Don’t forget that a DFT assumes the
signal in the window is periodic
•
The boundary conditions mess things
up…unless you manage to have a window
whose length
is
exactly 1 period of your
signal
•
Making the edges of the window less
prominent helps suppress undesirable
artifacts
38
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Fourier Transform of Windows
-4
-2
0
2
4
-30
-20
-10
0
10
20
30
40
Normalized angular frequency
Amplitude (dB)
Main lobe
Sidelobes
We want
-
Narrow main lobe
-
Low
sidelobes
39
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Which window is better?
-4
-2
0
2
4
-150
-100
-50
0
50
Normalized angular frequency
Amplitude (dB)
-4
-2
0
2
4
-60
-40
-20
0
20
40
Normalized angular frequency
Amplitude (dB)
Hann
window
=
0
.
5
1
−
cos
2
−
1
Hamming window
=
0
.
54
−
0
.
46
×
cos
2
−
1
40
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Multiplication
v.s
. Convolution
Time domain
Frequency Domain
[
]
∙
[
]
1
[
𝑘
]
∗
[
𝑘
]
[
]
∗
[
]
[
𝑘
]
∙
[
𝑘
]
•
Windowing is multiplication in time domain, so the spectrum
will be a convolution between the signal’s spectrum and the
window’s spectrum
•
Convolution in time domain takes
(
2
)
, but if we perform in
the frequency domain…
•
FFT takes
log
•
Multiplication takes
•
IFFT takes
log
41
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Windowed Signal
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
42
0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
0
50
100
150
200
250
300
350
400
-3
-2
-1
0
1
2
3
Spectrum of Windowed Signal
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
•
Two sinusoids: 1000Hz + 1500Hz
•
Sampling rate: 10KHz
•
Window length: 100 (i.e. 100/10K = 0.01s)
•
FFT length: 400 (i.e. 4 times zero padding)
43
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Zero Padding
•
Add zeros after (or before) the signal to
make it longer
•
Perform DFT on the padded signal
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
44
0
200
400
600
800
1000
1200
1400
1600
-3
-2
-1
0
1
2
3
Windowed
signal
Padded zeros
Why Zero Padding?
•
Zero padding in time domain gives the ideal
interpolation in the frequency domain.
•
It doesn’t increase (the real) frequency resolution!
–
4 times is generally enough
–
Here the resolution is always
fs
/L=100Hz
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
No zero padding
4 times zero padding
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
8 times zero padding
45
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
How to increase frequency resolution?
•
Time
-
frequency resolution tradeoff
∆
𝑡
⋅
∆
=
1
(second) (Hz)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
60
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-100
-50
0
50
Frequency (Hz)
Amplitude (dB)
0
1000
2000
3000
4000
5000
-80
-60
-40
-20
0
20
40
Frequency (Hz)
Amplitude (dB)
46
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Window length: 10ms
Window length: 20ms
Window length: 40ms
Short time Fourier Transform
•
Break signal into windows
•
Calculate DFT of each window
47
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
The Spectrogram
•
There
is a
“spectrogram”
function in
matlab
, but you
can’t do zero padding using it.
48
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
A Fun Example
(Thanks to Robert
Remez
)
49
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao Duan 2013
Overlap
-
Add Synthesis
•
IDFT on each spectrum
–
The complex, full spectrum
–
Don’t forget the
phase (
often using the original
phase).
–
If you do it right, the time signal you get is real.
•
Multiply with a synthesis window (e.g.
Hamming)
–
Not dividing the analysis window
•
Overlap and add different frames together.
50
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao
Duan
2013
Shepard Tones
Continuous
Risset
scale
Barber’s pole
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao
Duan
2013
51
Shepard Tones
•
Make a sound composed of sine waves
spaced at octave intervals.
•
Control their amplitudes by imposing a
Gaussian (or something like it) filter in the
(log) frequency dimension
•
Move all the sine waves up a musical ½
step.
•
Wrap around in frequency.
ECE 492
-
Computer Audition and Its Applications in Music, Zhiyao
Duan
2013
52
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο