Digital Signal Processing
1.
Define statistical variance and covariance.
In
probability theory
and
statistics
,
covariance
is a measure of how much two
random variables
change
together. If the greater
values of one variable mainly correspond with the greater values of the other
variable, and the same holds for the smaller values, i.e. the variables tend to show similar behavior, the
covariance is a positive number. In the opposite case, when the greater
values of one variable mainly
correspond to the smaller values of the other, i.e. the variables tend to show opposite behavior, the
covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship
between the vari
ables. The magnitude of the covariance is not that easy to interpret. The normalized
version of the covariance, the
correlation coefficient
, however shows by its magnitude the strength of the
linear relation.A distinction has to be made between the covariance of two random variables,
a
population
parameter
, that can be seen as a property of the
joint probability distribution
at one side, and
on the other side the
sample
covariance, which serves as an
estimated
value of the parameter.
Relationship to inner products
Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar
properties to those of an
inner product
:
1.
bilinear
: for constants
a
and
b
and random variables
X
,
Y
, and
U
, Cov(
aX
+
bY
,
U
)
=
a
Cov(
X
,
U
)
+
b
Cov(
Y
,
U
)
2.
symmetric: Cov(
X
,
Y
) = Cov(
Y
,
X
)
3.
positive semi

definite
: Var(
X
) = Cov(
X
,
X
)
≥
0, and Cov(
X
,
X
) =
0 implies that
X
is a constant
random variable
(
K
).
In fact these properties imply that the covariance defines an inner product over the
quotient vector
space
obtained by t
aking the subspace of random variables with finite second moment and identifying
any two that differ by a constant. (This identification turns the positive semi

definiteness above into
positive definiteness.) That quotient vector space is isomorphic to the
subspace of random variables with
finite second moment and mean zero; on that subspace, the covariance is exactly the
L
2
inner product of
real

valued functions on the sample space.
2.
How do
you compute the energy of a discrete signal in time and frequency domains?
The
Discrete

Time Fourier Transform
is a version of the fourier transform that is used to convert
a discrete data set into a continuous

frequency representation. The DTFT is
used mostly in theory, and
less in practice, because computers are not usually capable of handling continuous

frequency data. The
DTFT is also useful because it provides a theoretical basis for the Z transform.
Conceptually, it is
important to note that si
gnal processing operates on
an
abstract
representation
of a physical
quantity and not on the quantity itself. At the same time, the
type
of abstract representation we
choose for the physical phenomenon of interest determines the nature of a signal processi
ng unit.
A temperature regulation device, for instance, is not a signal processing system as a whole. The
device does however contain a signal processing core in the feedback control unit which converts
the instantaneous
measure
of the temperature into an
ON/OFF trigger for the heating element.
The physical nature of this unit depends on the temperature model: a simple design is that of a
mechanical device based on the dilation of a metal sensor; more likely, the temperature signal is
a voltage generated by
a thermocouple and in this case the matched signal processing unit is an
operational amplifier.Finally, the adjective “digital”
derives from
digitus,
the Latin word for
finger: it concisely describes a world view where everything can be ultimately represe
nted as an
integer number. Counting, first on one’s fingers and then in one’s head, is the earliest and most
fundamental form of abstraction; as children we quickly learn that counting does indeed bring
disparate objects (the proverbial “apples and oranges
”) into a common modeling paradigm,
i.e.
their cardinality. Digital signal processing is a flavor of signal processing in which
everything
including
time
is described in terms of integer numbers; in other words, the abstract
representation of choice is a o
ne

size

fit

all countability. Note that our earlier “thought
experiment” about ambient temperature fits this paradigm very naturally: the measuring instants
form a countable set (the days in a month) and so do the measures themselves (imagine a finite
numb
er of ticks on the thermometer’s scale). In digital signal processing the underlying abstract
representation is always the set of natural numbers regardless of the signal’s origins; as a
consequence, the physical nature of the processing device will also a
lways remain the same, that
is, a general digital (micro)processor. The extraordinary power and success of digital signal
processing derives from the inherent universality of its associated “world view”.
3.
Define sample autocorrelation function. Give the mean value of this estimate.
Autocorrelation
is the
cross

correlation
of a
signal
with itself. Informally, it is the similarity between
observations as a function of the time separation between them. It is a mathematical tool for finding
repeating patterns, such as t
he presence of a periodic signal which has been buried under noise, or
identifying the
missing fundamental
frequency in a signal implied by its
harmonic
frequencies. It is often
used in
signal processing
for analyzing functions or series of values,
such as
time domain
signals.
In
statistics
, the autocorrelation of a
random process
describes the
correlation
between values of the
process at different points in time, as a function of the two times or of the t
ime difference. Let
X
be some
repeatable process, and
i
be some point in time after the start of that process. (
i
may be an
integer
for
a
discrete

time
process or a
real number
for a
continuous

time
process.) Then
X
i
is the value
(or
realization
) produced by a given
run
of the process at time
i
. Suppose that the process is further
known to have defined values for
mean
μ
i
and
variance
σ
i
2
for all times
i
. Then the definition of the
autocorrelation between times
s
and
t
is
where "E" is the
expected value
operator. Not
e that this expression is not well

defined for all time
series or processes, because the variance may be zero (for a constant process) or infinite. If the
function
R
is well

defined, its value must lie in the range [−1,
1], with 1 indicating perfect correl
ation
and −1 indicating perfect
anti

correlation
.If
X
t
is a
second

order stationary process
then the
mean
μ
and the variance
σ
2
are time

independent, and further the autocorrelation depends only on
the difference between
t
and
s
: the correlation depends only on the time

distance between the pair of
values but not on their position in time. This further implies that the autocorrelation can be expressed
as a function of the time

lag, and that this would be an
even function
of the lag τ
=
s
−
t
. This gives
the more familiar form
and the fact that this is an
even function
can be stated as
It is common practice in some disciplines, other than statistics and
time series analysis
, to
drop the normalization by
σ
2
and use the term "autocorrelation"
interchangeably with
"autocovariance". However, the normalization is important both because the interpretation of
the autocorrelation as a correlation provides a scale

free measure of the strength
of
statistical dependence
, and because the normalization has an effect on the statistical
properties of the estimated autocorrelations.
4.
What is the basic principle of Welch method to estimate power spectrum?
In
physics
,
engineering
, and applied
mathematic
s
,
Welch's method
, named after P.D. Welch, is used
for estimating the
power
of a
signal
at different
frequencies
: that is, is is an approach to
spectral density
estimation
. The method is based on the concept of using
periodogram
spectrum estimates, which are the
result of converting a signal from the time domain to the fre
quency domain. Welch's method is an
improvement on the standard
periodogram
spectrum estimating method and on
Bartlett's method
, in that
it reduces noise in the estimated
power spectra
in exchange for
reducing the frequency resolution. Due to
the noise caused by imperfect and finite data, the noise reduction from Welch's method is often desired.
The Welch method is based on
Bartlett's method
and differs in two ways:
1.
The signal is split up into overlapping segments: The original data segment is split up into L data
segments of length M, overlapping by D points.
1.
If D = M / 2, the overlap is said to be 50%
2.
If D = 0, the
overlap is said to be 0%. This is the same situation as in the
Bartlett's
method
.
2.
The overlapping segments are then windowed: After the data is split up into
overlapping
segments, the individual L data segments have a window applied to them (in the time domain).
1.
Most
window functions
afford more influence to the data at the center
of the set than to
data at the edges, which represents a loss of information. To mitigate that loss, the
individual data sets are commonly overlapped in time (as in the above step).
2.
The windowing of the segments is what makes the Welch method a
"modified"
periodogram
.
After doing the above, the
periodogram
is calculated by computing the
discrete Fourier transform
, and
then computing the squared magnitude of the result. The individual
peri
odograms
are then time

averaged, which reduces the variance of the individual power measurements. The end result is an array
of power measurements vs. frequency "bin".
5.
How do find the ML estimate?
In statistics,
maximum

likelihood estimation
(
MLE
) is
a method of
estimating
the
parameters
of
a
statistical model
. When applied to a data set and given a
statistical model
, maximum

likelihood
estimation provides
estimates
for the model's parameters.The method of maximum likelihood
corresponds to many well

known estimation methods in statistics. For example, one may be interested in
the heights of adult female giraffes, but
be unable due to cost or time constraints, to measure the height
of every single giraffe in a population. Assuming that the heights are
normally (Gaussian) distribut
ed
with
some unknown
mean
and
variance
, the mean and variance can be estimated with MLE while only
knowing the heights of some
sample of the overall population. MLE would accomplish this by taking the
mean and variance as parameters and finding particular parametric values that make the observed
results the most probable (given the model).In general, for a fixed set of data and un
derlying statistical
model, the method of maximum likelihood selects values of the model parameters that produce a
distribution that gives the observed data the greatest probability (i.e., parameters that maximize
the
likelihood function
). Maximum

likelihood estimation gives a unified approach to estimation, which
is
well

defined
in the case of
the
normal distribution
and many other problems. However, in some
complicated problems, difficulties do occur: in such problems, maximum

likelihood estimators are
un
suitable or do not exist.
Suppose there is a sample
x
1
,
x
2
, …,
x
n
of
n
independent and identically
distributed
observations,
coming from a distribution with an unknown
pdf
ƒ
0
(∙). It is however surmised that
the function
ƒ
0
belongs to a certain family of distributions
{
ƒ
(∙
θ
),
θ
∈
Θ
}, called the
parametric model
, so
that
ƒ
0
=
ƒ
(∙
θ
0
)
. The value
θ
0
is unknown and is referred to as the "
true value
" of the parameter. It is
desirable to find some
estimator
which would be as close to the true value
θ
0
as possible. Both the
observed variables
x
i
and the parameter
θ
can
be vectors.
To use the method of maximum likelihood, one first specifies the
joint density
function
for all observations.
For an
iid
sample this joint density function will be
Now we look at this function from a different perspective by considering the observed values
x
1
,
x
2
,
...,
x
n
to be fixed "parameters" of this function, whereas
θ
will
be the function's variable and allowed to
vary freely. From this point of view this distribution function will be called the
likelihood
:
s
6.
Give the basic principle of Levinson recursion.
7.
Why are FIR filters widely used for adaptive filters?
8.
Express the Windrow

Hoff LMS adaptive algorithm. State its pro
perties.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο