LITERATURE REVIEW on OPTICAL FLOW
1
Recovering 3

D structure using a set of 2

D images is an important problem in the field of
computer vision. One method for collecting a set of images is the temporal accumulation of
information through a monocular observer. The relationship between subsequ
ent still images in a
video stream provides a wealth of information. This information is in the form of spatio

temporal
change. The temporal integration of such velocity information in both 2

D and 3

D space is
essential for solving shape

from

motion [26],
time

of

collision [8], object tracking [20], object

recognition [2], and figure

ground problems.
It is intuitively sound to suggest that changes in intensity on an image plane are somewhat
coupled with the projection of the apparent motion of the 3

D spa
ce surrounding the plane. It is
however incorrect to say that such projections are unique and complete. The loss of a
dimension, quantization of intensity, discrete sampling of infinitesimal spatial data and sensor
noise make the problem of recovering 3

D
motion from 2

D intensity distribution ill

posed.
There are two levels of ill

posedness in recovering 3

D motion from a sequence of intensity
distributions. The first level involves recovering the true 2

D velocity field from the subsequent
intensity imag
es formed on the retina of the 2

D imaging device. The second problem involves
constructing a fully determined system of 2

D flow fields, such that the 3

D motion can be fully,
uniquely and stably recovered.
Under constraints of rigid motion and weak per
spective the second problem becomes tractable
[26]. This literature review will concern itself with the more difficult problem of recovering the 2

D flow from the intensity fields. This is commonly referred to in the literature as the optical

flow
problem.
Barron et
al.
[3]
define the optical flow problem as that of “computing the approximation to the
2

D motion field
–
a projection of the 3

D velocities of surface points onto the imaging surface
–
from spatio

temporal patterns”.
Two comprehensive papers
on the subject of optical flow performance exist. Barron et
al.
[3]
have produced a paper that compares nine classic flow algorithms on the basis of accuracy and
density. They provide a clear test set of image sequences that can be used for quantitative a
nd
qualitative comparison of the different algorithms. More recent work by Liu et
al.
[16] has
improved on Barron et
al.
’s study by including efficiency in the evaluation of the algorithms.
They provide a coordinate system that compares accuracy with effic
iency. A curve is
constructed in this coordinate system by changing the search areas of the different algorithms.
As stated above, recovering the 2

D velocity field from a sequence of intensity images is ill

posed [13]. The problem is ill

posed because lo
cal intensity alone fails to completely encode
motion information. For example, in regions of constant intensity, motion cannot be detected
and an infinite number of solutions exist. Even when the intensity is constant in a given direction
, the solution is still only partially available. Under such conditions it is only possible to provide
LITERATURE REVIEW on OPTICAL FLOW
2
the component of the solution that is perpendicular to
. This is referred to as the aperture
problem [21].
Another example of ho
w intensity fails to describe motion is Horn’s mirror ball problem [14].
Consider a sphere with no texture. As the sphere rotates about its center, no change in intensity
is observed, yet the sphere does possess a motion field.
Because the problem of reco
vering the 2

D velocity field is ill

posed, additional information
must be added to the problem statement to clearly define a closed set in which a single stable
solution exists. This additional information takes the form of imposed additional constraints
on
the problem and regularization strategies.
In [14], Horn formalizes the image flow constraint, thus creating an incomplete correlation
between the motion domain and the intensity domain. This constraint can be interpreted as the
assumption that a point
in the 3

D shape, when projected onto the 2

D image plain, maintains a
constant intensity over time. Mathematically this is formulated as
(1)
where
is the intensity distribution of the image of a pixel at a
point
at a time t.
Simoncelli et
al.
[22] present this constraint from a Bayesian perspective by suggesting that the
maximum a posteriori
2

D displacement of a point in 3

D space when projected into an
intensity plane
is obtained as
(2)
On top of constraints, regularization strategies are used to interpolate and propagate flow
estimates from areas of greater certainty to those of lower certainty where the aperture problem
prevails. Some
algorithms simply perform post

filtering while other algorithms provide a
measure of confidence for each estimate, thus providing a criterion for better re

distribution of
the velocities estimates of high confidence to areas of lower confidence.
Thus an
optical flow algorithm is specified by three elements [3]:
the spatio

temporal operators that are applied to the image sequence to extract
features and improve the signal

to

noise ratio,
how velocity estimates are produced from a gradient search of the ex
tracted feature
space, and
the form of regularization applied to the flow field considering confidence measures
if they exist.
LITERATURE REVIEW on OPTICAL FLOW
3
Barron et
al.
[3] classify optical flow algorithms by their signal

extraction stage. This provides
four groups: differential tec
hniques, energy

based methods, phase

based techniques and region

based matching.
Liu et
al.
[16] prefer to classify algorithms into two groups: those that perform a gradient search
on extracted structure of the image sequence and those that don’t. This ef
fectively groups the
differential techniques, energy

based methods, and phase

based methods into a single class of
algorithms referred to as gradient

methods, while leaving the region

matching techniques in a
class of their own. This classification is not
truly justified as region based matching also performs
hill climbing, only on a much coarser level than filter

dependent algorithms. The coarser search
provides quicker more robust results. The cost of the improved efficiency is reduced accuracy.
Differe
ntial Techniques
Differential techniques are characterized by gradient search performed on extracted first and
second order spatial derivatives, and temporal derivatives. From the Taylor expansion of the
flow constraint equation, the gradient constraint e
quation is obtained:
(3)
Horn and Schunk [15] combine a global smoothness term with the gradient constraint equation
to obtain a functional for estimating optical flow. Their choice of smoothness term minimizes the
absolute grad
ient of the velocity:
(4)
This functional can be reduced to a pair of recursive equations that must be solved iteratively.
Lucas and Kanade [18] also construct a flow estimation technique based on first order
derivatives of the
image sequence. In contrast to Horn and Schunk’s post

smoothing
regularization, they choose to pre

smooth the data before using the gradient constraint equation.
Mathematically,
(5)
where
is a window that
gives more influence to constraints near the center of the
neighborhood
.
This can be reduced to a closed form solution for the flow estimates where:
LITERATURE REVIEW on OPTICAL FLOW
4
(6)
and
(7)
One important a
dvantage of this approach over Horn and Schunk [15] is the existence of a
confidence measure. The smallest eigenvalue,
, of
(8)
provides a measure to distinguish estimates of normal velocity from 2

D velocity
.
Barron et
al.
report in their survey [3] that Lucas and Kanade’s algorithm provides the second
most accurate results. Liu et
al.
[16] evaluate Lucas and Kanade as providing the third best
efficiency

accuracy curve. This has motivated much work on Lucas
and Kanade’s algorithm.
Fleet and Langley [10] attempt a more efficient implementation using IIR temporal pre

filtering
and temporal recursive estimation for regularization. The temporal support was reduced to 3
frames, and the computation time improved
, while only slightly diminishing performance.
Accuracy issues are also tackled especially in the domain of discontinuities. Discontinuities
provide information about occlusion and shape. Thus researchers have attempted to reduce the
effects of smoothing
along steep intensity gradients. Nagel and Enkelmann [19] were the first to
formulate an oriented smoothness constraint:
(9)
where
is a fixed constant.
Ghosal and Vanek [11] look at weighted anisotrop
ic filtering to reduce the loss of discontinuity
information when regularizing. Using the eigenvalues of (8), when W is the identity matrix, they
establish weights for imposing isotropic smoothness along the x and y directions.
LITERATURE REVIEW on OPTICAL FLOW
5
In the same spirit, Spetsa
kis [25] uses an adaptive Gaussian filter approach to minimize the loss
of occluding information. He applies this Gaussian
to normal equations of (8) and obtains
(10)
A velocity estimate is obtained from t
he solution of the system when the constraints
and
are zero.
The size of the Gaussian filter applied to a given pixel should be governed by the following rules:
the Gaussian should be larger when
the flow is smoot
h, and
instabilities in the system of equations occur.
Spetsaskis derives a measure of confidence called the
incompatible measure
. This value is
determined from the residual values
and
that result when reverse inje
cting
into (10)
when
is the identity operator. If the residual is high, the solution is considered robust. If the
residual is low, then the Gaussian
has not gathered enough information and its
size must be
increased.
More generally, differential methods can be seen as band

pass signal extraction. Therefore they
only provide a local representation in frequency space and thus are restricted to performing well
on an interval of velocities characte
rized by a pre

smoothing of the spatio

temporal signal
before numerical differentiation. Flow information may not always be limited to a tight frequency
band. Solutions to these problems are somewhat tackled by Spetsakis, and Ghosal and Vanek
through adapt
ive filtering. Heeger [12], and Fleet and Jepson [9], however, suggest that
providing information for multiple frequency bands is much more accurate. This approach is
much more computationally expensive as well [16].
Energy

Based Methods
The advantage of
energy

based methods is the hierarchical decomposition of the image
sequence in the frequency domain. Energy based techniques extract velocities by using families
of band

pass filters that are velocity and orientation tuned. The Fourier transform of a tra
nslating
2

D pattern is
(11)
LITERATURE REVIEW on OPTICAL FLOW
6
where
is the Fourier transform of
,
is a Dirac delta function,
denotes
temporal frequency, and
d
enotes spatial frequency. This effectively implies that all
power associated with the translation will be mapped to a plane that traverses the origin in
frequency space.
As such, Heeger [12] uses a family of twelve Gabor filters of different spatial reso
lutions to
extract velocity information from the image sequences. By using Gabor filters [9], which provide
the simultaneous spatio

temporal and frequential localization, a clean band

pass representation
is obtained. A least square fit is then applied to t
he resulting distribution in frequency

space.
Phase

Based Techniques
The most accurate optical flow estimations are produced using Fleet and Jepson’s [9] phase

based approach [3,16]. Phase

based methods also use a family of velocity

tuned filters to
ext
ract a local

frequency representation of the image sequence. Flow estimates are provided by
gradient search in the phase space of the extracted signatures.
Motivation for this approach is based on the argument that evolution of phase contours provides
a
good approximation to projected motion field. The phase output of band

pass filters is
generally more stable than the amplitude when small translations in the scene are sought. As
optical flow is a localized measurement, it is often characterized by small
displacements. Thus
deriving the velocity from phase as opposed to magnitude is advantageous.
Region Matching
Region matching is particular in that it forms the filters for feature extraction from the previous
image in the sequence. Tiles from the previ
ous image are correlated with the next image using
some distance measure. The best match provides the most likely displacement. This is
equivalent to searching a spatially shifted and temporally differentiated space.
This approach provides more robustness
with respect to numerical differentiation and is
generally quicker since it constructs a highly quantized gradient distribution. As mentioned
earlier, this distribution is so coarse that Liu et
al.
[16] classify it as a non

gradient algorithm.
The distan
ce measure used by more classical algorithms such as Anandan [1], and Singh and
Allen’s [24] is referred to as the sum

of

square differences (SSD). It is formulated as
LITERATURE REVIEW on OPTICAL FLOW
7
(12)
where W is a 2

D window function and
denote the suggest displacement vector.
Anandan constructs a multi

scale method based on the Burt Laplacian pyramid [6]. A coarse

to

fine strategy is adopted such that larger displacements are first determined from less resolved
versions of the im
ages and then improved with more accurate higher resolution versions of the
image. This strategy is well suited for large

scale displacements but is less successful for sub

pixel velocities.
Confidence measures,
and
, which are based on the principle curvatures of the SSD
surface, are used to steer the smoothing process. The smoothness constraint is based on the
principle axes of the SSD surface
and
, the estimated displacement
s
and
the sought best

fit velocity estimate
. Anandan also includes Horn and Schunk’s [15]
formulation of the smoothness constraint. Mathematically,
(13)
Singh and Allen [24] provid
e another approach using the SSD. They use a three

frame
approach to the region matching method to average out temporal error in the SSD. For a frame
0, they form an SSD distribution with respect to frame
–
1 and frame +1 as such:
(14)
From
, Singh and Allen build a probability distribution:
(15)
where k is a normalization constant. The sub

pixel flow estimates
are then
obtained by considering the mean of the di
stribution with respect to
and
.
(16)
LITERATURE REVIEW on OPTICAL FLOW
8
Singh and Allen [24] employ a Laplacian pyramid strategy similar to that of Anandan [1]. This
provides a more symmetric distribution about displace
ment estimates in the SSD. A covariance
matrix is then constructed from these estimates as such:
(17)
Singh suggests that the eigenvalues of the inverse of
provide a measure of confidence for
.
For a given flow field
, the least

square estimate in a
neighborhood about
can be obtained from
(18)
A covariance matrix
can then
be generated in the same manner as (17) from (18). Flow
regularization is then obtained by minimizing the sum of the Mahalanobis distances between the
estimated flow field
and the two distributions
and
:
(19)
The eigenvalues of the covariance matrix
serve as confidence measures for the
regularization process.
Benoits and Ferrie [4] build a more robust and simplified region matching metric that is sim
ilar to
the SSD. They denote a tile
in pixel pattern in frame 1 at position
as
and a
corresponding pattern
in frame 2 at position
as
. T
he match distance
between the two tiles is provided by a combination of the absolute difference and sum in
intensities for the tiles
(20)
Thresholding is applied to the difference and sum distributions. When the sum of pixel i
ntensities
is small, the data is considered unusable. When the difference between successive inputs is
LITERATURE REVIEW on OPTICAL FLOW
9
small, the intensities should be considered the same. From the total difference

sum ratio
distribution, average pixel

matching error is summarized.
Li
near flow consistency is implemented using adaptive diffusion. Similarity measures are
constructed for neighborhoods of flow based on magnitude and direction. These are averaged
to form a single similarity metric.
An interesting region matching flow algor
ithm is Camus’ quantized flow [7]. Camus constructs a
real

time flow algorithm on the idea that performing a search over time instead of over space is
linear in nature rather than quadratic. A quantized sub

pixel displacement field results. Liu et
al.
[16]
report that Camus’ algorithm provides one of the two best accuracy

efficiency ratio
curves.
The SSD search space constructed in Camus’ algorithm is limited spatially to small areas yet
extends itself in time. This is denoted as a temporal search S frame
s deep, and the spatial search
over a (2n+1)x(2n+1) pixel area. The success of this algorithm is based on the idea that support
for a faster frame

rate reduces the necessary area of spatial search by providing a better
sampling rate. Another efficient elem
ent of this algorithm is its suitability for integer arithmetic.
However, it only provides a quantized flow field containing
different possible
velocities.
Other Algorithms
The only other algorithm that obtains comparable efficienc
y

accuracy results to Camus’
approach is that of Liu et
al.
[17]. Using a general steady 3

D motion model in combination
with 3

D Hermite polynomial differentiation filters an efficient and accurate algorithm is
constructed [16]. Their approach is similar
to that of Heeger [12] and Fleet and Jepson [9], as
it requires generating a family of spatio

temporal filters. The filters are thus tuned to reflect the
3

D motion model when projected onto a 2

D perspective model.
Hermite polynomial filters offer seve
ral advantages. Orthogonal and Gaussian properties ensure
stability. They are extensible to higher order derivatives. Finally, they reflect numerous
physiological models that support receptive fields being modeled by Gaussian derivatives of
various widths.
Other original approaches to flow estimation include Nesi at
al.
’s [21] work. They obtain better
discontinuous flow estimates using clustering techniques. They use the Combinatorial Hough
Transform to propose a multipoint solution with most

likelihood es
timation. They argue that
clustering techniques provide a better approach to solving the flow

blurring problem than more
traditional techniques that use least

square estimation.
Using the flow constraint equation Nesi et
al.
build a line parameterization
of the linear system
provided by a neighborhood of pixels:
LITERATURE REVIEW on OPTICAL FLOW
10
(21)
where
,
and
.
Votes are accumulated by counting the number of lines that intersect in any given neigh
borhood
of the parameterized space. This effectively provides a discriminating function for possible
solutions. Outliers are ignored and multiple velocities in the polled neighborhood are segregated,
thus avoiding the aliasing that results from traditional
least

square filter estimation methods.
In [13], Heitz and Bouthemy also use a statistical model to provide better estimation of
discontinuous flow fields. They suggest that using the intersection of solution sets provided by
complementary constraints pr
ovides a more robust estimate. Data fusion between constraints is
formulated using a Bayesian framework associated with Markov Random Fields.
A two

constraint system is provided in [13]. The first is the traditional image flow constraint (1).
The second i
s Bouthemy’s feature
–
based motion constraint [5]. It is based on spatio

temporal
surface modeling and hypothesis testing techniques and incorporates occlusion information into
the flow estimate.
Heitz and Bouthemy’s data fusion approach is interesting as
it is scalable to incorporate any
number of constraints (correlation, similarity functions, etc.).
The final point of interest of this literature review considers how one measures the validity of the
estimated flow field that results from the image forma
tion process with respect to the actual 2

D
velocity field that results from the projection of a 3

D motion field onto a perspective plane. The
work of Verri and Poggio [27] should be mentioned. These authors contend that the flow
constraint model provides
a near correct solution set for true 2

D flow along areas of high
curvature in the intensity domain. This indeed bridges the gap between the confidence measure
of Lucas and Kanade [18] and the confidence of the estimate with respect to the actual 2

D
fiel
d.
In conclusion, there has been much work done on the optical flow problem. Researchers have
experimented with different representations of the image sequence, different regularization
techniques and different confidence measures to provide a large famil
y of flow algorithms. Some
algorithms provide near real

time frame rates with poorer accuracy while others provide more
accurate results at a higher computational cost. Some provide sparser flow

fields that are more
accurate while others provide more estim
ates that are smoothed out. It is clear that the choice of
flow algorithm when implementing a computer vision system is dependent on the application in
question. It is also clear that with advances in computing power and parallel processing, frame
rates of
these algorithms will continue to improve.
LITERATURE REVIEW on OPTICAL FLOW
11
References
[1] Anandan, P., “A Computational Framework and an Algorithm for Measurement of Visual
Motion”, International Journal of Computer Vision, Vol. 2, pp. 283

310, 1989.
[2] Arbel, T., Ferrie, F.P, Mitra
n, M., “ Recognizing Objects From Curvilinear Motion”,
Submitted to the International Conference on Computer Vision and Pattern Recognition,
2000.
[3] Barron, J.L., Fleet, D.J. and Beauchemin, S.S., “Performance of Optical Flow Techniques”,
Internation J
ournal of Computer Vision, 12:1, pp. 43

77, 1994.
[4] Benoits, S.M. and Ferrie, F.P., “Monocular Optical Flow for Real

Time Vision Systems”,
Technical Report, Center for Intelligent Machines, McGill University, 1996.
[5] Bouthemy, P., ”A Maximum

Likeliho
od Framework for Determining Moving Edges”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 5, pp. 499

511,
May 1989.
[6] Burt, P.J., “Fast Filter Transforms for Image Processing”, Computer Graphics and Image
Processing, Vol.
16, pp. 20

51, 1981.
[7] Camus, T., “Real

Time Quantized Optical Flow”, Proceedings of IEEE Conference on
Computer Architecture for Machine Perception, Como, Italy, pp. 126

131, 1995.
[8] De Micheli, E., Torre, V. and Uras, S., “The Accuracy of the Compu
tation of Optical Flow
and of the Recovery of Motion Parameters”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 15, No. 5, May 1993.
[9] Fleet, D.J. and Jepson, A.D., “Computation of Component Image Velocity from Local
Phase Informa
tion”, International Journal of Computer Vision, 5:1, pp. 77

104, 1990.
[10] Fleet, D.J. and Langley, K., “Recursive Filters for Optical Flow”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 17, No. 1, pp.61

67, Jan 1995.
[11] Ghosa
l, S. and Vanek, Petr., “A Fast Scalable Algorithm for Discontinuous Optical Flow
Estimation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No. 2,
pp. 181

194, Feb. 1996.
[12] Heeger, D.J., “Optical Flow using Spatiotemporal Fi
lters”, International Journal of
Computer Vision. Vol. 1, pp. 279

302, 1988.
LITERATURE REVIEW on OPTICAL FLOW
12
[13] Heitz, F. and Bouthemy, P., “Multimodal Estimation of Discontinuous Optical Flow Using
Markov Random Fields”, IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 15, No. 12, pp. 1217

1232, Dec. 1993.
[14] Horn, B.K.P, “Robot Vision”, The MIT Press, Cambridge, Massachusetts, 1986.
[15] Horn, B.K.P. and Schunk, B.G., “Determining Optical Flow”, Artificial Intelligence, Vol.
17, pp. 185

201, 1981.
[16] Liu,
H., Hong, T.H., Herman, M., Camus, T. and Chellappa, R., “Accuracy vs Efficiency
Trade

offs in Optical Flow Algorithms”, Computer Vision and Image Understanding, Vol. 72,
No. 3, pp. 271

286, 1998.
[17] Liu, H., Hong, T.H, Herman, M., and Chellappa, R., “A
Generalized Motion Model for
Estimating Optical Flow Using 3

D Hermite Polynomials”, Proceedings of the IEEE
International Conference on Pattern Recognition, Jerusalem, Israel, pp. 360

366, 1994
[18] Lucas, B. and Kanade, T., “An Iterative Image Regitra
tion Technique with Applications in
Stereo Vision”, Proceedures of the DARPA Image Understanding Workshop, pp. 121

130,
1981.
[19] Nagel, H.H. and Enkelmann, W., “An Investigation of Smoothness Constraints for the
Estimation of Displacement Vector Fields
from Image Sequences”. IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 8, pp. 565

593, 1986.
[20] Negahdaripour, S., Yu, C.H, and Shokrollahi A.H., “Recovering Shape and Motion From
Undersea Images”, IEEE Journal of Oceanic Engineerin
g, Vol. 15, No. 3, pp 189

198, July
1990.
[21] Nesi, P., Del Bimbo, A., and Ben

Tzvi, D., “A Robust Algorithm for Optical Flow
Estimation”, Computer Vision and Image Understanding, Vol. 62, No.1, pp 59

68, July
1995.
[22] Schunk, B.G., “The Image Flow Co
nstraint Equation”, Computer Visison, Graphics and
Image Processing, Vol. 35, pp 20

46, 1986.
[23] Simoncelli, E.P., “Distributed Representation and Analysis of Visual Motion”, Ph.D.
dissertation, Dept. of Electrical Engineering and Computer Science, MIT,
1993.
[24] Singh, A. and Allen, P., “Image

Flow Computation: An Estimation

Theoretic Framework
and a Unified Perspective”, Computer Vision, Graphics and Image Processing, Vol. 56, pp.
152

177, Sept 1992.
LITERATURE REVIEW on OPTICAL FLOW
13
[25] Spetsakis, M.E., “Optical Flow Estimation Us
ing Discontinuity Conforming Filters”,
Computer Vision and Image Understanding, Vol. 68, No. 3, pp. 276

289, Dec. 1997.
[26] Weber, J. and Malik J.,”Rigid Body Segmentation and Shape Description from Dense
Optical Flow Under Weak Perspective”, IEEE Transa
ctions on Pattern Analysis and Machine
Intelligence, Vol. 19, No. 2, pp. 139

143, Feb. 1997.
[27] Verri, A. and Poggio, T., “Motion Field and Optical Flow: Qualitative Properties”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, N
o. 5, pp.490

498,
May 1989.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο