LITERATURE REVIEW on OPTICAL FLOW

kneewastefulΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

80 εμφανίσεις

LITERATURE REVIEW on OPTICAL FLOW


1

Recovering 3
-
D structure using a set of 2
-
D images is an important problem in the field of
computer vision. One method for collecting a set of images is the temporal accumulation of
information through a monocular observer. The relationship between subsequ
ent still images in a
video stream provides a wealth of information. This information is in the form of spatio
-
temporal
change. The temporal integration of such velocity information in both 2
-
D and 3
-
D space is
essential for solving shape
-
from
-
motion [26],

time
-
of
-
collision [8], object tracking [20], object
-
recognition [2], and figure
-
ground problems.


It is intuitively sound to suggest that changes in intensity on an image plane are somewhat
coupled with the projection of the apparent motion of the 3
-
D spa
ce surrounding the plane. It is
however incorrect to say that such projections are unique and complete. The loss of a
dimension, quantization of intensity, discrete sampling of infinitesimal spatial data and sensor
noise make the problem of recovering 3
-
D
motion from 2
-
D intensity distribution ill
-
posed.


There are two levels of ill
-
posedness in recovering 3
-
D motion from a sequence of intensity
distributions. The first level involves recovering the true 2
-
D velocity field from the subsequent
intensity imag
es formed on the retina of the 2
-
D imaging device. The second problem involves
constructing a fully determined system of 2
-
D flow fields, such that the 3
-
D motion can be fully,
uniquely and stably recovered.


Under constraints of rigid motion and weak per
spective the second problem becomes tractable
[26]. This literature review will concern itself with the more difficult problem of recovering the 2
-
D flow from the intensity fields. This is commonly referred to in the literature as the optical
-
flow
problem.


Barron et

al.
[3]

define the optical flow problem as that of “computing the approximation to the
2
-
D motion field


a projection of the 3
-
D velocities of surface points onto the imaging surface


from spatio
-
temporal patterns”.


Two comprehensive papers

on the subject of optical flow performance exist. Barron et

al.

[3]
have produced a paper that compares nine classic flow algorithms on the basis of accuracy and
density. They provide a clear test set of image sequences that can be used for quantitative a
nd
qualitative comparison of the different algorithms. More recent work by Liu et

al.

[16] has
improved on Barron et

al.
’s study by including efficiency in the evaluation of the algorithms.
They provide a coordinate system that compares accuracy with effic
iency. A curve is
constructed in this coordinate system by changing the search areas of the different algorithms.


As stated above, recovering the 2
-
D velocity field from a sequence of intensity images is ill
-
posed [13]. The problem is ill
-
posed because lo
cal intensity alone fails to completely encode
motion information. For example, in regions of constant intensity, motion cannot be detected
and an infinite number of solutions exist. Even when the intensity is constant in a given direction
, the solution is still only partially available. Under such conditions it is only possible to provide
LITERATURE REVIEW on OPTICAL FLOW


2

the component of the solution that is perpendicular to
. This is referred to as the aperture
problem [21].


Another example of ho
w intensity fails to describe motion is Horn’s mirror ball problem [14].
Consider a sphere with no texture. As the sphere rotates about its center, no change in intensity
is observed, yet the sphere does possess a motion field.


Because the problem of reco
vering the 2
-
D velocity field is ill
-
posed, additional information
must be added to the problem statement to clearly define a closed set in which a single stable
solution exists. This additional information takes the form of imposed additional constraints
on
the problem and regularization strategies.


In [14], Horn formalizes the image flow constraint, thus creating an incomplete correlation
between the motion domain and the intensity domain. This constraint can be interpreted as the
assumption that a point

in the 3
-
D shape, when projected onto the 2
-
D image plain, maintains a
constant intensity over time. Mathematically this is formulated as








(1)


where

is the intensity distribution of the image of a pixel at a

point

at a time t.
Simoncelli et

al.

[22] present this constraint from a Bayesian perspective by suggesting that the
maximum a posteriori

2
-
D displacement of a point in 3
-
D space when projected into an
intensity plane

is obtained as








(2)



On top of constraints, regularization strategies are used to interpolate and propagate flow
estimates from areas of greater certainty to those of lower certainty where the aperture problem
prevails. Some

algorithms simply perform post
-
filtering while other algorithms provide a
measure of confidence for each estimate, thus providing a criterion for better re
-
distribution of
the velocities estimates of high confidence to areas of lower confidence.


Thus an
optical flow algorithm is specified by three elements [3]:



the spatio
-
temporal operators that are applied to the image sequence to extract
features and improve the signal
-
to
-
noise ratio,



how velocity estimates are produced from a gradient search of the ex
tracted feature
space, and



the form of regularization applied to the flow field considering confidence measures
if they exist.


LITERATURE REVIEW on OPTICAL FLOW


3

Barron et

al.

[3] classify optical flow algorithms by their signal
-
extraction stage. This provides
four groups: differential tec
hniques, energy
-
based methods, phase
-
based techniques and region
-
based matching.


Liu et

al.

[16] prefer to classify algorithms into two groups: those that perform a gradient search
on extracted structure of the image sequence and those that don’t. This ef
fectively groups the
differential techniques, energy
-
based methods, and phase
-
based methods into a single class of
algorithms referred to as gradient
-
methods, while leaving the region
-
matching techniques in a
class of their own. This classification is not
truly justified as region based matching also performs
hill climbing, only on a much coarser level than filter
-
dependent algorithms. The coarser search
provides quicker more robust results. The cost of the improved efficiency is reduced accuracy.



Differe
ntial Techniques


Differential techniques are characterized by gradient search performed on extracted first and
second order spatial derivatives, and temporal derivatives. From the Taylor expansion of the
flow constraint equation, the gradient constraint e
quation is obtained:








(3)


Horn and Schunk [15] combine a global smoothness term with the gradient constraint equation
to obtain a functional for estimating optical flow. Their choice of smoothness term minimizes the
absolute grad
ient of the velocity:







(4)


This functional can be reduced to a pair of recursive equations that must be solved iteratively.


Lucas and Kanade [18] also construct a flow estimation technique based on first order
derivatives of the

image sequence. In contrast to Horn and Schunk’s post
-
smoothing
regularization, they choose to pre
-
smooth the data before using the gradient constraint equation.
Mathematically,









(5)


where

is a window that
gives more influence to constraints near the center of the
neighborhood
.


This can be reduced to a closed form solution for the flow estimates where:


LITERATURE REVIEW on OPTICAL FLOW


4








(6)


and









(7)


One important a
dvantage of this approach over Horn and Schunk [15] is the existence of a
confidence measure. The smallest eigenvalue,
, of







(8)


provides a measure to distinguish estimates of normal velocity from 2
-
D velocity
.


Barron et

al.

report in their survey [3] that Lucas and Kanade’s algorithm provides the second
most accurate results. Liu et

al.

[16] evaluate Lucas and Kanade as providing the third best
efficiency
-
accuracy curve. This has motivated much work on Lucas

and Kanade’s algorithm.


Fleet and Langley [10] attempt a more efficient implementation using IIR temporal pre
-
filtering
and temporal recursive estimation for regularization. The temporal support was reduced to 3
frames, and the computation time improved
, while only slightly diminishing performance.


Accuracy issues are also tackled especially in the domain of discontinuities. Discontinuities
provide information about occlusion and shape. Thus researchers have attempted to reduce the
effects of smoothing
along steep intensity gradients. Nagel and Enkelmann [19] were the first to
formulate an oriented smoothness constraint:













(9)


where

is a fixed constant.


Ghosal and Vanek [11] look at weighted anisotrop
ic filtering to reduce the loss of discontinuity
information when regularizing. Using the eigenvalues of (8), when W is the identity matrix, they
establish weights for imposing isotropic smoothness along the x and y directions.


LITERATURE REVIEW on OPTICAL FLOW


5

In the same spirit, Spetsa
kis [25] uses an adaptive Gaussian filter approach to minimize the loss
of occluding information. He applies this Gaussian

to normal equations of (8) and obtains









(10)


A velocity estimate is obtained from t
he solution of the system when the constraints
and

are zero.


The size of the Gaussian filter applied to a given pixel should be governed by the following rules:
the Gaussian should be larger when




the flow is smoot
h, and



instabilities in the system of equations occur.


Spetsaskis derives a measure of confidence called the
incompatible measure
. This value is
determined from the residual values
and

that result when reverse inje
cting

into (10)
when

is the identity operator. If the residual is high, the solution is considered robust. If the
residual is low, then the Gaussian

has not gathered enough information and its
size must be
increased.


More generally, differential methods can be seen as band
-
pass signal extraction. Therefore they
only provide a local representation in frequency space and thus are restricted to performing well
on an interval of velocities characte
rized by a pre
-
smoothing of the spatio
-
temporal signal
before numerical differentiation. Flow information may not always be limited to a tight frequency
band. Solutions to these problems are somewhat tackled by Spetsakis, and Ghosal and Vanek
through adapt
ive filtering. Heeger [12], and Fleet and Jepson [9], however, suggest that
providing information for multiple frequency bands is much more accurate. This approach is
much more computationally expensive as well [16].


Energy
-
Based Methods


The advantage of

energy
-
based methods is the hierarchical decomposition of the image
sequence in the frequency domain. Energy based techniques extract velocities by using families
of band
-
pass filters that are velocity and orientation tuned. The Fourier transform of a tra
nslating
2
-
D pattern is








(11)


LITERATURE REVIEW on OPTICAL FLOW


6

where

is the Fourier transform of
,

is a Dirac delta function,

denotes
temporal frequency, and
d
enotes spatial frequency. This effectively implies that all
power associated with the translation will be mapped to a plane that traverses the origin in
frequency space.



As such, Heeger [12] uses a family of twelve Gabor filters of different spatial reso
lutions to
extract velocity information from the image sequences. By using Gabor filters [9], which provide
the simultaneous spatio
-
temporal and frequential localization, a clean band
-
pass representation
is obtained. A least square fit is then applied to t
he resulting distribution in frequency
-
space.



Phase
-
Based Techniques


The most accurate optical flow estimations are produced using Fleet and Jepson’s [9] phase
-
based approach [3,16]. Phase
-
based methods also use a family of velocity
-
tuned filters to
ext
ract a local
-
frequency representation of the image sequence. Flow estimates are provided by
gradient search in the phase space of the extracted signatures.


Motivation for this approach is based on the argument that evolution of phase contours provides
a
good approximation to projected motion field. The phase output of band
-
pass filters is
generally more stable than the amplitude when small translations in the scene are sought. As
optical flow is a localized measurement, it is often characterized by small
displacements. Thus
deriving the velocity from phase as opposed to magnitude is advantageous.



Region Matching


Region matching is particular in that it forms the filters for feature extraction from the previous
image in the sequence. Tiles from the previ
ous image are correlated with the next image using
some distance measure. The best match provides the most likely displacement. This is
equivalent to searching a spatially shifted and temporally differentiated space.


This approach provides more robustness

with respect to numerical differentiation and is
generally quicker since it constructs a highly quantized gradient distribution. As mentioned
earlier, this distribution is so coarse that Liu et

al.

[16] classify it as a non
-
gradient algorithm.


The distan
ce measure used by more classical algorithms such as Anandan [1], and Singh and
Allen’s [24] is referred to as the sum
-
of
-
square differences (SSD). It is formulated as



LITERATURE REVIEW on OPTICAL FLOW


7












(12)


where W is a 2
-
D window function and
denote the suggest displacement vector.


Anandan constructs a multi
-
scale method based on the Burt Laplacian pyramid [6]. A coarse
-
to
-
fine strategy is adopted such that larger displacements are first determined from less resolved
versions of the im
ages and then improved with more accurate higher resolution versions of the
image. This strategy is well suited for large
-
scale displacements but is less successful for sub
-
pixel velocities.


Confidence measures,

and
, which are based on the principle curvatures of the SSD
surface, are used to steer the smoothing process. The smoothness constraint is based on the
principle axes of the SSD surface
and
, the estimated displacement
s

and
the sought best
-
fit velocity estimate
. Anandan also includes Horn and Schunk’s [15]
formulation of the smoothness constraint. Mathematically,














(13)


Singh and Allen [24] provid
e another approach using the SSD. They use a three
-
frame
approach to the region matching method to average out temporal error in the SSD. For a frame
0, they form an SSD distribution with respect to frame

1 and frame +1 as such:







(14)


From
, Singh and Allen build a probability distribution:










(15)


where k is a normalization constant. The sub
-
pixel flow estimates

are then
obtained by considering the mean of the di
stribution with respect to
and
.










(16)


LITERATURE REVIEW on OPTICAL FLOW


8

Singh and Allen [24] employ a Laplacian pyramid strategy similar to that of Anandan [1]. This
provides a more symmetric distribution about displace
ment estimates in the SSD. A covariance
matrix is then constructed from these estimates as such:















(17)


Singh suggests that the eigenvalues of the inverse of

provide a measure of confidence for
.


For a given flow field
, the least
-
square estimate in a

neighborhood about
can be obtained from










(18)


A covariance matrix
can then
be generated in the same manner as (17) from (18). Flow
regularization is then obtained by minimizing the sum of the Mahalanobis distances between the
estimated flow field
and the two distributions
and
:






(19)


The eigenvalues of the covariance matrix

serve as confidence measures for the
regularization process.


Benoits and Ferrie [4] build a more robust and simplified region matching metric that is sim
ilar to

the SSD. They denote a tile
in pixel pattern in frame 1 at position
as
and a
corresponding pattern
in frame 2 at position

as
. T
he match distance
between the two tiles is provided by a combination of the absolute difference and sum in
intensities for the tiles










(20)


Thresholding is applied to the difference and sum distributions. When the sum of pixel i
ntensities
is small, the data is considered unusable. When the difference between successive inputs is
LITERATURE REVIEW on OPTICAL FLOW


9

small, the intensities should be considered the same. From the total difference
-
sum ratio
distribution, average pixel
-
matching error is summarized.


Li
near flow consistency is implemented using adaptive diffusion. Similarity measures are
constructed for neighborhoods of flow based on magnitude and direction. These are averaged
to form a single similarity metric.


An interesting region matching flow algor
ithm is Camus’ quantized flow [7]. Camus constructs a
real
-
time flow algorithm on the idea that performing a search over time instead of over space is
linear in nature rather than quadratic. A quantized sub
-
pixel displacement field results. Liu et

al.

[16]

report that Camus’ algorithm provides one of the two best accuracy
-
efficiency ratio
curves.


The SSD search space constructed in Camus’ algorithm is limited spatially to small areas yet
extends itself in time. This is denoted as a temporal search S frame
s deep, and the spatial search
over a (2n+1)x(2n+1) pixel area. The success of this algorithm is based on the idea that support
for a faster frame
-
rate reduces the necessary area of spatial search by providing a better
sampling rate. Another efficient elem
ent of this algorithm is its suitability for integer arithmetic.
However, it only provides a quantized flow field containing
different possible
velocities.


Other Algorithms


The only other algorithm that obtains comparable efficienc
y
-
accuracy results to Camus’
approach is that of Liu et

al.

[17]. Using a general steady 3
-
D motion model in combination
with 3
-
D Hermite polynomial differentiation filters an efficient and accurate algorithm is
constructed [16]. Their approach is similar
to that of Heeger [12] and Fleet and Jepson [9], as
it requires generating a family of spatio
-
temporal filters. The filters are thus tuned to reflect the
3
-
D motion model when projected onto a 2
-
D perspective model.


Hermite polynomial filters offer seve
ral advantages. Orthogonal and Gaussian properties ensure
stability. They are extensible to higher order derivatives. Finally, they reflect numerous
physiological models that support receptive fields being modeled by Gaussian derivatives of
various widths.


Other original approaches to flow estimation include Nesi at

al.
’s [21] work. They obtain better
discontinuous flow estimates using clustering techniques. They use the Combinatorial Hough
Transform to propose a multipoint solution with most
-
likelihood es
timation. They argue that
clustering techniques provide a better approach to solving the flow
-
blurring problem than more
traditional techniques that use least
-
square estimation.


Using the flow constraint equation Nesi et
al.

build a line parameterization
of the linear system
provided by a neighborhood of pixels:

LITERATURE REVIEW on OPTICAL FLOW


10










(21)


where
,
and
.


Votes are accumulated by counting the number of lines that intersect in any given neigh
borhood
of the parameterized space. This effectively provides a discriminating function for possible
solutions. Outliers are ignored and multiple velocities in the polled neighborhood are segregated,
thus avoiding the aliasing that results from traditional

least
-
square filter estimation methods.


In [13], Heitz and Bouthemy also use a statistical model to provide better estimation of
discontinuous flow fields. They suggest that using the intersection of solution sets provided by
complementary constraints pr
ovides a more robust estimate. Data fusion between constraints is
formulated using a Bayesian framework associated with Markov Random Fields.


A two
-
constraint system is provided in [13]. The first is the traditional image flow constraint (1).
The second i
s Bouthemy’s feature

based motion constraint [5]. It is based on spatio
-
temporal
surface modeling and hypothesis testing techniques and incorporates occlusion information into
the flow estimate.


Heitz and Bouthemy’s data fusion approach is interesting as
it is scalable to incorporate any
number of constraints (correlation, similarity functions, etc.).


The final point of interest of this literature review considers how one measures the validity of the
estimated flow field that results from the image forma
tion process with respect to the actual 2
-
D
velocity field that results from the projection of a 3
-
D motion field onto a perspective plane. The
work of Verri and Poggio [27] should be mentioned. These authors contend that the flow
constraint model provides

a near correct solution set for true 2
-
D flow along areas of high
curvature in the intensity domain. This indeed bridges the gap between the confidence measure
of Lucas and Kanade [18] and the confidence of the estimate with respect to the actual 2
-
D
fiel
d.


In conclusion, there has been much work done on the optical flow problem. Researchers have
experimented with different representations of the image sequence, different regularization
techniques and different confidence measures to provide a large famil
y of flow algorithms. Some
algorithms provide near real
-
time frame rates with poorer accuracy while others provide more
accurate results at a higher computational cost. Some provide sparser flow
-
fields that are more
accurate while others provide more estim
ates that are smoothed out. It is clear that the choice of

flow algorithm when implementing a computer vision system is dependent on the application in
question. It is also clear that with advances in computing power and parallel processing, frame
rates of

these algorithms will continue to improve.

LITERATURE REVIEW on OPTICAL FLOW


11

References


[1] Anandan, P., “A Computational Framework and an Algorithm for Measurement of Visual
Motion”, International Journal of Computer Vision, Vol. 2, pp. 283
-
310, 1989.


[2] Arbel, T., Ferrie, F.P, Mitra
n, M., “ Recognizing Objects From Curvilinear Motion”,
Submitted to the International Conference on Computer Vision and Pattern Recognition,
2000.


[3] Barron, J.L., Fleet, D.J. and Beauchemin, S.S., “Performance of Optical Flow Techniques”,
Internation J
ournal of Computer Vision, 12:1, pp. 43
-
77, 1994.


[4] Benoits, S.M. and Ferrie, F.P., “Monocular Optical Flow for Real
-
Time Vision Systems”,
Technical Report, Center for Intelligent Machines, McGill University, 1996.


[5] Bouthemy, P., ”A Maximum
-
Likeliho
od Framework for Determining Moving Edges”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 5, pp. 499
-
511,
May 1989.


[6] Burt, P.J., “Fast Filter Transforms for Image Processing”, Computer Graphics and Image
Processing, Vol.
16, pp. 20
-
51, 1981.


[7] Camus, T., “Real
-
Time Quantized Optical Flow”, Proceedings of IEEE Conference on
Computer Architecture for Machine Perception, Como, Italy, pp. 126
-
131, 1995.


[8] De Micheli, E., Torre, V. and Uras, S., “The Accuracy of the Compu
tation of Optical Flow
and of the Recovery of Motion Parameters”, IEEE Transactions on Pattern Analysis and
Machine Intelligence, Vol. 15, No. 5, May 1993.


[9] Fleet, D.J. and Jepson, A.D., “Computation of Component Image Velocity from Local
Phase Informa
tion”, International Journal of Computer Vision, 5:1, pp. 77
-
104, 1990.


[10] Fleet, D.J. and Langley, K., “Recursive Filters for Optical Flow”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 17, No. 1, pp.61
-
67, Jan 1995.


[11] Ghosa
l, S. and Vanek, Petr., “A Fast Scalable Algorithm for Discontinuous Optical Flow
Estimation”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No. 2,
pp. 181
-
194, Feb. 1996.


[12] Heeger, D.J., “Optical Flow using Spatiotemporal Fi
lters”, International Journal of
Computer Vision. Vol. 1, pp. 279
-
302, 1988.


LITERATURE REVIEW on OPTICAL FLOW


12

[13] Heitz, F. and Bouthemy, P., “Multimodal Estimation of Discontinuous Optical Flow Using
Markov Random Fields”, IEEE Transactions on Pattern Analysis and Machine Intelligence,

Vol. 15, No. 12, pp. 1217
-
1232, Dec. 1993.


[14] Horn, B.K.P, “Robot Vision”, The MIT Press, Cambridge, Massachusetts, 1986.



[15] Horn, B.K.P. and Schunk, B.G., “Determining Optical Flow”, Artificial Intelligence, Vol.
17, pp. 185
-
201, 1981.


[16] Liu,
H., Hong, T.H., Herman, M., Camus, T. and Chellappa, R., “Accuracy vs Efficiency
Trade
-
offs in Optical Flow Algorithms”, Computer Vision and Image Understanding, Vol. 72,
No. 3, pp. 271
-
286, 1998.


[17] Liu, H., Hong, T.H, Herman, M., and Chellappa, R., “A

Generalized Motion Model for
Estimating Optical Flow Using 3
-
D Hermite Polynomials”, Proceedings of the IEEE
International Conference on Pattern Recognition, Jerusalem, Israel, pp. 360
-
366, 1994


[18] Lucas, B. and Kanade, T., “An Iterative Image Regitra
tion Technique with Applications in
Stereo Vision”, Proceedures of the DARPA Image Understanding Workshop, pp. 121
-
130,
1981.


[19] Nagel, H.H. and Enkelmann, W., “An Investigation of Smoothness Constraints for the
Estimation of Displacement Vector Fields
from Image Sequences”. IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 8, pp. 565
-
593, 1986.


[20] Negahdaripour, S., Yu, C.H, and Shokrollahi A.H., “Recovering Shape and Motion From
Undersea Images”, IEEE Journal of Oceanic Engineerin
g, Vol. 15, No. 3, pp 189
-
198, July
1990.


[21] Nesi, P., Del Bimbo, A., and Ben
-
Tzvi, D., “A Robust Algorithm for Optical Flow
Estimation”, Computer Vision and Image Understanding, Vol. 62, No.1, pp 59
-
68, July
1995.


[22] Schunk, B.G., “The Image Flow Co
nstraint Equation”, Computer Visison, Graphics and
Image Processing, Vol. 35, pp 20
-
46, 1986.


[23] Simoncelli, E.P., “Distributed Representation and Analysis of Visual Motion”, Ph.D.
dissertation, Dept. of Electrical Engineering and Computer Science, MIT,

1993.


[24] Singh, A. and Allen, P., “Image
-
Flow Computation: An Estimation
-
Theoretic Framework
and a Unified Perspective”, Computer Vision, Graphics and Image Processing, Vol. 56, pp.
152
-
177, Sept 1992.


LITERATURE REVIEW on OPTICAL FLOW


13

[25] Spetsakis, M.E., “Optical Flow Estimation Us
ing Discontinuity Conforming Filters”,
Computer Vision and Image Understanding, Vol. 68, No. 3, pp. 276
-
289, Dec. 1997.


[26] Weber, J. and Malik J.,”Rigid Body Segmentation and Shape Description from Dense
Optical Flow Under Weak Perspective”, IEEE Transa
ctions on Pattern Analysis and Machine

Intelligence, Vol. 19, No. 2, pp. 139
-
143, Feb. 1997.


[27] Verri, A. and Poggio, T., “Motion Field and Optical Flow: Qualitative Properties”, IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, N
o. 5, pp.490
-
498,
May 1989.