Machine Vision-Assisted In-Situ Ichthyoplankton Imaging ... - CS Mail

geckokittenΤεχνίτη Νοημοσύνη και Ρομποτική

17 Οκτ 2013 (πριν από 4 χρόνια και 8 μήνες)

146 εμφανίσεις

Machine Vision-Assiste d In-Situ
Ichthyoplankton Imaging Syste m
Computer Vision Technology for High-Throughput Specimen
Recognition in the New In-Sit u Ichthyoplankton Imaging System
By Dr. Gavriil Tsechpenakis
Lead Scientist
Data Visualization Group
University of Miami
Center for Computational Science
Coral Cables, Florida
Cedric Guigand
Senior Research Associate
Dr. Robert K. Cowen
Marine Biology and Fisheries
Rosenstiel School of Marine and
Atmospheric Science
University of Miami
Miami, Florida
urrent technologies available for
the study of plankton remain limit-
ed in comparison to the spatial-tempo-
ral resolution and data acquisition rate
available for physical oceanographic
Specifically, plankton measurements
are made primarily by use of net col-
lections, as opposed to the high-speed
digital output possible for physical sam-
pling. Though net technology has
become quite sophisticated (e.g., the
multiple opening/closing net and envi-
ronmental sensin g system), enabling
vertical resolution coupled with
detailed physical data, net samples still
require manual processing, which is a
time consuming and costly effort.
Furthermore, nets integrate organisms
over the sampling distance and depth,
significantly reducing sample resolu-
To address this problem, a high-reso-
lution towed plankton imaging system,
the In-Situ Ichthyoplankton Imaging
System (ISIIS), was built, capable of
Tru e Positiv e %
Fals e Positiv e %
Copepo d Fis h larva e Chaetognat h Larvacea n Trichodesniu m
_ 84 88 93 _
9 ~
1 6
The initial version of the system is tested in
a case study of five classes of specimens,
namely copepods, larvaceans, chaetog-
naths, fish larvae and Trichodesmium.
imaging water volumes sufficient to
accurately quantify even rare plankton
(e.g., larval fish) in situ} Thi s imaging
system produces very high-resolution
imagery at very high data rates, necessi -
tating automated image analysis. Since
the goal is the identification and quan-
tification of a large number of speci-
mens, whose shape s can be relatively
similar to each other, an automated sys-
tem for detection and recognition of
specimens of interest is being devel-
oped using computer vision and
machine [earning tools.
The existing approaches to similar
problems either assume that the speci-
mens or regions of interest (ROIs ) are
already segmented (thi s is done manu-
ally) and thereby focus on the recogni-
tion methodology or they focus on the
visual features to be used for recogni-
tion, assuming that the data is free of
noise (intensity ambiguities).2'
Obviously, the first approach requires
the extremely tedious work of manual-
ly segmenting large amounts of data,
which produces small instance s from
extracted images, each containing sin-
gle, relatively easy-to-recognize speci-
mens. The second approach also
assume s the existence of a single
extracted specimen in the examined
image, where the specimen outline or
region features can be clearly distin-
guished visually and computationally.
This article describes an approach that
automatically extracts and classifie s
specimens from the digital images.
Detection of Candidate Specimens
Let O be the image domain, BQ1 be
the background region and n\B be the
set of ROIs. To model the background
gray scal e intensity distribution, the
probability density function, pa, of the
intensity values is estimated from sam-
ple (training) background regions.4
Since a single parametric distribution
(e.g., Gaussian ) cannot sufficiently
model the background, a mixture of n
Gaussian distributions is used, estimat-
ed by the expectation maximization
(EM) algorithm.
It was experimentally shown that
three to four Gaussian s are sufficient to
model such nonparametric distribu-
tions, while the computational cost EM
remains low. Then, the probability of an
image pixel, x/en, with gray scale inten-
sity value, l(xi), being consistent with
(i.e., belonging to) the background is
given by Equation 1 (shown on page
18), where dl determines a smal l gray
scal e interval. The probabilities of the
image pixel s belonging to the ROIs are
then calculated as P(x,|mB)=1-P(x/|B).
To eliminate the noisy effect s in this
probability map (i.e., those caused by
intensity in homogeneities, shadowing
DECEMBE R 2008 / sf 15
with the difference being that only the
interactions between labels are taken
into account in the interaction poten-
tial. That is, the Markovian (neighbor-
ing) property between image pixels is
used only at the labels level (intuitively,
"neighboring pixels must have similar
labels"), without taking into account
the underlying features (intensities): This
causes the known label bias problem.
Although the distribution p(L\l) of
Equation 2 provides a fast global opti-
mization (segmentation) for the entire
input image, and it is the best possible
solution one can obtain given the
nature of the data, in practice the fol-
lowing are observed: Off-focus effects
and intensity inhomogeneities cause
loss of informative pixels/regions after
thresholding the probability field; sever-
al regions from the background are
extracted as ROIs, and using only inten-
sity information, such false positives
always appear in the segmentation
process but at this stage are not elimi-
nated; and all detected regions are
retained as candidate specimens, and
the focus is on a robust recognition
(Above) Detection of "candidate" speci-
mens: blue boxes show the ground truth,
while the red boxes indicate the detection
results. The false positives are eliminated
during recognition.
(Right) The CoCRF initially uses a limited-
length library. During recognition, the con-
fidently labeled data and a pool of unla-
beled data are used to update the library.5
effects and speckle noise in the input
images) and apart from the size con-
straint imposed, the probability map is
smoothed using a discriminative condi-
tional random field (CRF), taking into
account the data-driven interactions
between spatially neighboring pixels on
the image plane.45
Let /.={//} be the labels associated with
the image pixels. A label can have two
values; in other words, a pixel belongs
to a ROI (specimen) or the background.
The discriminative CRF can directly
estimate the labels' distribution for the
entire image, /, using the formula
shown in Equation 2, where Z is a nor-
malization constant, \\O\\ is the size of
the image domain and Ni indicates the
spatial neighborhood of the ;'-th pixel.
In this definition, ty, is called the asso-
ciation potential, since it associates the
0 - .:'•.-.-: Jat E .•"•-:.;•:•; o f interes t _ Collaborativ e informatio n wit h a KN N
f i grap h (K=2): neighborin g ROi s may portra y
Center s of the detecte d ! instance s of the sa t ~;—
candidat e ROi s -™;=^—^—^=^~~-
label of a pixel with its intensity (map-
ping). Also, ^n is called the interaction
potential, since it determines the inter-
action between the neighboring pixels
in terms of both their intensity values
and their labels. For the estimation of
these two potentials, the discriminative
classification and label smoothing for-
mulations are adopted, and the infer-
ence problem (estimation of the
model's parameters) is solved locally
using the highest confidence first algo-
rithm.4 Note that common random
fields can be expressed in terms of an
association and interaction potential,
approach that can eliminate the falsel y
detected (background) regions.
Feature Extraction
The detection scheme produces any-
where from a few to many small regions
to be recognized from the input images,
using the bounding boxes of the esti -
mated ROIs. Apart from the intensity
distributions, efficient, robust and com-
pact shape and appearance features are
used to best describe these small
regions that may or may not contain
desired specimens, without the need of
explicit segmentation (boundary esti -
16 sf / DECEMBE R 2008
www. sea-technology, com
Kongsbei g Maritime
E-mail: m
18 St I DECEMBE R 2008
(2) p(L|/)=yexp
mation). Besides, the automated esti-
mation of precise boundaries for the
desired specimen is not a realisti c sce-
nario due to the high level of ambigui-
ties described above.
SIFT Descriptors. In order to describe
the shape and texture of the extracted
object s of interest and to use this
description for recognition, the scal e
invariant feature transform (SIFT ) i s
used. SIF T descriptors are commonly
used today in computer vision applica-
tions, mainly for image/shape retrieval
and registration.4 In contrast to most of
the existing approaches for local feature
generation, SIF T transforms an image
into a collection of local feature vec-
tors, each one of which is invariant to
image translation, scaling and rotation;
illumination changes; and affine or 3D
projection. Therefore, SIF T provides
robustnes s to small changes in the
object's appearance in the examined
images, which is a key factor for the
recognition performance.
Shape Histograms. The shape his-
togram is a popular shape feature,
invariant to on-plane transformations
(translation, rotation and uniform scal-
ing). It is used for the description of
binary images (e.g., white background
and a black ROI). Thresholding the
probability field inside each extracted
region provides binary images that can
be transformed into histograms. In con-
trast to the SIF T features, thi s descriptor
is not capable of characterizing unique-
ly single specimens in different off-
plane orientations. Therefore, the shape
histogram is used mainly to assist the
SIFT-base d recognition in instances of
missing information, as described
Multiple Specimen Recognition
To achieve robust recognition, two
major issue s need to be tackled. First,
the recognition module must reject
regions that are detected during seg-
mentation but do not correspond to
specimens of interest. Second, some
specimens of interest are partially off- m
focus, and thus, only some of their part s
can be detected. Therefore, the recogni-
tion approach must handle instances of
missing region information and also
must be able to discriminate between
different specimens existing in the same
segmented ROI.
If a simple classification approach is
followed, such as a support vector
machine, then an important constraint,
namely the neighboring property in the
spatial domain, is ignored. Neighboring
regions are very likely to belong to the
same clas s (specimen or background)
or even be parts of the same specimen
that was over-segmented (usually due
to off-focus effects).2 On the other hand,
if the contextual information of the
image is exploited, several parts of the
background that were falsely segment-
ed as ROIs can be immediately exclud-
ed, and different ROIs can be merged if
they portray a single specimen; this is
done using the novel method called
collaborative conditional random field
(CoCRF).5 This framework is the most
promising solution so far for recogniz-
ing specimens without manually crop-
ping the ROIs; moreover, the CoCRF is
a generic classification scheme that can
be used in general recognition and
image partitioning problems.
Integration of Active Learning.
CoCRF, as a supervised discriminative
learning model, requires specimen
samples that are manually cropped and
labeled for training purposes. A major
problem, especially in discriminative
models, is how to avoid the extremely
tedious work of collecting large
amounts of such training samples, since
the recognition accuracy depends on
the estimated distributions during train-
ing. To tackle this problem, the idea of
active learning is integrated with the
A limited number of samples for each
specimen is used (i.e., five to 10 manu-
ally cropped and labeled regions from
ISM S images), and the CoCRF is initially
trained based only on this information.
Along with these samples, a pool of
conten t of the extracte d
boundin g boxe s
\ (local representation )
hidden layer: class labeled to be assigned to each R01
detecte d candidat e organism s of interest
"local structur e (neighborhood} 1
determine d wit h a KN N grap h (K=2} <
Graphical representation of the CoCRF idea
for simultaneous multiple-specimen recog-
nition to handle instances of over-segmen-
tation and missing regions.
unlabeled samples cropped from ISII S
images is used, which will enrich the
training set. For the input image, both
the detected ROIs and the unlabeled
samples in the pool are recognized. The
system automatically detects informa-
tive unlabeled samples from the pool
(i.e., samples that can improve the clas-
sification confidence) and either asks
the user to give labels for these samples
manually or applies an online learning
procedure (local classification) to
improve the decision boundary for
each specimen class.5 In this setup, the
first option is used. The system selects
up to 10 of the most informative sam-
ples for manual labeling. These samples
are compared with the initial training
set, and a new set of training samples is
created; this new set may or may not
include the manually selected samples,
and also some of the samples initially
Welcome to the family.
6000m rated
6000m rated 1-800-487-3775
rfears m
DECEMBE R 2008 / st 19
Internationa l
Industrie s Inc.
Sell s an d Rent s
th e Bes t
EdgeTec h Marin e
Sid e Sca n Sonars,
Subbotto m Profiler s &
Integrate d System s
Contract #GS-07F-5715P
Chesapeak e Technolog y
SonarWiz.MA P &
SonarWiz.MA P Offic e
**Nowon GSA
ODO M Ech o Sounder s
& Multibea m System s
Blac k Lase r Learnin g
Sona r Trainin g
Sid e Sca n Sona r Trainin g Clas s
Feb. 24-26, 200 9
Annapolis, M D
Teledyn e TS S
Motio n & Navigatio n System s
407 Innovation Drive
Annapolis, MD 21402
sal es (ci) in ternationali n
found in the training set can be rejected
as redundant. With a cross-validation
procedure, this new training set is eval-
uated on how it improves the recogni-
tion, and the CoCRF parameters are
updated. In practice, this active learning
procedure is not run in every recogni-
tion task; thi s scheme is to update the
classifier s and, therefore, update the
Conclusion s and Future Work
A fully automated specimen recogni-
tion software is being developed to
assis t the plankton imaging system,
ISMS. State-of-the-art machine vision
and learning methods are used to
achieve the best possible robustnes s to
a high amount of data ambiguities, such
as speckle noise, off-focus effect s and
shadowing. Thi s framework is based on
the idea that the most realistic scenario
for high-throughput organism recogni-
tion in plankton images is to assume
imperfect segmentation of the existing
organisms and use a robust classifica -
tion scheme that can merge, split or
exclude the detected ROIs during
recognition. The performance of the
segmentation and recognition algo-
rithms is being validated in a cas e study
of five target specimens in a binary one-
against-all classification manner.
Future work will include the exten-
sion of the software capability for the
recognition of more (15 to 20) classe s of
specimens and the improvement of the
recognition accuracy using direct multi-
clas s classification (instead of the bina-
ry one-against-all) and manifold learn-
ing approaches.
Thi s work was supported in part by
the U.S. Department of Commerce
under grant s BS123456, National
Science Foundation OCE-0513490 and
the National Oceanic and Atmospheric
Administration (NA04NMF4550391
via the University of New Hampshire-
Large Pelagic Researc h Center 06-131).
The authors wish to thank Oscar L.
Martinez, a graduate student in the
Department of Electrica l and Computer
Engineering at the University of Miami,
for his significant contribution in the
implementation of parts of the algo-
rithms. •
Reference s
1. Cowen, R.K., and C.M. Guigand,
"In-Situ Ichthyoplankton Imaging
Syste m (ISMS): System Design and
Preliminary Results," Limnology and
Oceanography Methods, no. 6, pp.
2. Hu, Q., and C. Davis, "Automatic
Plankton Image Recognition With
Co-Occurrence Matrices and
Support Vector Machine," Marine
Ecology Progress Series, vol. 295,
pp. 21-31,2005.
3. Benfield, M.C., P. Grosjean, P.P.
Culverhouse, X. Irigoien, M.E.
Sieracki, A. Lopez-Urrutia, H.G.
Dam, Q. Hu, C.S. Davis, A. Hansen,
C.H. Pilskaln, E.M. Riseman, H.
Schultz, P.E. Utgoff and G. Gorsky,
"RAPID: Researc h on Automated
Plankton Identification,"
Oceanography, vol. 20, no. 2, pp.
4. Tsechpenakis, G., C. Guigand and
R. Cowen, "Image Analysi s
Techniques to Accompany a New
In-Situ Ichthyoplankton Imaging
Syste m (ISMS)," Oceans 2007
Europe, 2007.
5. Martinez, O., and G. Tsechpenakis,
"Integration of Active Learning in a
Collaborative CRF," Online Learning
for Classification, IEEE Conference
on Computer Vision and Pattern
Recognition, 2008.
Visit our Web site at www.sea-tech-, and click on the title of this
article in the Table of Contents to be
linked to the respective company's Web
Dr. Cavriil Tsechpenakis is the lead scientist
of the Data Visualization Croup at the Center
for Computational Sciences at the University
of Miami. His research focus is on machine
vision and learning, including applications in
medical vision and computational biology.
He is a member of IEEE and the Association
for Computing Machinery.
Cedric Cuigand is a senior research associate
at the Rosenstiel School of Marine and
Atmospheric Science in Miami. He is mainly
responsible for instrument design and system
integration, as well as at-sea deployment and
Dr. Robert K. Cowen is a professor, the
Maytag chair of ichthyology and the chair of
the Marine Biology and Fisheries Division in
the Rosenstiel School of Marine and
Atmospheric Science at the University of
Miami. His research interests are in fisheries
oceanography, larval fish ecology and popu-
lation dynamics of marine fishes.
20 st / DECEMBE R 2008