Detecting and Segmenting Objects for Mobile Manipulation

builderanthologyAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

324 views

1

1

OpenCV Tutorial


Omri Perez



Adapted from:


Gary Bradski

Senior Scientist, Willow Garage

Consulting Professor: Stanford CS Dept.

http://opencv.willowgarage.com


www.willowgarage.com




Vision is Hard


Camera Model, Lens, Problems and Corrections


OpenCV


OpenCV Tour




2

CS324


What is it?


Turning sensor readings into perception.


Why is it hard?


It’s just numbers.

Maybe try gradients to find edges?

Vision is Hard

3

CS324

4


Depth discontinuity


Surface orientation
discontinuity


Reflectance
discontinuity (i.e.,
change in surface
material properties)


Illumination
discontinuity (e.g.,
shadow)

Slide credit: Christopher Rasmussen


Use Edges? … It’s not so simple

CS324

Must deal with Lighting Changes …

5

CS324

Lighting is also a Strong Cue

6

Gary Bradski (c) 2008

6

The Brain Assumes 3D Geometry

7

Perception is ambiguous … depending on your point of view!

7

8

Geometrical aberrations

q

spherical distortion


q

astigmatism


q

tangential distortion


q

coma

aberrations are reduced

by combining lenses

Marc Pollefeys

Non
-
Geometrical

aberrations

q

Chromatic


q

Vignetting

These are typically what

are corrected for in

camera Calibration

Distortion Correction so that Lens

can Approximate a Pinhole Camera


Distortions are corrected mathematically


We use a calibration pattern


We find where the points ended up


We know where the points hould be




OpenCV 2.2 Function:

double
calibrateCamera
(

const vector<vector<Point3f> >& objectPoints,

const vector<vector<Point2f> >& imagePoints,

Size imageSize,

Mat& cameraMatrix,

Mat& distCoeffs,

vector<Mat>& rvecs,

vector<Mat>& tvecs,

int flags=0);

CS324

9


Vision is Hard


Camera Model, Lens, Problems and Corrections


OpenCV


OpenCV Tour



10

CS
324

OpenCV Overview:

General Image Processing Functions

Machine
Learning:



Detection,



Recognition

Segmentation

Tracking

Matrix Math

Utilities and

Data Structures

Fitting

Image Pyramids

Camera

calibration,

Stereo, 3D

Transforms

Features

Geometric

descriptors

Robot support

opencv.willowgarage.com

>
2000
algorithms

11

Gary Bradski

OpenCV Tends Towards Real Time

http://opencv.willowgarage.com

Where is OpenCV Used?

2
M downloads



Well over
2
M downloads

Screen shots by Gary Bradski,
2005


Google Maps, Google street view, Google Earth, Books


Academic and Industry Research


Safety monitoring
(Dam sites, mines, swimming pools)


Security systems


Image retrieval


Video search


Structure from motion in movies


Machine vision factory production inspection systems


Robotics


OpenCV Modules


Calib3d


Calibration, stereo, homography, rectify, projection, solvePNP


Contrib


Octree, self
-
similar feature, sparse L
-
M, bundle adj, chamfer match


Core


Data structures, access, matrix ops, basic image operations


features2D


Feature detectors, descriptors and matchers in one architecture


Flann

(Fast library for approximate nearest neighbors)


Gpu



CUDA speedups


Highgui


Gui to read, write, draw, print and interact with images


Imgproc



image processing functions


Ml



statistical machine learning, boosting, clustering


Objdetect



PASCAL VOC latent SVM and data reading


Traincascade



boosted rejection cascade



CS
324

14

Software Engineering


Works on:


Linux, Windows, Mac OS (+ Android since open CV
2.2
)


Languages:


C++, Python, C


Online documentation:


Online reference manuals:
C++
,
C

and
Python
.


Vision is Hard


Camera Model, Lens, Problems and Corrections


OpenCV


OpenCV Tour


16

CS324

Gradients: Scharr instead of Sobel


Sobel has been the traditional
3
x
3
gradient finder.


Use the
3
x
3
Scharr operator
instead since it is just as fast
but has more accurate
response on diagonals.

CS
324

17

void Scharr(const Mat& src, Mat& dst,
int ddepth, int xorder, int yorder, double
scale=1, double delta=0, int
borderType=BORDER_DEFAULT)

Canny Edge Detector

18

OpenCV team, Gary Bradski

Canny()

Hough Transform

Gary Bradski (c)
2008

19

HoughCircles(), HoughLines(), HoughLinesP() (probabilistic Hough)

Scale Space

void cvPyrDown(


IplImage*

src,


IplImage*

dst,


IplFilter

filter = IPL_GAUSSIAN_5x5);

void cvPyrUp(


IplImage*

src,


IplImage*

dst,


IplFilter

filter = IPL_GAUSSIAN_
5
x
5
);

20

Gary Bradski (c)
2008

Space Variant vision: Log
-
Polar Transform

21

Gary Bradski (c) 2008

cvLogPolar(src,dst,center,size, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS)

Delaunay Triangulation, Voronoi Tessellation

Gary Bradski (c)
2008

22

CvSubdiv
2
D* cvCreateSubdivDelaunay
2
D(CvRect rect, CvMemStorage* storage)

Contours

Gary Bradski (c) 2008

23

void findContours()

Histogram Equalization

Gary Bradski (c)
2008

24

void equalizeHist(const Mat& src, Mat& dst)

Image textures



Inpainting:


Removes damage to images, in this case, it removes the text.

25

Gary Bradski (c)
2008

void inpaint(const Mat& src, const Mat& inpaintMask, Mat& dst, double
inpaintRadius, int flags);

Morphological Operations Examples


Morphology
-

applying Min
-
Max
.

Filters and its combinations

Opening IoB= (I

B)

B

Dilatation I

B

Erosion I

B

Image

I

Closing I•B= (I

B)

B

TopHat(I)= I
-

(I

B)


BlackHat(I)= (I

B)
-

I

Grad(I)= (I

B)
-
(I

B)


26

Gary Bradski (c) 2008

Void morphologyEx()

createMorphologyFilter()

erode()

dilate()

Distance Transform


Distance field from edges of objects

Flood Filling

27

Gary Bradski (c)
2008

void
distanceTransform(c
onst Mat& src, Mat&
dst, int
distanceType, int
maskSize)

int
floodFill(
Mat&
image, Point seed,
Scalar newVal, Rect*
rect=
0
, Scalar
loDiff=Scalar(), Scalar
upDiff=Scalar(), int
flags=
4
)

Thresholds

Gary Bradski (c)
2008

28

void adaptiveThreshold()

double threshold()

Segmentation


Pyramid,
mean
-
shift, graph
-
cut


Here: Watershed

Gary Bradski (c)
2008

29

void watershed(const Mat& image, Mat& markers)

Background Subtraction

30

Gary Bradski (c) 2008

BackgroundSubtractorMOG
2
(), see samples/cpp/bgfg_segm.cpp

31

Image Segmentation & Minimum Cut

Image

Pixels

Pixel

Neighborhood

w

Similarity

Measure

Minimum

Cut

* From Khurram Hassan
-
Shafique
CAP
5415
Computer Vision
2003


Graph Cut based segmentation

Gary Bradski

32

GrabCut

void grabCut(const Mat& image, Mat& mask, Rect rect, Mat& bgdModel, Mat& fgdModel, int iterCount, int mode)

Motion Templates
(My work with James Davies)



Object silhouette


Motion history images


Motion history gradients


Motion segmentation algorithm

silhouette

MHI

MHG

33

Gary Bradski (c)
2008

Segmentation, Motion Tracking


Pose

Recognition

Motion

Segmentation

Gesture

Recognition

Motion

Segmentation

34

Gary Bradski (c)
2008

void updateMotionHistory();


void calcMotionGradient();

double calcGlobalOrientation();

James Davies, Gary Bradski

Tracking with CAMSHIFT


Control game with head

RotatedRect CamShift(const
Mat& probImage, Rect&
window, TermCriteria criteria)

3
D tracking


Camera Calibration


View Morphing


POSIT

void POSIT()

A more general technique for solving pose is

solving the Percpective N Point problem:

void solvePnP(…)


Mean
-
Shift for Tracking

Gary Bradski (c)
2008

37

CamShift();

MeanShift();

Optical Flow

// opencv/samples/c/lkdemo.c

int‏main(…){



CvCapture*‏capture‏=‏<…>‏?‏
cvCaptureFromCAM(camera_id) :
cvCaptureFromFile(path);

if( !capture ) return
-
1
;

for(;;) {


IplImage* frame=cvQueryFrame(capture);


if(!frame) break;


//‏…‏copy‏and‏process‏image

cvCalcOpticalFlowPyrLK(

…)



cvShowImage(‏“LkDemo”,‏result‏);


c=cvWaitKey(
30
); // run at ~
20
-
30
fps
speed


if(c >=
0
) {


// process key


}}

cvReleaseCapture(&capture);}

calcOpticalFlowPyrLK()

Also see dense optical flow:

calcOpticalFlowFarneback()



Features
2
D

CS
324

39

Read two input images:


Mat img
1
= imread(argv[
1
], CV_LOAD_IMAGE_GRAYSCALE);


Detect keypoints in both images:


// detecting keypoints

FastFeatureDetector detector(
15
);

vector<KeyPoint> keypoints
1
;

detector.detect(img
1
, keypoints
1
);


Compute descriptors for each of the keypoints:


// computing descriptors

SurfDescriptorExtractor extractor;

Mat descriptors
1
;

extractor.compute(img
1
, keypoints
1
, descriptors
1
);


Now,‏find‏the‏closest‏matches‏between‏descriptors‏from‏the‏first‏image‏to‏the‏second:


// matching descriptors

BruteForceMatcher<L
2
<float> > matcher;

vector<DMatch> matches;

matcher.match(descriptors
1
, descriptors
2
, matches);


Features
2
D continued …

CS
324

40

Viusalize the results


namedWindow("matches",
1
);

Mat img_matches;

drawMatches(img
1
, keypoints
1
, img
2
, keypoints
2
,


matches, img_matches);

imshow("matches", img_matches);

waitKey(
0
);


Find the homography transformation between two sets of points:


vector<Point
2
f> points
1
, points
2
;

// fill the arrays with the points

....

Mat H = findHomography(Mat(points
1
), Mat(points
2
), CV_RANSAC, ransacReprojThreshold);


Create a set of inlier matches and draw them.

Use perspectiveTransform function to map points with homography:


Mat points
1
Projected;

perspectiveTransform(Mat(points
1
), points
1
Projected, H);


Use
drawMatches()

again for drawing inliers
.

Features
2
d contents

Detectors available


SIFT


SURF


FAST


STAR


MSER


GFTT
(Good Features To Track)

Descriptors available


SIFT


SURF


One way


Calonder
(under construction)


FERNS

Detection:

Description:

Kalman Filter, Partical Filter for Tracking

Gary Bradski (c)
2008

42

Kalman

Condensation or Particle Filter

::KalmanFilter class

ConDensation

Projections

Mat getAffineTransform()

Mat getPerspectiveTransform()

void warpAffine()

void warpPerspective()

Find:

Warp:

Homography


Maps one plane to another


In our case: A plane in the world to the camera plane


Great notes on this: Robert Collins CSE
486


http://www.cse.psu.edu/~rcollins/CSE
486
/lecture
16
.pdf


Derivation details: Learning OpenCV
384
-
387

Gary Bradski, CS
223
A, Into to Robotics

44

Gary Bradski and Adrian Kaehler: Learning OpenCV

Perspective Matrix Equation

(camera coords Pt in world to pt on image)

Homography


We often use the chessboard detector to find
4
non
-
colinear
points


(X,Y *
4
=
8
constraints)


To solve for the
8
homography parmeters.


Code:
Once again, OpenCV makes this easy


findHomography(…)
or:


getPerspectiveTransform(…)

Gary Bradski, CS
223
A, Into to Robotics

45

Single Camera Calibration

Now, camera calibration can be done by holding
checkerboard in front of the camera for a few seconds
.

And after that you’ll get:

3
D view of checkerboard

Un
-
distorted image

46

Gary Bradski (c)
2008

See samples/cpp/calibration.cpp

Stereo … Depth from Triangulation


Involved topic, here we will just skim the basic
geometry.


Imagine two perfectly aligned image planes:

47

Depth “Z” and disparity “d” are inversly related:

Stereo


In aligned stereo, depth is from similar triangles:






Problem: Cameras are almost impossible to align


Solution: Mathematically align them:

48

All: Gary Bradski and Adrian Kaehler: Learning OpenCV

Stereo Rectification


Algorithm steps are shown at right:


Goal:


Each row of the image contains the same world points


“Epipolar constraint”


49

Result
: Epipolar alignment of features:

All: Gary Bradski and Adrian Kaehler: Learning OpenCV

samples/c

50

In ...
\
opencv_incomp
\
samples
\
c


bgfg_codebook.cpp

-

Use of a image value codebook




for background detection for




collecting objects

bgfg_segm.cpp


-

Use of a background
learning engine

blobtrack.cpp


-

Engine for blob tracking in images

calibration.cpp


-

Camera Calibration

camshiftdemo.c


-

Use of meanshift in
simple color tracking

contours.c


-

Demonstrates how to compute and use




object contours

convert_cascade.c

-

Change the window size in a
recognition




cascade

convexhull.c


-

Find the convex hull of an object

delaunay.c


-

Triangulate a
2
D point cloud

demhist.c


-

Show how to use histograms for
recognition

dft.c


-

Discrete fourier transform

distrans.c


-

distance map from edges in an image

drawing.c


-

Various drawing functions

edge.c


-

Edge detection

facedetect.c


-

Face detection by classifier cascade

ffilldemo.c


-

Flood filling demo

find_obj.cpp


-

Demo use of SURF features

fitellipse.c


-

Robust elipse fitting

houghlines.c


-

Line detection

image.cpp


-

Shows use of new image class,
CvImage();

inpaint.cpp


-

Texture infill to repair imagery

kalman.c


-

Kalman filter for trackign

kmeans.c


-

K
-
Means

laplace.c


-

Convolve image with laplacian.

letter_recog.cpp

-

Example of using machine learning




Boosting,




Backpropagation (MLP) and




Random forests

lkdemo.c


-

Lukas
-
Canada optical flow

minarea.c


-

For a cloud of points in
2
D, find min



bounding box and circle.




Shows use of Cv_SEQ

morphology.c


-

Demonstrates Erode, Dilate, Open,
Close

motempl.c


-

Demonstrates motion templates




(orthogonal optical flow given
silhouettes)

mushroom.cpp


-

Demonstrates use of
decision trees (CART)




for recognition

pyramid_segmentation.c

-

Color segmentation in pyramid

squares.c


-

Uses contour processing to find
squares




in an image

stereo_calib.cpp

-

Stereo calibration, recognition and
disparity




map computation

watershed.cpp


-

Watershed transform demo.

samples/cpp Code of Possible use for Projects


Brief_match_test


Use of fast det., brief descrp. ORB will
replace. See
video_homography.cpp


Calibration
(single camera)


Chamfer

(
2
D edge matching)


Connected_components


Using contours to clean up regions in
images.


Contours
2

(finding and drawing)


Convexhull

(finding in
2
D)


Cout_mat



(print out Mat)


Demhist

using calcHist()


histograms and histogram
normalization


Descriptor_extractor_matcher


Use of features
2
D detector descriptor


Also see
matcher_simple.cpp


Distrans


Use of the distanceTransform on edge
images and voroni tessel.


Edge
(Canny edge detection)


CS
324

51


Ffilldemo
(flood fill methods)


Filestorage
(I/O of data structs)


Fitellipse
(find contours, fit ellispe)


Grabcut
(energy based segmentation)


Imagelist_creator
(yaml or xml lists)


Read using:
starter_imagelist.cpp


Kalman
(Using the kalman filter)


Kinect_maps
(using kinect in OpenCV)


Kmeans

(using kmeans clustering)


Laplace
(finding points/edges)


Letter_recog

(machine learning)


Use of Random trees, boosting, MLP


Lkdemo
(Lukas Kanada optical flow)


Morphology
2

(erosion, dilation etc)


Multicascadeclassifier

(rejection cascade)


Peopledetect

(use of HOG)


Select
3
dobj

(calc R and t from calib)


Stereo_*

(stereo calib. and matching)


Watersed

(segmentation algorithm)



ML
for

Recognition

52

Gary Bradski (c)
2008

CLASSIFICATION / REGRESSION

(new) Fast Approximate NN (FLANN)

(new) Extremely Random Trees

(coming) LSH

CART

Naïve Bayes

MLP (Back propagation)


Statistical Boosting,
4
flavors

Random Forests

SVM

Face Detector

(Histogram matching)


(Correlation)



CLUSTERING

K
-
Means

EM

(Mahalanobis distance)



TUNING/VALIDATION

Cross validation

Bootstrapping

Variable importance

Sampling methods

Machine Learning Library (MLL)


AACBAABBCBCC

AAA

AACACB

CBABBC

CCB

B

CC

ABBC

CB

B

C

A

BBC

C

BB

53

53

http://opencv.willowgarage.com

K
-
Means, Mahalanobis

Gary Bradski (c)
2008

54

double kmeans()

double Mahalanobis()

K
-
Means:



Choose K data points as cluster centers



While cluster centers change:



Assign each data point to the closest center



If a cluster has no points, chose a random point from
points far away from other cluster centers



Move the centers to the mean position of points in their
cluster

Patch Matching

Gary Bradski (c)
2008

55

void matchTemplate()

Gesture Recognition

Up

R

L

Stop

OK

Gestures:

Meanshift Algorithm
used to track,
histogram
intersection with
gradient used to
recognize.

Gesture via:

Gradient

h
istogram*
based gesture
recognition with
Tracking.

56

Gary Bradski (c)
2008

*Bill Freeman

double compareHist()

Boosting: Face Detection with

Viola
-
Jones Rejection Cascade

57

Gary Bradski (c)
2008

In samples/cpp, see
:

Multicascadeclassifier.cpp

Machine learning


Good features
often

beat good algorithms


Choose an operating point that trades off accuracy vs.
cost

Gary Bradski (c)
2008

58

TP

FN

FP

TN

100
%

100
%

59

Some project ideas: (feel free to steal, modify or ignore)

1.
Identify

faces

in

(cellphone)

pictures

using

facebook

as

database
.

2.
Use

the

(cellphone)

camera

to

detect

dangerous

road

events

and

or

detect

when

someone

is

awake

or

sleeping

(even

with

sunglasses

on?)

also

in

low

light

conditions
.

3.
Use

webcam/cellphone

to

take

pictures

or

videos

of

a

room

and

then

generate

the

floor

plan
.

4.
Photograph

or

video

a

Jenga

tower,

and

advise

the

player

which

is

the

safest

block

to

remove
.

5.
Make

a

multiplayer

game

(if

possible

more

than

one

computers/

cameras)

based

on

CV
.

6.
Make

an

intuitive

two

handed

UI

for

the

OS

(extra

points

for

adding

the

use

of

facial

gestures)
.

7.
Do

something

with

kinect

(e
.
g
.

a

golf

game)

8.
For engineers: make a paintball turret (e.g.
http://www.paintballsentry.com/Videos.htm
).

9.
Make

a

security

system

with

multiple

cameras

that

records

high

quality

portrait

images

and

low

quality

video

and

alerts

the

presence

suspicious

people

in

real

time

(e
.
g
.

covered

faces)
.

10.
Use

the

camera

to

cheat/gain

an

advantage

in

real

life

interactions

(sports,

gambling)

11.
Make

a

system

(on

the

cellphone)

that

identifies/

classifies

photographed

objects

(for

example

mushrooms)


Questions
?

Useful OpenCV Links


61

61

OpenCV Wiki:

http://opencv.willowgarage.com/wiki



OpenCV Code Repository:

svn

co

https://code.ros.org/svn/opencv/trunk/opencv



New Book on OpenCV:

http://oreilly.com/catalog/
9780596516130
/




Or,
direct from Amazon:

http://www.amazon.com/Learning
-
OpenCV
-
Computer
-
Vision
-
Library/dp/
0596516134


Code examples from the book:

http://examples.oreilly.com/
9780596516130
/



Documentation

http://opencv.willowgarage.com/documentation/index.html



User Group
(
44700
members
4
/
2011
)
:

http://tech.groups.yahoo.com/group/OpenCV/join