# Detecting and Segmenting Objects for Mobile Manipulation

AI and Robotics

Oct 19, 2013 (4 years and 8 months ago)

358 views

1

1

OpenCV Tutorial

Omri Perez

Senior Scientist, Willow Garage

Consulting Professor: Stanford CS Dept.

http://opencv.willowgarage.com

www.willowgarage.com

Vision is Hard

Camera Model, Lens, Problems and Corrections

OpenCV

OpenCV Tour

2

CS324

What is it?

Why is it hard?

It’s just numbers.

Maybe try gradients to find edges?

Vision is Hard

3

CS324

4

Depth discontinuity

Surface orientation
discontinuity

Reflectance
discontinuity (i.e.,
change in surface
material properties)

Illumination
discontinuity (e.g.,

Slide credit: Christopher Rasmussen

Use Edges? … It’s not so simple

CS324

Must deal with Lighting Changes …

5

CS324

Lighting is also a Strong Cue

6

6

The Brain Assumes 3D Geometry

7

Perception is ambiguous … depending on your point of view!

7

8

Geometrical aberrations

q

spherical distortion

q

astigmatism

q

tangential distortion

q

coma

aberrations are reduced

by combining lenses

Marc Pollefeys

Non
-
Geometrical

aberrations

q

Chromatic

q

Vignetting

These are typically what

are corrected for in

camera Calibration

Distortion Correction so that Lens

can Approximate a Pinhole Camera

Distortions are corrected mathematically

We use a calibration pattern

We find where the points ended up

We know where the points hould be

OpenCV 2.2 Function:

double
calibrateCamera
(

const vector<vector<Point3f> >& objectPoints,

const vector<vector<Point2f> >& imagePoints,

Size imageSize,

Mat& cameraMatrix,

Mat& distCoeffs,

vector<Mat>& rvecs,

vector<Mat>& tvecs,

int flags=0);

CS324

9

Vision is Hard

Camera Model, Lens, Problems and Corrections

OpenCV

OpenCV Tour

10

CS
324

OpenCV Overview:

General Image Processing Functions

Machine
Learning:

Detection,

Recognition

Segmentation

Tracking

Matrix Math

Utilities and

Data Structures

Fitting

Image Pyramids

Camera

calibration,

Stereo, 3D

Transforms

Features

Geometric

descriptors

Robot support

opencv.willowgarage.com

>
2000
algorithms

11

OpenCV Tends Towards Real Time

http://opencv.willowgarage.com

Where is OpenCV Used?

2

Well over
2

2005

Safety monitoring
(Dam sites, mines, swimming pools)

Security systems

Image retrieval

Video search

Structure from motion in movies

Machine vision factory production inspection systems

Robotics

OpenCV Modules

Calib3d

Calibration, stereo, homography, rectify, projection, solvePNP

Contrib

Octree, self
-
similar feature, sparse L
-

Core

Data structures, access, matrix ops, basic image operations

features2D

Feature detectors, descriptors and matchers in one architecture

Flann

(Fast library for approximate nearest neighbors)

Gpu

CUDA speedups

Highgui

Gui to read, write, draw, print and interact with images

Imgproc

image processing functions

Ml

statistical machine learning, boosting, clustering

Objdetect

PASCAL VOC latent SVM and data reading

CS
324

14

Software Engineering

Works on:

Linux, Windows, Mac OS (+ Android since open CV
2.2
)

Languages:

C++, Python, C

Online documentation:

Online reference manuals:
C++
,
C

and
Python
.

Vision is Hard

Camera Model, Lens, Problems and Corrections

OpenCV

OpenCV Tour

16

CS324

3
x
3

Use the
3
x
3
Scharr operator
instead since it is just as fast
but has more accurate
response on diagonals.

CS
324

17

void Scharr(const Mat& src, Mat& dst,
int ddepth, int xorder, int yorder, double
scale=1, double delta=0, int
borderType=BORDER_DEFAULT)

Canny Edge Detector

18

Canny()

Hough Transform

2008

19

HoughCircles(), HoughLines(), HoughLinesP() (probabilistic Hough)

Scale Space

void cvPyrDown(

IplImage*

src,

IplImage*

dst,

IplFilter

filter = IPL_GAUSSIAN_5x5);

void cvPyrUp(

IplImage*

src,

IplImage*

dst,

IplFilter

filter = IPL_GAUSSIAN_
5
x
5
);

20

2008

Space Variant vision: Log
-
Polar Transform

21

cvLogPolar(src,dst,center,size, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS)

Delaunay Triangulation, Voronoi Tessellation

2008

22

CvSubdiv
2
D* cvCreateSubdivDelaunay
2
D(CvRect rect, CvMemStorage* storage)

Contours

23

void findContours()

Histogram Equalization

2008

24

void equalizeHist(const Mat& src, Mat& dst)

Image textures

Inpainting:

Removes damage to images, in this case, it removes the text.

25

2008

void inpaint(const Mat& src, const Mat& inpaintMask, Mat& dst, double

Morphological Operations Examples

Morphology
-

applying Min
-
Max
.

Filters and its combinations

Opening IoB= (I

B)

B

Dilatation I

B

Erosion I

B

Image

I

Closing I•B= (I

B)

B

TopHat(I)= I
-

(I

B)

BlackHat(I)= (I

B)
-

I

B)
-
(I

B)

26

Void morphologyEx()

createMorphologyFilter()

erode()

dilate()

Distance Transform

Distance field from edges of objects

Flood Filling

27

2008

void
distanceTransform(c
onst Mat& src, Mat&
dst, int
distanceType, int

int
floodFill(
Mat&
image, Point seed,
Scalar newVal, Rect*
rect=
0
, Scalar
loDiff=Scalar(), Scalar
upDiff=Scalar(), int
flags=
4
)

Thresholds

2008

28

double threshold()

Segmentation

Pyramid,
mean
-
shift, graph
-
cut

Here: Watershed

2008

29

void watershed(const Mat& image, Mat& markers)

Background Subtraction

30

BackgroundSubtractorMOG
2
(), see samples/cpp/bgfg_segm.cpp

31

Image Segmentation & Minimum Cut

Image

Pixels

Pixel

Neighborhood

w

Similarity

Measure

Minimum

Cut

* From Khurram Hassan
-
Shafique
CAP
5415
Computer Vision
2003

Graph Cut based segmentation

32

GrabCut

void grabCut(const Mat& image, Mat& mask, Rect rect, Mat& bgdModel, Mat& fgdModel, int iterCount, int mode)

Motion Templates
(My work with James Davies)

Object silhouette

Motion history images

Motion segmentation algorithm

silhouette

MHI

MHG

33

2008

Segmentation, Motion Tracking

Pose

Recognition

Motion

Segmentation

Gesture

Recognition

Motion

Segmentation

34

2008

void updateMotionHistory();

double calcGlobalOrientation();

Tracking with CAMSHIFT

RotatedRect CamShift(const
Mat& probImage, Rect&
window, TermCriteria criteria)

3
D tracking

Camera Calibration

View Morphing

POSIT

void POSIT()

A more general technique for solving pose is

solving the Percpective N Point problem:

void solvePnP(…)

Mean
-
Shift for Tracking

2008

37

CamShift();

MeanShift();

Optical Flow

// opencv/samples/c/lkdemo.c

int‏main(…){

CvCapture*‏capture‏=‏<…>‏?‏
cvCaptureFromCAM(camera_id) :
cvCaptureFromFile(path);

if( !capture ) return
-
1
;

for(;;) {

IplImage* frame=cvQueryFrame(capture);

if(!frame) break;

//‏…‏copy‏and‏process‏image

cvCalcOpticalFlowPyrLK(

…)

cvShowImage(‏“LkDemo”,‏result‏);

c=cvWaitKey(
30
); // run at ~
20
-
30
fps
speed

if(c >=
0
) {

// process key

}}

cvReleaseCapture(&capture);}

calcOpticalFlowPyrLK()

Also see dense optical flow:

calcOpticalFlowFarneback()

Features
2
D

CS
324

39

Mat img
1
1

Detect keypoints in both images:

// detecting keypoints

FastFeatureDetector detector(
15
);

vector<KeyPoint> keypoints
1
;

detector.detect(img
1
, keypoints
1
);

Compute descriptors for each of the keypoints:

// computing descriptors

SurfDescriptorExtractor extractor;

Mat descriptors
1
;

extractor.compute(img
1
, keypoints
1
, descriptors
1
);

Now,‏ﬁnd‏the‏closest‏matches‏between‏descriptors‏from‏the‏ﬁrst‏image‏to‏the‏second:

// matching descriptors

BruteForceMatcher<L
2
<float> > matcher;

vector<DMatch> matches;

matcher.match(descriptors
1
, descriptors
2
, matches);

Features
2
D continued …

CS
324

40

Viusalize the results

namedWindow("matches",
1
);

Mat img_matches;

drawMatches(img
1
, keypoints
1
, img
2
, keypoints
2
,

matches, img_matches);

imshow("matches", img_matches);

waitKey(
0
);

Find the homography transformation between two sets of points:

vector<Point
2
f> points
1
, points
2
;

// fill the arrays with the points

....

Mat H = findHomography(Mat(points
1
), Mat(points
2
), CV_RANSAC, ransacReprojThreshold);

Create a set of inlier matches and draw them.

Use perspectiveTransform function to map points with homography:

Mat points
1
Projected;

perspectiveTransform(Mat(points
1
), points
1
Projected, H);

Use
drawMatches()

again for drawing inliers
.

Features
2
d contents

Detectors available

SIFT

SURF

FAST

STAR

MSER

GFTT
(Good Features To Track)

Descriptors available

SIFT

SURF

One way

Calonder
(under construction)

FERNS

Detection:

Description:

Kalman Filter, Partical Filter for Tracking

2008

42

Kalman

Condensation or Particle Filter

::KalmanFilter class

ConDensation

Projections

Mat getAffineTransform()

Mat getPerspectiveTransform()

void warpAffine()

void warpPerspective()

Find:

Warp:

Homography

Maps one plane to another

In our case: A plane in the world to the camera plane

Great notes on this: Robert Collins CSE
486

http://www.cse.psu.edu/~rcollins/CSE
486
/lecture
16
.pdf

Derivation details: Learning OpenCV
384
-
387

223
A, Into to Robotics

44

Perspective Matrix Equation

(camera coords Pt in world to pt on image)

Homography

We often use the chessboard detector to find
4
non
-
colinear
points

(X,Y *
4
=
8
constraints)

To solve for the
8
homography parmeters.

Code:
Once again, OpenCV makes this easy

findHomography(…)
or:

getPerspectiveTransform(…)

223
A, Into to Robotics

45

Single Camera Calibration

Now, camera calibration can be done by holding
checkerboard in front of the camera for a few seconds
.

And after that you’ll get:

3
D view of checkerboard

Un
-
distorted image

46

2008

See samples/cpp/calibration.cpp

Stereo … Depth from Triangulation

Involved topic, here we will just skim the basic
geometry.

Imagine two perfectly aligned image planes:

47

Depth “Z” and disparity “d” are inversly related:

Stereo

In aligned stereo, depth is from similar triangles:

Problem: Cameras are almost impossible to align

Solution: Mathematically align them:

48

Stereo Rectification

Algorithm steps are shown at right:

Goal:

Each row of the image contains the same world points

“Epipolar constraint”

49

Result
: Epipolar alignment of features:

samples/c

50

In ...
\
opencv_incomp
\
samples
\
c

bgfg_codebook.cpp

-

Use of a image value codebook

for background detection for

collecting objects

bgfg_segm.cpp

-

Use of a background
learning engine

blobtrack.cpp

-

Engine for blob tracking in images

calibration.cpp

-

Camera Calibration

camshiftdemo.c

-

Use of meanshift in
simple color tracking

contours.c

-

Demonstrates how to compute and use

object contours

-

Change the window size in a
recognition

convexhull.c

-

Find the convex hull of an object

delaunay.c

-

Triangulate a
2
D point cloud

demhist.c

-

Show how to use histograms for
recognition

dft.c

-

Discrete fourier transform

distrans.c

-

distance map from edges in an image

drawing.c

-

Various drawing functions

edge.c

-

Edge detection

facedetect.c

-

ffilldemo.c

-

Flood filling demo

find_obj.cpp

-

Demo use of SURF features

fitellipse.c

-

Robust elipse fitting

houghlines.c

-

Line detection

image.cpp

-

Shows use of new image class,
CvImage();

inpaint.cpp

-

Texture infill to repair imagery

kalman.c

-

Kalman filter for trackign

kmeans.c

-

K
-
Means

laplace.c

-

Convolve image with laplacian.

letter_recog.cpp

-

Example of using machine learning

Boosting,

Backpropagation (MLP) and

Random forests

lkdemo.c

-

Lukas
-

minarea.c

-

For a cloud of points in
2
D, find min

bounding box and circle.

Shows use of Cv_SEQ

morphology.c

-

Demonstrates Erode, Dilate, Open,
Close

motempl.c

-

Demonstrates motion templates

(orthogonal optical flow given
silhouettes)

mushroom.cpp

-

Demonstrates use of
decision trees (CART)

for recognition

pyramid_segmentation.c

-

Color segmentation in pyramid

squares.c

-

Uses contour processing to find
squares

in an image

stereo_calib.cpp

-

Stereo calibration, recognition and
disparity

map computation

watershed.cpp

-

Watershed transform demo.

samples/cpp Code of Possible use for Projects

Brief_match_test

Use of fast det., brief descrp. ORB will
replace. See
video_homography.cpp

Calibration
(single camera)

Chamfer

(
2
D edge matching)

Connected_components

Using contours to clean up regions in
images.

Contours
2

(finding and drawing)

Convexhull

(finding in
2
D)

Cout_mat

(print out Mat)

Demhist

using calcHist()

histograms and histogram
normalization

Descriptor_extractor_matcher

Use of features
2
D detector descriptor

Also see
matcher_simple.cpp

Distrans

Use of the distanceTransform on edge
images and voroni tessel.

Edge
(Canny edge detection)

CS
324

51

Ffilldemo
(flood fill methods)

Filestorage
(I/O of data structs)

Fitellipse
(find contours, fit ellispe)

Grabcut
(energy based segmentation)

Imagelist_creator
(yaml or xml lists)

starter_imagelist.cpp

Kalman
(Using the kalman filter)

Kinect_maps
(using kinect in OpenCV)

Kmeans

(using kmeans clustering)

Laplace
(finding points/edges)

Letter_recog

(machine learning)

Use of Random trees, boosting, MLP

Lkdemo

Morphology
2

(erosion, dilation etc)

Peopledetect

(use of HOG)

Select
3
dobj

(calc R and t from calib)

Stereo_*

(stereo calib. and matching)

Watersed

(segmentation algorithm)

ML
for

Recognition

52

2008

CLASSIFICATION / REGRESSION

(new) Fast Approximate NN (FLANN)

(new) Extremely Random Trees

(coming) LSH

CART

Naïve Bayes

MLP (Back propagation)

Statistical Boosting,
4
flavors

Random Forests

SVM

Face Detector

(Histogram matching)

(Correlation)

CLUSTERING

K
-
Means

EM

(Mahalanobis distance)

TUNING/VALIDATION

Cross validation

Bootstrapping

Variable importance

Sampling methods

Machine Learning Library (MLL)

AACBAABBCBCC

AAA

AACACB

CBABBC

CCB

B

CC

ABBC

CB

B

C

A

BBC

C

BB

53

53

http://opencv.willowgarage.com

K
-
Means, Mahalanobis

2008

54

double kmeans()

double Mahalanobis()

K
-
Means:

Choose K data points as cluster centers

While cluster centers change:

Assign each data point to the closest center

If a cluster has no points, chose a random point from
points far away from other cluster centers

Move the centers to the mean position of points in their
cluster

Patch Matching

2008

55

void matchTemplate()

Gesture Recognition

Up

R

L

Stop

OK

Gestures:

Meanshift Algorithm
used to track,
histogram
intersection with
recognize.

Gesture via:

h
istogram*
based gesture
recognition with
Tracking.

56

2008

*Bill Freeman

double compareHist()

Boosting: Face Detection with

Viola
-

57

2008

In samples/cpp, see
:

Machine learning

Good features
often

beat good algorithms

Choose an operating point that trades off accuracy vs.
cost

2008

58

TP

FN

FP

TN

100
%

100
%

59

Some project ideas: (feel free to steal, modify or ignore)

1.
Identify

faces

in

(cellphone)

pictures

using

as

database
.

2.
Use

the

(cellphone)

camera

to

detect

dangerous

events

and

or

detect

when

someone

is

awake

or

sleeping

(even

with

sunglasses

on?)

also

in

low

light

conditions
.

3.
Use

webcam/cellphone

to

take

pictures

or

videos

of

a

room

and

then

generate

the

floor

plan
.

4.
Photograph

or

video

a

Jenga

tower,

and

the

player

which

is

the

safest

block

to

remove
.

5.
Make

a

multiplayer

game

(if

possible

more

than

one

computers/

cameras)

based

on

CV
.

6.
Make

an

intuitive

two

handed

UI

for

the

OS

(extra

points

for

the

use

of

facial

gestures)
.

7.
Do

something

with

kinect

(e
.
g
.

a

golf

game)

8.
For engineers: make a paintball turret (e.g.
http://www.paintballsentry.com/Videos.htm
).

9.
Make

a

security

system

with

multiple

cameras

that

records

high

quality

portrait

images

and

low

quality

video

and

the

presence

suspicious

people

in

real

time

(e
.
g
.

covered

faces)
.

10.
Use

the

camera

to

cheat/gain

an

in

real

life

interactions

(sports,

gambling)

11.
Make

a

system

(on

the

cellphone)

that

identifies/

classifies

photographed

objects

(for

example

mushrooms)

Questions
?

61

61

OpenCV Wiki:

http://opencv.willowgarage.com/wiki

OpenCV Code Repository:

svn

co

https://code.ros.org/svn/opencv/trunk/opencv

New Book on OpenCV:

http://oreilly.com/catalog/
9780596516130
/

Or,
direct from Amazon:

http://www.amazon.com/Learning
-
OpenCV
-
Computer
-
Vision
-
Library/dp/
0596516134

Code examples from the book:

http://examples.oreilly.com/
9780596516130
/

Documentation

http://opencv.willowgarage.com/documentation/index.html

User Group
(
44700
members
4
/
2011
)
:

http://tech.groups.yahoo.com/group/OpenCV/join