# Chapter 21

AI and Robotics

Oct 19, 2013 (4 years and 6 months ago)

83 views

1

Chapter 21

Machine Vision

2

Chapter 21 Contents (1)

Human Vision

Image Processing

Edge Detection

Convolution and the Canny Edge Detector

Segmentation

Classifying Edges

3

Chapter 21 Contents (2)

Using Texture

Structural Texture Analysis

Determining Shape and Orientation from
Texture

Interpreting Motion

Making Use of Vision

Face Recognition

4

Human

Vision

5

Image Processing

Image Processing consists of the following
components:

Image capture

Edge detection

Segmentation

Three dimensional segmentation

Recognition and analysis

Image capture is the process of converting
a visual scene into data that can be
processed.

6

Edge Detection (1)

Edge detection is the first phase in
processing image data.

The following images show a
photograph of a hand and the edges
detected in this image.

7

Edge Detection (2)

Every edge represents some kind of
discontinuity in the image.

Most edges are depth discontinuities.

Other discontinuities are:

Surface orientation discontinuities

Surface reflectance discontinuities

8

Convolution and the Canny Edge Detector (1)

One method of edge detection is to
differentiate the image:

Discontinuities will have the highest
differentials.

This does not work well with noisy
images

Convolution is better for such
images.

9

Convolution and the Canny Edge Detector (2)

The convolution of two discrete functions f(a,
b) and g(a, b) is defined as follows:

The convolution of continuous functions f(a,b)
and g(a,b) is defined as follows:

An image can be smoothed, to eliminate
noise, by convolving it with the Gaussian
function:

10

Convolution and the Canny Edge Detector (3)

The image, after smoothing, can be
differentiated to detect the edges.

The peaks in the differential correspond
to the edges in the original image.

In fact, the same result can be obtained
by convolving the image with the
differential of G:

11

Convolution and the Canny Edge Detector (4)

This method only works with one
-
dimensional
edges. To detect two dimensional egdes we
convolve with two filters, and square and add
the results:

where I(x, y) is the value of the pixel at
location (x, y) in the image.

Filter 1 is

G’
σ
(x) G
σ
(y)

Filter 2 is

G’
σ
(y) G
σ
(x)

This is the Canny edge detector.

12

Segmentation

Once the edges have been detected, this
can be used to segment the image.

Segmentation involves dividing the image
into areas which do not contain edges.

These areas will not have sharp changes in

In fact, edge detection will not always
entirely segment an image.

Another method is thresholding.

Thresholding involves joining pixels
together that have similar colors.

13

Classifying Edges (1)

After extracting edges, it is useful to
classify the edges.

A convex edge is an edge between two faces
that are at an angle of more than 180
°

from
each other.

A concave edge is an edge between two faces
that are at an angle of less than 180
°

from each
other.

An occluding edge is a depth discontinuity.

14

Classifying Edges (2)

The following diagram shows a line
drawing that has had all its edges
classified as convex (+), concave (
-
)
or occluding (arrow):

15

Classifying Edges (3)

Most vertices represent a meeting of
three faces.

There are only

sixteen possible

ways these

trihedral vertices

can be labeled:

16

Classifying Edges (4)

The Waltz algorithm uses this constraint.

This works as follows:

The first edge that is visited is marked with all
possible labels.

Then the algorithm moves onto an adjacent
edge, and attempts to label it.

If an edge cannot be labeled, the algorithm
backtracks.

Thus, depth
-
first search is applied to attempt to
find a consistent labeling for the whole image.

17

Using Texture (1)

Textures, such as
these, tell us a
images, including:

Orientation

Shape

We can also
determine what the
pictures on the
right are showing,
simply by their
textures.

18

Using Texture (2)

A statistical method of determining texture is to use
co
-
occurrence matrices.

D(m, n) is the number of pairs of pixels in our picture,
P, for which:

P(i, j) = m

P(i + δi, j + δj) = n

i and j are pixels in P, and δi and δj are small
increments.

D defines how likely it is that any two pixels a
particular distance apart (δi and δj) will have a
particular pair of values.

The co
-
occurrence matrix is defined as:

C = D + D
T

where D
T

is the transposition of D.

19

Structural Texture Analysis

The structural approach treats textures as
being made up of individual units called
texels.

In this image, each tile

is a texel.

Texel analysis involves

searching for repeated

patterns and shapes

within an image.

20

Determining Shape and

Orientation from Texture (1)

These are good examples of pictures where texture
helps to determine the shape and orientation.

Note that the second image, although it is a flat, two
dimensional shape, looks like a sphere.

This is because this is the only sensible way for our
brains to explain the texture.

21

Determining Shape and

Orientation from Texture (2)

One way to determine orientation is to
assume that each texel is flat.

Thus the extent of distortion of the shape of
the texel will tell us what angle it is being
viewed at.

Orientation involves

determining
slant (σ)

and tilt (τ)
, as shown

here:

22

Interpreting Motion

Detecting motion is vital in mammalian
vision.

Similarly, agents that interact with the real
world need to be able to interpret motion.

We are interested in two types of motion:

Actual motion of other objects

Apparent motion caused by the motion of the
agent.

23

Interpreting Motion (1)

Detecting motion is vital in mammalian
vision.

Similarly, agents that interact with the real
world need to be able to interpret motion.

We are interested in two types of motion:

Actual motion of other objects

Apparent motion caused by the motion of the
agent.

This apparent motion is known as
optical
flow
, and the vectors that define the
apparent motion are the
motion field
.

24

Interpreting Motion (2)

The arrows on this photo show the
motion field.

25

Making Use of Vision

What purpose does machine vision really
serve?

It can be used to control mobile agents or
unmanned vehicles such as those sent to
other planets.

Another purpose is to identify objects in
the agent’s environment.

If the agent is to interact with these objects (pick
them up, sit on them, talk to them) it must be able
to recognize that they are there.

26

Face Recognition

An example of a problem that humans are extremely
good at solving, but computers are very bad at.

Faces must be recognized in varying lighting
conditions, from different angles and distances, and
with other variable elements such as facial hair,
glasses, hats and natural aging.

Methods used in face recognition vary, but many
involve principle component analysis:

Identifying those features that most differentiate
one face from another, and treating those as a
vector which is to be compared.