1
Chapter 21
Machine Vision
2
Chapter 21 Contents (1)
Human Vision
Image Processing
Edge Detection
Convolution and the Canny Edge Detector
Segmentation
Classifying Edges
3
Chapter 21 Contents (2)
Using Texture
Structural Texture Analysis
Determining Shape and Orientation from
Texture
Interpreting Motion
Making Use of Vision
Face Recognition
4
Human
Vision
5
Image Processing
Image Processing consists of the following
components:
Image capture
Edge detection
Segmentation
Three dimensional segmentation
Recognition and analysis
Image capture is the process of converting
a visual scene into data that can be
processed.
6
Edge Detection (1)
Edge detection is the first phase in
processing image data.
The following images show a
photograph of a hand and the edges
detected in this image.
7
Edge Detection (2)
Every edge represents some kind of
discontinuity in the image.
Most edges are depth discontinuities.
Other discontinuities are:
Surface orientation discontinuities
Surface reflectance discontinuities
Illumination discontinuities (shadows)
8
Convolution and the Canny Edge Detector (1)
One method of edge detection is to
differentiate the image:
Discontinuities will have the highest
differentials.
This does not work well with noisy
images
Convolution is better for such
images.
9
Convolution and the Canny Edge Detector (2)
The convolution of two discrete functions f(a,
b) and g(a, b) is defined as follows:
The convolution of continuous functions f(a,b)
and g(a,b) is defined as follows:
An image can be smoothed, to eliminate
noise, by convolving it with the Gaussian
function:
10
Convolution and the Canny Edge Detector (3)
The image, after smoothing, can be
differentiated to detect the edges.
The peaks in the differential correspond
to the edges in the original image.
In fact, the same result can be obtained
by convolving the image with the
differential of G:
11
Convolution and the Canny Edge Detector (4)
This method only works with one

dimensional
edges. To detect two dimensional egdes we
convolve with two filters, and square and add
the results:
where I(x, y) is the value of the pixel at
location (x, y) in the image.
Filter 1 is
G’
σ
(x) G
σ
(y)
Filter 2 is
G’
σ
(y) G
σ
(x)
This is the Canny edge detector.
12
Segmentation
Once the edges have been detected, this
can be used to segment the image.
Segmentation involves dividing the image
into areas which do not contain edges.
These areas will not have sharp changes in
colour or shading.
In fact, edge detection will not always
entirely segment an image.
Another method is thresholding.
Thresholding involves joining pixels
together that have similar colors.
13
Classifying Edges (1)
After extracting edges, it is useful to
classify the edges.
A convex edge is an edge between two faces
that are at an angle of more than 180
°
from
each other.
A concave edge is an edge between two faces
that are at an angle of less than 180
°
from each
other.
An occluding edge is a depth discontinuity.
14
Classifying Edges (2)
The following diagram shows a line
drawing that has had all its edges
classified as convex (+), concave (

)
or occluding (arrow):
15
Classifying Edges (3)
Most vertices represent a meeting of
three faces.
There are only
sixteen possible
ways these
trihedral vertices
can be labeled:
16
Classifying Edges (4)
The Waltz algorithm uses this constraint.
This works as follows:
The first edge that is visited is marked with all
possible labels.
Then the algorithm moves onto an adjacent
edge, and attempts to label it.
If an edge cannot be labeled, the algorithm
backtracks.
Thus, depth

first search is applied to attempt to
find a consistent labeling for the whole image.
17
Using Texture (1)
Textures, such as
these, tell us a
great deal about
images, including:
Orientation
Shape
We can also
determine what the
pictures on the
right are showing,
simply by their
textures.
18
Using Texture (2)
A statistical method of determining texture is to use
co

occurrence matrices.
D(m, n) is the number of pairs of pixels in our picture,
P, for which:
P(i, j) = m
P(i + δi, j + δj) = n
i and j are pixels in P, and δi and δj are small
increments.
D defines how likely it is that any two pixels a
particular distance apart (δi and δj) will have a
particular pair of values.
The co

occurrence matrix is defined as:
C = D + D
T
where D
T
is the transposition of D.
19
Structural Texture Analysis
The structural approach treats textures as
being made up of individual units called
texels.
In this image, each tile
is a texel.
Texel analysis involves
searching for repeated
patterns and shapes
within an image.
20
Determining Shape and
Orientation from Texture (1)
These are good examples of pictures where texture
helps to determine the shape and orientation.
Note that the second image, although it is a flat, two
dimensional shape, looks like a sphere.
This is because this is the only sensible way for our
brains to explain the texture.
21
Determining Shape and
Orientation from Texture (2)
One way to determine orientation is to
assume that each texel is flat.
Thus the extent of distortion of the shape of
the texel will tell us what angle it is being
viewed at.
Orientation involves
determining
slant (σ)
and tilt (τ)
, as shown
here:
22
Interpreting Motion
Detecting motion is vital in mammalian
vision.
Similarly, agents that interact with the real
world need to be able to interpret motion.
We are interested in two types of motion:
Actual motion of other objects
Apparent motion caused by the motion of the
agent.
23
Interpreting Motion (1)
Detecting motion is vital in mammalian
vision.
Similarly, agents that interact with the real
world need to be able to interpret motion.
We are interested in two types of motion:
Actual motion of other objects
Apparent motion caused by the motion of the
agent.
This apparent motion is known as
optical
flow
, and the vectors that define the
apparent motion are the
motion field
.
24
Interpreting Motion (2)
The arrows on this photo show the
motion field.
25
Making Use of Vision
What purpose does machine vision really
serve?
It can be used to control mobile agents or
unmanned vehicles such as those sent to
other planets.
Another purpose is to identify objects in
the agent’s environment.
If the agent is to interact with these objects (pick
them up, sit on them, talk to them) it must be able
to recognize that they are there.
26
Face Recognition
An example of a problem that humans are extremely
good at solving, but computers are very bad at.
Faces must be recognized in varying lighting
conditions, from different angles and distances, and
with other variable elements such as facial hair,
glasses, hats and natural aging.
Methods used in face recognition vary, but many
involve principle component analysis:
Identifying those features that most differentiate
one face from another, and treating those as a
vector which is to be compared.
Comments 0
Log in to post a comment