CIS 601 Fall 2003

bouncerarcheryΤεχνίτη Νοημοσύνη και Ρομποτική

14 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

69 εμφανίσεις

CIS 601 Fall 2003


Introduction to

Computer Vision


Longin Jan Latecki

Based on the lectures of Rolf Lakaemper and David Young

Computer Vision ?

Computer Vision ?


“Computer vision’s great trick is
extracting descriptions of the world
from pictures or sequences of pictures”

(Forsyth/Ponce: Computer Vision)

Pictures/Movies:


How to



Represent


Process / Prepare


Handle


Recognize Objects

Representation



Digital Images


Color Spaces


Gray Images


Binary Images


Geometrical Properties

Representation



Digital Images


Color Spaces


Gray Images


Binary Images


Geometrical Properties

How to process / prepare:








Filters


Edges


Geometric Primitives


Lines, Circles


Introduction to Image Analysis and Processing

Low Level Object Handling:



Image / Video Compression


Huffman


JPEG


MPEG









JPEG
-

Joint Photographic Experts Group


JPEG is designed with
photographs

in mind.

It is capable of handling all of the colors needed.

JPEGs have a lossy way of
compressing

images. At a low
compression value, this is largely not noticeable, but at high
compression, an image can become blurry and messy.




BMP
-

Bitmap Format


uses a pixel map which contains line by line information.

It is a very common format, as it got its start in Windows.


This format can cause an image to be super large.

Image File Formats

GIF
-

Graphics Interchange Format


GIF is the most popular on the Internet, mainly because of its small
file size. It is ideal for small navigational icons and simple diagrams
and illustrations where accuracy is required, or graphics with large
blocks of a single color. The format is
loss
-
less
, meaning it does not
get blurry or messy.


The 256 color maximum is sometimes tight, and so it has the
option to dither, which means create the needed color by mixing
two or more available colors.


GIF use a simple technique called LZW compression to reduce the
file sizes of images by finding repeated patterns, but this
compression never degrades the image quality.

GIF can also be
animated.


Low Level Object Handling:



Object representation






Low Level Object Handling:



Segmentation






The “bottom
-
up” approach

These operations fit into a processing scheme strongly associated with
David

Marr
, whose seminal book
Vision
appeared in 1980.

Marr espoused a
principle of least commitment,
and proposed a processing

scheme involving a series of representations:


Grey level array (the image, in effect)


Raw primal sketch (edges)


Primal sketch (groupings of edges)


Two
-
and
-
a
-
half
-
D sketch (surface depths and orientations, camera centered)


3
-
D model (object
-
centered shapes and relationships).

In some sense, the 3
-
D model is taken as the goal of the visual processing. It

can be used for matching against a database of object shapes to achieve
object

identification
.

But that is not the whole story

A better
goal
is to produce systems that enable successful interaction with

the environment. Interaction may mean, for example:



navigating a robot or autonomous vehicle through obstacles, or along a


road;


moving a robot arm to manipulate parts for assembly;


recognizing human gestures and movements for computer control;


identifying images in a database on the basis of their content.


For many applications, a
top
-
down, model
-
based
or
hypothesis
-
driven

approach is more successful. In such an approach the system starts from an
assumption about what is in front of it, and tests and updates this hypothesis to
attempt to match the image data.


Vision is becoming increasingly
dynamic
. Change and motion are integral

to the goals and methods, not simply techniques for recognizing shape or

inferring the third dimension. Dynamic vision needs to be predictive and

goal
-
directed.


Biological vision remains the most important inspiration for computer

vision. Increasing attention is being paid to the role of
foveal vision
and

eye movements
. And computer modeling continues to shed light on how

biological visual systems work.

Object Recognition:



Color, Texture, Shape


Object Recognition:




Applications



Character recognition


Face Recognition


Shape Recognition (Image
Databases)


3D Distance Histogram



(MATLAB DEMO)

The Interface (JAVA


Applet)

The Sketchpad: Query by Shape

The First Guess: Different Shape
-

Classes

Selected shape defines query by shape


class

Result

Specification of different shape in shape


class

Result

Let's go for another shape...

...first guess...

...and final result

Query by Shape, Texture and Keyword

Result