Digital image processing
Introduction
Hardware for image processing Basics
Eye–Human vision sensor
Mars Orbiter depicts Marineris canyon onMars.
This image is obtained using several images with so called stereo
camera.
Course information
Lecturer:Dr Igor Đurović
Number of lectures:3+1
Type of examination:depending on number of students
Literature:book
Other resources:
www.obradaslike.cg.ac.yu
PPT presentations
examples
test images
English textbooks available
etc
Covered topics
Introduction and image acquisition
Human vision, color models and image formation
Color and point transformshistogram
Geometrical transforms
Interpolation
Image in spectral domains
Filtering
Basics of image reconstruction
Edge detection
Basics of image recognition
Other topics
Compression
Digital image protection
Motion picture processing
Stereo images
Superresolution
Computer graphics
etc.
courseMultimedia
systems
out of scope of the
basic digital image
processing course
History
Photography appeared in XIX century.
Ideas for developing fax machines and usage of telegraphic linesfor
image submission was born during the World War I.
The idea about TV development was born around 1930.
The key event in digital image processing was development of the
first electronic computing machines. These machines enabled simple
and fast image processing.
The second important event was astronomical exploration and race
to space.
JPL fromCalifornia had the first task related to the digital image
processing within NASA USA space program.
Image acquisition
Usually we will
assume that source of
radiation is within
visible light frequency
domain but it could be
other electromagnetic
frequency bands (X
rays, gamma rays,
radio waves),
ultrasound, vibrations
etc.
sensorsreceive signals
reflected from the
surface.
source of radiation
normal line
surface
Image Acquisition
Image reflected from the
surface f(x,y)is input in
optical system.
This is nonelectronic part that
consists of lenses and other
similar parts; it can be modeled
as2D linear spaceinvariant
system with impulse response
h(x,y)that can be approximately
known in advance.
senzor transformsoptical
signal toelectrical
continuous time
electrical signal
transformed to digital
Optical
system
SensorDigitaizer
Image Acquisition
f(x,y)can be considered as a “power”of the light(or
some other signal that is subject of visualization).
h(x,y)is impulse response of the optical system.
Output of the optical system is optical signal.
For linear space invariant systems output is given as:
bxyfhxydd(,)(,)(,)=−−
−∞
∞
−∞
∞
zz
ξηξηξη
2D convolution(very similar to
the 1D convolution for linear time
invariant systems)
sinceb(x,y)andf(x,y)represent
optical power they are nonnegative
Image Acquisition
Optical system can deform image.
Companies producing systems can estimate distortion
caused byh(x,y) and they can develop techniques to
reduce effects of distortion.
The second important element in optical system is sensor
that transforms optical signal into electrical equivalent.
We will not study sensors since they are subject to quite
fast development and we will not consider it within this
course.
Some details related to sensors are given in our textbook
but development in this area is quite fast and some of
them can already be too old.
Digitizer
Analog electric equivalenti(x,y) is transformed to the
digital version within two procedures:
Sampling
Quantization
i(x,y)
1 2 3 4 5 6 7
1
2
3
4
5
6
7
Based on image context
in point or in area value
of the luminance for
image pixeli(1,1) is
determined
Pixel(picture element) is elementary point of the image. The fact
that eye considers numerous close dots as a continual image is used.
Digitizer
Sampling phase is followed withdigitalization
(digitalization is performed by using quantization to
closest value of the quantization steps multiple).
This integer can easily be represented by binary
number.
Number of quantization levels is commonly2k
wherek
is integer.
i(n1,for fixedn2)
n1
Δ
Most important systems
Photography.
Discovered in XIX century.
Not producing electrical
equivalent.
Based on the chemical
process.
It has still the best
resolution among all
sensors.
It is little by little removed
from the market but
technological advance still
exists.
Quality improvement in the
last 5 years is larger than
in the last 50 years.
Fax.
Developed in some variant
at beginning of the XX
century.
Digitalization of black points
on the paper and coded
submission.
Standardized.
Image Digitizers
Parts
Sampling apparatus
Scanning mechanical parts
Source of “light”
Quantizier(commonlyAD converter)
Memory medium
Important characteristics
Pixel size
Image size
What is transformed to image(for visible objects it is its transmittance)
Linearity
Scanning technologies:
SCANIN–source of light is
moving through the image part
by part
SCANOUT–entire image is
illuminated while sampling is
performed part by part
Check in the textbook and on the Internet
for more details on digitalization technology
Most important systems
TV.
Discovered around 1930.
Camera (previouslyvidicon
–tube camera) in TV studio
records images (frames).
Images are transmitted
through communication
channel.
Images are presented on
the TV screen.
Eye can see only 24 images
in second and TV image
seems to looks as
continuous image.
In our country TV has 25 images per
second.
Reason: In the case of presenting 24 per
second due to nonsynchronization
between eye and TV we would see
shaken images.
25 appropriate for Europe since it is½
of frequency in power electrical network.
Standards for analog TV:
PAL (used in our country and numerous
other countries in the world 25
images/second)
SECAM (used in France and francophone
countries, Iran etc)
NTSC (US standard with 30 frames per
second; US power network has 60Hz)
TV
Digital TV (DVB) is in rapid progress presently.
Digital TV standard is called the HDTV.
Since there is rapid development it can happen that
other standards emerge.
High compression lower quality formats are
developed for video streaming over the Internet.
Development of new generations of digital video
discs requires also new high quality video and image
formats.
Image in nonvisible spectrum
Detailsrelated to image acquisition for nonvisible spectrum are given in
textbook. The most important types of nonvisible spectrum used in
digital image are:
Gamma rays(medicine, astronomy etc).
Xrays(medicine, electronic industry, astronomy)
Ultraviolet spectrum(lithography, industrial inspection, microscopy, lasers,
biological imaging, astronomy)
Infrared spectrum(the same application area as in visible spectrum plus
food industry, satellite observations, meteorology, seismology, industrial
inspection)
Radio waves(medicine, astronomy)
It is possible to visualize signals that are not produced with
electromagnetic rays such as for examplevibrationsin seismology.
Computers in image processing
Revolution in development and application of the digital
image processing algorithms begins with development of fast
and reliable image processing machines.
Development of new devices for image processing is very
rapid and data given in the textbookjust couple of years old
can be assumed just as historical overview.
Commonly PCs used for general purpose have special
graphic card(its duty is to process graphical data and to
submit them to monitor).
This card can contain processor used for data processing as
well as the memory.
The main purpose of the processoris usuallydata
compressionwhilememorycommonly has special
constructionto allow processing of huge amount of data.
Computers in image processing
The main problem in realization of these cards is access to
other computer resources(data bus and communication
with computer processor).
There are several methods for solving this problem.
In addition, there could be additional cards for video data
processing.
Also, there are rather expensive special purpose machines
–graphicalstations.
video data requires large amount of
memory for example each pixel of
monitor requires 3 bytes of memory
multiplied with number of pixels and
number of frames persecond that is
larger than in the case of TV.
Printers
Printers are developed for giving images on paper.
There are numerous types of printers. Here, we will give some
data about the most important ones:
Matrix printers(inked ribbon pressed on the paper with needles
called pins).
Line printers(similar technology like in the matrix printers but with
ability to print entire line at once)
Laser andLED printers(electro photography based technology
described in 1938 but applied in 1980byCanon). This technology as
one of the most important is described in details on next coupleof
slides.
Inkjet andbubble jet printers(one or more boxes with ink is
positioned above the paper and appropriate amount of color is puton
the paperby adjusting piezoelectric device in inkjet printers and by
heater in the bubble jet printers).
Laser and LED printers
1.Laser printers have cylinder made of special glass
with uniform amount of electricity.
2.This cylinder is lightenedselectively according to
the page that should be printed.
3.Illuminated partsof cylinder remain without
electricity.
4.Then we are putting toner on cylinder.
5.Toner iskeptonly on places with electricity.
6.At the image is formed on the paper creating
permanent copy(it remains on the paper due to
the electrostatic effect).
Laser andLED printers
Laser beam source
Lens system
Photoconductive material
For illuminating photoconductive
material in the LEDT printers LED
diodes are used. The LED
technology has ability to print
entire line at once.
Printing
Laser printers print material in single color and due to
the imperfection of human visual system we see
different amounts of the same color as different
colors.
There are numerous alternative printing schemes and
apparatus.
Two the most important characteristics of printers
aredpi=dots per inch(determines printer resolutions
and how fine is our printed page;600dpi is today
assumed as reasonable minimum) andppm=page
per minute.
Displays
Displays are devices for displaying images.
We will not consider here so called
permanent displays that produce permanent
images(for example Xerox machines).
Under displays we assume:
Computer monitors,
Medical devices for displaying images,
Projectors,
etc.
Monitors
Three basic technologies:
CRTcathode ray tube technology based on physical properties
of illuminated phosphorous
LCDliquide crystal displays
PDPplasma display panel(with special purpose gas on high
voltage).
CRT monitors have numerous disadvantages (they are
large, energy demanding, different problems with
image presentation (shaking etc); not usable in
military applications etc.)
However, they still have important advantages: cheap;
produced by numerous companies; allow looking
under largeangles; etc.
HowCRT monitorworks?
Image in the CRT monitors are results of the phosphorous
coatinginside of the cathode ray tube.
Phosphorous hit by electrons produce illumination.
For each pixel we have phosphorous dots: red, greenand
blue(RGB).
Their combination produces corresponding colors.
On the other end of the cathode ray tube (it is narrower
part) we have electronic cannon that is sending electrons
toward the phosphorous. This cannon consists of cathode,
device for heating and elements for focusing.
Focusing and directing electron rays toward the
phosphorous coating is performed using the strong
electromagnetic field produced using theVN cascade.
Properties of the CRT monitors?
Three the most important properties of the CRTs are:
Diagonal length
Resolution
Refreshing rate
Usually we are giving diagonal in
inches1”=2.54cm
for CRT monitors ratio
width/height=4/3
Parts of monitor close to limits are
not usable in the CRT monitors
4
3
Resolution is
number of pixels
in image
typical1600x1200
How many images
is presented in the
screen in1sec
Refreshing and resolution
•Today we can adjust number of pixels that will be turned on
(resolution adjustment) up to some maximal value.
•We can adjust refreshing(number of semiframesper second,
70Hz means35 frames/s, i.e., 70 semiimages).
•For 1 pixelwe need data about three colors (each color
represented with k bits).
•Letamount of memorythat is available for onscreen
presentation is Wper second.
•If resolutionisMxNpixels maximal refreshing rate is
v=2W/(3kMN). Why?
Lowpasspattern
Numerous close pixels on small distance are
recognized as continuous shade and we are unable to
recognize separate pixels.
One pixel can be seen inthe following manner.
Pixel region
Due to the numerous reasons
(imprefection of the monitor,
imperfection of eyes and other
reasons) we cannot see pixel in
ideal manner.
pixel enlarged
more than 100
times
Lowpasspattern
Luminance of singe pixel is modeled as:
Aexp((x2+y2
)/R2
) , A is maximal luminance, Rdepends on
monitor quality.
What happens when pixels with the same luminance are in
neighborhood?
d
d
2
dis distance
between pixels
x
y
x
dd
resulting luminance is
not “flat”
luminance of
single pixel
yaxis is
neglected for
brevity
Lowpasspattern
Monitors have good lowpasspattern when
oscillations in luminance are as small as possible.
Details related to the lowpasspattern optimization
are given in textbook.
Highpasspattern
Highpasspattern is formed with alternating lines in
different colors.
Due to the previously described
imperfections changes between colors
are not abrupt and in the case of bad
monitors it is smooth or blurred.
Details related to highpqsspattern can
be found in the textbook.
Why this pattern is called highpass?
With the same purposecheckboard patternformed as the
chess board is introduced.
Cameras
Analogous(vidicon camerabased on the
photomultiplicativetube)
Svjetlost
Primarni
elektroni
Sekundarni
elektroni
0V
Polutransparentna
fotokatoda
+100V
+200V
+300V
+400V
+500V
+600V
R
+700V
I
Light photons hit
the
photocathode
that release
primary
electrons. These
electrons
accelerated by
electric field hit
next cathodes
causing
avalanche effect
with amplified
number of
electrons.
Output of the system isvideo signal, i.e., electrical equivalent
(significantly amplified) of light signal.
Semitransparent
photocathode
Primary
electrons
Secondary
electrons
light
Digital cameras
Analogous cameras have several advantages but
cheaper and cheaper digital video cameras are in
use.
There are two types of digital cameras:
CCD(Charged Coupled Device)sensors
CID(Charge Injection Device)sensors
CCD sensors are dominant(due to price reasons) and
only they will be analyzed here.
Details related to the CID sensors can be found in the
textbook.
Digital cameras
The main part of this camera is photodiode field:
Photodiode field is made on the single chip. Diodes works
as condensatorsin “light integrating regime”. Opening of
MOS switches causes flow of electricity that is proportional
to the charge integrated on diodes.
Photodiodes
MOS switches
Digital cameras
This electricity is proportional to luminance within
given period of integration.
This is similar(with little bit higher inertia) to the
vidicon.
There are problems with reading data from this
sensors.
There are three reading strategy.
Pomjeraj
linije
Pomjeraj
linije
Pomjeraj
linije
Pomjeraj pikselaPomjeraj pikselaPomjeraj piksela
Serijski registarSerijski registarSerijski registar
Full frame CCDInterline Transfer CCDFrame Transfer CC
D
Full frame
oblast
Oblast slike
Oblast za
smije{tanje
line shift
line shift
line shift
pixel shift
pixel shift
pixel shift
Image region
Storage region
Reading strategies forCCD
Full frame transfer
Assume that photodiode filed is matrix with single row
masked. This is row not used for light integration. Formed
image is moved row by row to the masked row and from
masked row is read using shift register.
Advantages:Simple hardware and large percentage of the
photodiode field used for image acquisition.
Drawbacks:Slowness.
Reading strategies forCCD
Interline transfer
Each even row is masked. After integration interval
integrated pixels are moved to masked row and information
is read from these rows while new acquisition can begin
from nonmasked rows.
Advantage:Fastness.
Drawback:Low percentage of used diode field for acquisition.
Check textbook for details on the third strategy.
In the case of color images acquisition there are color
filters that allow only single basic color for one pixel
and combination of images from diodes with various
filters form the color image.
Basic colors will be discussed later.
Human eye
Eyeisoneoffivehumansenses.
5/6 of all information we are getting using theeyes.
Work of eyes is similar to other sensors.
1. posterior compartment
2. oraserrata
3. ciliarymuscle
4. ciliaryzonules
5. canal of Schlemm
6. pupil
7. anterior chamber
8. cornea
9. iris
10. lens cortex
11. lens nucleus
12. ciliaryprocess
13. conjunctiva
14. inferior oblique muscule
15. inferior rectus muscule
16. medial rectus muscle
17. retinal arteries and veins
18. optic disc
19. duramater
20. central retinal artery
21. central retinal vein
22. optical nerve
23. vorticosevein
24. bulbar sheath
25. macula
26. fovea
27. sclera
28. choroid
29. superior rectus muscule
30. retina
“Steps in human vision”
Light is coming toiris.
Photosensitive musclesadjust size of iris and in this
way it is regulated amount of light allowed in eye.
Light is coming to thelens.
Other group of musclesadjust lens in order to give
appropriate focusing of image.
Light passes through the posterior compartment
(glass tissue).
Light is coming to the retina.
Light should come to the exact position on the
retina calledyellow spotthat has size about 1mm2.
“Steps in human vision”
On retina we have human visual cells:
rods(about125 million used for night vision)
cones(about5.5 millionused for color daylight vision –what do
you think why they have prismatic shape?)
Light is transformed in visual cells to electric impulses.
Number of visual cells decreases with moving away from
the yellow spot
On relatively small distance from the yellow spot is so
calledblind spot. It is position where optical nerveexits.
This nerve is connected with the brain.
Optical nerve is connected with visual cells with ganglions.
This nerve “integrate”outputs of visual cells.
“Steps”in vision
Optical nerve is rather long (about 6m) and it connects
eye with the very sensitive part of the brain called
(visual) cortex.
Our brains reconstruct object images.
We are looking with eyes but we see with brain!!!
Human visual system is inertial. We are unable to
create image immediately after abrupt luminance
change.
Transport of optical information through the optical
nerve is relatively slow (comparing to “wired
connections”in machines).
This leads to eye persistence–we are able to see 24
images in second.
Problems in eye
Problems in focusing light in the eye leads to missing
yellow spot–then we have two known drawbacks–
shortsightednessandlongsightedness.
Due to slower reaction of the muscles that manage the
size of eye lenses special type of longsightedness
common for older people can develop.
Problem with development of one of types of cone cells
can follow to daltonism.
There are other very danger deceases trachoma,
conjunctivitis, cataract.
Different other problems within eyes including physical
damage can cause problems in vision.
Color model for eye
There are 3 types of cone cells sensitive to 3 different
colors.
λ[nm]wavelength
≈650nm≈580nm
≈460nm
response of cones that react to:
bluegreenred color
There are smaller number of
cells sensitive to blue than to
other two basic colors. It
follows to smaller sensitivity
to this color (this is probably
caused by human evolution).
Combination of response from these three groups of cone cells
gives our color vision.
Color model for eye
During the night our cells for blackwhite vision are more active
but they are not loosing entirely information related to colors.
Humans are able to distinguish about 1 million of colors
(sensitivity to colors and colors that can be distinguished
varying between humans) but levels of gray are able to see only
about 4080.
Within the next week we will consider different color models for
digital images
Within next lecture we will consider color models for digital
image but we will also review some additional important
features of eyes.
ExerciseNo 1
Sensitivity to grayscale
The procedure will be performed as follows
We will create image with16x16 pixels with256 shades of gray
Then this image will be shown
Try to detect different shades
MATLAB program
clear
k=0;
for m=1:16
for n=1:16
A(m,n)=k;
k=k+1;
end
end
pcolor(A),colormap(gray(256)),shading interp
ExerciseNo 1
Obtained image
Question:
How many different
grayscale you can
observe?
ExerciseNo 2
Create binary image with chessboard pattern.
MATLAB code:
[m,n]=meshgrid(1:100,1:100);
r=rem(m+n,2);
imshow(r)
Adjust image to 100x100
monitor pixels
What can you conclude
about quality of your monitor
for highpassimages?
For selfexercise
Inform yourselves about recent development in the
CCD sensors.
Consider work of some scanner. Detect basic parts.
Consider construction of own scanner. Try to find
data about required parts.
Inform yourselves about work of the LCD and plasma
displays.
Analyze prices and performance of videoprojectors.
For selfexercise
Find data about various types of printing machines,
especially about construction and parts.
Find data about modern computer cards form
graphics and video processing, performance and
construction.
Find data about devices for medical imaging.
Digital Image Processing
EYE –Human Vision Sensor
Color Models
Eye sensitivity
Eye is more sensitive to lower frequencies with red
colors than to higher frequencies with blue colors.
During the daylight the eye is the most sensitive to
wavelength555μm(close to green).
With decrease of the luminance the frequency of the
most sensitive point decreases and it is approaching
toward the dark red colors(wavelength increases).
Eye has ability to adopt itself to the luminance level.
In order to have an impression of looking to the
same object it is just required to keep constant ratio
between object luminance and total luminance.
Eye sensitivity
Eye ability to detect luminance variations depend on
total luminance of the object.
For bright objects we need larger variation in
luminance in order to be able to detect it while for
dark object it is smaller.
Ratio between detectable variation of luminance with
total luminance is constant for a particular eye.
There is a threshold in detection of noise
(disturbances) in image.
Threshold level increases with luminance and it is
more difficult to detect noise in bright than in dark
images.
Eye sensitivity
Sensitivity to contrast assumes ability of eye to
distinguish certain number of different luminance
levels.
This sensitivity decreases with increasing of noise
level.
Noise is more difficult for detection in images with
large number of details than in image without details.
Noise correlated with image can be detected in easier
manner than noncorrelated noise.
Noise in motion (video) sequences causes fatigue of
spectators.
History of color models
History of color models is very long.
Aristotelthought that colors are rays coming from
havens.
Aguiloniusand SigfridForsiusconsidered color model
during renaissance. The former tried to construct the
color model based on colors appearing in nature from
daybreak to the dusk.
Newtondiscovered using the prism that white color
consists of all other colors.
Gothedefined color model based on equalsided
triangle (philosophical considerations).
History of color models
Rungeconstructed revolutionary color model based
on sphere and three coordinates“colorness,
whitenessanddarkness”.
Maxwellconsidered color models and he came to the
Gothetriangle with red, greenandbluecolors at the
corners.
In 1931.the International Comity for LightCIE
(Commission Internationalede l'Eclairage)
standardizedRGBcolor model.
Color model and image
All colors that can be seen by humans can be described
with combination of three images coming from our cone
cells in our eyessensitive to blue,greenandred color.
Then all colors can be represented as a vector with three
components color=[r,g,b].
Note that T. Younghas found in 1802.that combination of
three independent colors (!?) can produce all visible colors.
This model is called three channel model where channels
are shades of particular colors in the model.
RGB cube
The simplest method for representing colors is RGB cube in
Cartesian coordinates.
Coordinates correspond to three basic colors: red, greenand
blue.
Maximal luminance is 1 in any of colors while minimal is 0.
R
G
B
(1,0,0)
(0,1,0)
(0,0,1)
(0,1,1) Cian –lack of red
(1,0,1) Magenta
lack of green
(1,1,0) Yellow
lack of blue
RGB cube
Black=(0,0,0)
White=(1,1,1)
On the main diagonal r=g=bare grayscale (achromatic)
colors.
Achromatic colors can be described by single coordinate only
(luminance).
Example: Section of the color
cube with the plane
0.5r+0.3g+0.2b=1.
RGB model
RGB model is used in computer and TV monitors.
For example classical cathode ray tube monitors have
phosphorous grains that bombarded with electrons produce
red, greenandbluecolors.
RGB model is called additive modelsince resulting color is
obtained as a“sum”of three basic colors.
Based on the Newton experiment the white color (the
highest luminance) is obtained with full luminance in all
basic colors.
Memory representation of the RGB model data requires
MxN (pixels) x 3(colors) x k(bitsfor representing each
color).
Commonly k=8but there are alternative models.
Color table
The RGB model is memory demanding.
There are several techniques for reducing memory
requirements. The simplest is to reduce number of
bits used for representing blue color since humans
have the smallest sensitivity to this color.
Other, more sophisticated, methods are based on
the colormaporCLUT –color lookup table.
Any pixel is coded with rbits.
Numbers from0 to2r1represent one of colors
from the RGB model.
Color table
Information about color coding are given in the
special table(CLUT) that can be recorded together
with image.
The most common type of the CLUT is:
b2
r
1
b2
r
2
...b3
b2
b1
b0
g2
r
1
g2
r
2
...g3
g2
g1
g0
r2
r
1
r2
r
2
...r3
r2
r1
r0
2r12r2...3210
color code
amountsof
red, greenand
bluecolors for
a coded color
Color table
What to do with colors from the RGB model that exist in
image but are not given within the CLUT?
The most commonly these colors are replaced with some color
from the CLUT that is based on some criterion the “closest”to
desired color.
Memory requirements of the CLUTbased representation are:
MxNxrbitsfor image representation+ (r+3x8)2r
bitsfor the
CLUT.
There are alternatives. For example, instead of the color table
we can record and submit information which table is selected
since decoder have the same set of the CLUTs.
There are several color tables commonly used in practice.
Printing color models
While monitor and projectors produce colors by adding
basis colors it is not the case with printing machines.
Everybody knows that combination of colors on the paper
produces very dark close to black color.
The reason is the fact that we are receiving from the paper
reflected part of luminance.
Paper is white until we do not add some color and on this
way we reduce some part of reflected luminance.
On this way by adding colors on the paper we are getting
darker and darker image.
Printing color models
To conclude if we are adding colors we will
obtain darker image while not adding color
produces brighter image.
From this reason color model for printing are
called subtractive.
Colors for printing can be formed in the CMY
(cianmagentayellow) model.
All basic colors represent lack of some of
colors from the RBG models.
check their position in the color cube
CMY model
Relationship between colors from the CMY and RGB models:
C=1RM=1GY=1B
Problem in the CMY model is inability to reproduce large
number of colors.
It is difficult to reproduce fluorescent and similar colors and
the most importantly the black!!!
Approximately just about 1 million of colors is possible to
reproduce with good quality.
The main problem is black since it is very important for
human(important for image recognition, edges of shapes,
etc).
Reason: mix cyan, magenta and yellow and you will get dark
but not black color (or you will use huge amount of color
what is not economical).
CMYK model
Printing machines usually print images in the CMYK model
(4channel model). K means blacK sinceBis reserved for
Blue.
Relationship betweenCMY andCMYK models:
K=min(C,M,Y)
C’=CK
M’=MK
Y’=YK
CMY i CMYK model –comparison
CMYK image
CMY image
Other printing color models
The CMYK can produce significantly larger number of colors
than the CMY model.
This is very important especially for some important colors
such as for example black.
Unfortunately, numerous colors cannot be reproduced in such
manner. It motivated research in design newer and better
printing color models.
The most important group of models are so calledHIFI
models.
These models use (it is still not subject to standardization)6
or 7 printing colors(channels).
Additional channels are:green, orangei violet.
RGB→GRAY
Sometimes it is required to transform RGB color space to
grayscale.
Since grayscaleare on main diagonal of the color cube we
can simply use corresponding projection on this diagonal
consider as a grayscale variant of used color:
GRAY=(R2+G2+B2)1/2
Square root is numerically demanding
operation.
Sometimes it is assumed that obtained
results are not realistic with respect to
image colors.
Result should be quantized
to proper number of color
levels.
RGB→GRAY
The most common RGB→GRAYconverter is mean
value of channels:
GRAY=(R+G+B)/3
Even simpler techniques such as considering some of
channels (commonly red or green) as a grayscale are
sometimes used: GRAY=R GRAY=G.
Blue channel is rarely used since it is assumed to be
nonrealistic.
There are also some alternative schemes some of
them presented latter within this lecture.
GRAY→RGB will be taught at the end of course when
we present story about pseudo coloring.
Binary image
Binary image has only two colors.
Usually black and white are used.
White can be represented as 1 while black is0.
Usually this image is obtained from the grayscale by
comparing the grayscale with threshold:
1(,)
(,)
0(,)
gmnT
bmn
gmnT
≥
⎧
=
⎨
<
⎩
binary image
grayscale image
threshold
Binary image is used in industry application,edge detectionetc. Threshold selection
will be discussed lately.
Three channel model
In Cartesian coordinates any color can be represented
using the vector with three coordinates(R,G,B).
However, T. Young (1802.) has concluded that color
can be determined using3 independent information
(threeindependent coordinates). It is quite similar to
the RGB model.
However, instead of using Cartesian coordinates we can
also use polar or sphere coordinates and develop
corresponding color space.
Some other color models are also available in practice.
An overview of these color models that are different
from the RGB and CMYK follows.
Three channels models
Assume that we have three independent basic colors
(c1,c2,c3).
All other colors can be represented as a vector
(C1,C2,C3) where Ci
corresponds to the amount of the
basic color ci.
Chromais defined as:
123
,(1,2,3)
i
i
C
hi
CCC
==
++
An alternative for memorizing of colors is color space(h1,h2,Y)
where Y=C1+C2+C3 is total luminance. This procedure is used in
development of numerous color models.
CIE color models
International Commission on Illumination CIE
developed in1931 theRGB model (we call it theRGB
CIEsince it is different from the computer model).
RCIE
corresponds to wavelengths700nm, GCIE
λ=546.1nmand BCIE
λ=435.8nm.
Referent white for this model is RCIE=GCIE=BCIE=1.
RGB CIE model does not contain all colors that can
be reproduced and from this reason it is defined via
linear transformation model called the XYZ model
that can represent all visible colors.
RGB CIE and XYZ CIE
relationships
XYZ are RGB linearlydependent and their transform
can be given as:
0.4900.3100.200
0.1770.8120.011
0.0000.0100.990
X
R
YG
Z
B
⎡⎤⎡⎤⎡⎤
⎢⎥⎢⎥⎢⎥
=
⎢⎥⎢⎥⎢⎥
⎢⎥⎢⎥⎢⎥
⎣⎦⎣⎦⎣⎦
X, Y, Z should be represent
wavelegth on which the cone and
rode cells are the most sensitive.
Y corresponds to the most
sensitive wavelength for rode cells.
Chrominent components can
be defined as:
x=X/(X+Y+Z)
y=Y/(X+Y+Z)
X=Y=Z=1 referent white
CIE chromatic diagram
Illuminance of numerous light
sources is still given with (x,y)
chromatic diagram. Then this
diagram is widely used for
illuminancesystem design.
Problem in this diagram is the
fact that there are color areas of
eliptic shape within diagram that
cannot be visible byhumens.
“Computer”and CIE RGB models
Current monitor RGB model is developed as a
recommendation of the NTSC(National Television
Systems Commitee).
Relationship between RGB CIE andRGB NTSC is linear
and it can be given using the transformation matrix:
1.1670.1460.151
0.1140.7530.159
0.0010.0591.128
CIE
CIE
CIE
RR
GG
BB
−−
⎡⎤⎡⎤⎡⎤
⎢⎥⎢⎥⎢⎥
=
⎢⎥⎢⎥⎢⎥
⎢⎥⎢⎥⎢⎥
−
⎣⎦⎣⎦⎣⎦
Modification of the XYZ model
Three modifications are used for overcoming
drawbacks of the XYZ model:
UCS model (model with uniform chromatic scale)
UVW model
U=2X/3 V=Y W=(X+3Y+Z)/2
4
153
X
u
X
YZ
=
++
6
153
X
v
X
YZ
=
++
luminance is the same as in the XYZmodel
Modification of the XYZ model
U*V*W*
model is formed in such manner that referent
white be in origin
*1/3
25(100)17,0.011WYY=−≤≤
**
0
13()UWuu=−
**
0
13()VWvv=−
(u0,v0)are coordinates of
referent white color
Colorimetry
Colorimetryis scientific area specialized for color
comparison.
For example in industry we can have some process that is
done when color of some object is the same or close to
some color known in advance.
Assume that current color in the RGB model is(R
1,G1,B1
)
while the target color is(R2,G2,B
2).Distance between these
two colors can be described as:
222
121212
()()()
R
RGGBB−+−+−
Euclidian distance but some alternative
distances are also used.
Unfortunately, distance defined in this manner for RGB model does not produce
reliable results since similar colors could produce large distance and quite
different color relatively small distance.
Colorimetry
All models with linear dependency to the RGB suffers
from the same problem as the RGB.
Therefore, for colorimetryapplications it is defined
theLabcolor space.
Labmodel can be defined in various manners but
here we adopted definition based on the XYZmodel:
*
3
0
16116/
L
YY=−+
*
33
00
500[//]aXXYY=−
*
33
00
200[//]bYYZZ=−
luminancecomponent
a*>0 meansredwhilea
*<0 means green
b*>0 means blue whileb
*<0 means yellow
Colorimetry
(X
0,Y0,Z0)is referent white(almost always it is
(1,1,1)).
Euclidian distance in the Lab coordinates is assumed
good measure of color difference.
However, there are alternative approaches for
defining color difference measures.
HSL and related models
As we have already seen:
All colors can be represented using three independent colors;
Humans have two types of vision: night vision based on luminance
and daylight vision based on colors;
Colors can be represented using Cartesian coordinates (RGB cube)
but also they can be represented using spherical and polar
coordinates;
Before development of the color TV existed the blackwhite
(grayscale) TV. It was important to develop new system and that
people with old TV sets could follow new signal with the same
functionality as they did with the old one.
Numerous color models are developed for these purposes and
we will describe only probably the most popular theHSL.
HSL color model
H–
hue
S–
saturation
L
luminosity
H is represented with angle. In the HSL model angles
between0 and240 represent colors that can be seen by
humans while angles between240360 are UV colors.
Procedure for transformation of the RGBmodel to the HSL
model is:
Step1.Coordinate transform.
1
[2]
6
HS
x
RGB=−−
1
[]
3
L
RGB
=
++
1
[]
2
HS
y
GB
=
−
RGB→HSL
Step2.FromCartesian coordinates (
x
HS
,
y
HS
)to polar
coordinates(radius is measure of saturation while angle is
measure of hue)
Obtained coordinate system(φ,ρ,
L
) corresponds to theHSL
but commonly several additional operations are performed.
Step3.Normalized saturation.
22
HSHS
x
y
ρ
=+
(,)
HSHS
xy
=
∠
φ
max
3min(,,)3
11min(,,)
RGB
SRGB
RGBL
ρ
ρ
==−=−
++
RGB→HSL
Step4.Additional processing of angle(hue).
Step 5.Final relationship forH:
θ=
−+−
−+−−
L
N
M
M
O
Q
P
P
arccos
.[()()]
()()()
05
2
RGRB
RGRBGB
H
GB
GB
=
≥
−≤
R
S
T
θ
πθ2
HSL model
Similarly it can
be performed
HSL→RGB
transformation.
Determine this
transform for
selfexercise
and with usage
of the
textbook.
Popular
presentation of
the HSL color
model.
White
Red
Yellow
Blue
Black
Color models for video signal
They are similar to the HSL and similar relevant models.
Here, we have single component that corresponds to the
grayscale(so called blackwhite) image due to the
backward compatibility with older models for the TF
signal.
NTSC uses theYIQcolor model where intensity is given
as:
Y=0.2999R+0.587G+0.114Bwith chrominent
components given as:
IRYBYRGB=−−−=−−cos.)sin.()...300877(300493056902710322
QRYBYRGB=−−−=−−sin.)cos.()...300877(300493021105220311
Color models for video signals
PAL color model(YUV–Y is the same as in theNTSC):
SECAM color model (YDrDb–Y is the same as in the
NTSC):
U
B
R
G
B
Y
=
−
−
=
−
0463014702890493....()
V
RRGR
Y
=
−
−
=
−
0615051501000877(....)
DBRGBY
b
=
−
−
=
−
1333045008831505....()
DBRGRY
r
=
−
+
=
−
0217013311161902(....)
These models are quite similar to the HSL.
Exercise No.1
Realize relationship between RGB CIE and standard RGB model.
Here, we will realize several aspects of the problem:
We will create matrix that produces inverse transform from the RGBCIE to theRGB model.
We will determine limits in which we should perform discretizationof the RGB CIE model.
Visualize channels of image forRGB andRGBCIE models.
»A=[1.167 0.146 0.151;
0.114 0.753 0.159;
0.001 0.059 1.128];
»B=inv(A)
B=
0.8417 0.1561 0.0907
0.1290 1.3189 0.2032
0.0075 0.0688 0.8972
ExerciseNo.1
Limits of the RGB model are usually 0 and1 along all coordinates but they are
different for the RGB CIE model.
Minimum of the R component in the CIE model is obtained forR=0, G=1, B=1 and it is
equal to–0.297 while the maximum is produced withR=1, G=0, B=0 and it exhibits
1.167.
Minimum of G in the CIE model follows forR=G=B=0 and it exhibits0, while
maximum follows forR=G=B=1 and it exhibits1.026.
B component achieves maximum forR=0, G=B=1 and it exhibits1.186 while the
minimum is produced forR=1, G=B=0 and it exhibits0.001.
For visualization of color channels we can use the following commands:
a=double(imread('spep.jpg'));
b(:,:,1)=1.167*a(:,:,1)0.146*a(:,:,2)0.151*a(:,:,3);
b(:,:,2)=0.114*a(:,:,1)+0.753*a(:,:,2)+0.159*a(:,:,3);
b(:,:,3)=0.001*a(:,:,1)+0.059*a(:,:,2)+1.128*a(:,:,3);
Channels can be represented with commands as the follows:
pcolor(flipud(b(:,:,1)),shading interp
For selfexercise
List of miniprojects and tasks for selfexercise:
1.
Solve problems from the textbook.
2.
Realize all color models given on these slides and in textbook
and create transformations between them. Visualize channels
for considered models.
3.
Consider the following experiment. Assume that colors that
can not be printed in the CMYK modelhave any of channels
except black channelwhen it is represented with more than
90% of maximal value. Assume that in addition we have color
space with three alternative colors(for examplerose, green
andorange). Colors can be printed in CMYK or in alternative
model with appropriate amount of black. Rules for printing in
the corresponding model are the same as for theCMY (it is
possible to print up to 90%of any color). How many colors
from the RGB model is possible to print the CMYK color and
how many in model with three additional alternative colors.
For selfexercised
List of miniprojectsand tasks for selfexercise:
4.
Create introduced color models and transformations between
these color models and RGB. Make images of cakes in process
of baking or some other similar kitchen experiment. The main
results of the first set of experiments should be: average
color of cake several minutes before we assume that it is
done and average color when cake is done. The second set of
experiments is performed after that. Try to determine
algorithm for automatic turning off the baking appliance
based on the first set of experiments and check if it is
performing well for the second set of experiments. Determine
number of correct and wrong results.
Digital Image Processing
Histogram
Point operation
Geometricaltransforms
Interpolation
Histogram
Histogramis simple (but
very useful) image statistics.
H(X)=number of pixels with
luminanceX
Sum histogram values of image for all luminances(here
grayscale image with bi bits/pixel is considered) is
equal to number of pixels in image.
NMXH
X
×=
∑
=
255
0
)(
Example of histogram
Histogram has numerous applications. It is very useful in techniques that
use probabilistic model of image with probability density function of image
luminance (or at least estimation of the pdf). How to connect histogram
and probability density function?
Histogram –Common shapes
L
P
H
()
P
L
P
H
()
P
L
P
H
()
P
unipolar histograms
for dark and bright
images
bipolar histogram can be used for
threshold determination and
obtaining of binary images(how?)
Histogram extension
Optical sensors very often concentrate image in very
narrow region of luminance.
Then software systems are usually employed to solve
this problem.
Problems are solved by using information obtained
using histogram.
Let image be contained into luminance domain of
[A,B] (estimation of AandBcan be performed using
histogram).
Then assume that we want to extend the histogram
over entire 8bit grayscale domain[0,255]:
255255
()
A
fXX
BABA
=−
−
−
Luminance of
originalimage
luminance of image
with extended
histogram
Histogram ExtensionExample
original
after
histogram
extension
original
histogram
histogram
after
operation
Histogram equalization
Histogram equalization is also one of the
most common histogram operations.
In equalized image histogram is
approximately flat, i.e., the goal is to have
image with approximately the same number
of pixels for all luminances.
Then, we want to transform histogram to be
approximately uniform.
Images with equalized histogram have good
contrast and it is the main reason for
performing this operation.
Histogram Equalization
L
P
H
()
P
originalhistogram
L P
H(P)
goalequalizedhistogram
This can be considered as the following problem: There is a random
variable with probability density functionfx(x)(it can be estimated
using the histogram of original image). We are looking for transform
y=g(x)producing probability density functionfy(y).In this case it is
proportional to the equalized histogram.
Histogram equalization
From probability theory follows:
where(x1,x2,...,xN)are real roots of equationy=g(x).
Assume that solution is unique (it is possible for
monotone functions g(x)).
12
12
()()()
()...

'()

'()

'()

N
xxxN
y
x
xxxxx
fxfxfx
fy
gxgxgx
===
=+++
1
1
()
()

'()

x
y
x
x
fx
fy
gx
=
=
Histogram Equalization
Sincef
y
(y) is constant it means thatg’(x
1) is
proportional to fx(x1
).
Assume that g(x) is monotone increasing function it
means g’(x)=c fx(x).
Selectc=1 (it means that output image would have
the same luminance domain as input one) :
constant value
11
()()
x
x
gxfxdx
−∞
=
∫
Integral of probability density
function. This is monotone
function in its domain.
Histogram Equalization
Since image is not continual but discrete function
here we have no continual probability density
function but its discrete version(histogram).
MATLAB realizationis quite simple:
I=imread('pout.tif');
a=imhist(I);
g=cumsum(a)/sum(a);
J=uint8(255*g(I));
Reading original image
functiong
output image
Histogram EqualizationExample
0
100
200
0
1000
2000
3000
4000
0
100
200
0
1000
2000
3000
4000
original image
equalized image
(significantly
improved contrast)
corresponding
histograms
Obtained
density is not
uniform due to
discrete nature
of images
Histogram matching
Histogram equalization is operation that produces
uniform probability density function.
Similarly histogram can be matched to any desired
probability density function.
The procedure is the same for any monotone
functiong(x)(increasing and decreasing) that is
satisfied in the equalization case.
Otherwise we need more complicated operation
involving segmentation of g(x) in monotone regions.
Applications of Histogram
All methods that uses probability density function estimate
are histogram based.
Improvement of image contrast(equalization).
Histogram matching.
Modifications of colors.
Histogram can be applied locally to image parts. For
example we have very bright object on dark background.
We can perform histogrambased operations on selected
region: object of background depending on our task.
Also, histogram can be calculated for parts of image or for
channels in color images.
Image Negative
There numerous operations that can be applied to
the image luminance. Some of them are applied to
each pixel in independent manner of other pixels.
These operations are called point operations.
One of the simplest of them is determination of the
image negative(or positive if we have image
negative).
This operation can be performed in different manner
depending on the image format.
Image negative
Binary image:
Negativ(n,m)=~Original(n,m)
Grayscale
Negativ(n,m)=2k1Original(n,m)
RGB
Operation for grayscale images
is performed for each image channel.
logical operation negation
number of bits used for
memory representation of
image pixels
Color Clipping
Color clippingis operation performed on colors but it
is not related to geometrical clipping.We are keeping
some image colors as in original image but other
colors are limited to some selected limits:
maxmax
maxmin
minmin
(,)
(,)(,)(,)
(,)
aij
bijaijaij
a
cc
cc
icjc
>
⎧
⎪
=≥≥
⎨
⎪
<
⎩
Brightening (Darkening)
There are several methods to perform these
operations.
For example,f(n,m)=g(n,m)+rwould increase
luminance for r>0while forr<0image will darker.
The second technique: f(n,m)=g(n,m)xr brightening for
r>1darkening for 0<r<1.
These techniques are not of high quality since they
have several drawbacks. The most common is the
following procedure:
f(n,m)=[2k1] {g(n,m)/ [2k1]}γ
Number of bits per pixel
γ>1darkening γ<1
brightening
Luminance correction
Histogram modification function:
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
brightening
darkening
This way of representing point
operation is accepted by almost all
image processing software (for
examplephotoshop).
Results of point operations
Point operations and other operations usually
performed on images do not produce always
integers and numbers in proper domain.
Then after operations are performed we should
return results to image format.
It is performed through simple steps (here
described for grayscale image in 256 levels):
Rounding or truncating to integer(rounding is more
precise but truncation is commonly performed);
Luminance below0 are set to 0 while those above
255 are set to255.
Point operationsConclusion
Point operations are very common and usually they
are performed after observing histogram.
We assumed that pixel of target image depends only
on pixel in original image that is on the same
position.
It is possible to define more complex functions that
establish relationship between more pixels.
Some of these combinations remains for your self
exercise.
Geometrical transforms
In the case of geometrical transform we have that
pixel from position(x,y)moves to the position in
(x1,y1)target image. Here we are in fact considering
transformation of coordinates in digital image that
can be written as:
The simplest transform is translation where entire
image is moved with a given vector(x0,y0).
()
1
1
1
x
y
x
y
g
⎡⎤
⎡
⎤
===
⎢⎥
⎢
⎥
⎣
⎦
⎣⎦
XX
Translation
0
1
0
0
0
xx
yy
x
y
⎡⎤⎡⎤
=−=
⎢⎥⎢⎥
⎣⎦
−
⎣
−
⎦
XX
00
(,)(,)gxyfxxyy=−−
x
0
y
0
f(x,y)g(x,y)
In this case we keep
dimension of the target
image to the same dimension
as in the original image. In
region appearing by
translation we put white or
some other default color.
This strategy is called
cropping. An alternative
strategy (when we want to
change image dimension) is
enlarging image in order to
entire original image be kept
in target image.
Also, it is possible to have cyclical translation
with part that are removed from the image
cyclically shifted at the beginning.
Cropping
Cropping is operation were part of original image is
used as a new image. Of course this image has
smaller dimensions then the original one.
For example let the original image f(x,y)has
dimensions (M,N)and let we want to crop region
between (M1,N1)and(M2,N2) where0<M1<M2<M
and 0<N1
<N2<N.
11
(1,1)(,)gxMyNfxy
−
+−+=
forx∈[M1,M2] i y∈[N1,N2]. Determine dimensions of
the target image?
Rotation
Coordinate transform in the case of the rotation is defined as:
Obtained image is given as:
g(x,y)=f(xcosα+ysinα,xsinα+ycosα).
We assumed that coordinate transform is performed around
origin. This is rare situation in digital images. Develop the
coordinate transform for rotation around pixel(x0,y0).
Positive direction for rotation is counter clockwise.
cossin
'
sincos
x
y
αα
⎡
⎤⎡⎤
=
⎢
⎥⎢⎥
−αα
⎣
⎦⎣⎦
X
negative direction –clockwise
positive direction –counter clockwise
Distortion
y
x
θ
(x',y')
(x,y)
We will demonstrate distortion along one
of coordinate axis.
Coordinate transform:
'1cot
'01
xx
yy
−θ
⎡
⎤⎡⎤⎡⎤
=
⎢
⎥⎢⎥⎢⎥
⎣
⎦⎣⎦⎣⎦
g(x,y)=f(xycotθ,y)
Consider distortion that would be performed in paralelto
the liney=ax+b.
Scaling
Coordinate scaling can be described as:
0
'
0
a
b
⎡
⎤
=
⎢
⎥
⎣
⎦
XX
Determine function of the output image based on input image.
Determine dimensions of output image in functionaandb. For which
parameter values image is enlarged?
This is scaling along xandyaxes. Is it possible to define scaling
along alternative directions?
Could reflection with respect to coordinate axes or origin be described
using scaling?
Nonlinear transforms
There are numerous nonlinear transform used in
digital image processing.
Number of nonlinear transforms is significantly
greater than of linear ones.
Here, we give a simple example:
g(x,y)=f(x+Asinby,y+Asinbx).
Important example is the fisheyenonlinearity.
Fisheye transform
The fisheye effect is cause by shape and limited
(relatively small) dimensions of camera lens. It causes
that objects in the middle of the scene are larger than
objects on borders of the scene.
Sometimes this effect is desired in photography and
photographers simulate it or they produce it using
special form of lenses.
Try to simulate fisheye transform and to propose
method for removing this effect.
Geom. transf. Problem
Original image with pixels on
grid.
Image rotated for 45 degrees.
Pixels are dislocated from the
grid.
Need for interpolation
Commonly only small number of pixels after
geometrical transforms is on the grid while others are
displaced.
Then we have a problem to determine pixels in grid
for the target images.
Techniques for determination of grid values are
called interpolation.
Here we will describe several strategies for
interpolation.
For other techniques look at textbook, Internet, and
for additional material available at the lecturer office.
Nearest neighbor
Nearest neighbortechnique is the simplest
interpolation strategy.
For pixel in the grid we are taking value of
the nearest pixel of interpolated image.
This technique suffers from low quality
problem.
original
rectangle
after rotation for 5 degrees and this
interpolation technique
Human eye is very sensitive to
broken edges and disturbed small
details that are caused by this
interpolation form.
Bilinear Interpolation
Strategy of bilinear interpolation is slightly better with
respect to image quality than the nearest neighbor
but little bit slower. However, calculation burden in
this strategy is still reasonable.
Let pixel of original image be surrounded with four
pixels of transformed image.
pixel that we want to determined
luminanceg(x,y)
transformed pixels (we
assume that dimensions of
square in which we perform
interpolation are not
changed)
x
1x
y
1y
Bilinear interpolation
For simpler determination we
will rotate the coordinate
system.
f(m,n)f(m+1,n)
f(m,n+1)
f(m+1,n+1)
f(m+x,n+y)
Bilinear interpolation determines luminance in point(m+x,n+y)
as:
f(m+x,n+y)=axy+bx+cy+dconstantsa, b, candd
should be determined
Bilinear interpolation
Constants can be determined from the following condition:
f(m,n)=ax0x0+bx0+cx0+d ⇒d=f(m,n)
f(m+1,n)=ax1x0+bx1+cx0+d ⇒b=f(m+1,n)f(m,n)
f(m,n+1)=ax0x1+bx0+cx1+d ⇒c=f(m,n+1)f(m,n)
f(m+1,n+1)=a+b+c+d ⇒
a=f(m+1,n+1)+f(m,n)f(m+1,n)f(m,n+1)
Consider the following case. We are not performing geometrical
transform but we want to change number of pixels in image(for
example instead of NxMwe want to getkNxlNwhere kand lare
integers and k>1and l>1. Determine relationship that connects
original and target image with bilinear interpolation?
This operation is called image resize.
Rotation–MATLAB program
function b=rotacija_i_interpolacija(a, theta, x0, y0)
%realization is performed within function where a is source image
%theta is rotation angle,x0 andy0 are rotation center
[M, N] = size(a);%Size of grayscale image
b = zeros(M, N);%Target image
%we assume that the target image has the same dimensions as source image
%we will perform cropping of remaining parts
for xp= 1 : M
for yp= 1 : N%performing operation for all image pixels
x = (xpx0) * cos(theta) (ypy0) * sin(theta) + x0;
y = (xpx0) * sin(theta) + (ypy0) * cos(theta) + y0;
%Determination of the origin of the pixel mapped to(xp,yp)i.e.,
%where it is in the original image(inverse transform)
Rotation–MATLAB program
if ((x >= 1) & (x <= M) & (y >= 1) & (y <= N))
%Is pixel within proper domain?
xd= floor(x); xg= ceil(x);
yd = floor(y); yg= ceil(y);%(xd,yd) bottom left corner
%rectangular for interpolation; (xg,yg) upper right corner
D = double(a(xd, yd));
B= double(a(xg, yd)) double(a(xd, yd));
C= double(a(xd, yg)) double(a(xd, yd));
A= double(a(xg, yg)) double(a(xd, yd))...
double(a(xd, yg)) double(a(xg, yd));
%Determination of coefficients
b(xp, yp) = A*(xxd)*(yyd)+A*(xxd) + B* (yyd)+D;
%Values of target image
Rotation–MATLAB program
end
end
end
b = uint8(b);
%%%end of the program (end of if selections and two for cycles)
%%%and return image to proper format
Write program for distortion.
Write program for rotation and nearest neighbor.
Rotate image for 5 degrees using the nearest neighbor two times and
perform rotation for 5 degrees two times. Perform the same operation
with bilinear interpolation and compare results.
Polar to rectangular raster
In medicine numerous images are created by axial
recording of objects under various angles. This is
imaging technique is common for various medical
scanner types.
Obtained image has circular shape.
Similar images are obtained in radars, sonars, and
some other acquisition instruments.
These images have polar raster(pixels distribution).
Since monitors have rectangular raster we have to
perform corresponding interpolation.
Polar to rectangular raster
4
p
1
p
3
p
2
p
c
p1, p2, p3, p4
pixels of polar raster
c pixels of rectangular raster
Bilinear interpolation form that is commonly
applied here:
fc
fppfppfppfpp
pppp
()
()/()/()/()/
////
=
+
+
+
+++
11223344
1234
1111
pi
are distances betweenpi
and cwhile f()is luminance in the
corresponding pixel
Other interpolation methods
Earlier the bicubic interpolation was not used due to its
demands. It is based on the third order polynomial function.
Today it is assumed that calculation demands are
reasonable and it is one of the most common strategies.
Some sensors in acquisition process deform image(for
example scanners). Fortunately distortion can be known in
advance and we can select appropriate interpolation
strategy commonly using the grid.
The procedure is as follows: scan of the rectangular grid is
performed; then it is observed position where nodes of the
rectangular are moved. Based on this we create inverse
transform that returns image to proper shape using
software means.
Other interpolation forms
A group of the polynomial interpolation algorithm is
quite common and among them the Lagrange
interpolation technique is quite popular.
There are numerous wellestablished interpolation
technique able to preserve important image features
such for example edges.
Quite common interpolation technique today is based
on spline. This technique is related to both
polynomial interpolation and wavelets.
The Fourier transform can also be used for
interpolation purpose.
For selfexercise
Write own program for evaluation of the image histogram.
Write program for histogram adjusting where upper and lower
bounds are adjusted that reject 5% darkest and 5% brightest
pixels. Pixels outside of this range should be set to the maximal,
i.e., minimal luminance.
How to determined negative of image written using colormap?
Calculation of image negative for color models different from
the RGB?
Write programs for calculation of the image negative.
Create target image based on original image of the hexagonal
shaped range in the size 242where pixels of destination
image are equal to mean of pixels of original image.
Realize own functions for all variants of translation.
For selfexercise
Write coordinate transform where rotation is performed for arbitrary
angle.
Write coordinate transform that performs distortion parallel to arbitrary
liney=ax+b.
Determine functional relationship between output and input imagefor
all transforms defined within lectures.
Is original image enlarged or shrinkedwith respect to aandbin the
case of scaling?
Can scaling be defined for alternative directions than along thex andy
axes?
Can reflection along coordinate axes and with respect to origin be
described using scaling?
Realize all introduced linear geometric transforms.
Realize coordinate transform: g(x,y)=f(x+Asinby,y+Asinxb). Perform
experiments with A andb.
For selfexercise
Create program for image resize based on bilinear transform. This program
should be able to handle with noninteger values of scalek andl, as well as
with possibility that k andl are smaller than 1.
Write program for distortion.
Write program for rotation and the nearest neighbor interpolation.
Perform rotation for5 degrees twice using the nearest neighbor and after
that for 5 degrees twice. Also, these operations repeat with bilinear
interpolation. Compare results.
Check if the bilinear interpolation used for transformation of polar to
rectangular raster is the same as the standard bilinear interpolation
introduced previously.
Project
Write program that allows that users can select
colors and adjust colors in interactive manner
including usage of curves presented on slide 18.
When user define more different points curve should
be interpolated using the Lagrange multipliers.
Write program that perform the fisheye transform as
well as program able to image distorted with fisheye
return to normal (or close to normal) shape.
Project
Review the Lagrange interpolation formula and use it
for polynomial interpolation using the grid.
Find Internet resources related to interpolation and
write seminar paper related to found facts.
Digital Image Processing
Image and Fourier transform
FT of multidimensional signals
Imagesare2D signals.
The Fourier transformof2D continuous time signal
x(t1,t2)is:
InverseFourier transform gives the original signal:
1122
122121
(,,)()
jtjt
edtdttXxt
∞∞
−ω−ω
−∞−∞
=ωω
∫∫
1122
12121
2
2
1
(2)
)((,,)
jtjt
eddxXtt
∞∞
ω+ω
−∞−∞
ω
ω
ω
π
ω=
∫∫
FT of multidimensionalsignals
Signal and its FT represent the Fourier transform pair.
This pair can be written in compact form by introducing
vectors (allowing larger number of coordinates than 2):
Fourier transform can be written as:
12
(,,...,)
Q
tttt=
12
(,,...,)
Q
ω
=ωωω
(())
jt
t
xdtetX
−ω
ω=
∫
12
12
......
Q
Q
tttt
dtdtdtdt
∞∞∞
=−∞=−∞=−∞
==
∫
∫∫∫
11
...
QQ
ttt
ω
=ω++ω
FT of multidimensionalsignals
InversemultidimensionalFourier transform:
We will consider the 2D signals only.
Since we are considering discretizedsignals we will not
consider in details FT of continuous time signals.
Before we proceed with “story”about discretizedsignals
we will give several general comments about the FT.
((
1
(2)
))
jt
Q
Xtdxe
ω
ω
=
ω
π
ω
∫
Fourier transform
FT establishes the “1 to1”mapping with signal.
Roughly speaking signal in time and spectral domain
(its Fourier transform) are different representations
of the same signal.
In addition the FT and its inverse have quite similar
definitions (difference in constant and sign of the
complex exponential).
Why we in fact use the FT?
Motivationfor introducing the FT
signal in time domain
signal in frequency domain
Consider the sinusoid
represented with red line.In
the FT domain it can be
represented with two clearly
visible spectral peaks. When
we add large amount of
Gaussian noise to sinusoid it
cannot be recognized in time
domain (blue lines) while in
spectral domain still the
spectra achieves maximum for
frequency corresponding to
considered sinusoid.
Motivationfor introducing the FT
Roughly speaking: some signals are represented in
better manner in frequency than in time domain.
In addition, filter design is much simpler in spectral
than in space domain. In previous example we will
detect spectral maximum and we will design cutoff
filter that will take just narrow region in the
frequency domain around spectral maximum. In this
way we would reduce significantly influence of noise
to sinusoid.
Similar motivation is used for introducing additional
transforms in digital image processing but we will
consider them later in this course
Similar motivation is used in the case 2D signals and
images.
2D FT of discrete signals
Here we will consider discrete signals.
Then we will skip properties of the 2D FT of
continuous time signals but I propose you to check
that in textbook.
Discretetime signals are obtained from the
continuous time counterpart by using sampling
procedure.
The sampling in the case of 1D signals is simple and
we can take equidistantly separated samples:
x(n)=c xa(n T)
discretetime signal
continuous time signal(T is sampling
rate)
constant
(commonlyc=T)
2D FT of discrete signals
In order that we can reconstruct continuoustime
signal based on discretetime counterpart the
sampling theorem should be satisfied.
This theorem is satisfied if sampling rate satisfies:
T ≤1/2f
m
If the sampling theorem is not satisfied we are
making smaller or bigger mistake.
How to perform sampling in the case of digital
images?
maximal signal frequency
Sampling of 2D signals
The simplest sampling in digital images is:
x(n,m)=c xa(nT
1,m T2)
Constant cis commonly selected asc=T
1
T2.
Sampling rate is usually equal for both coordinatesT
1=T2.
Sampling theorem is satisfied when T1≤1/2fm1 andT
2≤1/2fm2.
Here,fm1
andfm2
are maximal frequencies along corresponding
coordinates(note that 2D FT of continuous time signals has two
coordinatesω1
andω2. Thenfm1
andfm2
are maximal
frequencies along these coordinatesf
mi=ω
mi/2π).
Sampling of 2D signals
Previously described rectangularsampling is not
unique sampling scheme.
In the case of rectangular sampling we are replacing
the rectangular of image with single sample:
Entire rectangular can be replaced with single sample
This is practical sampling scheme but we can
apply some alternative sampling patterns.
2D signal samplings
Some of alternative sampling schemes are given
below:
It can also be rhomb but the hexagon is best
pattern with respect to some wellestablished
criterion.
However, we will continue with
usage of hexagonal sampling due to
simplicity and practical reasons!!!
Quantization
Discretizedsignal is not used directly but it is
quantized.
Instead of exact values we are taking values rounded
(or truncated) to the closest value from the set of
possible values (quant).
Errors caused to rounding is smaller than error
caused by truncation but truncation is used more
often in practice.
Quantization
Quantization can be performed with possible values
equidistantly separated but some systems and
sensors are using nonuniform quantization.
Number of quantization levels is commonly2
k
and
these quant levels are commonly represented as
integers in domain [0,2k1].
We will almost always assume that we have
discretizedandquantized(these signals are called
digitalized).
2D FT of discrete signals
Fourier transform pair between discretetime signal and
corresponding FT can be represented using the following
relationships:
12
12
(,)(,)
j
njm
nm
Xxnem
∞∞
−ω−ω
=−∞=−∞
=ωω
∑∑
12
12
12
2
12
(,(,)
1
2)
)
(
jnjm
Xdxendm
ππ
ω+ω
ω=−πω=−π
=ω
ω
ω
π
ω
∫∫
2D FT of discretetime signal is continuous variable and it is not
suitable in this form for operation on the computer machines.
2D FT is periodic with period along both coordinates of2π.
2D DFT
We will not explain in details properties of the 2D
FT of discretetime signals since we will not use it
in process.
Our goal is to have discretizedtransform that is
suitable for processing using computer machines.
In order to achieve this we use periodicity
extensionproperty.
Namely, assume that signalx(n,m)is defined
within limited domain(it is always the case for
digital images).
Let size of signal(image)isNxM.
2D DFT
Perform periodical extension of the original signal with period
N1xM1
(it should be satisfiedN1≥NandM1≥Mbut here from the
brevity reasons we assumeN1=NandM1=M).
n
m
originalsignal x(n,m)
(,)(,)
p
rp
xnrNmxnm
p
M
∞∞
=−∞=−∞
++=
∑∑
periodical extension
2D DFT
FT of periodically extended signal is:
12
1212
12
12
12121122
(,)(,)
(,)
(,)(,)(2)(2)
jnjm
p
nmrp
jnjmjrNjpM
rpnm
jrNjpM
rp
XxnrNmpMe
xnmee
X
eXNkMk
∞∞∞∞
−ω−ω
=−∞=−∞=−∞=−∞
∞∞∞∞
−ω−ωω+ω
=−∞=−∞=−∞=−∞
∞∞
ω+ω
=−∞=−∞
ωω=++=
==
=ωω=ωωδω−πδω−π
∑∑∑∑
∑∑∑∑
∑∑
Here we changed places of sums and we used property of FT of
translated (shifted) signal with possible neglecting some multiplicative
constants.
Analog Dirac pulses
(generalized functions)
produced by sumation
over infinity number of
terms in sums
2D DFT
Finally we obtain:
Thus, the FT of periodically extended signal is equal to samples
of the 2D FT taken in the discrete gridk1
∈[0,N)and k2∈[0,M).
Periodical extension produce discretizedFT(DFT).
Periodical extension is commonly not performed in practice due
to infinity number of terms in sums and usage of generalized
functions.
However, we should keep in mind that we assumed and that
the smallest period for extension is equal to dimension of image
NxM.
12
12
22
(,),
p
kk
XX
NM
π
π
⎛⎞
ωω=
⎜⎟
⎝⎠
2D DFT
2D discrete signal and2D DFT are transformation pair
12
11
12
00
22
(,)(,)e
kk
N
nm
M
NM
jj
nm
Xkkxnm
−−
−−
==
ππ
=
∑∑
It is the FT of discrete
signal calculated in limited
interval and for discretized
frequencies.
12
12
22
11
12
00
1
(,)(,)e
knkm
NM
jj
NM
kk
xnmXkk
NM
ππ
−−
+
==
=
∑∑
Important fact:The inverseDFT can be calculated in almost the same way as direct on usign
sums. Differences are very small(minus in complex exponential and normalization constant
1/NM).
Domains of the 2D DFT and FT of
discrete signals
Domain of the 2D FT of the discrete time signal is
Descartes product:
Domain of the 2D DFT is discrete set of points
(k
1,k
2)∈[0,N)x[0,M).
We have to determine relationship between frequencies
in these two domains!!!
Relationshipω1=2πk1/Nandω2
=2πk2/Mcan be satisfied
only for0≤k1<N/2and 0≤k2<M/2while for largerk
1
and
k2
these relationship would produce frequencies outside
of ω1
andω2
domain.
12
(,)[,)[,)
ω
ω∈−ππ×−ππ
[belongs to interval
(does not belong to
interval
Domains 2D DFT and FT of
discrete signals
For largerk
1
andk
2
we can use the fact that the2D FT of
discrete time signals is periodical along both coordinates ω1
andω2
with period2πand we can establish relationships:
ω1
=2(Nk1)π/Nand ω2=2(Mk2)π/Mfor N/2≤k1<Nand
M/2≤k2<M.
(0,0)(N,0)
(N,M)
(0,M)
Arrows depict
quadrants of the 2D
DFT shift in this
operation for obtaining
properly ordered
frequencies.
2D DFT, convolutionandLSIS
Digital image can be subject to numerous forms of
processing.
Assume that image is input in linear space invariant
system:
System is linear when linearcombination of inputs
produceslinearcombination of outputs:
If x(n,m)is input andT{x(n,m)}is transformation of
input produced by the LSIS then it holds:
T{ax1(n,m)+bx2(n,m)}=aT{x1(n,m)}+bT{x2(n,m)}
LPIS
?
These systems can be for various
purposes but we will assume their
application for image filtering and
denoising
2D DFT, convolution and LSIS
System is space invariant(extension of the concept of time
invariance) when transform of shifted image is equal to
shifted transform of original image:
Ify(n,m)=T{x(n,m)}if should hold
y(nn0,mm0)=T{x(nn0,mm0)}.
LSIS has important property that
its output can be given as a
convolution of input signal with
2D impulse response of the system:
Due to limits in image size,
rounding, discrete nature of
image systems that process
digital imagers are rarely LSIS
but most of them can be
approximated with theLSIS.
y(n,m)=x(n,m)*n*mh(n,m)
2D convolution
2D DFT, convolutionandLSIS
h(n,m)=T{δ(n,m)}where:
Assume that impulse response is finiteN
1xM1. Then
linear convolution (here, we consider only this type of convolution and
other types will not be considered here)
can be definedas :
1for =0 and =0
(,)
0elsewhere
nm
nm
⎧
δ=
⎨
⎩
11
11
11
1111
00
(,){(,)}(,)**(,)
(,)(,)
nm
NM
nm
ynmTxnmxnmhnm
hnmxnnmm
−−
==
=
=
=−−
∑∑
Output domain is:
[0,N+N1)x[0,M+M1)!!!!
2D DFT convolution and LSIS
We should keep in mind size of the output in important
applications.
The FT of convolution is equal to the products of the FTs.
In order to be able to reproduce results in the case of the
2D DFT we should follow the following steps:
Perform zeropadding up to domain[0,N+N
1)x[0,M+M1).
The same operation should be performed with impulse response.
Determine 2D DFT of signal and impulse response.
Multiply obtaned2D DFTsand calculate inverse2D DFT.
2D DFT,convolution andLSIS
Write these steps in mathematical form:
111
(,)[0,)[0,)
'(,)
0[,)[,)
xnmnNmM
xnm
nNNNmMMM
∈
∧∈
⎧
=
⎨
∈+∨∈+
⎩
11
11111
(,)[0,)[0,)
'(,)
0[,)[,)
hnmnNmM
hnm
nNNNmMMM
∈
∧∈
⎧
=
⎨
∈+∨∈+
⎩
12
11
11
22
11
12
00
'(,)'(,)
knkn
NNMM
jj
NNMM
nm
Xkkxnme
π
π
+−+−
−−
++
=
=
=
∑∑
12
11
11
22
11
12
00
'(,)'(,)
knkn
NNMM
jj
NNMM
nm
Hkkhnme
π
π
+−+−
−−
++
=
=
=
∑∑
2D DFT,convolution andLPIS
12
11
11
12
11
22
11
1212
00
1
(,)
(1)(1)
'(,)'(,)
knkn
NNMM
jj
NNMM
kk
ynm
NNMM
XkkHkke
ππ
+−+−
+
++
==
=×
+−+−
∑∑
There are cases when calculation of the convolution is faster inthe case
of using 3 2D DFTsthan by direct computation.
This is possible when we use fast algorithms for evaluation of the 2D
DFTs.
Need for fast algorithms
In the digital signal processing course we have learnt
two the simplest fast algorithms for the 1D DFT
evaluation:
decimation in time
decimation in frequency
Since digital images have more samples than 1D
signals these algorithms are even more important
than in the case of 1D signals.
We can freely claim that the modern digital image
processing would not be developed without FFT
algorithms(FFT is the same transform as DFT but
name only indicates fast evaluation algorithm!!!).
Need for fast algorithms
Direct evaluation for single frequency requires NxM complex
multiplications
(complex multiplication is 4 real multiplications and two real
additions)
andNxM complex additions
(2 real additions). For each
k1, k2
these operations should be repeated NxM times. Then
calculation complexity isof order:
N2M2
complex additions(2N
2M2
real)
N2M2
complex multiplications(4N
2M2
real+2N2M2
real
additions)
For exampleN=M=1024 on the PC that can perform 1x10
9
operations in second requires: 8N2M2 >8x10
12
realoperations,
i.e., 8x103
secthat is more than 2h.
12
11
12
00
22
(,)(,)e
kk
N
nm
M
NM
jj
nm
Xkkxnm
−−
−−
==
ππ
=
∑∑
Stepbystep fast algorithm
The second sum represents the FT of obtained result in the first
step.
All applied DFTsare 1D and there are in total N+MDFTs.
All 1D DFT can be realized using the fast algorithms.
1
2
1
2
2
2
11
1
2
2
00
1
0
(,)(,)e
(,)
n
m
NM
j
j
k
k
N
M
m
N
n
k
n
j
n
N
Xkkxnme
nkeX
−−
−
−
==
π
π
π
−
−
=
⎡⎤
==
⎢⎥
⎣⎦
=
∑∑
∑
2D DFT can be written as:
1D DFT of image row
StepbyStep algorithm
For simplification assume thatN=M.
For each of the FFT in rows and columns we need
Nlog2N complexadditions and multiplicationsand for
entire freqeuncyfrequency plane it is required:
2NxNlog2N complex additions and multiplications.
This ieequal to 8N
2log2N real additions and
multiplications, i.e.,16N
2log2N operations.
ForN=M=1024 required number of operations is
160x106. It is equal to0.16secon considered
machine.
Stepbystep algorithm
For 1D signals 2FFT algorithms are in usage:
decimation in time
decimation in frequency.
Complexity of both of these algorithms is similar.
The stepbystep algorithm is not optimal for 2D signals but it is quite
popular due to its simplicity.
PC earlier and today mobile devices have problems with memory
demands of the 2D FFT algorithms since today some machines like
mobile devices still have moderate memory amounts. Then we have
problem since in the step by step algorithm we need 3 matrices for:
original image
FT of columns or rows(complex matrix written in memory as two real
valued matrices)
2D DFT (again complexvalued matrix)
Stepbystep algorithm
memory
Then in fact we need memory for 5 realvalued matrices.
Fortunately, image is realvalued signal where the following rule
holds:
12
12
22
11
*
12
00
2()2()
11
12
00
(,)(,)e
(,)e(,)
knkm
NM
jj
NM
nm
NknMkm
NM
jj
NM
nm
Xkkxnm
xnmXNkMk
ππ
−−
++
==
π−π−
−−
−−
==
==
==−−
∑∑
∑∑
it holds
x*(n,m)=x(n,m)
You can find in the textbook how this relationship can be
used to save memory space!!!
Advanced 2D FFTalgorithms
Advanced FFT algorithms for images perform decimation
directly along both coordinates.
12
11
12
00
(,)(,)
NN
nkmk
NN
nm
XkkxnmWW
−−
==
=
∑∑
W
N
=exp(j2π/
N
)
We assume N=M
This 2D DFT can be given with 4 subsumsfor even and odd
coefficients
1212
1212
1111
12
0000
even even even odd
1111
0000
odd even odd odd
(,)(,)(,)
(,)(,)
NNNN
nkmknkmk
NNNN
nmnm
nmnm
NNNN
nkmknkmk
NNNN
nmnm
nmnm
XkkxnmWWxnmWW
xnmWWxnmWW
−−−−
====
−−−−
====
=++
++
∑∑∑∑
∑∑∑∑
Advanced2D FFTalgorithms
After simple manipulations we obtain all sums within limits of
[0,N/2)x[0,N/2)
XkkSkkSkkWSkkWSkkW
N
k
N
k
N
kk
(,)(,)(,)(,)(,)
120012011210121112
2112
=+++
+
where
SkkxmmW
N
mkmk
m
N
m
N
001212
22
0
21
0
21
22
1122
21
(,)(,)
//
=
+
=
−
=
−
∑∑
SkkxmmW
N
mkmk
m
N
m
N
011212
22
0
21
0
21
221
1122
21
(,)(,)
//
=+
+
=
−
=
−
∑∑
SkkxmmW
N
mkmk
m
N
m
N
101212
22
0
21
0
21
212
1122
21
(,)(,)
//
=+
+
=
−
=
−
∑∑
SkkxmmW
N
mkmk
m
N
m
N
111212
22
0
21
0
21
2121
1122
21
(,)(,)
//
=++
+
=
−
=
−
∑∑
Advanced2D FFT algorithms
2D DFT can now be represented depending on the
(k1,k2)as:
XkkSkkWSkkWSkkWSkk
N
k
N
k
N
kk
(,)(,)(,)(,)(,)
120012011210121112
2112
=+++
+
for
021
1
≤≤
−
kN/
021
2
≤
≤
−
kN/
XkkSkkWSkkWSkkWSkk
N
N
k
N
k
N
kk
(,)(,)(,)(,)(,)
1
2
20012011210121112
2112
+=+−−
+
for
021
1
≤≤
−
kN/
021
2
≤
≤
−
kN/
XkkSkkWSkkWSkkWSkk
N
N
k
N
k
N
kk
(,)(,)(,)(,)(,)
12
2
0012011210121112
2112
+=−+−
+
for
021
1
≤≤−kN/
021
2
≤
≤
−
kN/
Advanced2D FFT algorithms
Finally:
XkkSkkWSkkWSkkWSkk
NN
N
k
N
k
N
kk
(,)(,)(,)(,)(,)
1
2
2
2
0012011210121112
2112
++=−−+
+
for
021
1
≤≤
−
kN/
021
2
≤
≤
−
kN/
Skk
0012
(,)
Skk
0112
(,)
Skk
1012
(,)
Skk
1112
(
,
)
1
WN
WN
WN
k1
k2
k1
+k2
Xkk
(,)
12
Xkk
(,)
12
+/2
N
Xkk
(,)
12
+/2
N
X
(
,
)
kk
12
+/2+/2
N
N
1
1
1
1
1
1
Decomposition can be presented using the following lattice:
Decomposition can
be performed in the
next stage on each
S
ij
(
k
1,
k
2) block
Comments 0
Log in to post a comment