Changes
a
nd Modifications
:
Everyone corrected their part according to the feedback given by the other groups. We
also revised each others part checking grammar and spelling.
The parts
that people found confusing in the feedback we’ve tried to simplify.
We changed the structure of some parts of the report i.e we moved the “Project
definition” to the “Method” part and made sure that the “Method” part only contained
the method and not a l
ot of theory that should go under “Theory”. We also removed the
“Acknowledge
ment
” part.
We shortened the introduction to only mention the truly relevant aspects for our
project.
We
also
added a part to the introduction explaining briefly the medical
application of our project.
We added
some
result images such as different view
s
of our object
s
so one could
visualize
the 3D shape of the point cloud.
We added some brief description of the r
esult
images.
We
also
added the
attempts
we did in finding a solution to the 3D visualization
problem. These
attempts
are explained in the “Discussion”
.
The “C
onclusion
”
and
“Future work”
have been updated.
HL2027 3D IMAGE
RECONSTRUCTION AND ANALYSIS IN MEDICINE
3D SURFACE SCANNER
USING VISIBLE LIGHT
GROUP 6
Project Members:
Hannah Reuterdahl
Hassanain
Najah
Kubba
Mathias Järnberg
Panchamy Krishnan K
Yasir
Rasool
Supervisors:
Hamed H Muhammed
Philip Köck
Fredrik Bergholm
Abstract
This project is aimed to develop a 3D surface scanner using a camera. With a number of 2D
image sequences taken from different vie
ws of different objects, we have
reconstruct
ed
3D
model
s
in Matlab
.
There are different algorithms to achieve 3D model reconstruction.
Among these, the point cloud algorithm is the general approach in solving 3D
reconstruction
problems,
as it is a
pplicable to any kind of object
. We decided to implement
this method instead
of back projection since it doesn’t suffer from the same constraints as
back projection i.e. can only be applied to semi

transparent objects using a strong light
source. In our project, we are employing the
Point Cloud Algorithm
for the 3D reconstruction
t
hat
uses feature points of the object and the projection matrix obtained from the camera to
triangulate the 2D points into the 3D object space to reconstruct the surface. The results
from our project proves that the point cloud algorithm is applicable to b
oth semi

transparent objects and opaque objects as we implemented the code for both a semi

transparent perfume bottle and a steel thread and obtained confirming results.
Work Declaration
Yasir and Hassanain set up the scene and took the images of all the different
test
objects.
They also did the camera calibration using Matlab Calibration Toolbox. Panchamy
implemented the pre

processing of the images i.e. filtering, cropping etc. Hannah
implemented the Harris Corner Detector to find the feature points and also
matchbycorrelation
to match the features across image pairs. Mathias implemented the
RANSAC
algorithm. Panchamy, Hannah
and Mathias tried several different approaches of
obtaining the 3D point cloud such as the triangulation algorithm, DLT algorithm etc.
Panchamy also implemented a thresholding function to eliminate the
outliers
in the point
cloud.
Table of Content
s
PART A
................................
................................
................................
................................
..................
1
1 Introduction
................................
................................
................................
................................
....
1
2 Theory
................................
................................
................................
................................
...............
2
2.1 Camera Calibration
–
Y.R.
................................
................................
................................
................................
...
3
2.2 Feature Extraction
–
H.R.
................................
................................
................................
................................
....
5
2.3 RANSAC Algorithm
–
M.J.
................................
................................
................................
................................
....
6
2.4 Triangulation
–
P.K.
................................
................................
................................
................................
...............
7
3 Method
................................
................................
................................
................................
..............
8
3.1 Data Acquisition
–
H. N. K.
................................
................................
................................
................................
..
8
3.2 Preprocessing
–
P.K.
................................
................................
................................
................................
..............
9
3.3 Feature Detection and Matching
–
H.R.
................................
................................
................................
......
10
PART B
................................
................................
................................
................................
...............
12
3.4 RANSAC Algorithm
–
M.J.
................................
................................
................................
................................
.
12
3.5 Camera Calibration
–
Y.R.
................................
................................
................................
................................
13
3.6 Point Cloud
–
P.K.
................................
................................
................................
................................
...............
15
3.7 3D Modeling
–
P.K.
................................
................................
................................
................................
.............
16
4 Re
sults
................................
................................
................................
................................
............
17
5
Discussion
................................
................................
................................
................................
.....
24
6 Conclusion
................................
................................
................................
................................
....
26
7 Future Work
................................
................................
................................
................................
.
27
8 References
................................
................................
................................
................................
....
28
9 Evaluation of References
................................
................................
................................
.........
30
Appendix
................................
................................
................................
................................
...........
31
Table
of Figures
1. Different views o
btained from projection data
................................
................................
..............
1
2. Description of an algorithm for
projective 3D reconstruction
................................
................
2
3. The principle of triangulation usin
g two different camera views
................................
..........
3
4.
Parameters of the camera
................................
................................
................................
......................
4
5.
Pinhole camera model
................................
................................
................................
.............................
4
6.
Pseudo code for finding and matching feature points
................................
................................
..
5
7.
Triangulation example
................................
................................
................................
.............................
7
8. System Description
................................
................................
................................
................................
....
8
9. Data acquisition setup
................................
................................
................................
...............................
9
10.
Princip
le of Harris Corner Detector
................................
................................
..............................
10
11.
Inliers and outliers
…………
………………
................................
................................
..........................
12
12. E
pi
p
olar lines, true and false ……
................................
................................
................................
....
12
13. Two modes of o
perations to load the images
................................
................................
...........
13
14. Main calibration t
oolbox
................................
................................
................................
...................
14
15. Conversion of image from color to g
ray scale to reduce the siz
e
................................
......
14
16.
Calibration Images
................................
................................
................................
...............................
14
17.
Final Extraction Image
................................
................................
................................
.......................
15
18. Pseudo code for the method
................................
................................
................................
.............
16
1
PART A
1Introduction
In
medical
care
, it is often required to obtain
high quality
images of our bodies to
be able
to establish a diagnosis.
S
ometimes
it is even
necessary to reconstruct a set of images to
obtain a 3D reco
nstruction to fully understand the anatomy.
This is a process where
projections of the body
are taken from different
angles
and then by using image
reconstruction assembling the projections, see
Figure 1
.
Figure
1
: Different views obtained from projection data [1]
The image reconstruction techniques can be divided into two different
categories, the
analytical and the iterative. Filtered back projection is a
n
analytical technique
that is
commonly used in image reconstruction. This method was evaluated
in the prior year of
this course
using a light source to reconstruct a semi

transpare
nt object. Unfortunately
though, by using only a light source for the back projection will give unsatisfying results
.
We therefore decided to employ another me
thod,
using point
cloud
created from
feature
points and triangulation
to perform a 3D reconstruct
ion
.
This method could be useful in medical applications since it doesn’t need
slices of the
body but uses
instead
different views of the target
to create the
shape
. This could for
example be used together with ultrasound to generate 3D shapes by retrieving different
views with the ultrasound probe. Matching and transforming feature points also have
other
medical applications i.e.
to augment
different image

modalitie
s such as ultrasound
together with MR or CT images.
This method is used to fuse preoperative planning
images or 3D images with the intra

operative images used during surgery.
2
2Theory
To be able to re

create a 3D shape of an object’s surface using multiple
images we need
information about the depth. The principle of retrieving depth information in an image
is called
triangulation
[4
]
.
There are several different methods incorporating this
principle such as
Stereo, Structure from Motion, Active Triangulation
, Time of Flight,
Shape from Shading, Shape from Texture
etc. A general algorithm for 3D reconstructing
using two uncalibrated images is shown in
Figure
2
.
Figure
2
: Description of an algorithm for projective 3D reconstruction
[4]
We will employ a variant of the method
Structure from Motion
. This method requires
several images taken of a static scene at different angles. The more camera views used
the more complete model of the object will be given
[4
]. The camera views are bein
g
correlated by the correspondences between the different images. Using this method has
the advantage that the camera doesn’t need to be calibrated beforehand but instead the
camera projection matrix can be created with the information from the multiple im
ages.
This is called self

calibration of the camera meaning we can retrieve the external and
internal parameters of the camera directly from the images. The extrinsic and intrinsic
camera parameters are then formed into the camera projection matrix that wi
ll
transform the image points from 2D image space to 3D object space.
The mathematical derivation of the 3D reconstruction is beyond the scope of this report.
Briefly one can say that for a given pixel coordinate (x, y) of point
x
in the image the
3
corresponding world coordinate (X
, Y, Z
) of point
X
in the
scene is
determined by finding
the projection of
x
. This point
X
can though be found anywhere on the so

called
projecting ray or line of sight of the point
x
in the camera. The projecting lines are
found
by using a projection matrix given by homography or the camera calibration. By using
two different views we can then find the intersection between these two lines as shown
in
Figure
3
. This intersection will give us the correct position of
X
in world coordinates
[4
].
Figure
3
: The principle of triangulation using two different camera views [4]
2.1
Camera Calibration
–
Y.R.
There are two types of camera calibration, geometric camera calibration and
photometric calibr
ation. We are employing the geometric camera calibration for our
project. Camera calibration refers to the camera resectioning which is a method to find
the intrinsic and extrinsic parameters of a camera
. The calibration is necessary to obtain
intrinsic
and extrinsic parameters regarding focal length
*
, skew factor of the image, lens
distortion and translation occurring in the camera itself.
The images to be reconstructed
are calibrated according to the characteristic of the camera. Camera resectioning
can
also
be
used to find or estimate the true parameters of the camera since it is examining
the image by comparing two images.
According to [
1
],
“‘In the camera calibration, each camera contains a lens that forms a 2D
projection of the 3D scene on the image
plane where the sensor is located. This projection
causes direct depth information to be lost so that each point on the image plane
corresponds to a ray in 3D space. Therefore, additional information is needed to determine
the 3D coordinates corresponding
to points on an image plane. More specifically, the
transformation from 3
D world coordinates to 2D image
coordinates must be determined
by solving the unknown parameters of the camera model.””
Camera calibration is used to figure out where the camera was
in relation to a view in a
photograph and also to figure out information regarding the camera in relation to the
scene. By using camera calibration, we can also compare if the actual camera pixels are
square or not, and if they are not square, we can find
the horizontal and vertical
scaling
factors for the pixels.
4
Parameters of camera model
There are two types of the camera parameters, intrinsic and extrinsic parameters.
Figure
4
illustrates the p
arameters of the camera. T
he intrinsic parameters such
as
f
ocal length,
image format, principle point
**
, lens distortion, and skew factor are the
characteristic
s
of the camera,
i.e., (
α, β, γ, µ0,
ѵ
0
)
;
w
here
γ
=
α
cot
θ
.
Focal length
,
f
, is the distance
between the image frame and camera frame(C) and t
he
intersection of
the image plane
and the optical axis
, i.e. z

axis,
is called the
principal point
.
E
xtrinsic parameters are
orientation (rotation) and location (tr
anslation) of the camera, i.e.
(
R
,
t
)
[2
]
.
Figure
4
: Parameters of
the camera [2]
The
F
igure
5
displays the intri
nsic and extrinsic parameters of a
pinhole camera
type
(ideal lens distortion).
Where
the camera is modeled as a
pinhole camera
,
m
is the 2D
point (
m
= [u, v]
T
))
,
M
is
the 3
D point (
M
=
[X, Y, Z]
T
) in the world
frame, (u
0
, v
0
)
is
the
coordinates of the principal
point,
(
α,
β
) is
the scale
factors
in
image
u
and
v
axes, and
γ
is
the
parameter describing the skew of
the two
image axes
. The image of a 3D point
M
, denoted by m is formed by an optical ray from
M
passing th
rough the optical center C
.
[2
]
. The pinhole camera model is just used to describe the parameters of the camera. In
our project, we used a CCD camera (not ideal lenses).
Figure
5
: Pinhole camera model
[7
]
5
2.2
Feature
Extraction
–
H.R.
The 3D reconstruction is i
n
practice
done by finding point correspondences (i.e. point
matches) between different viewpoints of the object in a static scene. To establish a 3D
model of the object a dense number of correspondences between the images are needed.
If not enough correspon
dences can be found it will be difficult to reconstruct the object
since the later on point cloud will be to small (RANSAC and the reconstruction
algorithms must have at least 8 correspondences to even run). The process of finding
these correspondences is
called feature detection and matching. A pseudo code of a
general algorithm is given in
Figure
6
:
Figure
6
: Pseudo code for finding and matching feature points
The process
of the feature extraction
can be divided into three stages [5]:
The first stage is to
find
the feature points in each image frame by searching for locations
that are likely to match between the image frames. These points will then have a higher
probability to be extracted with goo
d repetitiveness and with accurate positions
throughout the images [4]. The feature points can be classified differently according to
their properties in the image. First of all there are the obvious points, the key
features
that
can easily be tracked amon
g several images such as a top of a mountain, corners of a
building, or oddly shaped textures [5]. Other classes of features are edges, blobs and
ridges.
To detect the features there are several different feature detection algorithms such as
the Canny Edg
e Detector, Harris Corner Detector, Level Curve Curvature, Förstner,
Difference of Gaussians etc. The algorithms are all based on the same principle as
described above
although
the ingoing details of these algorithms are out of scope of this
report. Since
there are many different feature detection algorithms there is the need for
a measure to be able to evaluate the
ir
performance. One research group [5] measured
the repeatability of the feature detectors as a measure of how well the algorithm was
Read N images
Preprocess N images
For i = 1:num_images
Find feature points
For i = 1: num_imagepairs
Compute descriptors
Match feature points between first and second
frame using maximized correlation
6
performing
. The result of their study was that the
Gaussian Derivative
version of the
Harris
Corner Detector
operator worked best.
Next stage is to describe the feature. The descriptors, or so

called feature vectors, have
to be invariant under the viewpoint, object
orientation and illumination changes [5]. A
good descriptor is defined by having large variations in all directions in the
neighbourhood of the feature point. By looking at the intensity values in the point it
should be easy to recognize the point and if t
his window is moved in any direction it
would induce a large change in intensity. The gradients of the descriptors can be
calculated with different techniques. Example of these descriptors are SIFT (Scale

Invariant Feature Transform) and SURF (Speeded Up R
obust Features) that both detects
and defines local features in an image. They use pixel intensities in a square around a
point to describe the feature point.
The final stage is to match the descriptors by comparing them in every pair of frames.
The
comparison is often made using normalized correlation.
2.3
RANSAC
Algorithm
–
M.J.
The
RANSAC
algorithm
,
or the Random Sampling Consensus
,
has the purpose of
downscaling an amount of data. It is an iterative process that tests a set of data to define
its reliability. The reliability of data is connected to how well a certain set of points may
or may not fit a mathematical model, so called “i
n

and outliers”. Inliers carries are data
points that do describe the model in a true sense where outliers on the other hand are
divergent data. A model with a high concentration of outlier
s
will result in a description
of the model that does not corresp
ond to the real one th
us polluting the final result [5,
6
].
To rid a data set of outliers, the
RANSAC
algorithm assumes that input data are
somewhat polluted of outliers. Knowing this, it randomly selects a number of data
points, in this report
–
correspon
dences
,
and assumes that these are inliers i.e. true
information. The numbers of correspondences differ depending on what version of the
algorithm that is being used. The chosen correspondences are a representation of a
predetermined model; using the remai
ning data the algorithm then tests how good of a
representation this is. Using an error threshold value remaining data is tested towards
the model, evaluating how good
a correspondence consists. [4
, 5
]
The threshold is predefined and can be calculated from
various probability density
functions or manually defined b
etween, normally, 0,001

0,0001 [5
]. If the value is less
then given threshold, this subset cannot provide a true fit to the model why a new subset
is chosen. The process is iterated until an opti
mized fit can be found or if the process is
deactivated due to a user input of maximum number of iterations. [
4
, 5, 6
]
During each of the iterations, a score is kept of how good the fit is and in the end, the fit
with maximum score is chosen. Depending on
what problem the
RANSAC
is to be
7
A
B
C
applied to, the number of needed samples or minimizing distance equation can be
modified, making it flexible towards whatever application it is to be used in
[4,6
].
2.4
Triangulation
–
P.K.
Triangulation is a widely used
algorithm in geometric space. It maps the 2D points to 3D
points based on the mapping of the object space or the projection matrix.
In triangulation, the location of a point in object space is determined by measuring the
angles to it from two known points
rather than measuring the distances to the point
directly which are not possible in the image case. The determined point can then be
fixed as the third point in the triangle with one known side and two unknown angles
[12
].
Figure
7
: Triangulation example [3]
The method of triangulation can be e
xplained using the
Figure 7
. Suppose you want to
calculate the coordinates and distance from the shore to the ship. You need two
projections to determine this. So we assume that there are two observers, one at A, who
measures the angle
α
between the shore and the ship and another ob
server at B who
measures the angle
β
. If either the length l or the coordinates of A and B are know
n
, the
law of sines can be applied to find the coordinates of the ship at C and the distan
ce d.
[12
]
An i
mage is a projection from a 3D scene to a 2D plane d
uring which process the depth
information is lost. So in an image, the 3D point is constrained to be in the line of sight.
From a single image, it is not possible to determine which point in the line of sight
corresponds to the image point. But from a pair
of images, the position of the 3D points
can be found as the intersection of the two projection rays. This process is the same as
triangulation
.
We have two set of matching points in an image pair and two projection
matrixes, the only task is to project i
t to the 3D by finding the intersection of the
projection matrixes.
8
3 Method
The system implementation for the 3D model reconstruction from the point cloud
algorithm is shown in
Figure 8
.
Figure
8
: System Description
The system
works according to the following steps:
Pictures of semi

transparent object at different angles are obtained
Filter the acquired images according to the system requirements
Find the feature points of all the images
Find the matches between corresponding i
mage pairs
Classifying the matches as inliers and outliers using RANSAC
Obtain the camera calibration parameters
Construct the 3D point cloud using Triangulation
Model the 3D surface of the object from the point cloud
3.1 Data Acquisition
–
H. N. K.
For d
ata acquisition we
used the setup shown in
F
igure 9
. We used a
fluorescent
lamp
as
a light source which gives us the best illumination to our
objects
. A white paper was
used
to avoid specular reflections, (used to dim the light source)
;
the object was then
rotated with consequent, angular steps.
A paper that could be rotated was divided into
32 angles to obtain the
different angles of
the objects, as shown in
Figure
9
.
The images were
taken
using
a digital camera
while rotating
the obje
ct
every 5 degrees
.
And for further calibrating and registration of the
camera
we used a checkerboard to
retrieve
the camera transformation matrix.
Data Acquisition
Preprocessing
Feature
Extraction
Point Cloud
Generation
3D Modelling
9
Figure
9
: Data acquisition setup
List of the materials that used to get the images:
1.
Digital camera Canon D503
2.
Cylindrical paper which rotates above a fixed paper
3.
Fixed white paper
scaled into
32
with a separation of
5 degree
angle
.
4.
White background paper
5.
Good lighting source
6.
Steel thread shaped as a heart
7.
Semi

transparent perfume bottle
3.2 Preprocessing
–
P.K.
The images obtained from the data acquisition step must be preprocessed according to
the process requirement. In order to use the program for diff
erent sets of images, we
have tried to make the program as general as possible. The program doesn’t depend on
the name of the saved images. The input to the program is the name of the folder in
which the images are saved. The name of the folder is passed t
o the
fpreprocessing
function where all the images are preprocessed for the further steps in the algorithm.
In the
fpreprocessing
function, a
ll the images
in the folder are written into a cell array to
make the computations easier. The main part of the al
gorithm is the feature detection
and for that the resolution doesn’t play an important role. The resolution of the image
taken by the Canon camera is around
4752 x 3168 and we have more than 30 images.
Such high resolution picture consumes lots of processi
ng time and in order to reduce
this, we have decreased the resolution of the images to 1/10.
The feature detection is sensitive to the background information in the images, i.e. if in
the image, we
may
have any other distinctive features in the background other than our
Light source
A steel thread with
the rotating disc
Canon camera
10
required object, the algorithm will detect these features too. This will lead to wrong
matches for the point cloud
.
To avoid this,
t
he images are cropped to remove the
background and get
the object alone
. Since all the pictures are taken with a static
camera, we can assume that all the pictures will have the object at the same position and
hence cropping was made easier.
Another important goal of the preprocessing is the type of filtering
used.
Since the
images are taken using high resolution camera in our own setup, the
re is not much noise
in the pictures. H
ence
, we decided to avoid
low pass filtering.
Feature points are usually
edges, corners or texture difference and in order to detect t
hese, the images must have
high frequency components. Therefore, we decided to use sharpening filter. We tested
different sharpening filters like
sobel
,
prewitt
, etc. In the end, we decided to use the
unsharp
filter, a contrast enhancement filter, as it y
ielded better results than the others.
3.3 Feature Detection and Matching
–
H.R.
By reviewing different feature detector algorithms Harris Corner Detector seemed to be
a popular choice. As mentioned above this algorithm was also shown to perform well
when
measuring its repeatability. Harris Corner Detector is invariant to rotation which
also makes it appropriate to our project. We therefore decided to choose this method for
the feature detection.
The implementation of the Harris Corner Detector in Matlab w
as done by using a
function one of the project members had implemented in a previous assignment to stitch
two photographs together. With some modifications this code could be used for this
project as well. A lot of implementations of the Harris Corner Dete
ctor can be accessed
via Ma
thWorks website. One of these [10
] was also tested for comparison but since
there was no distinct difference it was chosen to go with the code we implemented by
ourselves.
The basic idea of the function
findHarrisCorner
is to find patches in the image where a
small displacement of the patch would yield in a large difference in intensity, see
Figure
10
.
Figure
10
: Princi
ple of Harris Corner Detector [
5
]
11
The algorithm uses an autocorrelatio
n function calculating the change of intensity for a
certain shift of a window or patch [3]. For nearly constant patches this function will be 0
and for distinctive patches it will be large. By combining the derivatives in each direction
it creates a 2x2 m
atrix, M, where the eigenvalues are the principal curvatures of the
auto

correlation function i.e. M = [I
x
2
I
x
I
y
; I
x
I
y
I
y
2
] where I
x
and I
y
are the gradients of the
image [3]. The image points are then classified by the eigenvalues of M. If
1
and
2
are
small we have a region of no interest but if
1
and
2
are large we have a large intensity
difference in every direction of the image i.e. a corner. Using the eigenvalues a corner
response can be calculated which then will describe how good a point in
the image will
serve as a feature point.
The corner response, R, is given by the following equation:
The algorithm also thresholds the value of R by nonmax suppression which will throw
away
weaker scores. This threshold needs to be set manually depending on which
feature points one wants to find. A threshold below

10000 will result in the edges while
a threshold above 10000 will result in the corners of the image. A feature point is then
def
ined if the score is high enough. The results of using different thresholds can be seen
in the figure 4

6 below. In the end a threshold of 5e7 gave a satisfactory result.
To match the feature points a function called
matchbycorrelation
[5] was used. This
f
unction takes the feature points given by the Harris Corner Detector and then tries to
match these points between the image pairs. The algorithm will try to correlate an image
patch around the feature point, called a feature vector (size given by the param
eter
windowSize
, in our case set to 5) to another feature vector in another frame. The
algorithm only chooses matches when the correlation is maximized, i.e. it looks for the
maximum value of the correlation matrix to find the strongest match between the
f
eature vectors. The algorithm also checks that this correlation is maximized in each
direction i.e. feature vector 1 in image 1 correlates to feature vector 1 in image 2 and the
other way around.
Depending on the parameters
windowSize
,
sigma
and
threshol
d
the algorithm will be
more or less discriminate when choosing feature points.
WindowSize
defines the size for
the Gaussian filter to be used for smoothing the image and also the size for the non

maxima suppression which finds the local maxima for the thr
esholding part of Harris
Corner Detector.
Sigma
defines the standard deviation of the Gaussian kernel; a larger
value of
sigma
will result in a smoother image.
The value of the parameter
threshold
should be larger than 1e4 to detect corners in the image
.
k
:
constant between 0.04

0.06
12
The function
matchbycorrelation
may
allow un

true matches of
image points that do
not
correspond at all.
There
could
also
be
a problem when the algorithm locates feature
points on the ground since we only care about the feature points on the object itself.
These two problems can be solved by using the RANSAC algorithm
, which
trains the data
and chooses which points are inliers and which are outliers.
PART B
3.4 RANSAC Algorithm
–
M.J.
In this project the RANSAC algorithm will be used to calculate a fundam
ental matrix for
each image pair. The fundamental matrix is a description of the epipolar geometry of the
frameset, i.e. geometry in stereovision [5]. In the process, 8 matches will randomly be
selected and considered to be inliers. From these, a fundamen
tal matrix will be
constructed and provide the possibility of describing two point in either image. The
remaining data is tested towards the matrix, and if they are within an accepted distance
of each other’s epipolar line, then considered being inliers, s
ee Figure 11
. As mentioned
above, the model with the highest number of inliers i.e. has the best fit, will in the end be
chosen and used in the 3D reconstruction. Using a non

true fundamental matrix in the
reconstruction phase will result in placing points
at false place distorting the final result,
see Figure 12.
A non

true matrix corresponds to a homography of two frames with
correspondences that do not exist. Thus resulting in a false placement matches in 3D
space [4, 5].
In the case of image 3D reconstruction RANSAC plays a crucial part in determining true
correspondences between two frames. From the feature extraction an attempt is made
to match these between the two, i.e. controlling if they represent the same point in
r
espective image. Unfortunately, this does not always work, as the information of a
feature might correspond to the same conditions in another region in the next image,
resulting in a false match. Lines between two feature points represent the matching
poin
ts and as one can see these two points do not always correspond to the other from
the images.
Figure
11
: Inliers and outliers [
4
]
Figure
12
: Epi
p
olar lines, true and false [
4
]
13
When letting the RANSAC
algorithm evaluate the probability of each matching pair,
these outliers are hopefully removed. In the algorithm one can change the threshold
coefficient to tune what data to be saved. Although this is a trade

off between amount of
noise and good informati
on, if it is too low the risk of outliers to be kept is increased and
if it is too high inliers might be excluded [6].
Epipolar Geometry
When the RANSAC
evaluations the truthfulness of a match it calculates it’s coordinates
using epipolar geometry. Epipola
r geometry is a fundamen
tal concept in 3D
imaging as
this is used to describe the same object in different frames and how they are related.
In epipolar geometry a point seen by camera 1 extends a projection line towards infinity,
in
F
igure 1
2
this would r
epresent the vector
C1

Q1

X.
Using the knowledge of a straight
line between points
C1
–
C2
with corresponding epipoles
e1
and
e2
, camera2 will view our
point as a line, epipolar line, in its frame. This line can be calculated as from knowing the
position of i
ts epipole
e2
and any point from where camera2 projects a line on the
C1

Q1

X
. If the second point in a
n
assumed match lies somewhere on this line, the match of two
points from two views is true and treated as an inliers. [4, 5
]
3.5 Camera Calibration
–
Y.R
.
To retrieve the camera projection matrix,
several images of the planer checkerboard
were taken
in different positions and distances
. The requirements were
a
digital camera
Canon D503
, checkerboard
and
good lighting in the room.
The
parameters needed for
making the projection matrix are
the int
rinsic camera
parameters such as focal length
f
c
, Principal point
c
c
, Skew
α
c
, and Distortion k
c
.
Camera
Calibrati
on Toolbox for Matlab enables us
to load calibration images,
to extract image
corners, to run the mai
n calibration engine and display the results.
Figure
13
: Two modes of
operations to load the images [6
]
First
step of using this toolbox is
to load the images in Matlab using one of the two
modes, Standard or Memory efficient mode
[8
]
. We
have used the second one (M
emory
efficient). This mode
enable
s us
to load every image one by one. After clicking on
memory efficient mode, the main calibrati
on toolbox window appears on screen, see
Figures 13
and
14:
14
Figure
14
: Main calibration toolbox [
6
]
The second step is
to convert the image from rgb to grey with the extension of the file
(tif), and read them in 2D, then save them
in ti
f
extension. See
Figure 15:
Color Image.jpg
Gray I
mage.tif
Figure
15
: Conversion of image from color to gray scale to reduce the size
By clicking on the Image names, all the
images (
with “tif” extensions
) will be loaded
and
then
read.
In the end
, w
e can see all the calibration images
as shown in
Figure 16
.
Figure
16
: Calibration Images
The image corners can be extracted by
clicking on the extract grid corners in the camera
calibration tool window. Then click on the four extreme corners o
f
the checkerboard
pattern. The parameters that need to be selected in the program are: default window
size of the corner finder (wintx
, wi
nty
), where
window size was (73x73), the sizes dX
and dY in X and Y of each square in the
grid,
the number of squares along the X and Y
directions and the initial guess for distortion
.
[
8
]
15
T
he final detected corners after clicking on the fo
u
rth corner
of the checkerboard
can be
seen in
Figure 17
.
Figure
17
: Final Extraction Image
3.6 Point Cloud
–
P.K.
The point cloud is a representation of 2D points in 3D space. The point clou
d is
generated using
triangulation. The algorithm projects the 2D points back into the object
space using the projection matrix which is computed using the camera calibration
parameters. From
the RANSAC, we get the inliers in the image pairs. The next task is to
construct the
3D point cloud from these image pairs. For this, we need two projection
matrixes that project the inliers of the images to the object space. These are obtained
from the camera calibration parameters of the first image pair.
After obtaining the projection m
atrixes, t
riangulation is applied to the true point
correspondences from the
RANSAC
algorithm to project it to the 3D
cloud.
As
explained
in the theory part of triangulation, the true matches are the two perspectives of the same
object and the angles corre
sponding to these perspectives are the projection matrixes.
The 3D point for the image pair is obtained as such.
For th
e rest of the image pairs, the
same process is applied to compute the point cloud
as shown in
Figure 1
8
.
There might be some
wrong
matche
s made in the
RANSAC
which might lead to distorted
3D surface. In order to remove the outliers in the point cloud
, the point cloud is passed
through a
thresholding
function.
The thresholding function works in the principle that
the position of the object i
n the 3D space is known and anything outside these
coordinates are removed from the point cloud.
16
Figure
18
: Pseudo code for the method
M
ij
and R
ij
has 2 sets of coordinates corresponding to i
th
and j
th
image
3.7 3D
Modeling
–
P.K.
After computing the 3D point cloud, the next goal is to obtain a good 3D visualization for
the object.
T
he point cloud is
a 3 dimensional set of points with x
, y, z
coordinates as the
columns. We plotted the point cloud u
sing the plot3 command in Matlab
which gives a
quiet good representation of the object. The point cloud gives all the aspects of the
object in the object space such as a good representation of the top and side view of the
object which is the ultimate aim of the 3D model. The 3D represent
ation of the point
cloud was also able to provide some information regarding the transparency of the
object to some extent.
We tried to model the object using different approaches. One of the methods we tested is
the Delaunay triangulation. The 3D coordina
tes from the point cloud are passed through
the in

built function in Matlab
to create the tetrahedral that make up the
triangulation.
Then
from the Delaunay tetrahedral, the 3D surface is plotted using
trisurf
.
Some basic
commands to clean up the axis are
also implemented like
shading interp
. But this did not
yield a better representation of the object.
Some 3D modeling software was also tested to obtain a better visualization from our
point cloud data but most of them were unsuccessful in providing good re
sults. The
different approaches tested for the vis
ualization are detailed in the “D
iscussion
”
part of
the report. We concluded that the best result for the 3D modeling based on feature
detection and matching was the 3D point cloud plot.
Preprocess N images
Feature Detection in N images
FOR i=1:N

1 and j=2:N
Matches M
ij
between i
th
andj
th
images
RANSAC algorithm on the matches
–
R
ij
END
Load camera matrix
Projection Matrix P
1
using camera parameters of 1
st
image
Projection Matrix P
2
using camera parameters of 2
nd
image
FOR i=2:N

1 and j=3:N
Point Cloud Xw
ij
= Triangulation(P
1
, P
2
, R
ij
)
ThresholdingFunction (Xw
ij
)
END
17
4 Results
Image
before pre

processing
Perfume Bottle
Steel Thread
Image after reducing resolution, cropping and filtering
Perfume Bottle
Steel Thread
18
Feature points and matches
The result from the feature detection and matching
using different parameters
can be
seen below.
Harris Corner Detection using windowSize5, sigma 5 and threshold 2e7
Harris Corner Detection using windowSize 5, sigma 5 and threshold 2500
19
Harris Corner Detection using windowSize
3, sigma 5 and threshold 5e7
Harris Corner Detection using windowSize 3, sigma 5 and threshold 3e5 on the steel thread object
20
Using the result from the Harris Corner Detector we can then match the points across
the frames:
Feature points matching
across images using maximized correlation
RANSAC
Corresponding matches with RANSAC.
Camera Calibration
In the calibration, five randomly selected images
of the
checkerboard were
chosen.
Using the built in Matlab toolbox for camera calibration the following parameters could
be extracted:
Focal Length
: [x
, y
] =
[5465.758056012.87096]
±
[669.72529794.32774]
Principal point
: [
x
, y
] =
[2360.16809

315.08749]
±
[0.000000.00000]
Skew
:
∂
=
[0.00000]
=> angle of pixel = 90.00000 degrees
Distortion
: [0.14119

0.06530

0.03627
0.00787 0.00000]
±
[0.06312
0.06263
0.01306
0.00524 0.00000]
Pixel error
: [1.520000.91932]
21
Point cloud
Results from the point cloud algorithm are shown below. The final result can be rotated
in a 3D system in Matlab. Below we present
the point cloud before thresholding out the
outliers and
three different views of the point cloud
after thresholding
; front v
iew, top
view and side view.
Perfume bottle before thresholding
Steel thread before thresholding
Front view of the perfume bottle Front view of the steel thread
22
Top
view of the perfume bottle
Top view of the steel thread
Side view of the perfume bottle
Side view of the steel thread
23
Surface reconstruction
Since the surface reconstruction problem wasn
’t fully solved we show two
of the
attempts
we made in reconstructing the surface below. The K

means was an approach of
trying to find dense clusters in the point cloud. The result of the K

means can also be
seen below.
Another attempt was to reconstruct the surface using delaunay
triangulation.
An attempt to find critical points using K

means
Attemp
t to
reconstruct the
surface
of the perfume bottle and the heart using d
elaunay
triangulation.
24
5
Discussion
The scene is important to be able to perform good 3D reconstruction. It has to be a
clean
background behind the object so that no feature points are detected th
at isn’t the object
i.e. ground and
walls. It is also important that the objects have some characteristics that
can be used as feature points. That is why we tried several differen
t objects such as a
plastic bottle, a steel thread, a glass horse, different types of perfume bott
les, a banana
and a pineapple. In the end we decided to go with the objects that seemed to give the
best results; the
perfume bottle and the steel thread.
Dif
ferent combinations of the feature detection parameters were tested.
Using a too high
threshold
,
results in
well defined
feature points but for the reconstruction part it will not
be enough. Lowering the threshold will not yield in a better result since th
e algorithm is
accepting points that are not really good feature points. What we need to have is a
smaller window size that will then find the best feature points but still keep the
threshold high
.
The optimal parameters
for the perfume bottle
was found to
be
windowSize
= 3,
sigma
= 5,
threshold
= 3e5
and for the steel
thread
windowSize
3
, sigma
5
and threshold
3e5
.
These parameters should be optimized rather than found by testing to
improve accuracy. It seems though that these parameters have to be set
according to the
specific task one is to perform, making a general optimization difficult. The parameters
also had to be set differently depending on the object we used, for example as the steel
thread didn’t have any sharp corners nor texture the feature
detection had troubles
finding true feature points which resulted in a lower threshold.
The
matchbycorrelation
function
was a satisfying algorithm for
localizing the features
across the image pairs. It
did
though make some mistakes matching image points th
at do
not correspond at all (see the lines with large slopes in the Result part). There is also a
problem when the algorithm locates feature points on the ground since we only care
about the feature points on the object itself.
To
correct the
non

true mat
ches we used the RANSAC algorithm. Compared to matches
without RANSAC, it can be seen that there are no longer any false correspondences in
form of diagonals or mixed up feature points. These points were instead characterized
as outliers according to RANSA
C.
We tested different methods for
the projection of the
2D points to
the
3D point cloud.
The first method was using the DLT algorithm and triangulation
that
computes the
projection matrixes using the inliers from
RANSAC
rather than using the camera
projec
tions. Since it didn’t yield a promising resu
lt,
we computed the 3D point cloud
with the triangulation alone. The triangulation was done using the camera parameters
given from the camera calibration toolbox combined into a camera matrix. This resulted
in a
point cloud that represented the object well.
The next step of fitting the point cloud to a surface model turned out to be more difficult
than we estimated at the beginning of the project. When searching for a solution we
25
found that this isn’t trivial bu
t decided to try and solve it anyways. Using the inbuilt
triangulation methods of Matlab seemed like a suitable solution to start off with. These
methods
were
supposed to fit either a mesh of triangles building a 2D surface in 3D
space. Although, due to th
e high number of points, the algorithms had problems giving
the smooth surface we wanted. This led to a representation of a bulky surface rather
than a volume. The algorithm was not capable of interpreting the point cloud data as an
object with different s
ides. So
no correct representation could be made.
To build a good mesh grid
,
one
need
a
data set with low error rate and data point that
isn’t redundant which will lead to an overrepresentation of a simple model increasing
the
complexity. The
difficulty with our point cloud is that it contains a lot of noise i.e.
points that are not placed in the true object. The first approach to solve this problem
was to restrain the parameters set for the image filtering, feature detection and the
RANSAC in
order to allow few but correct points to be placed in the point cloud.
This
lower
ed
the number of 3D points but was not enough nor a guarantee that
only
redundant information was filtered. To further reduce the noise a threshold algorithm
was implemented
which checked neighboring distances between points and then by a
threshold
,
the algorithm decided if the point should be kept or thrown away as being
noise. The result, can be seen above, which turned out to be effective in removing most
of the noise.
Afte
r removing most of the noise a second approach was tested. Since the mesh
algorithm was unable to connect the points in the point cloud we tried to construct a
frame of the correlated critical points before applying the mesh function.
By finding all
the co
rners and connecting them one should be able to filter any point that is not in
predefined distance from this line between two critical points. In order to create a base
frame from the point cloud two methods were chosen to analyze the data: K

means and
Se
lf Organizing Map, SOM.
The K

means method calculates the Euclidian distance from a reference point to the rest.
The aim is to establish a representation where all data points are divided into a
predefined number of clusters, in which the distance from an
y belonging point have a
minimum distance to the cluster
core. The
Self

organizing map is like the k

means, a
classification algorithm. Instead of creating clusters this method
try
to fit
an
NxM or NxN
grid to a data set. Each crossing point is individuall
y movable but still connected to its
neighbors. The method makes attempts to fit this grid after each position of a data point
thus able to create a 3D topology of a 3D structure. The reason of using a SOM is its
ability to form a surface and to fold aroun
d a data set, creating a volumetric surface.
Unfortunately, neither of these methods could be implemented in such a way that the
final result could be evaluated. If we have a deeper understanding of these classification
methods, there is a possibility tha
t the 3D surface of the object could be created from the
noisy point cloud.
26
6 Conclusion
The 3D reconstruction algorithm using feature points is an effective method of
reconstructing the surface of an object. We explored this method and found it not to
be
as trivial as we thought from the beginning. We did though come to some interesting
conclusions.
Harris Corner Detector worked well on both objects. The parameters set for this
algorithm are crucial since they highly affect the result. This is importan
t not only to find
the correct matches but also since the RANSAC algorithm needs at least 8 matches to
run.
The point cloud can be created by using the camera projection matrix which was created
using two frames together with the intrinsic parameters of th
e camera. Since the point
cloud is noisy the visualization of its surface was found to be nontrivial and needs
further exploration.
This method of 3D reconstruction was proven to be applicable to both semi

transparent
and opaque objects according to the r
esults we got, at least for creating a point
cloud. If
the point cloud was to be visualized easily in 3D this would be a method well suited for
general 3D reconstruction.
27
7 Future Work
Future work for our project would be to optimize the threshol
d and parameters set for
both Harris Corner Detector and RANSAC as we are now manually setting these
according to the images. Another work would be to eliminate the wrong matches by
using a discriminative algorithm together with RANSAC. Harris Corner Detec
tor could
also be implemented with other descriptors such as the SIFT descriptor which is used a
lot in reconstruction. The thresholding of the point cloud could be further optimized by
looking at center of mass to calculate which points are likely to be t
he
surface of the
object. Further,
the camera calibration can be improved by testing other algorithms and
toolboxes. The largest work need to be done is the visualization of the noisy point cloud
that wasn’t as trivial as we first stated. Here one has to
explore the use of different data
analyzing algorithms such as K

means and Self Organizing Maps, briefly described in
“Discussion”.
28
8 References
Web S
ources
[
1
]
Perez, Ulises, Cho, Sohyung, Asfour and Shihab. (2009).
Volumetric Calibration of
Stereo
Camera in Visual Servo Based Robot Control.
Visited on
18

05

2013. Available
at:
http://cdn.in
techopen.com/pdfs/6281/InTech

v
olumetric_calibration_of_stereo_camera_in_visual_servo_based_robot_control.pdf
[2]
Zhengyou Zhan. Camera
calibration. Chapter
2. Availabl
e
at:
http://cronos.rutgers.edu/~meer/TEACHTOO/PAPERS/zhang.pdf
[3]
R. Collins.
(N.A)
.
Lecture 06: Harris Corner Detector
.
Available at:
http://www.cse.psu.edu/~rcollins/CSE486/lecture06.pdf
[4
]
Moons, Theo. Vergauwen, Maarten. Van Gool, Luc.
(2008)
. 3D
Reconstruction From Multiple
Images.
Visited
on 15

04

2013
Available at
: ftp://ftp.esat.kuleuven.ac.be/psi/visics/konijn/ICVSS08/vangool.pdf
[
5
] R
Szeliski, Richard. (2012).
Computer Vision: Algorithms and Applications.
Spring. V
isited
on
15

04

2
013
Available at
:
http://szeliski.org/Book/drafts/SzeliskiBook_20100903_draft.pdf
[6
] Fishler, Martin A. Bolles, Robert C
.
(
1981
)
.
Random Sample Consensus: A Paradigm for Model
Fitting with Applications to Image Analysis and Automated Cartography.
Communication of the
ACM 6:24
[7] Central panoramic Camera, Visited on 18

04

2013.
Available at:
http://cmp.felk.cvut.cz/demos/Omnivis/CenPanCam.html
[8
]
Jean

Yves Bouguet. (
2010
).
Camera calibrationtoolbox for matlab.
Visited on
18

04

2013
Available at:
http
://www.vision.caltech.edu/bouguetj/calib_doc/
[
9
] P. Kovesi:
Matlab and Octave Functions for Computer Vision and Image Processing.
Available at:
http://www.csse.uwa.edu.au/~pk/Research/MatlabFns/index.html#spatial
[1
0
]
Apisit A. and Jednipat M:
Matlab
Tutorial
. (
2009
)
. Available at:
http://se.cs.ait.ac.th/cvwiki/matlab:tutorial:3d_reconstruction_with_calibrated_image_sequence
s
.
[
11]
PDF:
Geometric Camera Parameter
s, Visited at 25

04

2013
Available at:
http://www.cse.unr.edu/~bebis/CS791E/Notes/CameraPara
meters.pdf
[12
] Wikipedia (2013
–
04

24).
Triangulation
,
v
isited on 15

04

2013
Available at:
http://en.wikipedia.org/wiki/Triangulation
29
Figures
[1
] Zeng, G.L. (2001). Image Reconstruction
–
a Tutor
i
al. [electronic]
Computerized Medical
Imaging and
Graphics
, vol 25, pp 97

123. Available at:
http://www.guillemet.org/irene/coursem/simpletomo.pdf[
2013

04

15]
[2] University of Nevada, Reno. Computer Vision Course Notes. CE791E. (2004).
Geometric Camera Parameters
. Available at:
http://www.cse.unr.edu/~bebis/CS791E/Notes/CameraParameters.pdf
[
2013

04

25]
[3] Wikipedia (2013
–
04

24).
Triangulation.
Available at:
http://en.wikipedia.org/wiki/Triangulation
[2013

04

15]
[4]
T. Moons, M. Vergauwen, L. Van Gool.
(2008).
3D Reconstruct
ion from multiple images
. (2008).
Available at
: ftp://ftp.esat.kuleuven.ac.be/psi/visics/konijn/ICVSS08/vangool.pdf
[2013

04

18]
[
5
] R. Collins: Lectu
re 06: Harris Corner Detector.
Available at:
http://www.cse.psu.edu/~rcollins/CSE486/lecture06.pdf
[6
]
Jean

Yves Bouguet, Last updated July 9th, 2010: Camera calibrationtoolbox fo
r matlab.
Available at
:
http://www.vision.caltech.edu/bouguetj/calib_doc/
[
18

04

2013
]
[7
]
Zhengyou Zhan. Camera calibration. Chapter 2.
Available
at:
http://cronos.rutgers.edu/~meer/TEACHTOO/PAPERS/zhang.pdf
30
9 Evaluation of References
We decided to use the Matlab Tutorial
[10]
since it defined our problem well and was
divided into f
our different steps so we could easily divide the work between us. It can be
validated as a good tutorial since this method seems to be often used in reconstruction
contexts.
The
references
[
1, 4,
and 5]
are
well

known books in the field of Computer Vision
meaning that the quality and correctness have been verified.
It
provided some basic
knowledge about the 3D image processing. It contains description about the various
methods for 3D image acquisition, proces
sing and modeling in simple words.
A
lecture
presentation
[3]
where the information given is a briefer explanation of what is given in
the books
was also used
. What can be criticized with the references are that they were
published two to four years ago a
nd might therefore not be state of the art but compared
to the methods found when searching the web this still seems to be a good approach in
solving 3D reconstruction problem.
For the camera calibration, [8, 11] provided good references. The
website
[8] helped in
the understanding of calibrating
t
he parameters of the camera using
different methods.
The book [11] helped to
understand the parameters of the camera including the focal
length, skew and other parameters.
31
Appendix
Matlab Code
Instructions on running the Matlab code
Run main.m. All parameters are set according to what we found resulted in the best
reconstruction. The script is divided into cells to be able to run separate parts of the
code such
i.e.
the Harris Corner Detector.
The parameters that can be altered in order to get different results are:
directory:
specifying the directory of the image files to be used. We used two different
objects;
their corresponding images are located in ‘steel thread’ and ‘perfume’. Change
betwe
en these to choose which object to reconstruct
windowSize:
the size of the kernel that will be used for the Gaussian smoothing and the
maximum suppression
threshold:
value set for Harris Corner Detector, should be larger than 1e4 to get corners
sigma:
defi
nes the standard deviation used for the Gaussian kernel
main.m
%% clean up workspace and close all figures
clc
clear
all
;
close
all
;
%% read and process all the images:
% specify directory where the images are
%directory = 'perfume';
directory =
'perfume'
;
imMat = fpreprocessing(directory);
num_images = size(imMat,2);
%% run this cell to see result of the preprocessing of the first two images
figure
subplot(1,2,1);
imshow(imMat{1});
title(
'first image'
);
subplot(1,2,2);
imshow(imMat{2});
title(
'second ima
ge'
);
%% Harris Corner Detector: find the corners for all images
% parameters for harris corner detector
%set the window size for kernel of the Gaussian filter and the non maxima
%suppression
windowSize = 3;
%standard deviation for the Gaussian kernel
32
sigma = 5;
%threshold for the local maxima
threshold =3e5;
% initialize
uv = cell(1,num_images);
row = cell(1,num_images);
col = cell(1, num_images);
%loop through all the images and find their respective corners
for
num = 1:num_images
uv{num} =
findHarrisCorners((imMat{num}),windowSize,sigma,threshold);
if
(uv{num}(:,1))
row{num} = uv{num}(:,1);
col{num} = uv{num}(:,2);
else
row{num} = [];
col{num} = [];
end
end
%% display the corners of first image pair
figure;
subplot(1,2,1);
imshow(imMat{1});
title(
'harris corners on the first image'
);
hold
on
plot(col{1},row{1},
'.y'
);
subplot(1,2,2);
imshow(imMat{2});
title(
'harris corners on the second image'
);
hold
on
plot(col{2},row{2},
'.y'
);
%% match the feature points between all pairs by using maximized
correlation
k=1;
for
i=1:num_images

1
[m{k} m{k+1}] = matchbycorrelation(imMat{i}, [row{i}';col{i}'],
imMat{i+1}, [row{i+1}';col{i+1}'],5);
matches{i} = [m{k}(2,:); m{k}(1,:); m{k+1}(2,:); m{k+1}(1,:)];
k=k+2;
end
%% display the matches
f =
displayMatches(imMat{1},imMat{2},matches{1}',
'matches between
corners'
);
%% RANSAC
k=1;
for
i=1:num_images

1
m{k}(3,:)=ones(1,size(m{k},2));
m{k+1}(3,:)=ones(1,size(m{k+1},2));
[F{i}, X1{i}, X2{i}] = RANSACF(m{k}, m{k+1});
matchesRANSAC{i} = [
X1{i}(2,:); X1{i}(1,:); X2{i}(2,:);X2{i}(1,:)];
k=k+2;
end
Fgs = gsFundamental(F{1}, X1{1}, X2{1});
33
%% to show the RANSAC matches
mRANSAC = matchesRANSAC{1};
f = displayMatches(imMat{1},imMat{2},mRANSAC',
'matches between corners'
);
%% camera
calibration
%We used two different cameras for the different objects so we need to
%change the parameters accoring to which object we want to reconstruct
if
strcmp(directory,
'perfume'
)
load(
'Calib_Results1.mat'
);
kk = [fc(1) 0 cc(1); 0 fc(2) cc(2); 0 0
1];
k = 1;
else
load(
'Calib_Results1.mat'
);
kk = [fc(1) 0 cc(1); 0 fc(2) cc(2); 0 0 1];
k = 1;
end
for
num = 1:num_images

1
%projection matrices for the first two images
P1 = kk*[Rc_1,Tc_1];
P2 = kk*[Rc_2,Tc_2];
m1 =
matchesRANSAC{num}(1:2,:);
m2 = matchesRANSAC{num}(3:4,:);
for
i = 1:size(m1,2)
%calculate the triangulation given the projection matrices and the
%matched points
M(k,:) = triangulation(P1, m1(:,i), P2, m2(:,i));
k = k+1;
end
end
%% 3D
plot
% plots the point cloud and should also show the surface of the object
figure;
plot3(M(:,1), M(:,2), M(:,3),
'+'
);
title(
'3D point cloud of the object'
);
points = thresholding(M, directory);
%vol3d(points);
figure;
plot3(points(:,1), points(:,2), points(:,3),
'+'
);
title(
'3D point cloud of the object'
);
set(gca,
'DataAspectRatio'
, [1 11]);
[i,c]=kmeans(points,100);
34
figure
plot3(c(:,1), c(:,2), c(:,3),
'+'
);
figure
tri = delaunay(c(:,1), c(:,2));
trisurf(tri, c(:,1),
c(:,2), c(:,3));
shading
interp
;
fpreprocessing.m
function
grayimage=fpreprocessing(d)
%Read all images in given directorys and save them in a cell.
%The output is the cell thresimage which contains all the preprocessed
images.
%For accessing the
preprocessed image use grayimage{1} and so on.
%Input: directory of where the images are located
%Output: cell array of preprocessed images, 1xnbrofimages
file_directory = d;
D = dir(file_directory);
imageName = strcat(file_directory,
'/'
, D(4).name);
%remove first three elements in cells since they are not images
for
k = 1:2
D(k) = [ ];
end
D(1) = [ ];
num_images=length(D);
%initializing
image_sequence=cell(1,num_images);
%Sharpening Filters
filter=fspecial(
'unsharp'
);
%for loop for
preprocessing all the images
for
num=1:num_images
%reading the images as well as reducing the resolution to 1/10
image{num}=imresize(imread(strcat(file_directory,
'/'
,D(num).name)),0.1);
%Cropping the image depending on directory
if
strcmp(file_directory,
'
steel thread'
);
image_sequence{num}=image{num}(150:300,200:300,:);
else
image_sequence{num}=image{num}(1:250,150:300,:);
end
%Convert to Gray
grayimage{num}=rgb2gray(image_sequence{num});
%Sharpening filtering
grayimage{num}=imfilter(grayimage{num},filter);
end
%For seeing any of the pre

processed images, uncomment the following code.i
can be any number from 1 to 30
imshow(image{1});
figure
imshow(image_sequence{1});
figure
imshow(grayimage{1});
35
figure
end
findharriscorners.m
function
uv = findHarrisCorners(im,windowSize,sigma,threshold)
% This function finds the harris corners in a gray scale input image.
% L

number of Harris corners found in the image, each is
considered
% as a feature point in the rest of the exercise.
%
% inputs:
% im : gray

scale input image
% windowSize : a positive scaler value that defines the window size for
both
% Gaussian filtering and non

maxima suppression.
% sigma : a positive scal
er that defines the standard deviation of the
% Gaussian kernel.
% threshold : the threshold value that is used for non maxima suppression.
%
% outputs:
% uv: (L X 2) matrix where each line is <u,v> coordinates of the
% detected Harris corner.
% compute
gradients
dx = [

1 0 1;

1 0 1;

1 0 1];
% filter to compute dx
dy = dx';
% filter to compute dy
%gradienterna
Ix = conv2(double(im), dx,
'same'
);
Iy = conv2(double(im), dy,
'same'
);
% compute A, B and C images
gaussianFilter = fspecial(
'gaussian'
,wi
ndowSize, sigma);
A = conv2(Ix.^2, gaussianFilter,
'same'
);
B = conv2(Iy.^2, gaussianFilter,
'same'
);
C = conv2(Ix.*Iy, gaussianFilter,
'same'
);
height= size(im,1);
width = size(im,2);
H = zeros(height,width);
[I,J] = ind2sub([height,width],1:width*height
);
% computeharris scores for each pixel
tic
for
i = 1:length(I)
%I(i) = u J(i) = v
M = [A(I(i),J(i)) C(I(i),J(i));C(I(i),J(i)) B(I(i),J(i))];
H(I(i),J(i)) = det(M)

0.04*(trace(M)^2);
end
toc
% Make mask to exclude points within radius of the image
boundary.
bordermask = zeros(size(im));
36
radius = ((windowSize

1)/2)+1;
bordermask(radius+1:end

radius, radius+1:end

radius) = 1;
% local maxima
mx = ordfilt2(H,windowSize*windowSize,ones(windowSize));
% Grey

scale
dilate.
Hmx = (H==mx) & H>threshold &bord
ermask;
[r,c] = find(Hmx);
% Find row,colcoords.
% return u and v coordinates for the corners
uv = zeros(length(c),2);
uv(:,1) = r;
uv(:,2) = c;
matchbycorrelation.m
% MATCHBYCORRELATION

match image feature points by correlation
%
%
Function generates putative matches between previously detected
% feature points in two images by looking for points that are maximally
% correlated with each other within windows surrounding each point.
% Only points that correlate most strongly with each
other in *both*
% directions are returned.
% This is a simple

minded N^2 comparison.
%
% Usage: [m1, m2, p1ind, p2ind, cormat] = ...
% matchbycorrelation(im1, p1, im2, p2, w, dmax)
%
% Arguments:
% im1, im2

Images containing
points that we wish to match.
% p1, p2

Coordinates of feature pointed detected in im1 and
% im2 respectively using a corner detector (say Harris
% or phasecong2). p1 and p2 are [2xnpts] arrays though
%
p1 and p2 are not expected to have the same number
% of points. The first row of p1 and p2 gives the row
% coordinate of each feature point, the second row
% gives the column of each
point.
% w

Window size (in pixels) over which the correlation
% around each feature point is performed. This should
% be an odd number.
% dmax

(Optional) Maximum search radius for match
ing
% points. Used to improve speed when there is little
% disparity between images. Even setting it to a
generous
% value of 1/4 of the image size gives a useful
% speedup. If th
is parameter is omitted it defaults to
Inf.
%
%
% Returns:
% m1, m2

Coordinates of points selected from p1 and p2
% respectively such that (putatively) m1(:,i) matches
% m2(:,i). m1 and m2 are [2xnpts]
arrays defining the
37
% points in each of the images in the form [row;col].
% p1ind, p2ind

Indices of points in p1 and p2 that form a match.
Thus,
% m1 = p1(:,p1ind) and m2 = p2(:,p2ind)
% cormat

Correlation matrix; rows correspond to points in p1,
% columns correspond to points in p2
% REFERENCE:
% Copyright (c) 2004

2009 Peter Kovesi
% School of Computer Science & Software Engineering
% The University of Western Australia
% ht
tp://www.csse.uwa.edu.au/
function
[m1, m2, p1ind, p2ind, cormat] =
...
matchbycorrelation(im1, p1, im2, p2, w, dmax)
if
nargin == 5
dmax = Inf;
end
im1 = double(im1);
im2 = double(im2);
% Subtract image smoothed with an averaging filter of size wXw fro
m
% each of the images. This compensates for brightness differences in
% each image. Doing it now allows faster correlation calculation.
im1 = im1

filter2(fspecial(
'average'
,w),im1);
im2 = im2

filter2(fspecial(
'average'
,w),im2);
% Generate
correlation matrix
cormat = correlatiomatrix(im1, p1, im2, p2, w, dmax);
[corrows,corcols] = size(cormat);
% Find max along rows give strongest match in p2 for each p1
[mp2forp1, colp2forp1] = max(cormat,[],2);
% Find max down cols give strongest match in p1 for each p2
[mp1forp2, rowp1forp2] = max(cormat,[],1);
% Now find matches that were consistent in both directions
p1ind = zeros(1,length(p1));
% Arrays for storing matched indices
p2ind = zeros(1,le
ngth(p2));
indcount = 0;
for
n = 1:corrows
if
rowp1forp2(colp2forp1(n)) == n
% consistent both ways
indcount = indcount + 1;
p1ind(indcount) = n;
p2ind(indcount) = colp2forp1(n);
end
end
38
% Trim arrays of indices of matched points
p1ind = p1ind(1:
indcount);
p2ind = p2ind(1:indcount);
% Extract matched points from original arrays
m1 = p1(:,p1ind);
m2 = p2(:,p2ind);
%

% This function builds a correlation
matrix
% that holds the correlation strength of every point relative to every
% other point.
%
% This code assumes im1 and im2 have zero mean. This speeds the
% calculation of the normalised correlation measure.
function
cormat = correlatiomatrix(im1, p1
, im2, p2, w, dmax)
if
mod(w, 2) == 0
error(
'Window size should be odd'
);
end
[rows1, npts1] = size(p1);
[rows2, npts2] = size(p2);
% Initialize correlation matrix values to

infinty
cormat =

ones(npts1,npts2)*Inf;
if
rows1 ~= 2  rows2 ~= 2
error(
'Feature points must be specified in 2xN arrays'
);
end
[im1rows, im1cols] = size(im1);
[im2rows, im2cols] = size(im2);
r = (w

1)/2;
% 'radius' of correlation window
% For every feature point in the first image extract a window of data
% and
correlate with a window corresponding to every feature point in
% the other image. Any feature point less than distance 'r' from the
% boundary of an image is not considered.
% Find indices of points that are distance 'r' or greater from
% boundary on im
age1 and image2;
n1ind = find(p1(1,:)>r & p1(1,:)<im1rows+1

r &
...
p1(2,:)>r & p1(2,:)<im1cols+1

r);
n2ind = find(p2(1,:)>r & p2(1,:)<im2rows+1

r &
...
p2(2,:)>r & p2(2,:)<im2cols+1

r);
39
for
n1 = n1ind
% Generate window in 1st image
w1 = im1(p1(1,n1)

r:p1(1,n1)+r, p1(2,n1)

r:p1(2,n1)+r);
% Pre

normalise w1 to a unit vector.
w1 = w1./sqrt(sum(sum(w1.*w1)));
% Identify the indices of points in p2 that we need to consider.
if
dmax == inf
n2indmod = n2ind;
% We have to consider all of
n2ind
else
% Compute distances from p1(:,n1) to all available p2.
p1pad = repmat(p1(:,n1),1,length(n2ind));
dists2 = sum((p1pad

p2(:,n2ind)).^2);
% Find indices of points in p2 that are within distance dmax of
% p1(:,n1)
n2indmod = n2ind(find(dists2 <
dmax^2));
end
% Calculate normalised correlation measure, which will give us better
% matches
for
n2 = n2indmod
% Generate window in 2nd image
w2 = im2(p2(1,n2)

r:p2(1,n2)+r, p2(2,n2)

r:p2(2,n2)+r);
cormat(n1,n2) = sum(sum(w1.*w2))/sqrt(sum(sum(w2.*w2
)));
end
end
RANSACF.m
% RANSACF

RANSAC algorithm for the estimation of the fundamental matrix
%
%
% Computes the fundamental matrix given two sets of corresponding points.
Based on
% the 8

pint algorithm described by Zisserman in p282.
%
%
% Input

A

> 3xn set of homogeneous points in image A
%

B

> 3xn set of homogeneous points in image B
%
% Output

3x3 fundamental matrix F
%
%
%
% Author: Isaac Esteban
% IAS, University of Amsterdam
% TNO Defense, Security and Safety
% iesteban@science.
uva.nl
% isaac.esteban@tno.nl
% Modified by Apisit A. July 2009
function
[F, X1_out, X2_out] = RANSACF(X1,X2)
% Penk Modify
%%%%%%%%%% Parameters Set UP %%%%%%%%%%
40
% Threshold
t = 0.0075;
% Number of Maximun Loop
N = 10000;
% Total Popu
lation
Pop = size(X1,2);
% Pollute Ratio of Outlier in the total population
P_Ratio = 0.05;
% Acceptable number of inlier
Accept_N = round((1

P_Ratio)*Pop);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Best scores
bestscore = 0;
bestmodel = [];
%
List of points
inliers = [];
counter = 1;
% First normalize the points
[X1t,T1] = normalize2Dpoints(X1);
[X2t,T2] = normalize2Dpoints(X2);
for
i =1:N
% Select 8 random points % Penk Modified
RandGroup = randperm(size(X1,2));
% Penk Modified
rindex = RandGroup(1:8);
% Penk Modified
% Compute the fundamental matrix
Ft = eightpoint(X1(:,rindex),X2(:,rindex));
% If it can be computed...
if
(sum(sum(isnan(Ft))) == 0)
X2tFX1 = zeros(1,size(X1t,2));
for
n = 1:size(X1t,2)
X2tFX1(n) = X2t(:,n)'*Ft*X1t(:,n);
end
FX1 = Ft*X1t;
FtX2 = Ft'*X2t;
% Evaluate distances
d = X2tFX1.^2 ./ (FX1(1,:).^2
+ FX1(2,:).^2 + FtX2(1,:).^2 +
FtX2(2,:).^2);
inliersL = find(abs(d) < t);
% Indices of inlying points
%size(inliersL)
%pause
41
% Check this model (size of the inliers set)
if
(length(inliersL)>bestscore)
bestscore = length(inliersL);
bestmodel = inli
ersL;
end
;
% % Find which inliers are new
% for j=1:length(inliersL)
% if(sum(inliers == inliersL(j))==0)
% inliers(counter) = inliersL(j);
% counter = counter+1;
%
end;
% end;
end
;
if
bestscore>= Accept_N,
break
,
end
% Penk Modified
end
;
fprintf(
'N. Inliers: %d
\
n'
, size(X1(:,bestmodel),2));
fprintf(
'N. Outliers: %d
\
n'
, size(X1,2)

size(X1(:,bestmodel),2));
% Reestimate F based on the inliers of the
best model only
F = eightpoint(X1(:,bestmodel),X2(:,bestmodel));
%%%%%%%%%% Penk Modified %%%%%%%%%%
X1_out = X1(:,bestmodel);
X2_out = X2(:,bestmodel);
triangulation
.m
function
M = triangulation(P1, m1, P2, m2)
% TRIANGULATE
computes the 3D point location using 2D camera views
% P1: camera matrix of the first camera.
% m1: pixel location (x1, y1) on the first view. Row vector.
% P2: camera matrix of the second camera
% m2: pixel location (x2, y2) on the second view. Row vector
.
% M: the (x, y, z) coordinate of the reconstructed 3D point. Row vector.
% Camera one
C1 = inv(P1(1:3, 1:3)) * (

P1(:,4));
x0 = C1(1);
y0 = C1(2);
z0 = C1(3);
onerow = ones(size(m1,2),1)';
m1 = [m1; onerow];
M1 = pinv(P1) * m1;
x = M1(1)/M1(4);
y = M1(2)
/M1(4);
z = M1(3)/M1(4);
a = x

x0;
b = y

y0;
c = z

z0;
42
% Camera Two
C2 = inv(P2(1:3, 1:3)) * (

P2(:,4));
x1 = C2(1);
y1 = C2(2);
z1 = C2(3);
m2 = [m2; onerow];
M2 = pinv(P2) * m2;
x = M2(1)/M2(4);
y = M2(2)/M2(4);
z = M2(3)/M2(4);
d = x

x1;
e = y

y1;
f =
z

z1;
% Solve u and v
A = [a^2 + b^2 + c^2,

(a*d + e*b + f*c);
...

(a*d + e*b + f*c), d^2 + e^2 + f^2];
v = [ (x1

x0)*a + (y1

y0)*b + (z1

z0)*c;
...
(x0

x1)*d + (y0

y1)*e + (z0

z1)*f];
r = inv(A) * v;
M = [x0+a*r(1) y0+b*r(1) z0+c*r(1)];
end
THRESHOLDING
THE 3D POINT CLOUD:
function
points = thresholding(M, directory)
row=length(M);
k=1;
if
strcmp(directory,
'steel thread'
)
for
i=1:row
if
M(i,1)>403 && M(i,1)<420 && M(i,2)>

110 && M(i,2)<

100
points(k,1)=M(i,1);
points(k,2)=M(i,2);
points(k,3)=M(i,3);
k=k+1;
end
end
else
for
i=1:row
if
M(i,1)>397 && M(i,1)<437 && M(i,2)>

117 && M(i,2)<

99 && M(i,3)>557
points(k,1)=M(i,1);
points(k,2)=M(i,2);
points(k,3)=M(i,3);
k=k+1;
end
end
end
end
CAMERA CALIBRATION
:
Convert images from rgb to
gr
a
y
x = imread('Image1.jpg');
% read the image as a matrix 3D
k = rgb2gray(x);
% convert the images from rgb to grey,
imshow(k
) % display the image
end
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο