LiDAR AND MULTISPECTRAL DATA FUSION

munchsistersΤεχνίτη Νοημοσύνη και Ρομποτική

17 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

91 εμφανίσεις

1


LiDAR AND MULTISPECTRAL DATA FUSION

SINAGRA Ophelie, LIM Samsung

KEY WORDS:

LiDAR, Multispectral image, Classification, S
upport
V
ector
M
achine

ABSTRACT:

The

proposed

research aim
s

to

amalgamate a

lidar

point cloud
and multispectral
image
in order to
classify the surveyed area into chosen categories
by using

a
supervised
classification algorithm.
An
airborne
lidar

point cloud and a

QuickBird satellite image

of Red, Green, Blue and Near
-
Infrared
bands
over

the city of Strasbourg, France, wer
e processed to achieve the data fusion and estimate the
accuracy of the proposed method.

First
ly
, the multispectral image

was

used to
create

three

images representing

Normalised Difference
Vegetation Index (NDVI), Soil
-
Adjusted Vegetation Index (SAVI), and Global Environment
Monitoring Index (GEMI), respectively. Secondly, the
lidar

point
was used
to
calculate

the height of
the
features

above ground and
the
three
-
dimens
ional
information was
convert
ed

to

a raster
image.

Those four rasters were
compiled to have only one

composed
of two, three or four

bands to make
different tests.

The
Support Vector Machine
(
SVM
)
algorithm which is
a supervised classification
method

was
ap
plied

to the

raster
s

to classify
them

into three categories: vegetation, ground, and man
-
made objects
Confusion matric
es of the classification
results indicates that t
he overall accuracy of the classification
improves

when
the
lidar

data is integrated rath
er than the multispectral imagery is used alone.


1.

INTRODUCTION
:



















2


2.

METHODS
:

2.1.

Study area and data set

The study was carried out in France, over the city of Strasbourg in the east of France. It is a

urban
and dense area, mostly composed of medium buildings, roads and small parks.

QuickBird sensors capture multispectral images
with a 2.44 meters
resolution (for the red,
green, blue and near
-
infrared). The scene was
captured in May 2002 over Strasbou
r
g
, in the
WGS84
-
UTM32N system.

A lidar point cloud was acquired two years and
half later, in September 2004, on the same area
as the QuickBird image.

On average, the
density of the cloud is 1.3 points / square meter.

The projected system used is Lambert
-
I.


2.2.


Classification algorithm

The algorithm used for this classification is the Support Vector machine (SVM). It was first found and
used by Cortes and

Vapornik in 1995. Nowadays, it is considered as a powerful algorithm to classify
data. Indeed, this algorithm is a nonlinear and supervised statistical method that will define a
hyperplan to dissociate two classes, which mean minimising the maximal margin

(distance between
the hyperplan and the closest samples).

Several
parameters need to be chosen in order to perform the best classification: the kernel (linear,
polynomial, Gaussian or Laplacian), the coefficient C (controlling the trade
-
off between errors

of the
SVM on training data and margin maximization), the coefficient
ϒ

(width of the kernel: influence of
each sample on the algorithm), etc..


2.3.

Classification method:

Classification
was

preceded on a raster composed of four layers. Each layer
was

produced from either
the lidar data or the multispectral image.


LiDAR

Ground
points

Non ground
points

Height
variation


Ground

Water

Buildings

Vegetation

Figure
2
.

Classification steps flowchart

Multispectral
imagery

NDVI



SVM

Figure
1
. QuickBird image (a) and LIDAR elevation data (b)

(a)

(b
)

3


2.2.1.


Multispectral

data processing

A conversion of the luminances into thematic variables of the s
pectral bands can
, after combination,
give information on the composition of the pixel analysed. The Normalized Difference Vegetation
Index (NDVI)
was

calculated thanks to the red and near
-
infrared spectral bands. This index is
sensitive to many external factors such as at
mospheric effects, soil’s influence, etc.. In order to
overcome those effects, the Global Environmental Monitoring Index (GEMI, minimization of
atmospheric effects) and Soil Adjusted Vegetation Index (SAVI, minimization of the soil’s influence)
were

calcul
ated and
were

also used for the classification.


2.2.2.


LiDAR data processing

The height
was

extracted from the lidar data. To do so, ground and non
-
ground points
were classified

thanks to
LAStools
. The DTM of the area
was

produced, the height for each non
-
groun
d points
was

calculated (distance along the Z
-
axis between the considered point and the triangulated surface of the
ground) and
was

substituted to the third coordinate of the point.

The TIN of the area
was

created and then converted as a raster with the s
ame cell size as the previous
raster
s

(2.44 meters).


2.2.3.


Computation of the layers

The four layers calculated (NDVI, SAVI, GEMI, Height variation)
were

computed into a single raster.

A few issues arise: the projection system of the two data
were

different, a reprojection of the lidar layer (height variation) to the
multispectral layers (NDVI, SAVI, GEMI)
had

to be performed
first. Secondly, a grid
-
shift
was

observed between the different
layers after the reprojection.

The correction
was

carried o
ut by
converting one layer into points and adding values from the other
rasters to those points. A reconversion into rasters
was

performed
before the computation.


2.2.4.


Classification of the data

The classification
was

pro
cessed

with the Support Vector Machine

(SVM) algorithm.

First of all, the SVM is a supervised algorithm:
sampling zones
had

to be defined in a
shape

file. Each
of the four
class
es

will

be numbered from 1 to 4 and the polygons
were

drawn
thanks to the satellite image, the vegetation indexes and

the height
raster.


Then the parameters for the SVM
were

chosen by testing different values. The confusion matrix of the
SVM classification file
was

produced and its Cohen’s Kappa coefficient
was

calculated. The
following
parameters were

chosen:



The Cohen’s Kappa coefficient
was

95.6 %
which is an excellent v
alue. Our SVM
classification file
was

validated and
could

be
used on the four
-
layers raster.



Height variation

GEMI

SAVI

NDVI

Figure
3



Layers



Table
1

-

Classes

Value

Class

1

Vegetation

2

Buildings

3

Water

4

Ground



Table
2



Parameters values

Parameter

Value

Kernel

Gaussian

C

3

Ratio training / validation

0.5

ϒ

1


4








After a first classification, the “water” class was over represented. The satellite image was inspected
and a problem was spotted: supposed water areas seemed to b
e corrupted. This class was suppressed
(suppression of the polygons from this class) and the SVM classification file as well as the
classification itself were re
-
proceeded. The Cohen’s Kappa coefficient was then 99.1% and the raster
was visually well class
ified: buildings, vegetation and roads could be clearly seen.


2.2.5.


Reference raster

In order to estimate the classification accuracy, a reference raster is needed. Two types of data were
used to create this raster

since there was no possibility to gather ground truth samples
: the lidar point
cloud and the NDVI layer.

First, the lidar point cloud was manually classified and converted as a raster. Secondly, since there
was a two and half years gap between the two acq
uisitions
, the cell from the NDVI raster with a value
over 0.2 were substituted to the lidar manually classified raster. Indeed, it was believed that the SVM
classifier was mainly relying on the vegetation indexes to classify the vegetation and therefor,
during
the comparison with a reference raster, it would be wrongly considered misclassified.

3.

RESULTS:

3.1.

First classifiaction

The results were obtained by comparing the reference raster with the classified one. The confusion
matrix was then computed and different accuracy estimators were calculated.







Over than 83% of our data are well classified. This rate was improved later, after different tests.


After analysing the classified raster, it was noticed that either vegetation or road was misclassified:
both of those classes were classified as buildings.

It is particularly noticeable
around trees

or in
narrow streets.








.






Table
3



Accuracy estimators

Error of commission

6.69%

User accuracy

93.31%

Error of omission

9.68%

Producer accuracy

90.32%

Overall error

16.37%

Overall accuracy

83.63%



𝜅
=

Pr

𝑎


Pr

𝑒

1

Pr

𝑒


(1)

Figure
4



Cohen’s Kappa coefficient

With
Pr

𝑎

=


diagonal

𝑎



𝑒

𝑒 𝑒𝑒
††

)

And
Pr

𝑒

=




row

𝑎



𝑒

𝑒 𝑒𝑒

×



colomn

𝑎



𝑒

𝑒 𝑒𝑒




)

5



















3.2.

Improvement of the results

To determinate the importance of each layer, they were successively suppressed from the four
-
layers
raster, the classification file and the classification itself were done once again and the accuracies were
calculated.









The lidar

information enabled

to improve the classification from about 20%.
Indeed, when the “height
variation” layer was not used, the classification overall accuracy was only 63.1%. Moreover, the
GEMI layer was bringing more error than improvement in the classifi
cation.

A second test was then carried out, the classification of two
-
layers rasters. Since the GEMI layer
seemed to corrupt the results and the “height variation” one was improving them of 20%, the tests
were performed with the NDVI, SAVI and height layers.








4.

DISCUSSION:





Figure
5



Aerial image

(a), C
lassified reference (b), SVM

classification

(c)

(a)

(b)

(c)


Table
4



Classification rates

of three
-
layers rasters

Layers

Cohen’s
𝜿

of⁴h攠
卖M

O癥v慬l⁡ 捵牡c礠yf t桥h
捬慳獩fic慴i潮

NDV䤠⬠IAV䤠⬠IE䵉

67.7 %

63.1 %

NDVI + SAVI + H

97.4 %

84.3 %

GEMI + SAVI + H

97.6 %

83.7 %

NDVI + GEMI
+ H

97.6 %

83.7 %



Table
5



Classification rates of two
-
layers rasters

Layers

Cohen’s
𝜿

of⁴h攠
卖M

O癥v慬l⁡ 捵牡c礠yf t桥h
捬慳獩fic慴i潮

乄噉‫⁈

97.4 %

84.2 %

SAVI + H

97.4 %

84.2 %


6












5.

CONCLUSION:

The tests showed that only two layers were essential: one calculated from the lidar data and giving
information on height, and one form the multispectral image with a vegetation index. Indeed, the
height information enables the algorithm to detect ground f
rom non
-
ground area while the vegetation
index is helping in the distinction between man
-
made objects and natural ones.

6.

ACKNLOEDGMENTS:










7.

REFERENCES: