New image processing software for analyzing object size-frequency

lemonadeviraginityΤεχνίτη Νοημοσύνη και Ρομποτική

6 Νοε 2013 (πριν από 3 χρόνια και 5 μήνες)

111 εμφανίσεις


1

New image processing software for analyzing
object

size
-
frequency
1

distributions, geometry, orientation, and spatial distribution
*

2

Ciar
á
n Beggan
1

and Christopher
W.
Hamilton
2


3

1
Grant Institute,
School of GeoSciences,
University of Edinburgh
, UK

4

2
Hawai

i I
nstitut
e of Geophysics and Planetology,
University of Hawai

i
, USA

5


6

Corresponding
a
uthor:
Ciarán Beggan,
E
-
mail: ciaran.beggan@ed.ac.uk
;
Phone: +44
7

131 650 5916
;
Address: School of GeoSciences, Grant Institute, West Mains Road,
8

Edinburgh, EH9 3JW, UK

9


10

*Cod
e available from server at
http://www.geoanalysis.org

11


12

Abstract

13

Geological Image Analysis Software (GIAS)
combines

basic to
ols for calculating
14

object area
,
abundance,
radi
us, perimeter
, eccentricity, o
rientation, and centroid
15

location
,

with the first automated
method for

characterizing
the
aerial

distribution of
16

objects
using

sample
-
size
-
dependent

nearest neighbor (NN) statistics.
The

NN
17

analys
e
s include

tests for:
(1
) Poisson
,

(2) Normalized Poisson
, (
3) Scavenged

k

= 1
,

18

and
(4) Scavenged
k

= 2

NN distributions.
GIAS is

implemented in MATLAB with a
19

G
raphical
U
ser
I
nterface
(GUI) that is available as
pre
-
parsed pseudocode
for use
20

with MATLAB
,

or as a stand
-
alone application that run
s

on Windows and Unix
21

systems. GIAS can
process
raster

data
(
e.g.
, satellite imagery, photomicrographs,
etc.
)
22

and

tables of
object coordinate
s

to characterize the size, geometry, orientation, and
23

spatial organization of
a
wide
range of
geological features.

This information

expe
dites
24

quantitative
measurements of
2D
object properties
,
provides criteria for validating the
25


2

use
of stereolog
y to transform
2D
object sections into

3D models, and establishes a
26

standardized NN methodology that
can be used to

compar
e

the results of
differe
nt
27

geospatial
studies

and identify objects using non
-
morphological parameters
.


28


29

Key Words:
Image analysis, software, size
-
frequency, spatial distribution, nearest
30

neighbor, vesicles, volcanology

31


32

1. Introduction


33

Geological i
mage analysis extracts informa
tion
from representations of natural objects
34

that may either be captured by an imaging system
(
e.g.
, photomicrographs, aerial
35

photographs
, digital satellite

imagery
)

or
schematically
rendered into visual form
36

(
e.g.
, geological maps).
In addition to examini
ng the properties of individual objects,
37

spatial analys
i
s may be used
to
quantify object distributions

and
investigate

their

38

formation

processes
.

39

ImageJ

(Rasbald, 2005)
is
a commonly used image processing application

that
40

was

developed as
an open source Ja
va
-
based program by the National Institutes of
41

Health (NIH). Custom plug
-
in modules enable
ImageJ
to

solve numerous

tasks,
42

including the analysis of vesicle
size
-
frequency
distributions within geological thin
-
43

sections

(
e.g
.,
Szramek
et al
, 2006
, Polacci
et

al.
, 2007
)
.
However
, ImageJ and similar
44

programs
for
Windows
(
e.g.
, Scion Image and ImageTool)

and
Macintosh (
e.g.
, NIH
45

Image)

are not
specifically
designed for geological applications nor do they
provide
46

analytical tools
for

investigating patterns of

spa
tial
organization
.

47


To
take greater advantage of the information
contained
within geological
48

images,
we have developed
Geological Image Analysis Software (GIAS)
. This
49

program

combines
(1)
an image processing module f
or calculating and visualizing

50


3

object

ar
ea
s
,

abundance,
radi
i
, perimeter
s
, eccentricit
ies
, orientation
s
, and centroid
51

location
s
,

and

(2)
a spatial distribution module that automates sample
-
size
dependent
52

nearest neighbor
(NN)
analyses.
Although other programs can perform the basic
53

functions in t
he “Image Analysis
"

module
, GIAS is the first program to automate
54

sample
-
size
-
dependent

analyses of NN distributions.

55


56

2. Motivation

57

Nearest neighbor (NN) analysis is well
-
suited for investigating patterns of spatial
58

distribution within
in
trinsically

two
-
dimensional (
2D
)

datasets, such as orthorectified
59

aerial photographs and satellite imagery. Applications of NN analyses to remote
60

sensing imagery include the study of volcanic landforms (Bruno
et al.
, 2004; 2006;
61

Baloga
et al.
, 2007; Bishop, 2008; Hamilton

et al.
, 2009; Bleacher
et al.,

2009),
62

sedimentary mud volcanoes (Burr
et al.
, 2009), periglacial ice
-
cored mounds (Bruno
63

et al.
, 2006), glaciofluvial features (Burr
et al.
, 2009), dune fields (Wilkins and Ford,
64

2007), and impact craters (Bruno
et al.
, 200
6).
Despite widespread utilization of NN
65

analys
e
s, its
value
as a remote sensing tool and
effectiveness as basis

for

comparison

66

between different datasets

is
limited

by the lack of a standardized NN methodology

67

particularly in terms of defining feature fie
ld areas, thresholds of significance, and
68

criteria for
apply

higher
-
order NN
methods
.

69


In addition to remote sensing
applications
, NN analyses can be
used to study

70

objects in photomicrographs such as crystals (Jerram
et al.,
1996; 2003
; Jerram and
71

Cheadl
e, 2000
) and vesicles. In this study, we emphasize the application of NN
72

analyses to vesicle distributions to demonstrate how GIAS can be used to validate (or
73

refute) the assumption of randomness, which is a prerequisite for effectively applying
74


4

stereologi
cal techniques to derive vesicle volumes from photomicrographs and for
75

selecting appropriate statistical models to characterize those vesicle populations.

76


Vesicle textures preserve information about the pre
-
eruptive history of
77

magmas and can be used to in
vestigate the dynamics of explosive and effusive
78

volcanic eruptions (
e.g.
, Mangan
et al.,

1993; Cashman
et al.,

1994; Mangan and
79

Cashman 1996; Cashman and Kauahikaua, 1997; Polacci and Papale 1997; Blower
et
80

al.
, 2001; Blower
et al.
, 2003; Gaonac'h
et al.,

2005; Shin
et al.
, 2005; Lautze and
81

Houghton, 2005; Adams
et al.
, 2006; Polacci
et al.,

2006; Sable
et al.
, 2006; Gurioli
82

et al.,
2008). Quantitative vesicle analyses stem from Marsh (1988), who explored the
83

physics of crystal nucleation and growth dynami
cs to derive an analytical formulation
84

for crystal size distributions. This research was
then
applied to numerous bub
ble size
-
85

frequency distribution studies

(
e.g.
, Sarda and Graham, 1990; Cashman and Mangan,
86

1994; Cashman
et al.
, 1994; Blower
et al.
, 2003)
. Early studies of vesicle distributions
87

(
e.g.
, Cashman and Mangan, 1994) were limited by their inability to characterize the
88

full range of vesicle sizes because their methodology could not resolve the smallest
89

vesicles. Nested photomicrographs solve this
problem because photomicrographs
90

captured at multiple magnifications enable the reconstruction of total vesicle size
-
91

frequency distributions (
e.g.
, Adams
et al.
, 2006; Gurioli
et al.,
2008).

92

In general, vesicle studies are limited by two major factors: tr
ansformations of
93

2D

cross
-
sections into representative vesicle volumes (
Mangan
et al.
, 1993;
Sahagian
94

and Proussevitch, 1998; Higgins, 2000; Jerram and Cheadle, 2000); and development
95

of reliable statistical characterizations of vesicle populations in term
s of distribution
96

functions and their spatial characteristics (Morgan and Jerram, 2007; Proussevitch
et
97

al.
, 2007a). These difficulties have been partially addressed by improved statistical
98

techniques for inv
estigating vesicle populations; however, a

singl
e cross
-
section
99


5

cannot be used to reconstruct a representative 3D vesicle distribution unless
the
100

objects viewed in a 2D section can be characterized using 2D reference textures with
101

known 3D distributions (Jerram
et al.
, 1996; 2003; Jerram and Cheadle 200
0
;
102

Proussevitch
et al.
, 2007
a
).
Although s
ynchrotron X
-
ray tomography
is increasingly
103

being used
to

directly
generate 3D vesicle distributions (
e.g.
, Gualda and Rivers,
104

2006; Polacci
et al.
, 2007; Proussevitch
et al.
, 2007b)
,
nested datasets containing
105

mul
tiple Scanning Electron Microscope (SEM) images remain the most common
106

input for vesicle studies because of their superior spatial resolution relative to X
-
ray
107

tomography.

To facilitate
the analysis of vesicles in
SEM
imag
ery,
GIAS can be used
108

to determine

the geometric properties of vesicles within the plane of a given
109

photomicrograph and

establish if objects fulfil the criteria

of spatial randomness,
110

which is

required for
transforming 2D sections into 3D models using stereology
.

111



112

3
.
Nearest n
eighbor
(NN
) analysis

113

Clark and Evans (1954)

proposed a simple test for spatial randomness in which the
114

actual
mean
NN

distance
(
r
a
)
in a population of known density is compared
to
the
115

expected mean
NN

distance

(
r
e
)

within

a
random
ly distributed population of
116

equival
ent density
.
Following Clark and Evans (1954),
r
e

and expected standard
error

117

(
σ
e
) of the Poisson distribution
are
:


118


0
0
26136
.
0
;
2
1



N
r
e
e


,




[1; 2]

119

w
here

the i
nput population density
,
ρ
0
,
equals

the number of objects (
N
) divided by
120

the area
(
A
)
of the
feature field

(
ρ
0

= N / A
).
T
he following tw
o
test statistics

(termed
121

R

and
c
)

are used

to determine if

r
a

follows a Poisson random distribution
:

122


e
e
a
e
a
r
r
c
r
r
R




;
.




[
3
;
4
]

123


6

If a test population exhibits a Poisson random distribution,
R

should ideally
124

have a value of

1
,

w
hile
c

should equal

0
.
If
R

is approximately equal to 1, then the
125

test population may have a Poisson random distribution
. If

R

> 1
,

the
n the

test
126

population exhibits greater than random NN spacing (
i.e.
, tends towards a
maximum
127

packing

arrangement
)
,

wherea
s

if
R

< 1
,

the
n the

NN distances in the test case are
128

more closely spaced than expected within

a random distribution and thus

exhibit
129

clustering relative to the Poisson model
.

130

T
o identify
statistically significant departures from randomness
at

the 0.
9
5
and
131

0.
99

confidence
levels, |
c
| must exceed the critical values of 1.96 and 2.58,
132

respectively (Clark and Evans,
1954)
; however,

t
hese critical values

implicitly

assume
133

large

sample populations

(
N

>
10
4
).
Jerram
et al.
(1996) and

Baloga
et al.

(2007)
note
134

that
finite
-
sampling effects introduce biases
i
nto

the variation of

NN statistics
.

These
135

biases become significant for small populations (
N

<
3
00)

and

thereby

necessitat
e

the
136

use of sample
-
size
-
dependent

calculations of
R

and
c

thresholds
.

137

The NN methodolo
gy of
Clark and Evans (1954)

assumes that objects can be
138

approximated as points
. H
owever, f
or frameworks of solid objects
,
NN statistics
139

exhibit scale
-
dependent

variations
in randomness due to
the
restriction
s imposed by
140

avoiding
overlapping object placeme
nts.
For
2D sections through 3D
distributions of
141

equal
-
sized solid spheres
,

Jerram
et al.
(
2003
)

define a porosity
-
dependent

threshold
142

for
R
, which they term the random sphere distribution line (RSDL).
With increasing
N

143

(
i.e
., decreasing porosity),

the RSD
L
threshold increases from a value just above 1 to
144

a maximum
of
1.65 at 37% porosity
,
based on the experimental
spherical packing
145

model of Fin
n
ey (
1970
). Jerram
et al.
(1996
, 2003
)

also note that compaction,
146

overgrowth, and sorting can introduce systematic

vari
ations

in
R

relative to the RSDL
147


7

and thus, when correctly identified, can provide information
relating to the

relative
148

importance of these

processes
during rock formation
.

149

In addition to sample
-
size
-
dependent

tests for
Poisson randomness,

Baloga
et
150

a
l.

(2007)

proposed
the following three

NN models
:

151


152

1.

Normalized

Poisson NN

distributions

account

for

data
resolution
-
limit
s, which
153

require
normaliz
ation

of

r
e
,
σ
e
, skewness, and kurtosis
values
based on a user
-
154

defined minimum distance threshold.


155

2.

Scavenged

Poisson NN

distributions

assume

feature
s

consume

(
or “
scavenge

)
156

resources

in their surroundings, which affects the
k
th
nearest neighbor

by
157

increasing
r
e

relat
ive to the standard Poisson

NN model.
The order of the
158

scavenging model is given by the Poisson index,
k
, which identifies the
159

number of NN objects

participating in the scavenging process and, if
k

= 0, the
160

standard
Poisson
NN distribution is obtained.

161

3.

Lo
gistic NN

distributions

apply

the classical probability distribution for self
-
162

limiting population growth due to
the
use of available resources. Under this
163

probability assumption, NN distances become more uniform,
which
lead
s

to
164

smaller skewness
,

but larger

kurtosis compared to the Poisson
NN model
.

165


166

S
ample
-
size
-
dependent

R

and
c

test

statistics
,

in addition to the

expected
167

skewness versus kurtosis

values,

allow

for robust statistical
tests of

spatial
168

organization
. H
owever
,

prior to this study, these extens
ions to the classical NN
169

method
have not been automated

and freely available for use
.

170


171

4
. Method


172


8

4
.1.

Input

data

and
processing
parameters

173

GIAS is

designed to process

two input
data
types
: images
and tables of object
174

coordinates
. In the first instance,
i
mages may include
rectangular raster
files
(
e.g.
,
175

*.tiff, *.jpg, *.gif,
et
c
.
)

encoded

in

8
-
bit grey
-
scale format
s
.

All input
s

are

assumed to
176

have an orthogonal projection

with equal
x
and
y

pixel dimensions
. I
f
the
input
image
177

pixels are not square, their
area can be
uniquely
specified
.
To appropriately scale the
178

output results, the pixel size must be
defined

by the analyst
for each input image.

179

Image segmentation is achieved using Digital Number (DN)
thresholding
,
180

such that i
nput
s

are

converted to a binar
y threshold logical matrix
. GIAS
assum
es
181

input
grey
-
scale
pixel

values

>
250
are

white
; however,

the

upper and lower
threshold
s

182

can be adjusted
in the

input options.
Within input image
s
,
black pixels
are assumed to
183

be
the objects of interest (
e.g.
, vesicles
)

whereas
white
pixels are assumed to be
part
184

of the background (
e.g.
,
non
-
vesicles
)
.

An option is provided
for image inversion.

185

Other

than
performing
the binary conversion, GIAS

does not
modify

input
186

images.

Consequently,
pre
-
processing may be
necessary
to
remove
image
artifacts

187

and segment objects into feature classes
. To facilitate image analysis and pre
-
188

processing of inputs to the NN module,

GIAS suppl
ies

labeling information and
189

metadata for each object in

the image analysis output file;

however,

the
level of image
190

modification

will depend

upon

the
particular
scientific
investigation and the
191

preferences of the analyst.

192

I
nputs to the
NN
module
can be

passed from the image analysis module
or
193

load
ed as a

two column table (tab
-

or space
-

delimited)
with

x
-
coordinates in the first
194

column and

y
-
coordinates in the second column.
GIAS automatically checks input
195

files for duplicate entries and

retains only unique coordinate pairs
.

196


9

S
lider bars

control the
detection
thresholds for the
minimum and maximum

197

object

areas
, in addition to the

distance

threshold
used within the Normalized Poisson
198

tests
.
A
check
-
box

excludes objects that touch

the
periphery

of the image
from
199

subsequent

processing
. This option is enabled by default because
meaningful

cross
-
200

sectional prope
rties (
e.g.
, area, geometry, and

orientation) cannot
be
calculated for
201

truncated objects
.

Calculations of object abundance (as a percentage of total image
202

area) are based on total image properties and are unaffected by peripheral object
203

exclusion.
Once the

input
options are

satisfactor
il
y

set
,
the analyst simpl
y

press
es

the
204


P
rocess


button
to calculate

the output graphs and statistics.


205


206

4
.2.

Output results

207

GIAS

outputs
its
results

to
on
-
screen
graphs and text box
e
s
, which are presented in

208

two

tabbed panel
s. Statistics relating to the
object area
, geometry, and orientation

are
209

shown in the first panel
,

labeled “
Image Analysis
” (Figure
1).
The frequency axis can

210

be plotted in linear or log
units

with on
-
screen options for
defining

custom
211

dimensions
.

Object
a
rea
s,

radii,
perimeter
s
,
and
eccentricit
ies

are presented as
212

histograms,

whereas
object

orientations

are displayed using a rose plot
.

Object

radii
213

are

estimated

by
calculating the radi
i

of
equal area

circle
s
.

In addition to the graphical
214

output,
there are

text
-
box summaries
of
the total number of
objects
,

the
i
r

abundance,
215

minimum, maximum, mean, standard
deviation
(at 1 σ
),
skewness
,

and kurtosis

based
216

on

the object area

calculations
.


217

R
esults of the spatial analys
e
s

are summarized in
a

second panel, labele
d
218

“Nearest Neighbor Analysis” (Figure
2
). These results

includ
e

histograms of
the

219

measured NN distances,
in

addition to graphs of
R
,
c
,
and

skew
ness

versus kurtosis

220

(Baloga
et al.
,

2007)
.

The
R

and
c

values are presented with sample
-
size
-
dependent

221


10

threshol
ds for rejecting the null hypothesis (
i.e.
,
a
given
model of
r
e
) at significance
222

levels of
1

and
2 σ
. By default,
t
he program assumes
a
Poisson random model

for the
223

expected spatial distribution;

however, the user may also view results based on
224

Normalized Poisson,
or
Scavenged (
k

= 1 or 2)

models.

Similar
ly,

the skewness
225

versus kurtosis plots are prov
ided with sample
-
size
-
dependent

significance levels

of

226

0.
9
0
, 0.
9
5, and 0.
99
.

227

Text
box
outputs

provide

the
model
-
dependent

value of
r
e

and
summarize
r
a

228

statistics, including

the

total number of objects, number within the convex hull,
and

229

NN distance

statisti
cs (minimum
, maximum, mean, and standard deviation
at 1 and 2
230

σ).
The convex hull is a polygon defined by the outermost points in a feature field
231

such that each interior angle of the polygon measurers less than 180º (Graham, 1972
)
.

232

The convex hull thus forms a boundary around a feature field with the minim
um
233

possible perimeter length

(see

Appendix 1

for examples)
.

234


The

input parameters and

computed

statistics
from both tab
bed

pane
ls

can be
235

output in separate user
-
named text files. The data
are

saved in a tab
-
delimited format,
236

allowing the files to be
viewe
d using a text editor or
imported into a spreadsheet
.

237


238

4
.3.

Implementation

239

GIAS is

written and implement
ed

in
MATLAB with a

G
raphical
U
ser
I
nterface
(GUI)
240

that is
designed

for use within the

geological
community
.

The program

is available as

241

preparsed pseud
ocode
for use with MATLAB
(provided both “Image Processing” and
242

“Statistics” toolboxes are installed)
or as
a standalone application
that

can
run
on
243

Windows and Unix systems without
requiring
MATLAB

or its auxiliary toolboxes
.

244

These software products may b
e downloaded from the internet for free via
245

http://
www.geoanalysis.org
.

246


11

The
regionprops

function within the MATLAB

Image Processing toolbox

247

calculates the following key properties of each
object
:

a
rea
,
c
entroid
position
,
248

p
erimeter
,
e
ccentricity
,
o
rientation
,
e
qual
-
area circle equivalent
diameter

and a l
ist of
249

pixel
positions
within each
vesicle
.

The
pixel list is used to determine which
objects

250

are in contact with the image edge

and the equal area circle

diameter
s are used to
251

calculate

object
radii.

T
he eccentricity and orientation are calculated by fitting an
252

ellipse about the

object
.
T
he angle between the major axis of the ellipse and the
x
-
axis
253

of the image defines the orientation
.

254

By default
,

the

NN

analysis u
tilizes centroid positions
calculated in the image
255

analysis module
, however, object locations may be

uploaded as

a table of

x
-

and
y
-
256

coordinates
.

Within the NN module,
the
MATLAB
function
c
on
vhull
is used
to
257

identify the vertices of the convex hull for all

of the input objects.

The function
258

c
on
vhull

is also used to calculate the area of the convex hull
(
A
hull
)
and
to
separate
259

interior objects
(
N
i
)
from the hull

vertices

(see Appendix 1 for examples)
.

260

NN

distance
s

are
determined by calculating the Euclidean

distance between
261

each

interior
object
(centroid)
and all other
objects

within the dataset
,
including

the
262

vertices of the hull
.
The
results for each interior point

are
sort
ed
in
ascending order

to
263

identify
the minimum NN distance

and calculate the actual m
ean NN distance (
r
a
)
by
264

averaging the NN distances
for all

interior objects

(
N
i
)
.

The
interior population
265

density
(
ρ
i

=
N
i

/
A
hull
)
, provides a
statistical estimate of the true density within an
266

infinite domain and thus substitutes for
ρ
0
.
We then
calculate
R

as

the ratio of

r
a
-
to
-
r
e
.

267

Using sample
-
size
-
dependent

values of

R
,
c
, and their respective thresholds of
268

signi
ficance,

we may define
the following scenarios. If
the values of

R

and
c

fall
269

within ±2

σ of the
ir

expected value
s
, then
the results suggest that the

model correctly
270

describes
the

input distribution
.
If the value of
c

is

outside the

range of

±2

σ,
then
the

271


12

inference is that the input distribution is inconsistent with the model
, regardless of the
272

value of
R
. If
c

is
outside ±2

σ
,
and
R

is
greater than the +2
σ

threshold, then the input
273

distribution exhibits a statistically significant repelling relative to t
he model.
274

Conversely, if
c

is
outside
the
±2

σ
range, and
R

less then than the
-
2
σ

threshold, then
275

the input distribution is
inferred to be
clustered relative to the model. Lastly, if
R

is
276

outside ±2

σ
,

and

c

is within

the

±2

σ range,
then the test result
s are

ambiguous,
277

which
generally indicat
es

that
there is insufficient data to perform the analysis and/or
278

the standard error
(
σ
e
) of the input distribution is
large.

279

One of the novel contributions of our application
is the introduction of
280

automated calcula
tion
s

of sample
-
size
-
dependent

biases

in

NN statistics.

Baloga
et al.

281

(2007)

described

how finite sampling would affect
several

NN models;

however,
they
282

only implemented a

solution for the

Poisson NN
case
. In our application, we

have

283

automated the Poisson
NN procedure, extended it to include the
three
other
models
of
284

Baloga
et al.

(2007), and derived the
k

= 2 case for the scavenged NN
distribution

285

(Appendix
2
)
. Automation of the NN method within GIAS
enables users to
rapidly
286

determine if their data fulfill
s the
size
-
dependent

criteria
of

statistical
significance
for

287

the following
five
spatial distribution

models
: (1) Poisson
,

(2) Normalized Poisson
,
288

(3) Scavenged

k

= 1;
and
(4) Scavenged
k

= 2
.

Table
2

summarizes the properties of
289

these models.

290

To
obtain
sa
mple
-
size
-
dependent

values of
R
,
c
, and their confidence limits,
291

we
generated

simulations in MATLAB
.

F
ollowing the method of Baloga
et al.
(2007)
,

292

c
onfidence limits for the Poisson and the Scavenged
k

= 1 and 2
models were
293

simulated using 1000 random draws

from a modeled distribution.
However, for the

294

Normalized
Poisson
model
,
limits for the

standard

Poisson

case
are employed
because

295

simulating
unique
limit
s
for each

user
-
defined

distance threshold

is

computationally
296


13

intensive
in real
-
time
(
e.g.
,

requiring
approximately four minutes
to perform the
297

simulations using a computer with a

2 GHz processor
and

1 GB of
RAM
).

298


GIAS does not include the logistic density distribution described by Baloga
et
299

al.

(2007) because its
parameters (
i.e.
, the logistic
mean (
m
)
and the standard
300

deviation (
b
)
, see Table 2)

are estimated using a best
-
fit to
the
input

data
.
The logistic
301

case therefore differs substantially from the other NN analyses because the
y
involve a
302

comparison between input spatial distributions and expected m
odel distributions.
303

Additionally, the fit of a logistic curve to an input distribution does not necessarily
304

imply an underlying logistic process because other processes could generate
305

distributions that
better fits
the data. Thus given the differences betw
een the logistic
306

density distribution described by
Baloga
et al.

(2007) and the other Poisson NN
307

models, we have omitted the logistic case from the geospatial analysis tools
308

implemented within GIAS.

309

To
correctly interpret
the results of the

S
cavenging
NN
m
odels,
it is important
310

to

recognize that the formulation of Baloga
et al.

(2007) is
a scale
-
free distribution
.
311

Modification to expected NN distances depend upon the number of objects
312

participating in the scavenging process and not on the location of th
e

ob
jects.
313

Consequently, the

mean and standard
deviation statistics are not directly related to the
314

radius

of
the
scaveng
ed area
. This is
an
unrealistic

condition

for

resource
-
scavenging
315

phenomena

in

nature,

which

have

a

fixed upper limit to
r
.

Alternative mod
eling
316

scenarios
,

with a user
-
defined scavenging
radius,

may be more appropriate for some
317

applications

provided that a scale for the scavenging process
can be determined
.


318

D
istinguish
ing

between Poisson NN and Scavenged Poisson NN models
319

requires statistica
l separation of their respective
NN
distributions. The minimum
320

number of objects required will depend on
ρ
0

and
k
. In Appendix
3
, we sho
w

that

321


14

separation of the Poisson and
k

= 1 Scavenged Poisson models, at the 95% confidence
322

level, requires a minimum
50
to 60 objects for
ρ
0

values of 0.5 to 0.0005

objects / unit
323

area
, whereas 100 to 150 objects are required to distinguish between
k

= 1 and 2
324

models.

325

In addition to sample
-
size
-
dependent

R

and
c

statistics, we

expand

upon the
326

methods of

Baloga
et al.

(2007
)

by
simulat
ing

the
sample
-
size
-
dependent

range
of
327

expected
skewness and kurtosis
values
for
each NN

hypothesis
.
C
alculate
d
328

confidence intervals

at
9
0%, 95% and 99
%

are plotted within GIAS
to
illustrate
329

the
expected
range
s

of

skew
ness

versus kurtosis

for
e
ach

model
.
Plots of skewness
330

versus kurtosis are also useful for discriminating between populations with otherwise
331

similar
R

and
c

values, such as
ice
-
cored mounds (
e.g.
, pingos) and volcanic
rootless
332

cones (
Bruno

et al.
,
2006).

333


334

5
. Results

335

To demonstrate

the utility of GIAS, we have
applied the program to
vesicle

patterns
336

within
proximal
magmatic tephra

deposits

from
Fissure 3 of
the 1783
-
1784 Laki
cone
337

row, Iceland
.

Table
s

3

and 4 summarize

the results

of the “Image Analysis” and
338

“Nearest Neighbor
Analys
is
” modules, respectively.
In Appendix 1, we include
339

additional examples of geospatial distributions that are clustered and repelled relative
340

to the Poisson NN model. These examples are based on the Hamilton
et al.

(2010a,
341

2010
b
) and demonstrate how
statis
tical NN analyses can be combined with
field

342

observations

to obtain information about
the effects of

environmental resource

343

limitation on the spatial distribution of volcanic landforms. Appendix 1 also explores
344

how the geospatial analysis tools
that are pr
ovided

within GIAS can be applied to
345

remote sensing data to
quantitatively compar
e

the spatial distribution of landforms in
346


15

different planetary environments
,
supply non
-
morphological evidence to discriminate
347

between feature identities
,

and
examine
self
-
org
anization processes
within geological
348

systems.

349


350

5
.
1
.

Vesicle analysis

351

Vesicle characteristics can provide information about the pre
-
eruptive history of
352

magmas, volatile exsolution dynamics, and conduit ascent

processes

all of which are

353

important factors in

controlling the intensity of explosive

volcanic eruptions.
354

Quantification of these p
arameters

relies
largely
upon the use of
vesicle
size,
number
,
355

and

volume
d
istributions

to establish

link
s

between natural volcanic samples and
356

experimental data (Adams
et

al.
, 2006)
.
Nested
SEM

images at several magnifications
357

are typically used in combination
with
stereology

(
e.g.
, Sahagian and Proussevitch,
358

1998) to transform 2D vesicle properties into 3D
distributions
.
Although

acquisition
359

and analysis of a complete nes
ted SEM dataset is outside the scope of this study, we
360

present an example of how GIAS can be used to analyze vesicle properties within a
361

single image.
For this example, we have chosen
a

magmatic tephra sample
(LAK
-
t
-
362

23c, Emma Passmore, unpublished data)
fr
om
1783
-
1784 Laki eruption in Iceland.
363

The resolution of the image is 3.81 μm / pixel and we have
followed the method of
364

Gurioli
et al.

(
2008) by
us
ing a
minimum object area threshold of 20 pixels (76.2
365

μm
2
)

to avoid complications associated with poorly re
solved objects near the
limit of
366

resolution
within our data
.

367


The
output of the "I
mage
A
nalysis
"

module

(Table 3 and Figure 1)
reveals
that
368

the

sample has a
vesicularity
of

65.48%

and

excluding objects in contact with the
369

periphery of the image

there are
537 objects larger than
the detection threshold of
370

76.2 μm
2
.
Assuming the properties of an equal area circle, the
v
esicle diameters range
371


16

from 1
9
.
21

μm
to 2.85 × 10
3

μm, with a mean of
3
.
01

× 10
2

μm
±
6.89

× 10
2

(
at 1 σ).
372

The
ratio of
equal area

circle

circumference
to object perimeter is 0.77
±
1.53

(
at 1 σ
)
;
373

however
;
for the 21 vesicles
larger
than 3.10
× 10
5

μm
2

the ratio decreases to 0.55
374

±
0.21, which indicates that the largest vesicles strongly deviate from a circular
375

geometry
(see the graph of “Objects Perimeters”
).
The vesicles ha
ve a mean
376

eccentricity

of 0.7
1

± 0.18

and a
mean

orientation

trending

7
.
27
° to 18
7
.
27
° ±
4
3
.
91
°
377

(
n.b.
,
90° point
s

t
owards the top of the image;
see “Object Eccentricity” and “Object
378

Orientations” in Figure 1).

379


The NN
module
results (Table 4 and Figure 2) show that NN distances

380

between vesicle centroids range from
28
.
05
to 1.53 ×10
3

μm, with a mean (
r
a
) of
1.75

381

× 10
2

±
1.15

×10
2

(
see “Nearest Neighbor Frequency Distribution” in Figure 2). The
382

sample has
R

and
c

values
within the

2 σ

thresholds of significance (Figure 2) and
383

thus
we conclude that
the
vesicle centroids
exhibit a

Poisson
(random) distribution
.
384

We have
also
analyzed the sample using a
N
ormalized
Poisson
threshold

of 5 pixels
385

(
19.24 μm)
, which corresponds to
the
diameter of a circle that is equal
in area
to our
386

object area threshold of 20 pixels
. Given
this threshold, the sa
mple exhibits a repelled
387

distribution with
R

and
c

values greater than their respective upper 2
σ limits.
The
388

Scavenged
k

= 1 and 2 cases overestimate
r
e

relative to
r
a
, and thus result in clustered
389

results relative to both Scavenged models.

390

The GIAS outp
ut
s

suggest
that
vesicles within this

sample
exhibit an overall
391

Poisson distribution despite

the presence of a vesicle
fabric
trending

7.27° to 187.27°
392

±43.91°
.

The random distribution of vesicles

supports the usage of a
stereological
393

transformation
, howev
er, deformation and coalescence have caused vesicles
>
3.10
×
394

10
5

μm
2

to
strongly deviate from a circular geometry. Consequently, if the vesicles
395

represented within this thin
-
section were transformed into a 3D distribution

using
396


17

stereology, it would be necessary to acknowledge propagating errors associated with
397

the non
-
c
ircular geometry of the largest vesicles in the sample.

Additionally,
it is
398

important to recognize the resolution limitations associated with analysis because we
399

have applied a
minimum object area threshold

of 20 pixels
, which excludes all
400

vesicles <
76.2 μ
m
2

(
i.e
., 630 of 1167 objects)
.
A detection threshold of 20 pixels is
401

appropriate for studies of vesicle size
-
frequency distributions because an
402

misclassification
of 1 pixel results in an error of 5%

of the estimate of total object area

403

(
Lucia Gurioli,
per
sonal communication,

2009
)
.

Nevertheless, this threshold appears
404

insufficient for

filtering
artifacts associated with the
geometric properties of the
405

smallest
vesicles in the input image. For instance,
eccentricity values calculated for
406

vesicles within the

Laki magmatic tephra sample

appear to be very
high
(
0.71 ± 0.18
),
407

but this is
a
resolution
-
induced artifact resulting from
pixel
-
scale asymmetries in
a
408

large

population of

small vesicle
s
.
Resolution limitations
also

affect the Normalized
409

Poisson NN
distri
bution

because

the large number of objects below the

detection
410

thresholds lead
s

to underestimate
s

of

ρ
0

and
r
e
, which in turn results in a repelled
411

distribution relative to the model.

In instances where the majority of objects are below
412

the detection limit of analysis, it is therefore imperative to use
nested
image
datasets
,
413

and/or high magnification mosa
ics

to resolve the properties of the smallest objects in
414

a given distribution.

415


416

6
. Discussion


417

For
randomly distributed
spherical
object
s
, stereological techniques may be used to
418

convert cross
-
section
al

areas into volumes (
Sahagian and Proussevitch, 1998)
.
419

Higgins (2000) and Morgan and Jerram (2007)
have also applied

2D to 3D
420

transformations
for

crystal shapes such as cubes and prisms based upon a large
421


18

number of observed cross
-
sectional areas
. Nevertheless,
for irregular
object

422

geometries,
transformat
ions

do not exist for generating a
n accurate

3D model
from a
423

single
2D image

(
Mock and Jerram, 2005;
Sable
et al
., 2006).

For vesicle studies,
424

d
irect
measurements of

3D vesicle distributions

using

X
-
ray microtomography

can
be
425

used to
circumvent
the

problem
s as
sociated with stereological transformations

(
e.g.
,
426

Gualda and Rivers, 2006
; Polacci
et al.
, 2007
; Proussevitch
et al.
, 2007b
), but the

427

resolution of synchrotron
-
based
X
-
ray tomogra
phy is

insufficient

to resolve micron
-
428

sized
vesicle walls, which can result
in erroneously large estimates of vesicle
429

permeability (Gurioli
et al.
, 2008)
.

Tomographic resolution limitations similarly
430

prohibit the detection of sub
-
micron vesicles, which
decreases

estimates of vesicle
431

number densities relative to investigations of h
igher resolution

SEM images (Gurioli
432

et al.
, 2008). Consequently,
even given the availability of 3D imaging techniques,
433

thin
-
section analysis continues to supply

unique

information about vesiculation

434

processes and, therefore
, 2D image processing should co
ntinue to be explored.

435


Similarly, NN analysis of a

single

2D image cannot resolve the spatial
436

organization of the centroids
in 3D
, unless additional
reference
information
is
437

provided

(
e.g.
,

Jerram
et al.
, 1996; 2003; Jerram and Cheadle
,

2000)
. Instead,

2
D
438

methods

characterize the

aerial distribution
of objects within

the

plane of the image.
439

When applying NN analyses to vesicle distributions in thin
-
section
s
,

it is important to
440

remember that object centroids
are unlikely to
coincide
with the center of

a
ve
sicle in
441

3D
, especially when vesicles are irregularly shaped. To
obtain
a
more meaningful
442

statistical
description of vesicle organization it may be necessary to analyze several
443

photomicrographs

preferably three mutually perpendicular thin
-
sections that sam
ple
444

the principal component directions of the vesicle fabric.
Vertical offsets in the
445

positi
on of objects in orthometric aerial

photographs and satellite imagery may
446


19

similarly
introduce discrepancies between projected object sep
arations and true NN
447

distanc
es, but
the vertical offset
s

are
generally
small relative to the separation
of
448

objects
in the plane of the image and hence the differences can be
ignored
.

449


450

7
. Conclusions


451

We have developed Geological
Image Analysis Software (GIAS) for
calculating the
452

geom
etric properties of
objects

and
quantifying

their spatial distribution
. The program
453

improves the efficiency and reproducibility of geological object characterizations by
454

combin
ing

image processing tools for calculating object areas,

abundance,

radii,
455

perim
eters, eccentricity, orientation, and centroid location with sample
-
size
-
dependent

456

NN
tests for (1
) Poisson; (2) Normalized Poisson; (3) Scavenged,
k

= 1; (4)

and

457

Scavenged
k

= 2

distributions
. These geospatial
analyses

have not

been implemented
458

in other s
oftware packages

and, c
onsequently, GIAS represent
s

a novel approach
to
459

characterizing vesicle distributions in thin
-
sections and examining the spatial
460

organization of other geological features

in datasets ranging from photomicrographs
461

(
e.g.
, Section 5)
to

remote sensing imagery of planetary surfaces

(
e.g.
, Appendix 1
462

and Hamilton
et al.
, 2010
b)
.

463


Future work will focus on developing the following aspects of GIAS: (1)
464

image segmentation using Kriging analysis of DN histograms (
e.g.
, Oh and Lindquist,
465

1999);

(2) automated object recognition (
e.g.,

Proussevitch and Sahagian, 2001;
466

Lindquist
et al.
, 1996; Shin
et al
, 2005); (3) geometric cluster analysis (
e.g.
, Still
et
467

al.
, 2004);
(4) statistical analyses to address
truncated and multimodal populations
468

(
e.g.
,
Proussevitch

et al.
, 2007a)
; (5) processing nested image datasets (
e.g.
, Gurioli
et
469

al.
, 2008); (6) incorporating stereological techniques for converting 2D data into 3D
470

distributions and then using the 3D models to calculate vesicle number density (
e.g.
,
471


20

Sahagian
and Proussevitch,
1998;
Proussevitch

et al
., 2007a)
; (7) developing NN
472

analyses for 3D datasets (Jerram and Cheadle, 2000; Mock and Jerram, 2005); and (8)
473

treatment of solid object interactions on NN statistics (
e.g
.
,

Jerram
et al.;
2003
)
.

474


475

Acknow
ledgements

476

We thank Alexander
Proussevitch
, Steve Baloga,
Sarah Fagents,
Bruce Houghton
,
477

Lucia Gurioli,
and
Mark Naylor for insightful discussions relating to vesicle
478

distributions and NN simulations
;

Emma Passmore for kindly providing
the
479

geological thin
-
sections examined within this

study
; and
Dougal Jerram

and
480

Guilherme Gualda

for their thorough reviews, which have greatly improved this
481

manuscript.

CDB wishes to acknowledge the assistance of Sue Rigby in providing
482

funding for software purchases. CDB
wa
s

funded under NERC studentship award
483

NER/S/J/2005/13496.

CWH is funded by the Icelandic Centre for Research
484

(RANNÍS) and the
NASA Mars Data Analysis Program (MDAP)

grant
485

NNG05GQ39G
.

SOEST publication number:

7808
. HIGP publication number
1802
.


486


487


488

Reference
s


489

Adams
,

N
.
K
.
,
Houghton
,

B
.
F
.
, Fagents
,

S
.
A
.
, Hildreth
,

W
.
,

2006
.

The transition from
490

explosive to effusive eruptive regime: The example of the 1912 Novarupta eruption,
491

Alaska
.

Geological Society of America Bulletin

118
, 620

634
.

492

doi:10.1130/B25768.1
.


493


494


21

B
aloga
,

S
.
M., Glaze
,

L
.
S
.
, Bruno
,

B
.
C
.
,

2007
.

Nearest
-
neighbor analysis of small
495

features on Mars: Applicatio
ns to tumuli and rootless cones.

J
ournal of

Geophys
ical

496

Res
earch

1
12
,

E03002,

1
-
17
.
doi:10.1029/2005JE002652
.

497


498

Bishop
,

M.A., 2008. Higher
-
order neig
hbor analysis of the Tartarus Colles cone
499

groups, Mars: The application of

geographical indices to the und
erstanding of

cone
500

pattern evolution. Icarus

197, 73

83.

501


502

Bleacher
,

J
.E., Glaze
,

L
.S., Greeley
,

R.
, Hauber

E., Baloga
,

S
.M., Sakimoto
,

S
.
E.H.,
503

William
s
,

D.
A., Glotch
,

T
.D., 2009.
Spatial and alignment analyses for a field of
504

small
volcanic vents south of Pavonis

Mons and implications for the Tharsis province,
505

Mars
.

Journal of Volcanology and Geothermal

Research

185, 96
-
102.
506

doi:
10.1016/j.jvolgeores.2009
.04.008
.

507


508

Blower
,

J
.
D
.
, Keating
,

J
.
P
.
, Mader
,

H
.
M
.
, Phillips
,

J
.
C
.
,
2001
.

Inferring

volcanic
509

degassing

processes from vesicle size distributions.

Geophys
ical

Res
earch

Lett
ers

28
,
510

347

350
.

511


512

Blower
,

J
.
D
.
, Keating
,

J
.
P
.
, Mader
,

H
.
M
.
, Phillip
s
,

J
.
C
.
,
2003
.

The

evolution

of
513

bubble size distrib
utions in volcanic eruptions. J
ournal of

Volcanol
ogy and

514

Geotherm
al
Res
earch

120
,
1

23
.

515


516

Bruno B
.
C
.
, Fagents S
.
A
.
, Tho
rdarson T
.
, Baloga S
.
M
.
, Pilger E
.
, 2004
.

Clustering
517

within rootless cone groups on Iceland and Mars: Eff
ect of nonrandom processes.
518

J
ournal of

Geophys
ical

Res
earch

109
,
E07009
,

1
-
11.

doi:
10.1029/2004JE002273
.

519


22


520

Bruno
,

B
.
C
.
, Fagents
,

S
.
A
.
, Hamilton
,

C
.
W
.
, Burr
,

D
.
M
.
, Baloga
,

S
.
M
.
, 2006
.

521

Identification of volcanic rootless cones, ice mounds, and impact craters
on Earth and
522

Mars: Using spatial distribution as a remote sensing tool.
J
ournal of

Geophys
ical

523

Res
earch

111
,
E06017
,

1
-
16.

doi:10.1029/2005JE002510
.

524


525

Burr
,

D.M., Bruno
,

B.C., Lanagan
,

P.D., Glaze
,

L.S., Jaeger
,

W.L., Soare
,

R.J., Wan
526

Bun Tseung
,

J.M., Skin
ner
,

J.A., Baloga
,

S.M
., 2009.

Mesoscale raised
-
rim
527

depressions (MRRDs) on Earth: A review of the characteristics, processes and spatial
528

distributi
ons of analogs for Mars. Planetary

and
Space Sci
ence

57, 579
-
596.

529


530

Cashman
,

K
.
V
.
, Mangan
,

M
.
T
.
, 1994
.

Physic
al aspects of magmatic degassing: II.
531

Constraints on vesiculation processes from textural studies of eruption products. In:
532

Carroll
,

M.R.
and Holloway
, J.R.

(Ed
s
.)
,
Volatiles in Magmas. Reviews in
533

Mineralogy
, Mineralogical Society of America
,

p
p
.
447

478.

534


535

Cashman
,

K
.
V
.
, Kauahikaua, J
.
P
.
, 1997
.

Reevaluation of vesicle
distributions in
536

basaltic lava flows. Geology 25
,
419

422
.

537


538

Clark
,

P
.
J
.
, Evans
,

F
.
C
.
, 1954
.

Distance to nearest neighbor as a measure of spati
al
539

relationships in populations.

Ecology

35
, 445

4
53.

540


541

Finney J
.
,
1970
.

The geometry of random packing. Proceedings of the Royal
Society

542

of London, Series A 318
,
479
-
493.

543


544


23

Gaonac'h
,

H
.
, Lovejoy
,

S
.
, Scher
tzer
,

D
.
,
2005
.

Scaling vesicle distributions and
545

volcan
ic eruptions.
Bull
etin

of Volcanol
ogy

67,
350

357.

546


547

Gualda
,

G
.
A
.
R
.
, Rivers
,

M
.
,

2006
.

Quantitative 3D petrography using
X
-
ray
548

tomography: Application

to Bishop Tuff pumice clasts.

J
ournal of
Volcanol
ogy and

549

G
eotherm
al

Res
earch

154
,
48

62
.

550

Gurioli
,

L
.
,
Harris
,

A
.
J
.
L
.
, Houghton
,

B
.
F
.
, Polacci
,

M
.
, Ripep
e
,

M
.
, 2008
.

Textural
551

and geophysical characterization of explosive basaltic activity at Villarrica volcano
,
552

Journal of Geophysical

Res
earch
113,

B08206
, 1
-
16.

doi:10.1029/2007JB005328
.

553


554

Hamilton
,

C
.
W
.
, Thordarson
,

T
.
, Fagents
,

S
.
A
.
, 2010
a,
Explosive lava
-
water
555

interactions I: architecture and emplacement chronology of volcanic rootless cone
556

groups in the 1783
-
1784 Laki lava flow, Iceland. Bulletin of Volcanology, (in press)

557


558

Hamilton
,

C.W., Fagent
s,

S.A., Thordarson
,

T., 2010
b
, Explosive lava
-
water
559

interac
tions II: self
-
organization processes among volcanic rootless eruption sites in
560

the 1783
-
1784 Laki lava flow, Iceland.
Bulletin of Volcanology
, (in press).

561


562

Higgins, M.D., 2000. Measurement of crystal size distributions. American
563

Mineralogist 85
,

1105
-
1116
.

564


565

Jerram
,

D
.
A
.
, Cheadle
,

M
.
J
.
, Hunter
,

R
.
H
.
, Elliot
,

M
.
T
.
,
1996
.


The spatial
566

distribution of crystals in rocks.
Contributions to Mineralogy and Petrology 125(1)
,
567

60
-
74.

568


24


569

Jerram, D. A.,

Cheadle, M. J., 2000. On the cluster analysis of roc
ks. American
570

Mine
ralogist 84
, 47
-
67.

571


572

Jerram
,

D
.
A
.
, Cheadle
,

M
.
J
.
, Philpotts
,

A
.
R
.
,
2003
.

Quantifying the building blocks of
573

igneous rocks: are clustered crystal frameworks the foundation? Journal of Petrology
574

44
,
2033
-
2051
.

doi:10.1093/petrology/egg069.

575


576

Lautze
,

N
.
C
.
, Hou
ghton
,

B
.
F
.
, 2005.

Physical mingling of magma and complex
577

eruption dynamics in the shallow conduit at Stromboli volcano, Italy
.

Geology

33

578

425

428.

579


580

Lindquist
,

W
.
B
.
, Lee
,

S
-
M
.
, Coker
,

D
.
A
.
, Jones
,

K
.
W
.
, Spanne
,

P
.
, 1996
.

Medial axis
581

analysis of void struct
ure in three
-
dimensional tomographic images of porous media.
582

J
ournal of

Geophys
ical

Res
earch 101,

8297

8310
.

583


584

Mangan
,

M
.
T
.
,

Cashman
,

K
.
V
.
, 1996
.

The structure of basaltic scoria and reticulite
585

and inferences for vesiculation, foam formation, and fragmentat
ion in lava fountains
.

586

Journal of Volcano
logy and Geothermal Research 73,

1

18.

587


588

Mangan
,

M
.
T
.
, Cashman
,

K
.
V
.
,

Newman S
.
, 1993
.

Vesiculation of basaltic ma
gma
589

during eruption, Geology 21,
157

160
.

590


591

Mangan
,

M
.
T
.
, Mastin
,

M
.
, Sisson
,

T
.
, 2004
.

Gas evolution
in eruptive conduits:
592

combining insights from high temperature and pressure decompression experiments
593


25

with steady
-
state flo
w modelling.

Journal of Volcanol
ogy and Geothermal Research
594

129,

23

36.

595


596

Marsh
,

B
.
D
.
,
1988
.

Crystal size distributions (CSD) in rock
s and the

kinetics and
597

dynamic
s of crystallization: 1. Theory
.

Contr
ibutions to Mineralogy and

Pet
rology

99
,

598

277

291.

599


600

Mock, A
.,

Jerram, DA.
,

2005. Crystal size distributions (CSD) in three dimensions:
601

Insights from the 3D reconstruction of a highly porphy
ritic rhyolite. Journal of
602

Petrology 46
,

1525
-
1541
.

603


604

Morgan, D.,

Jerram, D.A
.,
2006
.

On estimating crystal shape for crystal size
605

distribution analysis. Journal of Volcanology and Geothermal Research 154
,

1
-
7
.

606


607

Oh W
.
, Lindquist W
.
, 1999.

Image thresholding

by indicator Kriging. IEEE
608

Trans
actions

on Pattern Analysis and Machine Intelligence

21,
590

602.


609


610

Polacci
,

M
.
,
Baker
,

D
.
R
.
,
Bai
,

L
.
,

Mancini L
.
, 2007.

Large vesicles record pathways
611

of degassing at basaltic volcanoes. Bull
etin of

Volcanol
ogy

70, 1023
-
1
029.
doi

612

10.1007/s00445
-
007
-
0184
-
8
.

613


614

Polacci
,

M
.
, Baker
,

D
.
R
.
, Mancin
i
,

L
.
, Tromba
,

G
.
, Zanini
,

F
.
, 2006.

Three
-
615

dimensional investigation of volcanic textures by X
-
ray

microtomography and
616

implications for conduit processes
.
Geophys
ical

Res
earch

Lett
ers 33,

L13312
, 1
-
5.
617

doi:10.1029/2006GL026241
.


618


26


619

Polacci
,

M
.
, Papale
,

P
.
, 1997
.

The evolution of lava flows from

ephemeral vents at
620

Mount Etna
: insights from
vesicle distribution

and morphological studies. J
ournal of

621

Volcanol
ogy and

Geotherm
al

Res
earch 76,
1

17
.

622


623

Proussevitch
,

A
.
, Sahagian
,

D
.
,

Tsentalovich, E
.
P
.
,
2007
a
.

Statistical analysis of
624

bubble

and crystal size distributions:
Formulations and procedures. Journal of
625

Volcanology and Geothermal Research
164
, 95
-
111.

626


627

Proussevitch
,

A
.
,

Sahagian
,

D
.
,
Carlson
,

W
.
,
2007
b
.

Statistical analysis of bubble and
628

crystal size distributions: Application to Colorado Plateau
Basalts. J
ournal of

629

Volc
anology and

Geotherm
al

Res
earch 164,
112
-
126
.

630


631

Proussevitch
,

A
.
, Sahagian
,

D
.
, 2001
.

Recognition and separation of discrete obje
cts
632

within complex 3D voxelized structures. Computers
& Geosciences

27,
441

454
.

633


634

Rasband
,

W.S., 2005. ImageJ, U. S. National Institutes of Health, Bethesda,
635

Maryland, USA,
http://rsb.info.nih.gov/ij/

[accessed
November 28, 2009].

636


637

Sahagian
,

D
.
,

Proussevitch
,

A
.
, 1998.

3D particle size distributions from 2D
638

observations: stereology

for natural applications
.
J
ournal of

Volcanol
ogy and

639

Geotherm
al

Res
earch

84
,
173

196
.


640


641

Sable
,

J
.
, Hough
ton
,

B
.
F
.
, Del Carlo
,

P
.
, Col
telli,
M.,
2006
.

Changing

conditions of
642

magma ascent and fragmentation during the Etna 122

BC basaltic Plinian eruption:
643


27

evidence
from clast microtextures. J
ournal of

Volcanol
ogy and

Geotherm
al

Res
earch
644

158,
333

354
.

645


646

Sarda
,

P
.
,

Graham
,

D
.
,
1990
.

Mid
-
ocean

ridge popping rocks: implication

for
647

degassin
g at ridge crests. Earth
and
Planet
ary

Sci
ence

Lett
ers

97
,
268

289.

648


649

Shin
,

H
.
, Lindq
uist
,

W
.
B
.
, Sahagian
,

D
.
L
.
, Song
,

S
-
R
.
, 2005
.

Analysis of the vesicular
650

structure of basalts. Computers
& Geosciences 31,
473

487
.

651


652

Stephenson
,

G
.
, 1992
.

Mathematical Methods for Science Students
,
2nd edn.,
653

Longman Scientific &
Technical
, London
, 528pp
.

654


655

Still
,

S
.
,

Bialek
,

W
.
, Bottou
,

L
.
, 2004
.

Geometric clustering using the Information
656

Bottleneck method. In: Thrun,
S.,
Saul
,

L.,

and Schölkopf
, B.

(
E
ds
.
)
,

Advances in
657

Neural Information Processing Systems, 16
,

MIT Press
,

Cambridge, MA
, pp. 1165
-
658

1172
.


659



660

S
zramek
,

L
.
, Gardner
,

J
.
E
.
, Larsen
,

J
.
, 2006
.

Degassing and microlite crystallization
661

of basaltic andesite

magma erupting at Arenal Volcano, Costa Rica
.

J
ournal of

662

Volcanol
ogy and
Geotherm
al

Res
earch

157
,
182

201
.

663


664

Wilkins
,

D.E., Ford
,

R.L., 2007. Nearest neighbor methods applied to dune field
665

organization: The Coral Pink Sand Dunes, Kane County, Utah, USA. Ge
omorphology
666

83, 48
-
57.

667

668


28

Appendix 1:

Additional nearest neighbor analysis examples


669

Explosive interactions
between

lava and
ground
water can generate volcanic rootless
670

cone
s

(VRC
s
)
. VRCs

are well

suited to
nearest neighbor (NN)

analyses because they
671

occur in

groups, which consist of a large number of landforms that share a
common
672

formation mechanism.

Geological Image Analysis Software (GIAS) includes
NN
673

analysis

tools that were used by Hamilton
et al.

(2010
b
) to investigate
self
-
674

organization

processes
within
VRC groups
and

quantitatively compare the spatial
675

distribution of VRCs in the Laki lava flow in Iceland

to analogous landforms in the
676

Tartarus Colles Region of Mars.

677


Figure 1 presents example
s

of rootless eruption sites within the Hnúta and
678

Hrossatungur
g
roups of the
Laki lava flow.
In the field,
Differential Global
679

Positioning
System
(DGPS) measurements
were used to delimit rootless crater floors
680

and the
ir

centroids were assumed to represent the surface projection of underlying
681

rootless eruption sites
(Ha
milton
et al.
, 2010a, 2010
b)
.

Hamilton
et al.

(2010
a) used
682

t
ephrochronology
and
s
tratigraphic

relationships

to divide the complete study area
683

(Figure 1A)

into
chronologically and spatially distinct
groups,
domains
,

and
684

subdomains
. Figure 1B provides an exa
mple of a VRC subdomain (Hnúta Subdomain
685

1.1).

Figures 1C and D show the
convex hull boundar
ies of the complete study area
686

and Subdomain 1.1

of the Hnúta Group
,

respectively. Figure 1 also shows the

687

locations of the convex hull vertices (
N
v
) and interior r
ootless eruption sites (
N
i
).



688


Identification of internal boundaries within the Hnúta and Hrossatungur
689

groups allowed Hamilton
et al.

(2010
b) to examine the effects of scale and resource
690

availability on the spatial distribution explosive lava
-
water intera
ctions.
Relative to
691

the Poisson NN model, rootless eruption sites in
both the

complete study area

(
N
i

=
692

2193) and Hnúta Subdomain 1.1 (
N
i

= 98) have |
c
| and |
R
| values that exceed their
693


29

respective
sample
-
size
-
dependent limits of confidence at 2 σ
.

Conseque
ntly, neither of
694

these

distributions
are constant with a Poisson random
model
. Within the complete
695

study area
,

R

is 0.
72
, which is

lower
than the
2 σ
confidence
limit of
R

(
0.9
8
) and

696

implies that the distribution is clustered with respect to
the

Poisson NN

model.

I
n
697

Hnúta Subdomain 1.1
,

R

is 1.23, which is above the upper 2 σ
confidence
limit of
698

1.16 and implies that the distribution is repelled relative to
a

Poisson NN
model
.
699

Hamilton
et al.

(2010
b) interpret
these
patterns of spatial distribution within t
he
700

context of resource
abundance

and

depletion

though

competiti
ve utilization
. O
n
701

regional scales
,

VRCs cluster in locations that contain sufficient

lava and groundwater

702

to initiate rootless eruptions
,
but on

local scales rootless eruption sites tend to se
lf
-
703

organize into distributions that maximize the utilization of limited groundwater

704

resources.
Hamilton
et al.

(2010
b) interpret topograph
y

to be the primary influence on
705

regional clustering because it

would control initial groundwater abundance and
706

region
s of lava inundation. In contrast, they attribute

local repelling
to
the
non
-
707

initiation
,

or early cessation
,

of rootless
eruptions

at
unfavorable

localities

due to
708

groundwater
depletion
in
a competitive
system

that is analogous to an aquifer
709

perforated by
a network of extraction wells
.

710


Baloga
et al.

(200
7
) show
ed

that r
ootless eruption sites in the
Tartarus Colles
711

Region of Mars

have

R

= 1.22

and
c

= 6.14. These N
N

statistics

are similar to those
712

observed within Hnúta Subdomain 1.1 (
R

= 1.23 and
c

= 4.45)
, which suggests that

a
713

similar formation

process in both environments
.
Thus
the statistical

NN analys
i
s
tools
714

provided within GIAS
can be combined with
field

observations and remote sensing to
715

quantify patterns of spatial distribution and infer underlying

mechanisms of formation
716

within a

diverse range of geological systems.

717


718


30

Appendix 1 Figure 1
:

719

A and B depict
rootless eruption sites
(red circles)
superimposed on a 0.5 m/pixel
720

color aerial photograph
showing

1783
-
1784
Laki
eruption
lava flow
s

in Iceland.
A:
721

Full e
xtent of Hnúta and Hrossatungur groups

and an inset
(white rectangle)
denoting
722

Hnúta Subdomain 1.1

(
shown in B
)
. C and D correspond to
identical

regions
shown in

723

A and B, respectively, and provide exa
mples of convex hull boundaries,
rootless
724

erupt
ion sites
that form

convex hull vertices (
N
v
)
,

and interior points (
N
i
).

725


726

Appendix 1
Figure 1
727


31

Appendix

2
:

Derivation for the Scavenged Nearest Neighbor (
k

= 2) case

728

The probability of finding exactly
k

points within a circle of radius
r

on a plane can b
e
729

found
using

the Poisson distribution:

730


2
0
!
)
(
)
,
(
2
0
r
k
e
k
r
k
r
P




,




[
1
]

731

where
ρ
0

is the spatial density

defined by the number

of
objects (
N
)
divided by the
732

area of the featur
e field

(
A
)
.
If the process of creating a point on the plane consumes
733

(
or

scavenges

)
,

resources that would have formed the
k
th

nearest neighbor, then the
734

process can be modelled as

(Baloga
et al.
, 2007)
:

735


)
(
)
1
(
)
0
(
1
)
scavenges

;
(
k
P
P
P
k
r
P





.




[
2
]

736

For
k

=
2
,
t
he

density distribution of the function can be written as:

737


2
0
5
3
0
)
(
)
(
r
e
r
r
p




.





[
3
]

738

For
convenience
, we write
0



, le
ading to:

739


2
5
3
)
(
r
e
r
r
p




.





[
4
]

740

To find the mean of this distribution, we multiply by
r
,
integrate

from 0 to ∞,

and
741

normalise the resulting equation, to
obtain

first moment of the distribution
r
:

742


743












0
5
0
5
3
2
2
dr
e
r
dr
e
r
r
r
r
r



.






[
5
]

744


745

The integrals can be solved using integration by parts and two mathematical
746

identities for the odd and even terms of
r

(
e.g.
,

Stevenson, 1992). F
or the even terms,
747

the
followin
g
identity

can be used,

748


749


32


750


2
1
2
1
0
2
2
)
1
2
(
5
.
3
.
1








n
n
r
n
n
dr
e
r




,






[
6
]

751

while odd integrals of the type,

752











0
1
2
2
dr
e
r
r
n







[
7
]

753

can be solved using the substitution
x
2

=

u

and solv
ing using integration by parts,

754

e.g.
,

755


du
ue
dr
e
r
u
r








0
0
3
2
1
2


.



[
8
]

756

Th
is

leads to the following solution for the mean distribution of the nearest neighbour
757

distances

for
k =

2:

758


0
3
0
3
2
7
2
7
0
3
2
7
4
16
15
1
16
15
1
2
5
3
1

















r

.





[
9
]

759

Th
e second moment
2
r
can be calculated in a similar manner:

760


0
0
5
0
3
2
2
2
3
2
2















dr
e
r
dr
e
r
r
r
r
r
r

.




[
10
]

761

The standard deviation is c
alculated using the identity:
2
2
r
r



:

762


0
0
2
2
0
27572
.
0
16
15
3









.


[
11
]

763

Thus, the expected Standard Error (
σ
e
) is:

764


0
27572
.
0




N
e

,





[
12
]

765


33

where
N

is the number of points in the area of interest.

The third and fourth moments
766

can be derived as above, leading to:

767


768


2
3
0
0
5
0
5
3
3
32
105
2
2














dr
e
r
dr
e
r
r
r
r
r

;



[
13
]

769



2
0
0
5
0
5
4
4
)
(
12
2
2














dr
e
r
dr
e
r
r
r
r
r

.



[14]

770

Using the moment formulae for skewness and kurtosis:

771



3
2
3
2
3
r
r
r
r
skew




;
4
2
2
3
4
3
6
4
r
r
r
r
r
r
kurt




,

[15]

772


we obtain:

773


2
3
0
3
006663
.
0



skew

;
2
0
4
0174838
.
0



kurt

.




[16; 17]

774

775


34

Appendix
3
:
Separation of population means

776

To examine the approximate number of samples required to find a statistically
777

significant separation of two Gaussian distributions, where the expected means and
778

standard deviation
s are known, a commonly
-
used test is to examine if
zero

lies within
779

the following confidence interval:

780


781


95
.
0
96
.
1
s
s
96
.
1
s
s
P
1
0
1
0
1
0
1
0
1
0




















N
N
N
N






,

[1]

782

where
s
0

and

s
1

are the sample means,
μ
0

and
μ
1

are the expected means and
σ
0

and
σ
1

783

are the expected standard deviations.
N

can be varied to estimate the number of
784

samples required to separate the distributions. In particular, if zero is within the
785

interval, then there is no discernible diffe
rence between the means of the distributions
786

at the 95% confidence level.

787

This method is applied to simulate two different means based upo
n the
788

expected values of the Poisson

and the Scavenged NN when
k

= 1 and
k

= 2. To
789

simplify the problem, an approximat
ion to a normal distribution is assumed.

790

Simulation from two normal distributions with increasing numbers of
N

was
791

coded in MATLAB.
We found

the number of samples
required
to statistically
792

differentiate between the distributions
by varying
ρ
0
.

793

For PNN and Scavenged NN,
k

= 1:

794


795

ρ
0

(objects / unit area)

Minimum
N

required to separate
Poisson NN and Scavenged NN (
k

= 1)

0.5

50

0.05

60

0.005

60

0.00
0
5

60


796

For Scavenged NN,
k

= 1 and Scavenged NN,
k

= 2:

797


35

ρ
0

(objects / unit area)

Minimum
N

required to separate
Poisson NN and Scavenged NN (
k

= 2)

0.5

100

0.05

120

0.005

130

0.0005

150


798

Given that

these simulations
assume

N
ormal distribution
s
, it would be wise to
799

regard them as indicative

of the minimum number of data
points
required to
separate
800

the competing models. A
much
larger number of samples
should be used

separate the
801

Scavenged NN,
k

= 1

and
k

= 2 models

because their

expected means are relatively
802

close together.

803

804


36

Electronic Supplements

805

(1)
MATLAB preparsed pseudo code and s
ta
n
dalone MATLAB executable file,
806

available to the public at

http://
www.geoanalysis.org
.


807

(2) GIAS help file

808


809

Figures and Tables

810

Tables

811

Table 1:

Notation
.

812


813

Table
2
:

Nearest neighbor model properties

(after Baloga
e
t al.
, 2007)
.


814


815

Table
3
:

“Image Analysis” module
: summary of results
.

816


817

Table
4
:

“Nearest Neighbor Analysis” module
: summary of results
.

818


819

Figures

820

Figure 1:

Image analysis program in
“Image Analysis”

mode
, with

maximum object
821

areas truncated to ~10
4

pix
els

for visualization purposes which
excludes very
large
822

(
and least circular
)

vesicles from displayed plots and histograms.
Note that o
bject
823

thresholding does not affect
results presented in Tables 3 and 4.

824


825

Figure 2:

Image analysis program in
“N
earest
N
eigh
bor
Analysis”
mode

with

outputs
826

display
ing

results for
a Poisson NN test
.


827

828


37

T
able
1
.

829

Variable

Definition

A

Area of a feature field

A
hull

Area of the convex hull

C

Statistic for comparing observed (actual) to expected mean NN distances

k

Poisson index

N

Number of features within a sample population

N
i

Number of features within the convex hull

N
v

Number of features
forming the vertices of

the convex hull

NN

Abbreviation for nearest neighbor

P

Probability

R

Statistic for comparing actual (
r
a
) to ex
pected (
r
e
) mean NN distances

R

Radial distance

r
o

Threshold distance used
to calculate Normalized Poisson NN distributions

r
a

Mean NN distance observed in the data

r
e

Mean NN distance calculated from a given NN model (
e.g.
, Poisson)

P

Probability den
sity

ρ
0

Population (spatial) density of the input features (
ρ
0

=
N
/
A
)

ρ
i

Population (spatial) density of the input features (
ρ
i

=
N
i

/
A
hull
)

Σ

Standard deviation

σ
e

S
tandard
error


830


38

Table 2.

831


832

Statistic

Poisson

Scavenge
k

= 1

Scavenge
k

= 2




Normalized NN



Logistic

Probability
Density
(
P
)

2
0
0
2
r
re





2
0
3
2
0
)
(
2
r
e
r





2
0
5
3
0
)
(
2
r
e
r





)
(
0
2
2
0
0
2
r
r
re







2
)
(
)
(
]
1
[
b
m
r
b
m
r
e
b
e






Mean
(
r
e
)

0
2
1


0
4
3


0
16
15






0
2
0
2
0
0
2
0
2
r
r
r
dr
e
r
e




m

Mode
(
r
mode
)

0
2
1


0
2
3


0
2
5


0
2
1


m

Standard Error
(
σ
e
)

0
26136
.
0

N

0
272249
.
0

N

0
27572
.
0

N

N
r
r
0
2
2
0
1




3

b

Skewness

2
3
0
3
4
)
3
(





2
3
0
3
008187
.
0



2
3
0
3
0066638
.
0




















2
0
0
2
0
3
2
2
3
)
3
(
1
r
r
r
r
r



0

Kurtosis

2
0
4
2
)
(
16
3
2





2
0
4
016807
.
0



2
0
4
0174838
.
0
























2
0
4
0
2
0
2
0
2
0
4
3
3
2
6
4
1



r
r
r
r
r
r

5
6


39

T
able 3.


833

Image Analysis


Magmatic vesicles

(base unit: μm)

Resolution (pixel size)

3.81 μm

Minimum object area

20 pixels = 76.2
μm
2

Total numb
er of objects

(excluding boundary objects)

537


Abundance (% of total)

65.48

Area (min. to max.)

2.90

×10
2

t
o
6
.40 ×10
6

Mean

a
rea (
±
1 σ)

7.11

×10
4

(
±3.73

×10
5
)

Eccentricity (min. to max.)

0.00 to 0.97

Mean e
ccentricity (
±
1 σ)

0.7
1

(
±
0.18)

Perimeter
(min. to max.)

56.50

to
3.55

×10
4

Mean

p
erimeter (
±
1 σ)

8.64
×10
2

(
±2.29

×10
3
)

Orientation (min. to max.)

-
89.3
2

to 90.00

Mean

o
rientation (
±
1 σ)

7.27

(
±
4
3
.
91
)

834


40

Table 4.

835

Nearest Neighbor (NN)

Magmatic vesicles

Analysis

(base unit: μm)

Input distri
bution properties


Resolution (pixel size)

3.81 μm

Minimum object area

20 pixels = 76.2 μm
2

Objects inside convex hull (
N
i
)

520

Interior object density (
ρ
i
)

7.79 ×10
-
6

Range of NN distances

28.05 to 1.53 ×10
3

Mean NN distance (
r
a
; ±1 σ)

1.75 ×10
2


1.15 ×10
2
)

Skewness

2.48

Kurtosis

16.13



Poisson model



Expected mean NN distance (
r
e
)

1.79 ×10
2

R

with thresholds at 2 σ

0.98 (0.97 and 1.07)

c

with thresholds at 2 σ

-
0.91 (
-
1.33 and 2.83)

Conclusions (
c

value)


Model supported (<2 σ)


Normal
ized Poisson model



Distance threshold

5 pixels = 19.05 μm

Expected mean NN distance (
r
e
)

1.50 ×10
2

R

with thresholds at 2 σ

1.17 (0.97 and 1.07)

c

with thresholds at 2 σ

3.51 (
-
1.34 and 2.88)

Conclusions (
c

value)


Model rejected (>2 σ)

Repelled vs.

Normalized

Scavenged (
k

= 1)

model


Expected mean NN distance (
r
e
)

2.69 ×10
2

R

with thresholds at 2 σ

0.65 (0.99 and 1.06)

c

with thresholds at 2 σ

-
21.82 (
-
0.69 and 3.91)

Conclusions (
c

value)


Model rejected (>2 σ)

Clustered vs.

k

= 1

Scaveng
ed (
k

= 2)

model


Expected mean NN distance (
r
e
)

3.36 ×10
2

R

with thresholds at 2 σ

0.52 (1.00 and 1.06)

c

with thresholds at 2 σ

-
37.05 (
-
0.14 and 4.78)

Conclusions (
c

value)


Model rejected (>2 σ)

Clustered vs.
k

= 2


836


41

Figure 1

837


838


42

Figure 2

839


840