Evolving inversion methods in

learningsnortSecurity

Nov 3, 2013 (3 years and 8 months ago)

50 views

Evolving inversion methods in
Geophysics with Cloud Computing


a case study of an eScience
collaboration



Mudge, Chandrasekhar,
Heinson
,
Thiel

Prof J Craig Mudge FTSE

University of Adelaide

Australia

School of Computer Science/ School of Earth
Sceinces


7
th

IEEE eScience Conference, Stockholm, December 2011

1

2

Two South Australian successes in geology

1.
Hot rocks for geo
-
thermal energy
-

95% investment is in
South Australia








2.
Olympic Dam


-

BHP Billiton


--

world's fourth largest copper deposit, fifth largest gold
deposit

and the largest uranium deposit.

craig.mudge@adelaide.edu.au IEEE eScience 2011

Outline

1.
Cloud computing

2.
Collaborative Cloud Computing Lab (C3L)

3.
Inversion in
m
agnetotelluric processing

4.
Geothermal


EGS in South Australia

5.
Results and Lessons learned

6.
Future work

4

Cloud service provider

owns and operates the infrastructure

and innovates to


keep technology leading edge,


handle software upgrades, and

steadily reduce energy costs


Google,
Dalles

Oregon


Microsoft Azure, Chicago

Air flow

Massive scale of data centres delivers 4


7X
cost reduction and energy efficiency

5

A no
-
machines Lab

eScience enabled by

cloud computing

Seed funding from

--

Department of Mines
www.pir.sa.gov.au

--

MSFT Research Jim Gray Seed Grant

Started June 2010

machines

6

craig.mudge@adelaide.edu.au IEEE eScience
2011

Our three cloud service providers

1.
Amazon Web Services

2.
Microsoft Azure


Now adding government funded eResearch
clouds which will run Open Stack (NASA and
Rackspace
)

7

craig.mudge@adelaide.edu.au IEEE eScience 2011

Magnetotelluric

(MT) imaging

1.
Using the magnetic and electric
fields of the earth, MT imaging
determines the resistivity
structure of a sub
-
surface area of
interest.

2.
It goes deeper (hundred or so Km)
than seismic (<2 Km) but does not
have the same resolution

3.
Applications

1.

mineral exploration,

2.
water management in mining,

3.
geothermal exploration,

4.
carbon storage,

5.
aquifer research and management

6.
earthquake and volcano studies.

CO
2

in depleted gas field

(
Heinson

and Mudge, 2010)

8

Electrical resistivity

Electromagnetic methods

Data logging by University of Adelaide
Geophysics, on a geothermal site


Paralana
, SA,
Australia

11

MT Processing steps

12

craig.mudge@adelaide.edu.au
IEEE eScience 2011

Inversion

y
es

n
o

l
ocally improve

model misfit

c
ompute model’s

MT
r
esponse

c
an locally
improve misfit?

> max

iterations?

s
tart

c
ompute

sensitivity

matrix

c
ompare model
r
esponse

to observed
d
ata

c
an locally improve
smoothness?

s
mooth

enough?

r
equired

misfit?

l
ocally improve

m
odel smoothness

f
inish

y
es

y
es

n
o

y
es

n
o

y
es

n
o

n
o

13

craig.mudge@adelaide.edu.au IEEE eScience 2011

Inversion iterations:

Compute model response,

compare with observed data




Searching the solution space


14

craig.mudge@adelaide.edu.au IEEE eScience
2011

15

craig.mudge@adelaide.edu.au
IEEE eScience 2011

Setting up a new inversion


part 1

craig.mudge@adelaide.edu.au IEEE eScience 2011

16

Setting up a new inversion


part 2

craig.mudge@adelaide.edu.au IEEE eScience 2011

17

Dashboard

craig.mudge@adelaide.edu.au IEEE eScience 2011

18

Results and Lessons learned

19

craig.mudge@adelaide.edu.au IEEE eScience 2011

Speedup

craig.mudge@adelaide.edu.au IEEE eScience 2011

20

Sequential





Parallel

Performance analysis beyond speedup

craig.mudge@adelaide.edu.au IEEE eScience 2011

21

Sequential






Parallel

Examples of recent performance analysis

1.
Effect of FORTRAN compiler with different optimisations has been worth exploring. A factor of


3X speed up from the Intel Visual Fortran Composer XE 2011 for Windows.

2. “Steal time”
-

time lost due to hypervisor’s management of a virtual machine


Netflix have


analysed their Amazon experience extensively



Results and
learnings

1.
“No
-
machines” works

2.
Speedup has led to 100% adoption in MT research

3.
First results of monitoring fluid injection in EGS
Reservoirs using
magnetotellurics

(MT)


promising
since seismic does not indicate fluid flow, and MT is
low cost

4.
Taking chunks of FORTRAN is achievable in a timely
manner

5.
Capability building


a true eScience partnership

6.
Our Web Services user interactions took same
amount of programming effort as parallelising


craig.mudge@adelaide.edu.au IEEE eScience 2011

22

eScience in the cloud

-

observations of a veteran of the
computer industry (but not my co
-
authors
in this eScience paper)

1.
Web Services (giving interoperability
between disparate services of historic
proportion) could have been adopted faster
in eScience

23

craig.mudge@adelaide.edu.au IEEE eScience
2011

24

craig.mudge@adelaide.edu.au IEEE eScience 2011

(Mudge, 2002)

25

craig.mudge@adelaide.edu.au IEEE eScience 2011

(Mudge, 2002)

eScience in the cloud

-

observations of a veteran of the
computer industry (but not my co
-
authors
in this eScience paper)

1.
Web Services (giving interoperability
between disparate services of historic
proportion) could have been adopted faster
in eScience

2.
Cloud computing will speed up the use of
web services , because cloud makes it natural
to interact using web services (service
orientation, discovery, interoperability)

26

craig.mudge@adelaide.edu.au IEEE eScience
2011

Lessons learned


HPC programming

1.
MapReduce

(
Hadoop
) is the programming model that
best matches
data centre as the computer.
However,
because it requires rewrite of existing programs, the
first wave of benefits come from simpler parallelism


parameter sweeps, Monte Carlo simulation, job
-
level
parallelism, etc.

2.
Second wave of benefits will be new algorithms and
rewrites using
MapReduce

3.
Nevertheless, the first wave in
cloud
-
based
bioinformatics

(matching short reads against
reference genome) did use
MapReduce


27

craig.mudge@adelaide.edu.au IEEE eScience
2011

Lessons learned
-

Azure

1.
Why was Azure much harder to migrate to than
predicted?

Answer:

-

We came from a non
.Net

environment

-

Azure younger than Amazon (2 years)

-
Virtual Machine in Beta

-
Deployment times 20 minutes
vs

20 seconds slows
debugging

-
Azure designed for long running applications, e.g.,
ecommerce, more than for scientific

2. However, we persist.

-

Warehouse
-
sized data centre


operating system is
robust and rich, e.g., hot swap of patches

-

Benefits of
PaaS






28

craig.mudge@adelaide.edu.au IEEE eScience 2011

Future work

29

craig.mudge@adelaide.edu.au
IEEE eScience 2011

Future work 1 of 2

1.
Inversion on demand
, available to colleagues
and explorers world
-
wide, wrapped in
workflow (persistence, provenance, partial
runs, ...)

2.
National/international collaboration building
on a national Geophysics Virtual Lab


-

access to disparate data (seismic, borehole images,
gravity, magnetic, ...) built by
Auscope

using
results of
GeoSciML

Interoperability Working
Group

30

craig.mudge@adelaide.edu.au IEEE eScience
2011

Sustainable Energy Policy

Societal
Need

Energy Exploration Integrated Virtual
Laboratory

Environment

Virtual Laboratory

Integrated
Virtual Labs

Virtual Geophysical
Laboratory

National
Borehole
Laboratory

Virtual Geodesy
Laboratory

Virtual Earth
Observation

Laboratory

Virtual Oceans
Laboratory

Virtual
Laboratories

Geophysics

Borehole

Geodesy

Land cover

Marine

Virtual
Libraries

Processing
Services

Data

Processing
Services

Data

Processing
Services

Data

Processing
Services

Data

Processing
Services

Data

Modelling &
analytic tools

Dr Robert Woodcock and Dr Lesley Wyborn

31

craig.mudge@adelaide.edu.au
IEEE eScience 2011

Future work 2 of 2

3.
Explore statistical machine learning to detect
interesting patterns

4.
Exploring solution space using
Evolutionary
Algorithms

implemented on thousands of
processors in the cloud
(Brad Alexander)

5.
Promulgate security best practices

6.
Following the success of speedup, model size
has become the limiter for our geophysicists



32

craig.mudge@adelaide.edu.au IEEE eScience
2011

Acknowledgements

Brad Alexander

Gordon Bell

Pinaki Chandrasekhar

Dennis Gannon

Graham
Heinson

Tony Hey

Ed
Lazowska

Stephan
Thiel




33

craig.mudge@adelaide.edu.au IEEE eScience
2011

Summary

1.
Cloud computing

2.
Collaborative Cloud Computing Lab (C3L)

3.
Inversion in
m
agnetotelluric processing

4.
Geothermal


EGS in South Australia

5.
Lessons learned

6.
Future work

Thanks

and

questions


craig.mudge@adelaide.edu.au


www.cloudinnovation.com.au


+61 417 679 266

+1 650 224 2111

35

craig.mudge@adelaide.edu.au
IEEE eScience 2011

Security best practices

1.
Certifications

2.
Physical security

3.
Secure services

4.
Data privacy via encryption

5.
Backups

6.
Constant monitoring

7.
External review

8.
Compare yours with Google, Amazon, Azure

36

craig.mudge@adelaide.edu.au IEEE eScience
2011