A Cloudy View on Computing

scacchicgardenSoftware and s/w Development

Dec 13, 2013 (3 years and 3 months ago)

84 views

A Cloudy View on Computing
Workshop and CReSIS Field Data
Accessibility


Jerome E. Mitchell

Indiana University

A Cloudy View on Computing

Workshop

Workshop Details


Who:



> 10
-
12 Association
of Computer/Information Sciences and
Engineering Departments at Minority
Institutions (ADMI) faculty/students


Where:



Elizabeth City State University (ECSU)


When:



June 7
-

July 5 2011


What:



A Teach
-

One
-

Teach
-

Many approach t
o

cloud computing


Preliminary discussion
on workshop at ADMI 2011 Conference
(April 14
-
16, 2011)






Workshop Purpose


Introduce ADMI to the basics of the emerging Cloud
Computing paradigm


Learn how it came about


Understand its enabling technologies


Understand the computer systems constraints, tradeoffs, and
techniques of setting up and using cloud


Teach ADMI how to implement algorithms in the Cloud


Gain
competence in
cloud
programming
models for
distributed
processing of large datasets
.


Understand how different algorithms can be implemented and executed
on cloud frameworks


E
valuating the performance and identifying bottlenecks when mapping
applications to the clouds




Workshop Components

What we are trying to answer?

What are its
challenges and
opportunities?

Why Cloud
Computing?

What is Cloud
Computing?

How does


Cloud

Computing
work?

Workshop Schedule

Now I
understand
Cloud Computing

Now I
appreciate
why Cloud
Computing
is important

Now I really
understand
Cloud
Computing!

Parallel

Processing

Functional

Programming

Map
/Reduce

Algorithm

Hadoop

Twister

Programming Model

Used by

Parallelized by

Apache’s
implementation

CGL’s
implementation

End of First Week

End of Third Week

End of Fifth Week

T

i

m
e

I

i

n
e

Experimenting With…

Compute Resources


FutureGrid


Virtual machines + virtual networking to create
sandboxed modules


Virtual

“Grid” appliances: self
-
contained, pre
-
packaged execution
environments


Group

VPNs: simple management of virtual clusters by students and
educators




References


D. Wolinsky, A. Prakash, and R. Figueiredo.
Experiences with Self
-
Organizing, Decentralized Grids
Using the Grid Appliance.


J. Ekanayake, H. Li, B. Zhang, T. Guanarathne, S. Bae,
J. Qiu, and G. Fox. Twister: A runtime for iterative
MapReduce. In
Proceeding of the 19th ACM
International Symposium on High Performance
Distributed Computing
, pages 810


818. ACM, 2010.


Apache Hadoop
, Retrieved Aug 20, 2010, from ASF:
http://hadoop.apache.org/core/


FutureGrid: futuregrid.org

CReSIS Field Data
Accessibility

Current CReSIS Data
Organization


CReSIS’s data products website lists



direct download links for individual files


The data are organized by season



Seasons are broken into data segments


Data segments are arranged into frames



Associated data for each frame are stored in different file formats



CSV (flight path)



MAT (depth sounder data)



PDFs (image products)


File
-
based data system has no spatial data access
support


Field Data Spatial Access
Project


We
are developing a solution based on
SpatiaLite

database with neither server nor
Internet requirements, ideal for field use


SpatiaLite Database


Spatialite



Spatial extension to the light
-
weighted SQLite database


Manages both vector and raster data and supports a rich set of
GIS analysis functions through SQL.


The data can be directly accessed through GIS software
and MATLAB

SpatiaLite Database Example


2009 Antarctic flight path data



~ 4 million entries
-

originally stored as 828 separate files and
imported into one SpatiaLite database file

Field Data Access Service

Visual Crossover Analysis for Quality
Control (in development project)

2009 Antarctica Season Vector Data

SpatiaLite Database Example


Flight path data stored as YYYYMMDD_segID_frameID.txt


SQLite command to create the segs table:


CREATE TABLE segs (


UTCTime Number,


Thickness Number,


Elevation Number,


FrameID VARCHAR(12),


Surface Number,


Bottom Number,


QualityLevel Integer)


SELECT AddGeometryColumn ('segs','geometry',4326,'POINT',2)

*note: geometry: 2
-
> xy, (longitude, latitude), 4326
-
> WGS84 coordinate
system

SpatiaLite: MATLAB Direct
Access


Mksqlite package: a MEX
-
DLL to access SQLite databases
from MATLAB http://mksqlite.berlios.de/


Add this flag to build.m to enable SQLite to load SpatiaLite as
an extension:


-
DSQLITE_ENABLE_LOAD_EXTENSION=1


Testing in MATLAB:

dbid = mksqlite(0,'open',

test.sqlite' )

sql = ['SELECT load_extension(''', path_to_spatialite, ''')'];

mksqlite(dbid, sql) % load extension

mksqlite(dbid, 'SELECT sqlite_version()')

mksqlite(dbid, 'SELECT spatialite_version()')

mksqlite(dbid, 'SELECT X(geometry) as lon, Y(geometry) as lat
from segs where FrameID=2009101601001');

mksqlite(dbid, 'close')

References


PolarGrid data products:
https://www.cresis.ku.edu/data


SpatiaLite: http://www.gaia
-
gis.it/spatialite/


Quantum GIS: http://www.qgis.org/


19

of XX