Evaluation of Cloud Storage for Preservation and Distribution of Polar Data.

abnormalobeisanceSecurity

Nov 3, 2013 (3 years and 11 months ago)

120 views

Evaluation of Cloud Storage for Preservation and Distribution of Polar Data.

Nadirah

Cogbill

Mentors: Marlon Pierce, Yu (Marie) Ma,
Xiaoming

Gao
, and Jun Wang

Elizabeth
City State University, Elizabeth City, North Carolina, 27909


Pervasive Technology Institute,
PolarGrid
, Bloomington, Indiana 47408

Abstract
The team goal was to find a service that could
both store
large amounts of data that Polar Grid
has collected, and also
be sure that the data will be preserved for researchers of
the future to continue to use the data. For this reason, the team looked to a cloud storage service for the solution. Cloud s
tor
age is the storing of data that is accessible as a service by the
use of a network. In this case, the team decided to research online storage using Amazon Web Services (AWS) and researched wh
at
AWS was, how reliable it was, how much data could
be stored, and if
data would
be lost over an extended period of time. AWS is a cloud computing platform that is offered by Amazon.com that is made up of d
iff
erent computing services
that are also known as web services. Within AWS, there is a service called the Simple Storage Service (S3) that is a user
-
friend
ly way of storing data over the Internet. The project
shifted to investigate more about what is S3 and if it provided the services needed to aid
PolarGrid
. There
were questions pertaining to S3 that the group researched. One of the questions
was the guarantee of the reliability that S3 mentioned in their Service Level Agreement, which is the service terms promised
to
the user. Also, there was mentioning of a “durability”
guarantee of the service by 99.9999999%. What did Amazon mean by “durability”? What does that percentile guarantee? Is that p
erc
entile guaranteed over a lifetime or only a few days?
What is the likelihood of losing irreplaceable field data over various time scales (years, decades, and longer)? Financially,

th
e group was to investigate how cost efficient it would be for
Polar Grid to use this service. Polar Grid uses 26 Terabytes and over 300,000 files, and it was the duty of the group to inve
sti
gate how Polar Grid would be charged. Would be for how
much data will be stored, how much time the data will be stored in this service, or both. For this project, the aim of the gr
oup

was to have these questions answered so that Polar Grid
may have a secure place to store its mounds amount of data.



PolarGrid

is a NSF MRI Funded Partnership between
Indiana
University and Elizabeth City State University,
tasked with
the goal
of obtaining information on rapid changes in glaciers in Polar Regions.
PolarGrid

provides information technology support for the
Center for the Remote Sensing of Ice Sheets (
CReSIS
), based at the University of Kansas. Data collected by
CReSIS
,
PolarGrid

must find a way to manage and preserve the large data archives. This is currently 26 TB of field
-
collected radar data and grows

regularly through additional
CReSIS

expeditions.


Reference

Image: "Basic Architecture of a Scalable Service."
Amazon Web
Services
. Web. 11 Jun 2010.
<http://developer.amazonwebservices.com/connect/servlet/KbServlet
/downloadImage/691
-
102
-
53/monster
-
diagram.png>.

S3: unknown. "Amazon Simple Storage Service (Amazon S3)."
Amazon Web Services
. Amazon Web Services LLC, 2010. Web. 21
Jul 2010. <http://aws.amazon.com/s3/>.

AWS: unknown. "What is AWS?."
Amazon Web Services
. Amazon
Web Services LLC, 2010. Web. 21 Jul 2010.
<http://aws.amazon.com/what
-
is
-
aws/>.

SLA: unknown. "Amazon S3 Service Level Agreement."
Amazon
Web Services
. Amazon Web Services LLC, 2010. Web. 21 Jul 2010.
<http://aws.amazon.com/s3
-
sla/>.

Cloud Storage:
Sliwa
, Carla. "What is cloud storage?."
SearchStorage
.
TechTarget
, 13 Feb 2009. Web. 21 Jul 2010.
<http://searchstorage.techtarget.com.au/articles/29558
-
What
-
is
-
cloud
-
storage
-
>.

Analysis
:
Having a 99.9999999% durability
guarantee over a year span, with 300,000 files; within a
decade there is a 30% chance of a file being lost. In a
century the estimated percentage of files lost is 183%.