INTO THE CLOUD

smilinggnawboneInternet and Web Development

Dec 4, 2013 (3 years and 4 months ago)

136 views


INTO THE CLOUD



20.07.2010

An evaluation of
the
Google App Engine


Chornyi, Dmitry

Riediger, Julian

Wolfenstetter, Thomas




Into the Cloud



Page

1

Into the Cloud

AN E VAL UAT I ON OF
T HE
GOOGL E
AP P E NGI NE


Abstract

Cloud Computing is

often glorified as one of the most important IT paradigm
s

to shape the
next decade of IT. In this paper we
illustrate

the concepts as well as the technology behind
Cloud Computing in general and analyze one of the currently most popular platforms, the
Google App Engine. For this purpose we developed a sample application utilizing
many of
the platform’s features. Based on

experiences that were made during implementation and
scalability testing we evaluate the Google App Engine and discuss whether it satisfies the
requirements of state
-
of
-
the
-
art software engineering.

Finally we present an outlook on the
way ahead in Cloud
Computing.

Keywords:

Cloud Computing, Google App Engine, Evaluation, Scalability testing, PaaS



Into the Cloud



Page

2

Content

OUTLINE

................................
................................
................................
..............................

3

CLOUD COMPUTING IN A

NUTSHELL

................................
................................
..................

3

Motivation

................................
................................
................................
................................
..................

3

Definition of Cloud Computing

................................
................................
................................
...............

4

Service delivery models

................................
................................
................................
..........................

4

GOOGLE APP ENGINE

................................
................................
................................
.........

6

Architecture

................................
................................
................................
................................
................

6

Costs

................................
................................
................................
................................
............................

6

Features

................................
................................
................................
................................
......................

7

Runtime Environment

................................
................................
................................
..............................

7

Persistence and the datastore

................................
................................
................................
.............

8

Services

................................
................................
................................
................................
....................

9

App Engine for Business

................................
................................
................................
.......................

11

APPLICATION DEVELOPM
ENT USING GOOGLE APP

ENGINE

................................
..........

11

General Idea

................................
................................
................................
................................
.........

11

Requirements and functionality

................................
................................
................................
...........

12

Implementation

................................
................................
................................
................................
.......

12

Development Environment

................................
................................
................................
.................

12

Application Environment

................................
................................
................................
....................

12

Application Architecture

................................
................................
................................
....................

13

Platform Limitations

................................
................................
................................
............................

17

SCALABILITY TESTING

................................
................................
................................
.......

18

Testing
approach

................................
................................
................................
................................
...

18

Application scalability

................................
................................
................................
..........................

19

DISCUSSION

................................
................................
................................
.......................

20

Software Engineering Aspects

................................
................................
................................
............

20

Outlook: the way ahead in Cloud Computing

................................
................................
.................

23

New application opportunities and use cases
................................
................................
...............

23

Challenges of Cloud Computing

................................
................................
................................
......

24

CONCLUSION

................................
................................
................................
....................

26

ABBREVIATIONS

................................
................................
................................
................

27

TABLE OF FIGURES

................................
................................
................................
............

28

LITERATURE

................................
................................
................................
.......................

29





Into the Cloud



Page

3

OUTLINE

In this paper we want to discuss the omnipresent topic
C
loud
C
omputing with a special focus on
application development using Google

s
platform
-
as
-
a
-
service

(PaaS)

environment, the Google App
Engine. First we give a general introduction on
C
loud
C
omputing where we will outline the motivation
behind this hype, give a definition of the term and explain the th
ree service delivery models of C
loud
C
omputing. In a next step we draft a detailed picture of the Google App Engine, illustrating its
architecture, services and API.
This section is followed by a report on an exemplary application
development project on the platform. Additionally
,

we sum

up our experiences made during scalability
testing, before we finally discuss whether Google App Engine and Cloud Computing in general meet the
requirements of modern software engineering and if there is a paradigm shift in the
mindsets of software
develo
pers.

CLOUD COMPUTING IN A

NUTSHELL

Motivation

In October 2008 the
British weekly, The
Economist praised Cloud Computing as the coining technology
for the IT world:


“T
HE RISE OF THE CLOUD

IS MORE THAN JUST

ANOTHER

PLATFORM SHIFT THAT
GETS GEEKS
EXCITED
.

I
T WILL
UNDOUBTEDLY TRANSFOR
M
THE
IT

INDUST
RY
,

BUT IT WILL ALSO PRO
FOUNDLY
CHANGE THE WAY PEOPL
E
WORK

AND COMPANIES OPERAT
E
.

I
T WILL ALLOW

DIGITA
L TECHNOLOGY TO PENE
TRATE EVERY
NOOK AND
CRANNY OF THE ECONOM
Y
AND
OF SOCIETY
,

CREATING SOME TRICKY

POLITICAL
PROBLEMS ALONG THE W
AY

(1)
.


Indeed, when looking at the Google Trends graph

for search volume worldwide, (see
Figure
1
)

one
realizes tha
t s
ince 2007 the term “Cloud Computing” has become increasingly popular
, at least in terms
of online search
behavior
.



FIGURE
1
:
GLOBAL
SEARCH VOLUME INDEX

FOR “CLOUD COMPUTING


(2)


The dream behind
Cloud Computing is that, a
s long as users
can connect to the Internet, they have the
entire

Web as their

computing center
.

When compared to the infinitely powerful Internet

C
loud,
p
ersonal
c
omputer
s seem like lightweight terminals

allowing users to utilize

the cl
oud. From this perspective,
C
loud
Computing may seem

like a “return”
to the original mainframe paradigm

from the 60s to 70s

(3)
.

Critics
therefore accuse Cloud Computing to be
old wine in new skins. They argue that it is

just a temporary
fashion term within the IT community. No matter whether
T
he Economist’s optimistic scenario will occur or
the critics will be right after all, the technology behind Cloud Computing is more than interesting enough
to be studied in detail.

Into the Cloud



Page

4

Definition of Cloud Computing

Recent literature review
(4)

has shown that Cloud Computing is still a fuzzy term and all existing
definitions have little in common. Moreover, distinguishing Clouds from Grids is definitely not trivial as
both approaches are closely related to each other. The most
noticeable differe
nces is

that in contrast to
Grid Computing, which focuses on sharing distributed resources dynamically at runtime, Cloud Computing
aims at virtualization
(5)
. In this context there are several approaches that aim at combining t
he
advantages of
c
louds and
g
rids
, which can also
be seen as a combination of ad
vanced networking with
sop
histicated virtualization
(4)
.

W
hen

look
ing

for a

universal

definition

of Cloud Computing
,

one has to
consider multiple a
spects.
The
essential characteristics

of Cloud Computing can be circumscribed as

o
n
-
demand self
-
service, rapid elasticity, resource pooling,
ubiquitous netw
ork access and
measured service

(6)
.

Furthermore, there are different d
eployme
nt models.

Cloud Computing can be operated
as a

private, public
, community

or h
yb
r
i
d

Cloud
. Public Clouds are
available

to everybody

in a pay
-
as
-
you
-
go manner
.

Current
examples of public
Clouds
inc
lude Amazon

Web Services
(7
)
, Google
App

Engine

(8)

and Microsoft Azure

(9)
.
Private Clouds on the other side

refer to intern
al datacenters of a business or
other organization that are not made available to the public
.

Community Clouds are shared by several
organizations and are

usually setup for their specific requirements.

Hybrid Clouds are
a mixture
of the
above

three deployment models. Each C
loud in
the hybrid model

can be

independently managed but
applications and data
are

all
owed to move across the hybrid C
loud.
Therefore hybrid C
louds allow
cloud bursting to take p
lace, which is where a private Cloud can burst
-
out to a public C
loud
when it
requires more resources
(10)

(11)
.

Based on these observations we adopt the Cloud Computing definition
by Vaquero et al.
(4)
:

C
LOUDS ARE A LARGE PO
OL
OF EASILY USABLE AND

ACCESSIBLE VIRTUALIZ
ED R
ESOURCES

(
SUCH AS
HARDWARE
,

DEVELOPMENT PLATFORM
S AND
/
O
R
SERVICES
).

T
HESE R
ESOURCES CAN BE DYNA
MICALLY RE
-

CONFIGURED TO ADJUST

TO
A VARIABLE LOAD
(
SCALE
),

ALLOW
ING ALSO FOR AN OPT
IMUM RESOURCE
UTILIZATION
.

T
HIS
POOL OF RESOURCES I
S TYPICALLY EXPLOITE
D BY
A PAY
-
PER
-
USE MODEL IN
WHICH
GUARANTEES ARE OFFER
ED BY
THE
I
NFRASTRUCTURE
P
ROVIDER BY MEANS OF
CUSTOMIZE
D
SLA
S
.

Service delivery m
odels

In general Cloud Computing services can

be

divided into three different service delivery

models
:

Infrastructure
-
as
-
a
-
Service (IaaS), Platform
-
as
-
a
-
Service (PaaS) and

Software
-
as
-
a
-
Service

(SaaS). As it
is shown in
Figure
2
, these delivery models can b
e seen in a h
ierarchic context. To the end user only SaaS
is visible, while
d
evelopers use PaaS and IaaS to deploy their applications.

Subsequently
,

the three
occurrences of Cloud Computing are introduced individually.

Into the Cloud



Page

5


FIGURE
2
: SERVICE DELIV
ERY

MODEL
S

OF CLOUD COMPUTING
(12)

Infrastructure
-
as
-
a
-
S
ervice

IaaS

p
roducts deliver a

complete

computer
infrastructure
remotely
via the Internet
. They
provide machine
instances to developer
s, which
essentially behave like
dedica
ted servers
,

controlled
by the developers.

This means that the developer has

full responsibility
for
server operation and o
nce a
machine reaches its
performance
limits
,

the develope
r has

to manually instantiate
another machine and

to

sca
le
the

applic
ation out to it.
To sum up, IaaS

is intend
ed for developers who want to

write
arbitrary software on
top
of the infrastructure with only
small compromises in their development methodology

(12)
.

Platform
-
as
-
a
-
S
ervice

PaaS
are situated one
level

higher within the Cloud Computing hierarchy. They

provide

a full or partial
application
dev
elopment environment
that abstracts machine
instances and other technic
al details from
the developer
. The
applications

are executed
within

dat
a cent
e
r
s, not concerning
the developers with
matters

of allocation. In exchange for
this, the developers have

to

handle

some constraints that
the
environment imposes on th
eir application design, for
example the use of
special data
stores

instead of
relati
onal
databases

(12)
.

Software
-
as
-
a
-
S
ervice

At the
consumer
-
facing level
there
are the most popular examples of

Cloud Computing, with wel
l
-
defined
applications offering
users online resources and storage. This differentiates

SaaS from traditional
websites or web applications which

do not interface with us
er information
or do so in a limited manner

(12)
.

SaaS offers complex applications such as CRM or ERM online
(5)
.

Figure
3

illustrates the different service delivery models of Cloud Computing and lists major vendors of
each domain as an example.

Into the Cloud



Page

6


FIGURE
3
:
MAJOR TYPES OF CLOUD

SERVICES

(ADAPTED FORM
(5)
)


GOOGLE APP ENGINE

Architecture

The Google App Engine (GAE) is Google`s
answer to the ongoing trend of Cloud C
omputing offerings
within the industry. In the traditional sense, GAE is a web application hosting service, allowing for
development and deployment of web
-
based applications within a pre
-
defined runtime environment.
Unlike other cloud
-
based hosting offerin
gs such as Amazon Web Services that operate on an IaaS level,
the GAE already provides an application infrastructure on the PaaS level. This means that the GAE
abstracts from the underlying hardware and operating system layers by providing the hosted
appli
cation with a set of application
-
oriented services. While this approach is very convenient for
developers of such applications, the rationale behind the GAE is its focus on scalability and usage
-
based
infrastructure as well as payment.

Costs

Developing and

deploying applications for the GAE is generally free of charge but restricted to a
certain amount of traffic generated by the deployed application. Once this limit is reached
with
in a
certain time period, the application stops working. However
,

this limit

can be waived when switching to a
billable quota where the developer can enter a maximum budget that can be spent on an application
per day. Depending on the traffic, once the free quota is reached the application will continue to work
until the maximum b
udget for this day is reached.

Table
1

summarizes
some of
the

in our opinion

most
importa
nt quotas and corresponding amount

per unit

that is charged

when

free resources are depleted

and

additional,

billable quota is

desired.



IaaS


Amazon EC2


Joyent


Sun Microsoft’s Network.com


HP Flexible Computing
Services


IBM Blue Cloud


3tera


OpSource


Jamcracker

PaaS


Bungee Lab’s Bungee
Connect


Etelos


Coghead


Google App Engine


HP Adaptive Infrastructure as
a Service


Salesforce.com


LongJump

Saas


Oracle SaaS platform


Salesforce Sales Force
Automation


NetSuite


Google Apps


Workday Human Capital
Management

Into the Cloud



Page

7



Free Default Quota

Billing Enabled Default Quota

C
ost

Daily Limit

Maximum Rate

Daily Limit

Maximum Rate

General Limits

Requests

1.3 mio

7,400 req/minute

43 m
io

30,000 req/min

n/a

Bandwidth In

1GB

56 MB/minute

1,046 GB

10 GB/min

$0.10/
GB

Bandwidth Out

1 GB

56 MB/minute

1,046 GB

10 GB/min

$0.12
/
GB

CPU Time

6.5 CPU
-
h

15 CPU
-
min/min

1,729 CPU
-
h

72 CPU
-
min/min

$0.10/
CPU
-
h

Data Store

Stored Data

1 GB

no maximum

$0.15/GB/month

# of Indexes

100

200

n/a

Queries

10 mio

57,000
/min

200 mio

129,0
00
/min

n/a

CPU Time

60 CPU
-
h

20 CPU
-
min/min

1,200 CPU
-
h

50 CPU
-
min/min

n/a

Mail Service

Recipients

2,000

8/min

7.4 mio

5,100/min

$0.0001
/recipient

URL Fetch Service

API Calls

657,000
calls

3,000/min

46 mio. calls

32,000/min

n/a

Memcache Service

API Calls

8,600,000

48,000/min

96,000,000

108,000/min

n/a

Task Queue Service

API Cal
ls

100,000

n/a

1
mio

n/a

n/a

Stored Tasks

1 mio

max

10 mio

max

n/a

TABLE
1
: GAE
QUOTA

AND BILLING

(ADAPTED FROM

(8)
)


F
eatures

W
ith a Runtime Environment, the Datastore and the App E
ngine services, the GAE can be divided into
three parts.

Runtime E
nvironment

The GAE runtime environment
presents itself as

the place where the actual application is executed.
However, the application is only invoked once an HTTP request is processed to the GAE via a web
browser or some other interface, meaning that the application is not constantly running if no invocation or
processing
has been

done. In case of such an HTTP request, the request handler forwards the request and
the GAE selects one out of many possible Google servers where the application is then instantly
deployed and executed for a certain amount of time

(8)
. The application may then do some computing
and return the result back to the GAE request handler which forwards an HTTP response to the client. It is
important to understand that the application runs completely embedded in
this described sandbox
environment but only as long as requests

are still coming in or some processing is done within the
application. The reason for this is simple: Applications should only run when they are actually computing,
otherwise they would alloca
te precious computi
ng power and memory without
need. This paradigm shows
already the GAE’s potential in terms of scalability. Being able to run multiple instances of one application
independently on different servers guarantees for a decent level of scalab
ility. However
,

this highly
flexible and stateless application execution paradigm has its limitations.

Requests are processed no
longer than 30 seconds after which the response has to be returned to the client and the application is
removed from the runtim
e environment again

(8)
. Obviously this method accepts that for deploying and
Into the Cloud



Page

8

starting an application each time a request is processed
,

an additional lead time is needed until the
application is finally up and running. The GAE

tries to encounter this problem by
caching

the application
in the server memory as long as possible, optimizing for several subsequent requests to the same
application. Furthermore, the stateless execution creates the need for a sophisticated solution for

persistence which will be presented in detail in the following chapter.

The type of runtime environment on the Google servers is dependent on the programming language
used.

For Java or other languages that have support for Java
-
based compilers (such as JR
uby, Rhino and
Groovy) a Java
-
based Java Virtual Machine (JVM) is provided. Also, GAE fully supports the Google
Web Toolkit (GWT), a framework

for rich web applications. For Python and related frameworks a
Python
-
based environment is used.



FIGURE
4
:
STRUCTURE OF

GOOGLE APP ENGINE

(13)


Persistence and the datas
tore

As previously discussed, the stateless execution of applicati
ons creates the need for a data
store

that
provides a proper way for persistence. Traditionally, the most popular way of persisting data in web
applications has been the use of relational databases. However, setting the focus on high flexibility and
scalability, the GAE uses a different appro
ach for data persistence, called
Bigt
able

(14)
. Instead of rows
found in a relational database, in Google’s
Big
t
able

data is stored in
entities
. Entities are always
associated with a certain
kind
.
These entities have
properties
, resembling columns in relational database
schemes. But in contrast to relational databases, entities are a
ctually schema
less, as two entities of the
same kind not necessarily have to have the same properties or even the same type of value for a certain
p
roperty.

The most important difference to relational databases is however the querying

of entities within a
Bigtable d
atastore. In relational databases queries are processed and executed against a database at
application runtime. GAE uses a different appro
ach here. Instead of processing a query at application
runtime, queries are pre
-
processed during compilation time
when

a corresponding index is created. This
index is later used at application runtime when the actual query is executed. Thanks to the index,

each
query is only a simple table scan where only the exact filter value is searched. This method makes
queries very fast compared to relational databases while updating entities is a lot more expensive.

Into the Cloud



Page

9

Transactions are similar to those in relational dat
abases. Each transaction is atomic, meaning that it either
fully succeeds or fails. As described above, one of the advantages of the GAE is its scalability through
concurrent instances of the same application. But what happens when two instances try to sta
rt
transactions trying to alter the same entity? The answer to this is quite simple: Only the first instance gets
access to the entity and keeps it until the transaction is completed or eventually failed.
In this case the
second instance will
receive a con
currency failure exception.
The GAE uses a
method of handling such
parallel transactions called optimistic concurrency control. It simply denies more than one altering
transaction on an entity and implicates that an application running within the GAE shoul
d have a
mechanism trying to get write access to an entity multiple times before finally giving up.

Heavily relying on indexes and optimistic concurrency control, the GAE allows performing queries very
fast even at higher scales while assuring data consist
ency.


FIGURE
5
: BIGTABLE STRUCTURE

(14)


Services

As mentioned earlier, the GAE serves as an abstraction of the underlying hardware and operating
system layers. These abstractions are implemented as services that can be directly called from the actual
application. In fact, the datastore itself is as well
a service that is controlled by the runtime environment
of the application.

MEMCACHE

The platform innate

memory cache service serves as a

short
-
term storage. As its name suggests, it stores
data in a server’s memory allowing fo
r faster access compared to t
he datastore. Memcache

is a non
-
persistent data store that should only be used to store temporary data within a series of computations.

Probably the most common use case for Memcache is to store session specific data

(15)
. Persisting session
information in the datastore and e
xecuting queries on every page interac
tion is
highly
inefficient
over t
he
application lifetime, since session
-
o
wner instances are unique per session

(16)
.

Moreover,
Memcach
e is

well suited

to speed up common datastore queries
(8)
.

To interact with the Memcache GAE supports
JCache, a
proposed interface standard for memory caches

(17)
.

URL FETCH

Because the GAE restric
tions do not allow opening sockets

(18)
, a

URL Fetch service can be used to send
HTTP
or HTTPS
requests to other servers on the Internet. This service works asynchronously, giving the
remote server some time to respond while
the request handler can do other things in the meantime.

After
the server has answered
, the

URL Fetch service returns
response code as well as

header and body
. Using
the Google Secure Data Connector an application can even access servers behind a company’s

firewall
(8)
.

MAIL

The GAE also offers a mail service that allows sending and receiving email messages. Mails can be sent
out directly from the application

either on behalf of the application’s administrator or on behalf of u
sers
Into the Cloud



Page

10

with Google Accounts. Moreover, an application can receive

email
s

in the form of HTTP requests initiated
by
the App Engine

and posted to the app

at multiple

address
es
.

In contrast to incoming emails, outgoing
messages may also have an attachment up to

1 MB
(8)
.

XMPP

In analogy to the mail service a similar service

exi
sts for instant messaging
, allowing

an application

to
send and receive instant messages
when

deployed t
o the GAE.

The service allows communication to and
from

any
instant messaging service

compatible to XMPP
(8)
,
a set of open technologies for instant
messaging

and related tasks
(19)
.

IMAGES

Google

also integrated a dedicated

image manipulation

service into the App Engine
.

Using this service

images can be
resize
d
, rotate
d
,
flipped or

cropped

(18)
. Additionally
it
is able to combine several
images into a single one,

convert

between several

image

formats

and enhance ph
otographs
.

Of course
t
he A
PI
also provide
s

information about format,
dimensions

and a histogram of color values

(8)
.

USERS

User a
uthentication with
GAE

comes in two flavors.
Developers can roll their

own
authentication service

using custom classes, tables

and M
emcache

or simply plug
into Google’s Accounts service.

Since for most
applications the time and effort of creating a sign
-
up page and

store user passwords

is
not worth the
trouble
(18)
, th
e Us
er service is a very convenient functionality which
gives a
n easy method for
authenticating users within applications. As byproduct thousands of Google
A
ccounts

are leveraged
.

The
User service
detect
s if a
user has signed in

and

otherwise
redirect the user to a sign
-
in page.
Furthermore, it

can detect whether the current user is an administrator,
which facilitates

implement
ing
admin
-
only areas within the application
(8)
.

OAUTH

The general idea behind
OAuth is
t
o allow

a user to grant a third party limited permission to access
protected data

without sharing
username and password

with the third party
. The OAuth specification
separates between a consumer, which is the application that seeks permission on accessing
protected
data, and the service provider who is storing protected data on his

users' behalf

(20)
.
Using Google
Accounts
and the GAE API,

application
s

can
be an

OAuth service provider

(8)
.

SCHEDULED

TASKS AND
TASK QUEUES

Because background processing is restricted on the GAE platform, Google introduced task queues
as
a
nother built
-
in functionality

(18)
. When a client requests an application to do certain steps, the
application might not be able to process them right away. This is where the task queues come into play.
Requests that cannot be executed right away are saved in a task queue that controls the correct
sequence of execution. This way, the client gets a respo
nse to its request right away, possibly with the
indication that the request will be executed later

(13)
.

Similar to the concept of task queues are cron jobs. Borrowed from the UNIX world, a GAE cron job is a
scheduled job that

can invoke a request handler at a pre
-
specified time

(8)
.

BLOBSTORE

The general idea behind the b
lobstore is to allow
applications to
handle

objects
that are much larger
than the s
ize allowed for objects in the d
atastore serv
ice
.
Blob is short for binary large object and is

designed to serve

large fil
es, such as video or high quality images. Although blobs can have up to 2 GB
they have to be processed in portions, one MB at a time.

This restriction was introduced to smooth the

curve of datastore traffic. T
o enable queries for blobs
,

each

has a

corresponding blob info record which
is persisted

in the datastore

(8)
, e. g. for creating an image database.

Into the Cloud



Page

11

ADMINISTRATION CONSO
LE

The administration
console

acts as a management cockpit for GAE applications. It

gives the developer
real
-
time data and information about the current performance of the deployed application

and is used
to upload new versions of the source code
.

At this juncture it is possibl
e to test new versions of the
application and switch the versions presented to the user.

Furthermore,
access data and
logfiles can be
viewed
. It also enables analysis of traffic so that quota can be adapted when needed.

Also the status of
scheduled tasks c
an be checked and the administrator is able to browse the applications datastore and
manage indices
(8)
.


App Engine for Business

While the GAE is more targeted towards independent developers in need for a hosting platform for

their medium
-
sized applicati
ons, Google`s recently launched

App Engine for Business tries to target the
corporate market. Although technically mostly relying on the described GAE, Google added some
enterprise features and a new pricing scheme to make thei
r cloud computing platform more attractive for
enterprise customers

(21)
.

Regarding the features, App Engine for Business includes a central development manager that allows a
central administration of all applications deployed within one company including access control lists. In
addition to that Google now offers a 99.9% servic
e level agreement as well as premium developer
support.

Google also adjusted the pricing scheme for their corporate customers by offering a fixed price of $8
per user per application, up to a maximum of $1000, per month. Interestingly, unlike the pricing
scheme
for the GAE, this offer includes unlimited processing power for a fixed price of $8 per user, application
and month. From a technical point of view, Google tries to accommodate for established industry
standards, by now offering SQL database support

in addition to the existing Big
t
able

datastore
described above

(8)
.



APPLICATION DEVELOPM
ENT USING GOOGLE APP

ENGINE

General Idea

In order to evaluate the flexibility and scalability of the GAE we tried to come up with an ap
plication
that relies heavily on scalability, i.e. collects large amounts of data from external sources. That way we
hoped to be able to test both persistency and the gathering of data from external sources at large
scale.

Therefore our idea has been to de
velop an application that connects people`s delicious bookmarks with
their respective Facebook accounts. People using our application should be able to see what their
Facebook friends’ delicious bookmarks are, provided their Facebook friends have such a de
licious
account. Th
is

way a user can get a visualization of his friends’ latest topics by looking at a generated tag
cloud giving him a clue about the most common and shared interests.

In order to provide such a service within our application we had to int
egrate both Facebook as well as
delicious and persist t
he fetched data in the GAE data
store. Although all data could as well always be
fetched in real
-
time, there are two reasons to persist bookmarks from delicious as well as personal
details from the resp
ective Facebook accounts. First, from a user perspective, reading out the information
from a GAE data store is much faster than re
-
fetching everything at runtime. Second and more important
Into the Cloud



Page

12

for this project is the need to test the scalability of the data st
ore and the GAE in general. So in order to
draw substantial conclusions about the scalability of the GAE, testing persistency remains essential.

Requirements and functionality

Before we started implementing the application we studied the relevant APIs of
affected service
providers as well as similar applications. Furthermore
,

we asked a number of potential users on their
opinion how the
user interface should look like and aligned them with our own design visions
.

The result
was a collection of requirements which is illustrated in
Table
2
.

Functional Requirements

Non
-
functional Requirements

User Interface

Backen
d




Login to a delicious
account



Filter friends and their
bookmarks



Filter bookmarks by tags



Visualize bookmarks in a
tag cloud



Show all bookmarks in a
list



Fetch bookmarks from
delicious and persist them



Fetch personal details and
friends from Facebook
and
persist them



Update and persist new
bookmarks from delicious



Update and persist new
Facebook details



Application has to be
scalable up to 1000

parallel

users



Secure & reliable
authentication with
Facebook & delicious



Application should run
within Facebook using an
iFrame

TABLE
2
: APPLICATION REQUIR
MENTS


Implementation

Development Environment

Because of our expertise in Java and familiarity with the Eclipse IDE, we decided to use both for
developing our application. Furthermore, by using Eclipse we were able to use the GAE plugin that
improves the development and debugging of a local GAE applic
ation significantly. Also, we agreed on
using the Google Web Toolkit (GWT), a Java framework that helps developing rich web applications
with
AJAX
-
based user interfaces.

In order to develop our application supported by a proper source code versioning tool
, we made use of
a free SVN server provided by Assembla.com.

Application Environment

As stated in the requirements, the application should run embedded within the Facebook
website. As
depicted in

Figure
6
, a user can find and select the application via Facebook and send a request to start
the application within Facebook. This will trigger a request for an iFrame that is forwarded to the GAE
wh
ere the actual application is started. The GAE will then call the Facebook API and wait for a response.
As soon as the response is received by the GAE, it returns the iFrame with the application to the user. The
application is then visible within an iFrame

in the Facebook UI and is ready to be used. If the user is not
yet logged in to delicious he can now do so. The login to delicious is done via Yahoo authentication. So
the GAE sends a request for authentication to Yahoo’s authentication servers and receiv
es an access
token. With this access token our application running within the GAE can then access the user`s bookmarks
by requesting them from the delicious servers.

Into the Cloud



Page

13



FIGURE
6
: APPLICATION ARCHIT
ECTURE (OWN ILLUSTRA
TION ADAPTED F
ROM
(21)

(22)

(23)

(8)
)


Application Architecture

LAYER & COMPONENT

OVERVIEW

Our application is based on a Three
-
Tier cli
ent/server architecture
(24)

incorporating a presentation, an
application/business logic and a data
tier
.
Figure
7

shows an overview of th
e components and layers
involved. The presentation layer is represented by the component
web browser
. Based on the GWT, Java
code is automatically compiled into AJAX code running within the client`s web browser.

The depicted business logic component includ
es various services which interact with external service
providers’ APIs such as Facebook, Yahoo and Delicious as well as with the data and presentation layer.
With the presentation layer the business logic layer interacts via both HTTP requests and RPC ca
lls.
Additionally, the business logic layer uses an XML parser to process data received from the delicious API.

The data layer is fully represented by the GAE. Although offering more services than data persistency,
the GAE serves as our main data layer. The business logic layer uses the Memcache API to store dat
a
temporarily and utilizes the d
atastore API to persist

data.

In addition to that the GAE also offers a Logging API and a Task API which we both utilize from the
business logic layer. These two features do not belong to the data layer, but have been added in the
diagram to show the whole functionality leverage
d from the GAE.

Into the Cloud



Page

14


FIGURE
7
: DIAGRAM OF LAYERS
AND COMPONENTS (OWN
ILLUSTRATION)


CLASS OVERVIEW

In the following paragraph we will present the application’s architecture by showing a class overview for
each of the three layers,
namely the business logic layer, the presentation layer and the data layer.

Business Logic Layer

Essentially, five classes build up the business logic layer. The class
FacebookServiceImpl

implements the
access to the Facebook API by providing methods for r
etrieving friends for a certain Facebook user.

DeliciousServiceImpl
and
DeliciousFeedConnector

both implement the access to the delicious backend.
Ac
cess to the data layer via the d
atastore API is realized by the class
PMF

which returns a handle to the
GAE
`s
PersistenceManagerFactory
. In order to communicate with the presentation layer, the class
UserServiceImpl
implements the server side of the GAE`s RPC service.

Into the Cloud



Page

15


FIGURE
8
: CLASS DIAGRAMM OF
BUSINESS LAYER (OWN
ILLUSTRATION)


Presentation Layer

The presentation layer is distributed over several sub
-
packages, in order to fulfill the division between
the parts of the presentation layer that are fully server
-

and those that are solely client
-
based. The
classes
BookmarkTable
,
Bookm
arkWidget
,
FilterWidget

and
TagCloudWidget

represent the main
functionality in terms of displaying bookmarks by user, filtering bookmarks and users and creating a tag
cloud navigation for accessing bookmarks within the web
-
based UI. Although these classes
are written in
pure Java, the GWT automatically compiles them into AJAX
-
enabled JavaScript code. Both the
DeliciousLoginWidget

and the
FacebookLoginWidget

classes provide functionality for the user to login to
Facebook and Delicious. The classes
Gwtapp, Gl
obal
and

Modality

represent GWT abstractions of the
actual presentation layer runtime environment in the client’s web browser.

Into the Cloud



Page

16


FIGURE
9
: CLASS DIAGRAMM OF
PRESENTATION LAYER

(OWN ILLUSTRATION)


Data Layer

In order to persist

data within the GAE, we are using JDO classes. Although Google’s BigTable is by
definition a schema
-
less database, the JDO classes serve as a means to define the schema for kinds of
entities (see GAE introduction).This way
BookmarkJDO

and
UserJDO

define t
he schema for bookmarks
that are persisted in the datastore as well as for user data that is stored.Not depicted in the diagram
above, data access classes such as
BookmarkDO

are used for data representation within the application.

Into the Cloud



Page

17


FIGURE
10
: CLASS DIAGRAM OF D
ATA LAYER

(OWN ILLUSTRATION)


Platform Limitations

At its core, GAE

restricts

the
access
to the physical infrastructure. This includes preventing the application

from opening sockets, running bac
kground processes (except

cron

jobs)

and using other common back
-
end routi
nes that application developers normally
take f
or granted
(18)
.

The following chapter dwells on
the limitations of the GAE that we directly encountered during our application developm
ent. We know
that there are several more limitations especially in terms of enterprise
-
centric applications for the GAE,
but these will be discussed later.

BACKGROUND PROCESSIN
G

Due to the character of the GAE, intense background processing is not possible

within applications. As
client requests are subject to a certain time limit, the ability to process large chunks of data is quite
limited.

TRANSACTIONS

Another limitation is the inability of using

the Memcache and the data store within one transaction. Th
is
limitation is quite a problem when processing large amounts of data. In our scenario we wanted to fetch
all user bookmarks from delicious and persist them in the data store. As the amount of entities that can be
handled per transaction is limited as wel
l, we tried to buffer all data in the Memcache temporarily. From
there we wanted to persist small portions of the buffered data within the data store. After successfully
storing the extracted data from the buffer in the data store, there has been no way of

accessing the
Memcache once again within this transaction and deleting the stored data from the Memcache. This way
it is unknown whether certain entities have already been stored when accessing the Memcache for the
subsequent transaction, trying to read t
he next chunk of buffered data to make it persistent within the
data store. The only way to circumvent this problem is by deleting the part of the data that is used by
the current transaction before the transaction is actually triggered. However, using thi
s method means
risking the loss of already fetched data, in case the transaction fails.


Into the Cloud



Page

18

DATA STORE QUERYING
LIMITATIONS

Although the underlying Bigtable

data storage approach is quite different to traditional SQL
approaches, the used Java Data Objects Que
ry Language (J
DOQL)

tries to provide an SQL
-
like query
language

(25)
.

However, certain features available in traditional SQL approaches are not available in
the GAE environment. By design joins of tables are not possible within

the GAE because it features a non
-
relational database. Also, certain combinations of filtering operators such as “<” and “>” cannot be used

at the same time within a single query
.



SCALABILITY
TESTING

Good on
-
demand scalability is one of the key features that
the GAE

offers. A scalable system should be
able to proportionally increase the amount of work it performs, as available resources increase. In order
to check whether GAE lives up to its promise of

effortless scalability, we
modified

the described

application
from the last chapter in order to gather

and evaluate performance data.

Testing approach

With GAE application instances running on a standardized, relativ
ely low
-
power virtual hardware and
havi
ng response time limited to 30 seconds for any request, scalability translates into parallelization. By
utilizing a concurrent algorithm and spreading work over multiple VM instances simultaneously, as the
number of VM instances grows, an application shoul
d ideally execute linearly faster, compared to the
sequential algorithm. This, of course, presumes that the algorithm it executes can be parallelized in the
first place.

One of the easily parallelizable tasks is crawling data sources for various data

for i
nstance,
downloading existing bookmarks from delicious.com in bulk.
Out of the described application in the last
chapter w
e developed a “Delicious Crawler”

an application that starts with a random user, then
downloads and saves his bookmarks and friends. T
hen in turn, for each friend it downloads his
bookmarks, friends list, and so forth, effectively performing a
breadth
-
first search
.
The
Delicious crawler
saves a total
of
100 users and 1600 bookmarks per run.

The o
peration of the Delicious Crawler is visua
lized in
Figure
11
. A request triggered by task queue
instructs the crawler to:

1.

Take a UserJDO c
lass (represents a delicious user) from the FIFO queue

2.

For this user, read bookmarks and friends using
the delicious.com JSON feed API

3.

Convert received data t
o classes that can be persisted

4.

Save the user and book
marks to the Bigtable datastore

5.

Enqueu
e the friends to the FIFO queue

6.

E
nqueue a task in the task q
ueue

As these operations are executed, performance data
is logged. This

include
s

the time to execute
URLFetch and Persistence API requests, as well as the overall request processing duration.

Since the described workflow can

be executed concurrently, thus

processing multiple users s
imultaneously,
the developed Delicious Crawler

is a suitable candidate for our
scalability testing. By setting the
rate

attribute of the task queue to 50/second (maximum), and varying the
bucket
-
size

between 1 (minimum)
and 50 (
maximum), we were able to test the performance of
the
Delicious Crawler with 1 to 50
concurrent threads. The results are discussed in the next section.

Into the Cloud



Page

19


FIGURE
11
. CRAWLER

(OWN I
LLUSTRATION)


Application scalability

Figure
12

shows the total time it took Delicious Crawler to process 100 users and 1600 bookmarks for
various bucket sizes. As we see, performance increases up to eig
ht parallel threads and then levels.


FIGURE
12
. TOTAL DURATION

(OWN I
LLUSTRATION)


One explanation for the disa
ppointing results when using 16 or more

threads could be possible
temporary technical problems at the App

Engine at the time of testing, or an automatic throttling by
Google, as the error message presented in
Figure
13

may suggest. Such errors occurred a t
otal of 53
Into the Cloud



Page

20

times out of 700 task queue executions. These errors possibly explain the extreme outliers seen in
Figure
14
, where detailed data for bucket

size 16 is shown.


FIGURE
13

ERROR MESSAGE
IN ADMINISTRATION CO
NSOLE
(8)



FIGURE
14

TIME PER REQUEST AND

SERVICE

(OWN ILLUSTRATION)


Since we only did a brief scalability
test
ing
, results may be affected by idiosyncrasies ranging from the
local network connection latencies to temporary tec
hnical problems at delicious

or the App Engine itself.
Thus, the results should not be generalized for use scenarios beyond
of what

we te
sted.

DISCUSSION

The following section tries to discuss whether
C
loud
C
omputing in general and the GAE in particular are
able to serve the needs and requirements of modern
s
oftware
e
ngineering.

First each aspect will be
discussed
in general terms which is
followed by a GAE
-
specific reflection.

Software Engineering Aspects

Functional Aspects

As discussed in the introduction, PaaS environments set certain restrictions to the developer in terms of
programming techniques, languages and other elsewhere available

functionality. In contrast to that, IaaS
environments such as Amazon Web Services give the developer more freedom and flexibility, however
at the cost of having less features or functionality pre
-
built. This shows the trade
-
off between developer’s
Into the Cloud



Page

21

flexibi
lity and pre
-
built functionality. The more
the
application requirements match with GAE´s pre
-
built
functionality, the easier and faster it will be to develop applications for the GAE compared to a
n

IaaS
platform such as Amazon Web Services. However, if the

match is quite low, GAE’s restrictions outweigh
the benefits of the pre
-
built functionality and an IaaS provider might be the better choice

in the that
case
.

Usability

Cloud Computing adopts the concept of Utility Computing, which presents the very idea that users obtain
and employ computing platforms in Clouds as easily as they access a traditional public utility
infrastructure (such as electricity, water or telephone
network)
(22)
. The same expectations are there for
the GAE from a developer`s perspective. Usability
-
wise the GAE offers easy access to the domain of
Cloud Computing by providing abstractions for important services such as pers
istence as well as an easy
to use development environment.

Scalability

The central design goal of the GAE is to address
concerns a
bout scalability. The platform

is built

on the
conce
pt of horizontal scaling. I
n essence
, this means that instead of running
an
application on more
powerful hardware,
the application is executed on more
instances of less powerful hardware

(16)
.

Being a PaaS solution, the GAE offers a wide portfolio of built
-
in services that can be easily integrated
i
nto a GAE
-
deployed application. This includes its built
-
in scalability feature as well as its persistence
abstraction. At least in theory built
-
in scalability and the persistence abstraction can be seen as one of
the most interesting USPs that the GAE has
to offer. Using the distributed application deployment
approach along with an extremely scalable Big
t
able database approach, the GAE already delivers all
tools to build applications that are highly scalable. However, in reality certain purposely set restri
ctions
limit the scalability of the GAE at the moment, as our tests have shown. Nevertheless, numerous Google
services
such as Gmail
show the GAE’s technical potential in terms of scalability

as the underlying layers
for scalability and especially for pers
istence are the same
. The emergence of
Google App Engine for
Business

shows that Google is currently trying to loosen the named restrictions and make the GAE
attractive for enterprise applications that may the
n fully utilize its scalability
features.

Integration

The need
will arise for migrating and integrating appli
cations and data from different
clouds. This will
bring a new form of cloud service, that is cloud

integration service

(23)
.

At the moment the GAE does not
offe
r any direct support to do so, although
data from
external clouds can be integrated by the tools and
services offered within the GAE.

Availability

It is impossible to
provide 100% availability, unless a high availability architecture is adopted and

both
th
e platform and applications are fully teste
d. Enterprise users should seek
service level agreements
(SLAs) that will motivat
e the vendors to ensure desired
levels of availability. Besides the SLA, users who
require 100% ava
ilability may
take a combination
of precautionary measures.

With data, they may
maintain a
backup on on
-
premises storage, or use a ba
ckup cloud, or simply not store
mission
-
critical
data on the cloud. With applications, t
he users may keep an on
-
premises
version of the application, so
that

they may work offline while the cloud

is down

(23)
.

In November 2007,
RackSpace, Amazon’s competitor,
stopped its service for 3 hours becau
se of power
cut
-
off at its data
center; in June 2008, Google App Engine se
rvice broke o
ff for 6 hours due
to some
bugs of storage system; In March 20
09, Microsoft Azure experienced
22 hours’ out of service caused by
Into the Cloud



Page

22

OS system upd
ate. Currently, the
public cloud provider based on virtualization, def
ines the reliability of
service
as 99.9% in
SLA

(24)
.

Google itself does not guarantee any service level agreements for the basic version of the GAE.
However, the recently launched GAE for Business specifically includes a 99.9% SLA. It remains to be seen
how
and if
this
promise will be met in future.

Support

In fact, cloud services should be designed for ea
sier usability than on
-
premises
computing in the first
place

(23)
.

Especially PaaS platforms such as the GAE have by definition a higher ne
ed for developer
support as the features/services provided a
re

mostly non
-
standardized. In contrast to IaaS solutions
where the basic hardware and
operating system

layers are mostly the same to traditional deployment
approaches, PaaS platforms need a detai
led description of the supported features and services. The
GAE offers some documentation on how to use the platform, but in the basic version no professional
support is available. The upcoming business edition of the GAE will offer a real support option,
however.

Privacy

Customers may be able to sue
enterprises if their privacy rights are violated, and in

any case the
enterprises may face damage to their

reputation.

Current privacy concepts such as the Fair

Information
Principles
are applicable to cloud

co
mputing scenarios and mitigate the risks
. Tips for SE:


1. Minimize personal information sent to and

stored in the cloud

2. Protect personal information in the cloud

3. Maximize user control

4. Allow user choice

Privacy s
hould be built into every stage
of
the product developm
ent process: it is not adequate
to try to
bolt on privac
y at a late stage in the design
process.

(25)
.

Cloud computing vendors must adopt the most sophisticated and up
-
to
-
date tools and procedures, and
striv
e to provide better security and privacy than is available for on
-
premises computing (8).

In terms of the GAE, Google´s general privacy notes are applicable. To evaluate the real level of
privacy, especially of the data stored within in the GAE, one would
have to perform further
privacy
-
related
tests that can deliver meaningful insights.

User Authentication

For user authentication the GAE comes along with the Google authentication service. This enables
developers to easily integrate logins for Google accoun
ts into their applications. Given the acceptance
of Google accounts, this feature is really useful and a great advantage compared to IaaS solutions
where authentication has to be handled by the developer on its own.

Legal Issues and Compliance

Enterprise u
sers must maintain business legal docum
ents and assure their integrity
in or
der to comply with
various laws.

Cloud computing vendors have
to adopt technologies to ensure that their enterprise users’
data satisfy

their
compliance requirements. Again, this d
oes not s
eem to have received much attention

yet

(23)
.

At this stage the GAE does not make any specifications about legal issues and compliance. For
applications that are heavily dependent on such restrictions, the GAE might not be
the
right choice at this
Into the Cloud



Page

23

point in time. But once again, Google has to look int
o these issues when becoming a serious PaaS
provider on the enterprise level as its start of the GAE for Business suggests
.

Cost

The 3rd party provider owns and manages all the computing resources (s
ervers, software, storage

and
networking) and electrici
ty needed for the services. The
users only need to “plug into” the cloud. The us
ers
do not need to make a large
upfront investment on computing resources;
the space needed to house
them;
electricity needed to run the computing resourc
es; and the cost of ma
intaining
staff for
administering the system, network, and database

(23)
.

In terms of costs, the GAE offers a usage
-
dependent pricing scheme that starts with a basic version which
is free of charge but subject to certain limita
tions. The paid version of the GAE removes some of these
limitations, however other limitations still exist caused by the GAE’s design. As no upfront investments are
necessary, the GAE is a good way to test new applications (even for free) and pay as the a
cceptance
and spread of the application grows.

Interestingly, the recently launched GAE for Business goes into a different direction, as Google now
offers a flat
-
rate pricing scheme for enterprise users.

Advantages for the software developer

In the next t
wo decades, service
-
oriented distributed comp
uting will emerge as a
dominant factor in
shaping the industry, changing th
e way business is conducted and
how services are delivered and
managed

(32)
.

This examp
le is not intended to discredit
the paradigm, just the exaggerated and premature

claims of
end
-
user empowerment.
On the contrary, even if lay end users won’t

be able to whip up a

serious
enterprise application
in a
matter of days, cloud computing
open
s up
exciting new possibilities
based on
a
mix of old and new technologies
for the next generation of software developers

(33)
.

Outlook: the way ahead in Cloud Computing

New application opportunities and use cases

It is foreseeable
that Cloud Computing will affect the world of IT in two ways. On the one hand it will
fundamentally change the way existing applications are designed and on the other hand it will create
whole new use
-
cases.
Chun and Maniatis

(31)

describe one such use
-
case, where cloud computing enables
a technology which otherwise would not be possible: to overcome hardware limitations and enab
le more
powerful mobile interactive applications
, external resources

are used

by partially
shifting
computations

from
a

smartphone
into the C
loud.

Although as enabler of new use
-
cases it will play a major role, the
impact on current applications is believed to be even bigger.
Cloud Computing presents a unique
opportunity for batch
-
processing and
business

analytics
.
The rise of business analytics has manifested
itself in a

growing share

of computing resources
being spent

on understanding customer
s, supply chains,
buying habits and so on.
Analyzing

terabytes

of data and can take hours
on a single computer
.
If
computations can be parallelized

using hundreds of computers for a short tim
e costs the same as using a
few
computers for a long time.

Another group of ideal candidates for the Cloud are

compute
-
intensive
desktop applications.

Especially for

new product

development
,

moving simulations into the Cloud can
mean enormous cost savings compared with

the traditional approach of

buying computation time from a
data processing center.
The latest
versions of the mathematics software packages

such as
Matlab
or

Mathe
matica are

already

capable of using Cloud Computing to perform expensive evaluations

(10)
.


Into the Cloud



Page

24

Challenges of

Cloud Computing

Of course, where there is light, there is also a shadow. Like every uprising technology Cloud Computing
has still got obstacles on its path that have to be overcome.
This is especially true for enterprise
applications.
Subsequently we point out some major
obstacles of current Cloud Computing

from a
business perspective

and present
corresponding solution opportunities

(see
Table
3
)
.

Challenge

Opportunity

Availability
of Service

Especially for enterprises, availability of certain
applications is business critical. Therefore they still shrink
from
trusting Cloud providers with hosting critical software.
A possible solution would be to use multiple Cloud p
roviders
t
o
provide business c
ontinuity and utilize Cloud
-
elasticity to
defend a
gainst DDOS attacks
.

Data Lock
-
In

Many businesses fear that choosing a certain Cloud
provider also means losing a certain degree of freedom as
their data gets locked in. If
APIs

of differ
ent providers
where built on an industry standard
, entry threshold would
definitely be lower.

Data Confidentiality

and Audita
bility

Another important issue is confidentiality in the Cloud. Even
if the provider is trustworthy it is still unclear on which s
erver
data is located and
which legislation is applied. A possible
work
-
around would be to

deploy encryption, VLANs and
firewalls. A really clean solution would store data
geographically according to legal requirements.

Data Transfer Bottlenecks

Only because high
-
speed broadband Internet is available
in some regions, this does not necessarily mean that each
branch can fall back on the same infrastructure quality.
FedExing d
isks

is still a common activity throughout the
world. And also in the era
of Cloud Computing enterprises
have to balance which data to move entirely into the Cloud
and where other solutions might be better suited.

Performance Unpredictability

Especially for HPC applications it is essential that
computation performance is stabl
e and predictable. To
solve this issue it is necessary to improve virtual machines,
for example by implementing gang scheduling. In some
matters of data storage flash m
emory

instead of hard
-
drives might greatly improve speed.

Scalable Storage

Although current relational database systems support multi
-
user access, Cloud Computing dimensions are in another
league. Although CC providers (e. g. Google) have
implemented special datastores, management of persistent
data is still a major bottleneck.
F
undamental research on
data base technology is therefore indispensable.

Into the Cloud



Page

25

Challenge

Opportunity

Bugs in Large
-
Scale Distributed Systems

Current debugging technology is designed for traditional
software. The laws that apply in distributed
virtual Cloud
systems
are different from
those in conventional systems. An
opportunity to approach this issue is to

i
nvent

special

Debugger
s for d
istributed VMs
.

Dynamic
Scaling

State
-
of
-
the
-
art scaling is mainly manually or at most semi
-
automatic. Although additional hardware is switched on in
case of need, this not until
server load hits a certain level.
Using machine learning algorithms to predict workload and
dynamically alloc
ate needed resources would improve
Cloud efficiency substantially.

Reputation Fate

For CC providers reputation might become a problem,
because o
ne customer’s bad behavior can affect the
reputation of the cloud as a whole
. As soon as the Clouds IP
address
es become blacklisted e. g. because of spam all
applications on the Cloud that send emails out of it become
negatively affected.
This issue could be resolved by
offering

reputation
-
guarding services similar to

“trusted
email” services
. Also legal issues
have to be addressed
since Cloud Computing providers surely not want to be held
liable for actions of their customers.

Software Licensing

Another major issue are current licensing models, as they
restrict the computers on which the software can run
. In th
is
context software vendors have to adapt their business
models and offer

p
ay
-
for
-
use licenses
. Many vendors have
already reacted and now offer SaaS themselves.

TABLE
3
: OBSTACLES AND OPPO
RTUNITIES OF CLOUD C
OMPUTING
(ADAPTED FROM
(10)
)











Into the Cloud



Page

26

CONCLUSION

Cloud Co
mputing remains the
number one
hype topic
within th
e IT industry at present. Our evaluation of
the Google App Engine has shown both functionality and limitations of the platform.
Developing and
deploying an application within the GAE is in fact quite easy and in a way shows the progress that
software development and deployment has made. Within our application we were able to use the
abstractions provided by the GAE without problems
, although the concept of Bigtable requires a big
change in mindset when developing.
Our scalability
testing showed the limitations of the GAE at this point
in time. Although being an extremely
helpful feature

and a great USP for the GAE
, the built
-
in scal
ability
of the GAE suffers from both purposely
-
set as well as technical restrictions at the moment. Coming back
to our motivation of evaluating the GAE in terms of its sufficiency for serious large
-
scale applications in a
professional environment, we have
to conclude that the GAE not (yet) fulfills business needs
for enterprise
applications at present. As the discussion showed, some of these needs are yet to be satisfied by Cloud
Computing platforms in general, others are GAE
-
specific issues.
However, s
eein
g the benefits and
potential of PaaS
-
based approaches such as the GAE, the question remains whether quite inflexible and
non
-
standardized PaaS platforms can establish themselves
in the market
for seriou
s large
-
scale
applications or will remain platforms fo
r small and simple applications

a
s

s
e
e
n

today
.


Into the Cloud



Page

27

ABBREVIATIONS


AJAX

Asynchronous JavaScript and XML

API

Application Programming Interface

Blob

Binary Large Object

CC

Cloud Computing

DDOS

Distributed Denial of Service

GAE

Google App Engine

GWT

Google Web Toolkit

HPC

High Performance Computing

HTTP

Hypertext Transfer Protocol

HTTPS

Hypertext Transfer Protocol Secure

IaaS

Infrastructure as a Service

JDO

Java Data Object

JDOQL

Java Data Object Query Language

JS

JavaScript

JSON

JavaScript Object Notation

JVM

Java Virtual Machine

PaaS

Platform as a Service

RPC

Remote Procedure Call

SaaS

Software as a Service

SLA

Service Level Agreement

SQL

Structured Query Language

SVN

Subversion (a revision control system)

VLAN

Virtual Local Area Network

VM

Virtual
Machine

XML

Extensible Markup Language

XMPP

Extensible Messaging and Presence Protocol




Into the Cloud



Page

28

TABLE OF FIGURES


Figure 1: Global Search volume index for “Cloud Computing” (2)

................................
................................
......
3

Figure 2: Service Delivery models of cloud computing (12)

................................
................................
...................
5

Figure 3: Major types of cloud Servic
es (Adapted form (5))

................................
................................
.................
6

Figure 4: Structure of Google App Engine (13)

................................
................................
................................
........
8

Figure 5: Bigtable Structure (14)

................................
................................
................................
................................
.
9

Figure 6: Application Architecture (Own Illustration adapted from (21) (22) (23) (8))

................................
.

13

Figure 7: Diagram of Layers and components (Own illustration)

................................
................................
.......

14

Figure 8: Class diagramm of Business layer (Own illustration)

................................
................................
...........

15

Figure 9: Class diagramm of Prese
ntation layer (Own illustration)

................................
................................
...

16

Figure 10: Class diagram of Data Layer (Own illustration)

................................
................................
................

17

Figure 11. Crawler (Own illustration)

................................
................................
................................
.......................

19

Figure 12. Total Duration (Own illustration)

................................
................................
................................
............

19

Figure 13 Error message in Administration console (8)

................................
................................
.........................

20

Figure 14 Time per request and service (Own illustration)

................................
................................
..................

20



Into the Cloud



Page

29

LITERATURE


1.
Let

it Rise.
The Economist.

October, 23., 2008.

2.
Google Inc.

Google Trends. [Online] 07 17, 2010. [Cited: 07 17, 2010.]
http://www.google.com/trends?q=Cloud+Computing&ctab=0&geo=all&geor=all&date=all&sort=0.

3.
Voas, J. and Zhang, J.

Cloud Computing:New Wine or Just a New Bottle?
IT Professional.
2009.

4.
Vaquero, L, et al.

A Break in the Clouds: Towards a Cloud Definition.
ACM SIGCOMM Computer
Communication Review.
01 2009.

5.
Leavitt, N.

Is Cloud Computing Really Ready for Prime Tim
e?
IEEE Technology News.
2009.

6.
NIST.

The NIST Cloud Computing Project. [Online] 2009. [Cited: 07 17, 2010.]
http://csrc.nist.gov/cyber
-
md
-
summit/documents/posters/cloud
-
computing.pdf.

7.
Amazon.com, Inc.

amazon web services. [Online] 2010. [Cited: 07 17
, 2010.]
http://aws.amazon.com/.

8.
Google Inc.

Google App Engine. [Online] 2010. [Cited: 07 17, 2010.]
http://code.google.com/intl/de
-
DE/appengine/.

9.
Microsoft, Inc.

Windows Azure Platform. [Online] 2010. [Cited: 07 17, 2010.]
http://www.microsoft.com/w
indowsazure/.

10.
Armbrust, M, et al.

Above the Clouds: A Berkeley View of Cloud Computing.
s.l.

: UC Berkeley
Reliable Adaptive Distributed Systems Laboratory, 2009.

11.
Sriram, I and Khajeh
-
Hosseini, A.

Research Agenda in Cloud Technologies. [Online] 10
2009.
[Cited: 07 18, 2010.] http://arxiv.org/ftp/arxiv/papers/1001/1001.3259.pdf.

12.
Marinos, A and Briscoe, G.

Community Cloud Computing.
2009.

13.
Sanderson, D.

Programming Google App Engine.
Sebastopol

: O’Reilly Media, 2009.

14.
Chang, F, et al.

Bigta
ble: A Distributed Storage System for Structured Data.
ACM Transactions on
Computer Systems.
s.l.

: ACM, 2008. Vol. 26, 2.

15.
Severance, C.

Using Google App Engine.
Sebastopol

: O’Reilly Media, 2009.

16.
Ciurana, E.

Developing with Google App Engine.
s.l.

: firstPress, 2008.

17. Java Community Process.
JCACHE
-

Java Temporary Caching API.
[Online] 2001. [Cited: 07 18,
2010.] http://jcp.org/en/jsr/detail?id=107. JSR 107.

18.
Roche, K. and Douglas, J.

Beginning Java™ Google App Engine.
s.l.

: Apress, 2009.

19.
XMPP Standards Foundation.

The Extensible Messaging and Presence Protocol. [Online] 2010.
[Cited: 07 17, 2010.] http://xmpp.org/.

20.
Internet Engineering Task Force.

Request for Comments: 5849 .
The OAuth 1.0 Protocol.
[Online] 04
2010. [Cited: 07 19,

2010.] http://tools.ietf.org/html/rfc5849.

21.
Yahoo! Inc.

Yahoo! [Online] 2010. [Cited: 07 18, 2010.] http://www.yahoo.com/.

Into the Cloud



Page

30

22.
Facebook, Inc.

facebook. [Online] 2010. [Cited: 07 18, 2010.] http://www.facebook.com/.

23.
Yahoo! Inc.

delicious.com. [Onlin
e] 2010. [Cited: 07 18, 2010.] http://www.delicious.com.

24.
Bruegge, B. and Dutoit, A.

Object
-
oriented software engineering: using UML, patterns, and Java.
s.l.

:
Prentice Hall, 2009.

25.
Tyagi, S., Vorburger, M. and McCammon, K.

Core Java Data Objects.
s
.l.

: Prentice Hall PTR, 2003.

26.
Scientific Cloud Computing: Early Definition and Experience.
Wang, L. and von Laszewski, G.

2008.

27.
Won, K.

Cloud Computing: Today and Tomorrow.
Journal of Object Technology.
2009, Vol. 08, 01.

28.
Qian, L., et al.

Clou
d Computing: An Overview.
CloudCom 2009.
Heidelberg

: Springer, 2009.

29.
Pearson, S.

Taking Account of Privacy when Designing Cloud Computing Services.
Proceedings of the
2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.
s.l.

: IEE
E Computer
Society, 2009.

30.
Deelman, E, et al.

The Cost of Doing Science on the Cloud: The Montage Example.
Proceedings of the
2008 ACM/IEEE conference on Supercomputing.
2008.

31.
Chun, B. and Maniatis, P.

Augmented Smart Phone Applications Through Clon
e Cloud Execution.
Proceedings of the 12th Workshop on Hot Topics in Operating Systems.
2009.

32.
Buyya, R, Pandey, S and Vecchiola, C.

Cloudbus Toolkit for Market
-
Oriented Cloud Computing.
Proceedings of the 1st International Conference on Cloud Computing
.
Beijing

: Springer, 2009.

33.
Erdogmus, H.

Cloud Computing: Does Nirvana Hide behind the Nebula?
IEEE Software.
2009,
March/April.