Cloud

pullfarmInternet και Εφαρμογές Web

3 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

95 εμφανίσεις

Storage service

1

Introduction

This document describes the envisaged generic storage service designed to be used in the Dicode project.

The objective of the
Dicode

project is to facilitate and augment collaboration and decision making in data
-
intensive and cognitively
-
complex settings
. In such collaborative scenarios, users need to share and
exchange information, files, reports, etc.
This service arises
as a solutio
n for this necessity.

The main purpose of this service is to provide Dicode users with a
permanent and reliable
storage place
to
keep files on the cloud. The service will be as generic as possible to allow storing any kind of files (text plain,
doc, pdf, h
tml, xml, zip…). The service will provide mechanisms to upload files and retrieve them by using
RESTful services.

Additionally, meta
-
data information about files will be also stored to facilitate their search
and location by search engines or services. The
se meta
-
data will contain information such as type of file (pdf,
html, xml, etc.) or type of content (Dynamika report, DNA sequence, etc.). These types of files and contents
should be
enumerated in the Dicode ONtology

(DON)
.

The storage service is envisage
d as a centralized repository from the end
-
user perspective, but it will be
designed
following a mixed approach

between centralized and distributed
. When users want to use this
service to share a file, they will have the option of uploading the file to the

central repository or providing an
URL accessible from the Internet where the file will be available. This mixed approach allows both to
maintain privacy of files
as to provide a

storage place “in the cloud
” to those users that cannot directly
provide acc
ess to files from their organizations.

The scope for this storage service is

thought for using within Dicode project but the generic approach
presented allows using it from outside.

2

Description of the service

All services will be implemented by using RESTf
ul services. The logical vision (end
-
user vision) of the storage
service is depicted in figure 1.


Figure 1. Logical vision of the storage service

Storage
service

Cloud



Databases

In this general scenario, one user needs to share a file and s/he decided to use the storage service. The file is
uploaded and stored in a database in the cloud. The storage service assigns an unique identifier to the file.
When another or the same
user wa
nts

to retrieve the file, s/he invokes the storage service using such
identifier. This is what happens from the point of view of the end
-
user.

Storage service consists of three main components:

1.

Local database
,

directly managed by the storage service, is u
sed to store physically the files uploaded
by the users.

2.

Metadata registry
,

also directly managed by the storage service, is used to store the metadata about
the files uploaded/published by the users. Such metadata contains information about the users, fil
e
annotations, creation and modification dates, etc.

3.

Semantic
services

are used to annotate the files. The storage service only access
es

this services

that
are deployed over the Internet (cloud) to get the tags/classes of DON that are useful for annotation

purposes.

As
mentioned before
,

the storage service
follows a mixed approach. When users want to share/upload a file,
two different scenarios are possible:

i.

User wants t
o store the file in a centralized repository directly managed by the storage service.

ii.

U
ser wants to maintain locally the file and provides a

public and accessible

URL/URI/reference to
retrieve such file
when required.


Figure 2.
U
ploading a file to the centralized repository

(scenario 1)
. Double red arrows represent interactions between user and
storage service, double blue arrows identify interactions between storage service and semantic services, single red arrows
describe
the data flow between the storage service and its internal components, and finally, double green dotted arrows represent that

some
references/information is shared by the components, i.e. metadata registry contains some tags belonging to the DON t
o annotate
the files and also contains references to the file in the database.

First scenario is presented in figure 2. User wants to share a file but s/he cannot provide a permanent link to
retrieve such file. So, s/he decides to use the storage service t
o upload the file. Before upload the file, user
has to provide some extra information about the file, such as file format or type of content. The different
formats and types of content supported are retrieved from the semantic services and are facilitated
to the
user. Then, user sends the file and metadata to the storage service. The storage service creates an unique
Storage
service

Cloud



Local

d
atabase

Upload


file

Metadata

Semantic services

(DON)




identifier for the new file

and store the file in its own local database and the metadata in the local registry.
Finally, user is provided wit
h the complete URI where the file is available.

Second scenario is
depicted

in figure 3.

In this case the user does not want to upload the file nowhere
because of, for instance, privacy concerns or legal issues. So, s/he decides to provide only the URI whe
re the
file is available.

The process is the same as described for scenario 1. User sends the metadata to the storage
service plus a reference/URI to the file. The storage service stores all this information in the metadata
registry (including the referenc
e to the file) and nothing is stored in the local database.


Figure 3.
Uploading a reference to the file (scenario 2)
.

Double
green arrow represents that the file is accessible through the
Internet/cloud. The rest of arrows have the
same meaning as in figure 2.

In both scenarios, metadata information about file
s

is
stored

in the same registry
. Particularly relevant are
the

metadata concerning the file format and content type. This information will be stored as tags using
concepts
from the DON. So, these concepts have to be previously defined in the DON.

To retrieve any file from the storage service, the user has only to
indicate the file identifier assigned by the
storage service. The storage service

will implement the logic needed to retrieve the file wherever it is stored.
This will be transparent for the end
-
user.

The storage service might be useful
both
to store user files
and
result/intermediate
/temporal

files coming
from the execution of other services.

In the last case, the users would be the application that uses the
storage service. All kind of users will be able to:



Upload and annotate files



Downloa
d files



Update already existing files and annotations



List the existing files in a repository

All these functionalities will be implemented as RESTful services.

Storage
service

Cloud



Local

d
atabase

Upload


file

Metadata

Semantic services

(DON)




file

To ensure privacy of files, service storage will manage differ
ent repositories. For instance,
in the Dicode
Workbench different workspaces (working areas) will be defined, one per use case. Each workspace might
use the storage service but they will only have access to their own repository. Users of workspace 1 will not
be able to see and access to
the repository of workspace 2.
Dicode Workbench will manage this issue
.

3

Dicode user interface

In order to be integrated within the Dicode

Workbench, a widget based interface will be developed. A
preliminary design of such interface is shown in figure
4
.

Figure
4
.
Preliminary design of the widget
-
based interface of the storage service for the Dicode workbench

This widget will display on the
top a generic label to identify the service (“Storage service”). There will be a
menu with, at least, three options:



“Upload file…”
, this option

will allow users to upload files to the storage service. A new window
(figure 5) will be opened to capture the
metadata information from users
, and

select the file and its
location.

Once the uploading process is finished and success, the new file will be shown within the
tree view.



“Configure”
, this option will allow users to configure some parameters of the storag
e service.



“About”,

this option will display information about the developers, dates, licenses and useful
information of the service.

In the widget body, the list of available files will be shown using a tree view. Users will be able to retrieve the
files
just clicking on the name.

Additionally, the widget will allow dragging any file and dropping it over
another service/widget within the workbench. Scrolls will be shown whenever needed.

As mentioned before,
when users want to upload a file, they will click

on the “upload file…” option and a
new window will appear. This window will contain a form to capture the metadata information about the
file. A preliminary design of this form is presented in figure 5.

Storage service
Configure

About
Upload file
...
File
1
Report
1
Users files
Document
1
Report
2
File
2
Figure
5
.
Preliminary design of the form to capture the information about the file

The form in figure 5 presents the following fields to be completed by the user:



Name:

textual identifier of the file that will be used to display in the tree
view.



Description:

textual description of the file and its contents.



File format:

this field allows users to specify the format of the file. This format will be selected from
a fixed list. The list of supported file formats will be retrieved using the
semantic services (DON).



File contents:

this field allows users to specify the contents of the file. These content types will be
selected from a fixed list. The
list of supported contents

will be retrieved using the semantic services
(DON).

A mechanism/pro
tocol should be defined to update and maintain the list of formats and
contents supported in the DON.



File location:

the user can decide whether the file will be uploaded and stored in the cloud or will
remain in its original location and a public URI will

be provided to access it.

o

When user selects “
I
n the cloud”, s/he has to select a file from his/her local machine

by
using the “Browse…” button.

o

If user selects “URI”, s/he must specify the public URI to the file.

Apart from this information, the Dicode

workbench will send extra information to the storage service about
the user uploading the file or the workspace/repository
that
the file belongs to.


Uploading a file
...
Name
:
Description
:
Cancel
Upload now
File contents
:
File format
:
In the cloud
:
URI
:
now
you can select whether to store the file in the cloud or to
provide a public URI to access the file
File location
:
Browse
...