FISHNet: Implementation Plan

stockingssoyaInternet και Εφαρμογές Web

7 Δεκ 2013 (πριν από 3 χρόνια και 11 μήνες)

63 εμφανίσεις


1





FISHNet:
Implementation
Plan







Project Name:

FISHnet

Partners:

FBA Freshwater
Life

/ KCL

Centre for e
-
Research




2


Prepared By:
FISHNet project team.

Document Owner(s)

Project/Organization Role

Dr Mark Hedges

FISHNet Project Manager (KCL)

Dr Micha
el Haft

FISHNet Site Manager (FBA)

Project Status Report Version Control

Version

Date

Author

Change Description

0.1

23/07/2010

Mark Hedges

Document created

0.2

26/07/2010

Mark Hedges

Mike Haft

Kearon McNicol

Eric Liao

Simon Fox

Contributions by various

project staff

1.0

29/07/2010

Mark Hedges

First release to JISC












































3









Contents



Contents

................................
................................
................................
................................
..................

3

1

Purpose of Document

................................
................................
................................
......................

4

2

Summary of Requirements

................................
................................
................................
..............

4

3

Core Technologies

................................
................................
................................
...........................

5

4

User Management Model

................................
................................
................................
................

6

5

Development Methodology

................................
................................
................................
..............

7

5.1.1

Agile Development

................................
................................
................................
.............

7

5.1
.2

Project Tracking

................................
................................
................................
..................

8

6

Detailed Schedule for Implementation (Workpackage 4)

................................
................................

8

Workpackage and activity

................................
................................
................................
................

9

Earliest start date

................................
................................
................................
.............................

9

Latest c
ompletion date

................................
................................
................................
.....................

9

Outputs

................................
................................
................................
................................
............

9

Milestone

................................
................................
................................
................................
..........

9

Responsibility

................................
................................
................................
................................
...

9



4



1

Purpose of Document

The
purpose of this document is to provide a detailed pl
an for the FISHNet implementation.
Specifically, the document describes the following:




The scope of the implementation, by identifying the key requirements that we will aim to support.
These are based on the outputs from the requirements capture phase of
the project.



The technologies that will be used for the implementation.



The implementation methodology to be followed.



A detailed breakdown of the implementation workpackage (WP4 in the Project Plan)..



2

Summary of Requirements

The following list of req
uirements was derived from a series of face
-
to
-
face interviews with the
freshwater research community, followed up by a workshop at which the initial list was discussed,
expanded and prioritized. Fuller information about the requirements, and about the pro
cess by which
they were captured, can be found in
FISHNet: User Requirements
. The list was prioritized in to the
following categories: Must have; High priority; Would be nice; Don’t need. Requirements capture for
FISHNet was intended to be wide
-
ranging, an
d not every requirement identified will be supported by
the system delivered at the end of the project. However, all will be taken into consideration during
development, so that the architecture is extensible without major reworking.

Broadly speaking, FISH
Net’s agile development approach will follow this prioritisation


“Must have”
followed by “High Priority” followed by “Would be nice”. However, the feasibility and ease of
implementing a requirement will also be taken into consideration. Thus a “Must have
” may not be
implemented during the project timescale if to do so would use up developer effort to the detriment of
other requirements, and conversely lower priority requirements will be implemented if it is easy to do
so.

Must have / Minimum requirement



G
eoreferencing of data



Timestamp for data



Ability to restrict access to data



Version control/notes on versions



Support for multiple digital formats, Excel, Word, etc



Ability to annotate maps



Taxonomy consistency tool, e.g. Synonym search, term resolution s
ervice



Taxonomic picklist tool for new data



Taxonomic spell checker for existing data



Uploading research papers



Peer
-
reviewed data journal



Ability to add photos pertaining to data



Browse data based on what has been downloaded by others (“people who liked t
his dataset
also liked …”)



Social/professional networking features



Use of data requires citation of uploaders' published papers



My experiment type workflow feature



Top ten data sets



Online access i.e. dropbox



Template IPR/licenses



Capture related informati
on, e.g. emails, presentations, posters


5




Dataset commenting by other users



DOIs as appendices to publications



Export to Endnote



Metadata only entry that can point to external datasource



Keyword and other search



Export data from system into original format



M
ust store in a standard non
-
prop/open format



Must be sustainable over the long term i.e. >20yrs



Owner/contact information



Taxonomic search hierarchical



Taxonomic spell checker



Customizable reference/citation options



Open search login to download



Address IP
R



Option to request access to an indicated (restricted) dataset.


High Priority



If you use my data you must deposit yours



View data in maps



Search across systems outside of FISHNet as well as FISHNet itself



Rich metadata, i.e. Field notes etc layers and su
pporting material



Taxonomic picklist



Usage statistics


Would be nice



View data in time series



Resolve taxonomic differences in data



Calculation of simple statistics



Ecological traits analysis



Statistical analysis of datasets using
R
, a software package for statistical data analysis.



Interoperability with myExperiment type tools



Research gap analysis


Don't need



Dataset public preview that restricts access to full dataset


3

Core Technologies

A variety of software te
chnologies were investigated and reviewed as potential components for
FISHNet. These technologies were evaluated in terms of the user requirements to be supported and
the ease with which they could be integrated. We identified four key components in the FI
SHNet
server
-
side technology stack


Web User Interface, Digital Repository, Security, and Integrated
Search


which will be implemented using the following software components. The bullets following
each component indicate the reasons for its selection.

W
eb User Interface:
Liferay



The existing Freshwater
Life

infrastructure uses it, thus allowing FISHNet direct access to the
>1000 Freshwater
Life

users.



It incorporates portal technology that allows generic functionalit
y such as user management
and security to be handled by the portal, leaving portlets to provide application
-
specific
functionality.



By using portlets, development work can be broken down easily into discrete parts that allow
the end user to customise how t
hey use the web tool.


6




Liferay has excellent support for single sign
-
on tools, allowing it to integrate easily with the
other technologies that FISHNet may use.


Digital Repository:
Fedora




The most appropriate

choice for representing the complex information that we intend to
capture, model and manage.



It offers a flexible content model architecture that supports representation of compound digital
objects and aggregations, and allows multiple metadata schemas to

be associated with an
object.



It contains built
-
in support for semantically representing (as an RDF/OWL graph) the structure
of compound objects and relationships between objects.



Fedora’s architecture is essentially service
-
orientated, with all functiona
lity, data and
metadata being exposed as web services. We will follow this architectural approach in our
development, enriching the functionality by building new services around the core repository.



The Fedora API includes both SOAP and REST options, each
of which has its own
advantages; as part of the work we will investigate both SOAP and REST for our extensions,
and if appropriate we may implement each as alternatives.




The new DuraCloud software will simplify the implementation of a distributed storage
layer for
the content held in the repository.

Security:
OpenSSO
/
OpenAM



It is an open source Authentication, Authorization, Entitlement and Federation system.



It provides core identity services to make it simple to implement transparent single sign
-
on
(SSO) as a security component in a networked infrastructure.



Fedlets make it simple to embed SSO capabilities in Java EE web applications (such as
Liferay and Fed
ora).



The preferred deployment container is GlassFish, which is already used in the Freshwater
Life

infrastructure.

Integrated Search:
Apache Solr



A popular and very fast open source enterprise search platform.



It supports full
-
text search, hit highlighting, faceted search, dynamic clustering, database
integration, and rich document (e.g., Word, PDF) handling.



Faceted search makes it easier for users to narrow down search results by classifying
returned results
into meaningful categories that are guaranteed to exist.



Rich document handling will enable FISHNet to index and search within documents stored in
the Fedora digital repository.



Solr is highly scalable, providing distributed search and index replication.

4

U
ser Management Model

Given the highly distributed nature of the intended FISHNet user community, a key implementation
issue is user management, access management and security. As described in Section
3
,
Liferay

will
used as the user interface environment for FISHNet, and we intend to follow the same user
management model for FISHNet, facilitating integration between FISHNet and Freshwater
Life
.

The user management model used in Liferay is
as follows. There are four distinct entities:
organisations; communities; users; user groups.



Organisations
, these are hierarchical and can have sub
-
organisations, such as University X,
with a Department of Freshwater Science as a sub
-
organisation. Managem
ent of an
organisation and its members can be delegated to a specific user. Users can also be
recognised automatically as belonging to a particular organisation based on their email

7


address (i.e. user@universityX.ac.uk) and so on. Permissions for individua
l organisation
users can be managed by the organisation administrator.



Communities
, these are similar to organisations, but rather than being hierarchically
structured, with organisations and sub
-
organisations, communities can contain users from a
variety
of different organisations, or users who don't belong to any particular organisation. An
example would be a group of researchers working on a shared research project but based at
several different universities or institutions. Permissions for individual co
mmunity users can
be managed by the community administrator.



Users
, essentially anyone with a registered user account.



User groups
, an independent collection of users who can be assigned specific permissions
across the whole of the system. For example, if
a moderator group is required for a forum that
is available to all users regardless of community or organisation, then a usergroup called
forum moderators could be created and anyone belonging to it could have the correct
permissions regardless of their co
mmunity or organisation.

This user management model allows organisations and communities to have their own semi
-
independent section of a Liferay
-
based website (such as Freshwater
Life
) which they can organise as
they wish. Examples of communities that alrea
dy exist within the new Freshwater
Life
environment
include

www.windermere
-
science.org.uk/home
, and the FISHNet community itself, at
www.fishnetonline
.org/home
. Clearly, these communities have very different appearances and
layouts, and contain different content. Crucially, all communities built within the Freshwater
Life

environment have access to any of the portlets within Freshwater
Life
, including th
ose that will be
developed as part of the FISHNet project.

Because the organisation/community/user model reflects to a great extent the way in which
freshwater data is collected and managed, as identified during the FISHNet requirements gathering
process,
it was decided to use this as the basis for user management in FISHNet. Individual users
and communities will have storage space for uploading data and associated metadata, which they
can share with either a specific user group of their own choosing, a com
munity or an organisation.
Liferay will be combined with a digital repository (using Fedora), which will reflect the Liferay user
management model. Thus a dataset uploaded by an individual user can be either completely private,
shared with just one or two
people they work with, or made available to an entire research community
or organisation, depending on the choice of the user.

5

Development Methodology

5.1.1

Agile Development

The FISHNet project is using an agile approach to software development, specifically th
e Scrum
approach. Scrum (in common with other agile development methods) involves working in short sprints
comprising about 4
-
5 weeks of work, with a “working” software release at the end of each sprint. This
doesn't mean that each monthly release is a per
fect working piece of software, but rather it represents
the best piece of software the development team was produce during the period.

The overall direction of the project is determined by creating what is termed the product backlog,
which is a list of r
equired, requested or desired features that the project stakeholders have for the
software product. In our case, this was determined mainly by the project workshop held in March, as
described in the FISHNet: User Requirements document (see also Section ???
??). As each sprint is
planned, features are taken from the product backlog and put on a separate list called the sprint
backlog, and developers then work on adding these features to the software. They do this by means
of having a daily meeting of no more
than 15 minutes


this is called the daily scrum, from which the
name of the method is derived


at which issues relating to the day’s work are discussed. Because
the progress is reviewed on a daily basis, developers are able to respond quickly to any prob
lems that
may arise, and to change their detailed, short
-
term plan accordingly. This means that these problems
do not impede their ability to release a product at the end of the sprint, and it because of this agility in
the face of events that the term agi
le is used to refer to this development methodology.


8


At the end of each sprint, the product is released either externally (to anyone that is interested) or
internally (to particular stakeholders)


we will be releasing externally. At this point project
st
akeholders can provide feedback on the new release, report bugs, test new features, request
improvements to existing features, and so forth.

The FISHNet team will have monthly meetings to review the previous sprint and to plan the next one.
Notes from the

planning meetings will be made available so that those participating in the FISHNet
project can see what is planned for the coming sprint.

5.1.2

Project Tracking

Keeping track of an agile software development project is a challenge in itself. Agile development
is
designed specifically to be able to respond efficiently to problems as they arise, and so these it is
imperative to keep track of these. There are various tools available for this


FISHNet is using a tool
called
Jira
, an industry
-
leading issue
-
tracking tool used by many leading software companies and
open source projects. It is normally somewhat expensive, but the FBA's charitable status meant that
the project can use the software package for free.

Jira allows users to post bug reports they've found, request improvements and request new features.
It also allows users to watch issues they report for any changes made by the development team. In
addition to this it allows us to implement many different

aspects of the Scrum development
methodology that we've decided to use. We intend that those participating in FISHNet will use it to
provide feedback on project outputs and help the development team to improve them. The FISHNet
Jira installation has been
used internally to the project since early July, and it will be made available to
the public in early August, with guides to how it is being used by FISHNet.

6


Detailed Schedule for Implementation (Workpackage 4)

In this section we provide a detailed breakd
own of the implementation workpackage for FISHNet (i.e.
WP4 in the Project Plan). In accordance with the agile approach followed by FISHNet (see Section
5
),
the implementation of the system will take place in short phases of ap
proximately one month in
duration, involving iterative cycles of development and testing. Thus it is not possible to say precisely
when each item of functionality will be implemented, as this will be a moving target.

Instead, we give a top
-
level breakdown

into broad activities with their expected start and end dates,
and a more fine
-
grained breakdown into sub
-
activities. The current release schedule, together with
the functionality to be implemented within each release, can be found in the FISHNet JIRA sys
tem at
http://www.freshwaterlife.org/issues/
.

9



Workpackage and activity

Earliest
start

date

Latest
completion
date

Outputs

Milestone

Responsibility

WP4: Implementation

Design and development of pilot

in accordance with
Implementation Plan. Note: an agile, incremental
approach will be followed, so the steps below do not
represent 3 successive phases of the WP, but rather
activities that are repeated as part of a cycle.


01/06/10

31/03/11

; (ii) e
-
Frame
work outputs; (iii) (iv) Test
documentation; (v) template copyright and
data
-
sharing agreements




1.

Content triage

01/06/10

31/08/2010

Content audits.



KM, MHaft

Undertake classification/audit of existing content in
FreshwaterLife (FWL), and determine:
which
objects are most important for import into FISHNet,
content types (format etc.)






Repeat classification for Environment Agency
systems (although may be metadata only for
FISHNet).












2.

Content Modelling

01/07/10

30/09/10

Content model doc
umentation


KM, MHaft,
MHedges

Define metadata requirements for each type of
content.






Define a bare minimum that users will be
forced to supply for their private data.






Define larger metadata set(s) for public release
of data.






Identify

support to be offered to depositors for
creating metadata.






Liaise with EA and CEH regarding their
metadata schemas. These will be used to
facilitate data exchange/interoperability,
although we may include additional fields
needed for FISHNet.






Define Fedora Content Models (CMs) for identified
content.







10


Define mapping from existing formats and
management systems to Fedora CMs (Note: old
FWL site can export content as XML)






Determine preservation requirements for different
content types.






Archival formats:



S灲敡摳桥e瑳
typic慬ly⁍icr潳潦琠
Ec敬)㨠:潮v敲琠e漠oSV



䑯D畭敮瑳
攮e⸠Wor搩㨠d潮v敲琠瑯t
O灥湏ffic攠e䵌



P䑆s㨠:o琠welc潭攬e扵琠ta渠nff敲⁴漠
cr敡瑥tP䑆⽁ v敲ei潮s ⁗or搠d瑣⸠
摯c畭敮瑳⁦潲⁤ossemi湡瑩on



䑡瑡b慳敳㨠瑡k攠eQ䰠摵m瀠p
n搠䍓V
c潰y 潦⁤ 瑡t 䥴Iwo畬搠de⁤ sir慢l攠e漠
桡v攠e瑡t摡rdis敤 MySQ䰠摵m瀠pl畳
liv攠e潵湴n⁏瑨敲⁰潳si扩lity㨠:IA剄R






Pr敳敲ea瑩潮整e摡瑡






剥Riew 潦⁣潮瑥tt潤敬⁤潣畭敮琠t潲⁰慲瑩ci灡湴n
f敥摢慣k.






䑥Di湥⁧敮敲ic⁰牯 敳s敳/睯wkfl潷s⁦潲o
r数潳i瑯ty
i湧敳琠tc敮慲a潳
畳i湧⁂P䵎r⁕䵌Ⱐ慳
慰灲p灲楡t攩⸠䥮e灡r瑩cul慲Ⱐt桩s will⁩湣l畤攺



B畬k⁩湧敳琠tr潣敳s敳⁦潲⁥is瑩湧慴arial
(攮e⸠.W䰬LEAⰠ䙁,
-
䍄SⰠ,m慧攠e慮ks)⸠
T桥s攠eill⁦潲⁴桥om潳琠t慲琠扥aa畴um慴a搩



䥮I敳琠潦 w慴ari慬
fr潭⁴ e

扡r攠
mi湩mum⁳c敮慲楯⁴漠o潲攠oom灬數
數慭灬敳).














䍯Cyrig桴ha湤⁉P删

〱⼰㠯80

㌱⼱㈯20

Tem灬慴a⁣潰yrig桴ha湤⁤慴a
-
s桡ri湧
慧r敥m敮瑳.


䵈慦t

䥤I湴ify⁲敱畩r敭敮瑳⁦rom EA/䍅䠬 wit栠牥h灥c琠瑯t
m敥瑩湧⁰ 扬ic li条ti潮sⰠi湣r敡si湧⁰畢lic
敮e
a来m敮琠tn搠d敥灩n朠gr慣kf⁴ 攠es攠ef⁴ 敩r
摡瑡d







11


Create template agreements for researchers and
validate them by liaising with researchers.












4.

Development Environment

01/06/10

31/12/10

Functioning environments


EL, SF

Set up development

environments with Fedora,
Solr, Liferay and OpenAM.






Local development (on desktop machines)






Test/Integration (on servers)






Live (on servers)






Set up hardware for integration/live environments
(largely in place, using existing hardwar
e.
Additional storage will be needed for newly
ingested data).












5.

Security and Access management

01/06/10

31/12/10




KM, MHaft
(spec)

EL, SF (impl.)

Define access and security model.






Investigate security aspects of Solr, in particular in

terms of private/restricted files.






Integration of Fedora with Liferay security/user model.












6.

Design

01/06/10

31/12/10

Technical documentation, including
architecture/class diagrams;

e
-
Framework outputs



EL, SF







7.

Coding, testing and

integration


01/06/10

28/02/10

Tested software components.

Tested release of integrated environment.

Test documentation (online).


EL, SF








12


8.

Desktop (non
-
web) functionality (Unlikely to be
implemented within project timescale)



䑡瑡 v敲ei潮i湧 畳i湧⁔
潲瑯os敓V丬⁇i琠t瑣⸬.
睨敲攠w敳e慲a桥rs⁷潲  ⁴ eir⁤ t愠a潣慬ly
慮搠灲敳s⁡ c潭mi琠t畴u潮 瑨t琠ty湣桲潮is敳
瑨tir 睯wks灡c攠e漠o桥⁲数osi瑯ty⸠



O瑨tr tio湳⁴ inv敳瑩条te㨠⁇潯:l攠e敡rs
f潲o
瑨ts攠睩lli湧⁴漠os攠䝯潧le)㬠;潯杬攠䑥sk瑯t
瑯tsc慶e湧e⁦潲

fil敳⁴ ⁣潭mi琠瑯⁲数潳i瑯ty㬠
F慳ci湡瑯爮



䑥Dk瑯t⁴ 潬s⁦潲整e摡瑡tcr敡瑩潮.

〱⼰ㄯ11

㈸⼰㈯20

F敡sibility⁤ c畭敮瑡ti潮.


E䰬LSF









啳敲
灵扬ic)⁴ s瑩湧

T桩s⁣潶敲e⁰ blic⁴敳瑩n朠g湤⁥val畡瑩o渠nf⁆䥓 乥琠by
r敳敡rc桥rs⸠.桥s攠睩ll i湣l畤攠湯琠
潮ly⁴桯s攠
i湴nrviewe搠d琠瑨t⁳t慲琠潦⁴ 攠er潪散琠t畴u慬s漠o桥 睩der
Fr敳桷慴敲
Life

community, who will assess the system
in more broadly and informally.

The description of this WP will be expanded later in the
agile development process.

Note: Any given r
elease of FISHNet will undergo
internal testing followed by external (user) testing,
becoming increasingly robust as it does so. More than
one release may be under test at any one time in
different environments, so close attention will be paid to
issue tra
cking and version control, using JIRA.

01/09/10

31/03/11

Test/evaluation documentation
(online/blog)


All (including
FISHNet and
Freshwater
Life

community)