Towards online collaboratories for global data
gathering in social and economic history
IISG - Internationaal Instituut voor Sociale Geschiedenis
(Institute of the Royal Netherlands Academy of Arts and Sciences)
Table of contents
2. Project goals
3. The research collaboratories
3.1. Institutional setting
3.3. Desired functionalities
4. Comparing collaborative software
5. Installation and test of Liferay
6. User experience of Liferay
7. Lessons learned and prospects for future research
8. Project evaluation and financial report
Appendix 1: Questionnaire workflows and functionalities (Jan Kok
and Frans de Liagre Böhl)
Appendix 2:Technical report platforms comparison (Dutch) (Mario
Appendix 3: User guide Liferay platform (Frans de Liagre Böhl)
Appendix 4: Technical report Liferay (Dutch) (Frans de Liagre Böhl)
Appendix 5: Questionnaire user satisfaction (Jan Kok and Frans de
The Hublab pilot was unthinkable without the input of a large number of people, who
have brought their widely varying expertise into the project. On the technical side, I’d
like to mention Mario Mieldijk, Jip Borsje, Gordan Cupac, Ole Kerpel and Luciën van
Wouw. The participating collaboratories were led by Karin Hofmeester, Sjaak van der
Velden, Marco van Leeuwen, Richard Zijdeman, Kees Mandemakers and Tine de
Moor. For their advice and contributions to overall management I am grateful to Titia
van der Werf, Jef van Egmond and above all Frans de Liagre Böhl.
HubLab aimed to create a platform to support communication and data-sharing of
international collaboratories in social and economic history. The groups are
stimulated by the International Institute of Social History, but their members are not
affiliated to the Institute.
The international character of the groups required a user-friendly, ‘light’ tool that
would operate independently from the Institute’s internal system. In the first stage of
the project, leaders of four collaboratories were interviewed in order to gain insight in
software requirements with respect to internal workflows, communication, security,
version management of documents et cetera. Subsequently, three platforms were
tested (Sakai, Sharepoint and Liferay) on both the technical fit with the in-house ICT
knowledge and the needs specified by the collaborator leaders. In this test, Liferay
emerged as optimal candidate.
In the third stage of the project, Liferay was installed on five group sites. The leaders/
administrators were given a free hand to design their own platform and to stimulate
their group member to make use of it. Demonstrations of the tool were planned to
coincide with conferences or workshops.
The platform proved relatively easy to install and to use, but several programming
errors caused delay and, in the case of one or two leaders, reluctance to further
expose the research team to the experiment. Eventually, the user test of the platform
was largely limited to the administrators and the technical support team.
Given the high priority of the Institute to stimulate the creation of international data
‘hubs’, the study of best practices of both the collaborative research model and the
ICT support of it will continue. The platform itself will likely be improved by e.g.
adding demonstration videos, better navigation tools, integration with email. Even
more important is further exploration of data-sharing which is integral to the
collaboratory model. Thus, we will look into version management, a licence structure,
intellectual property rights and possibly online manipulation of data.
This final report on the pilot project Hublab serves several purposes. Firstly, it
summarizes the original goals and planning as agreed with SURF Foundation
(section 2), it describes the actual implementation of these goals (sections 4 and 5)
and it evaluates the achievements in the light of the project’s proposal (section 8).
Second, it provides a detailed study of the collaboratories involved in the project. In
our view, the institutional and social aspects of the collaborative process deserve to be
taken fully into account when assessing the performance of technical solutions to
data-sharing and communication (section 3). Third, the report brings together a
number of scattered documents, such as technical tests and guidelines, which have
been produced in the course of the pilot test (Appendices). Last but not least, the text
offers a (self) critical evaluation of the test, in the sense that it weights and discusses
the relative merits of the selected platform Liferay and the input of the collaboratory
leaders and members, within the scope of a short-term SURF tender (sections 6, 7
2. Project goals
The project Hublab aimed to construct a supportive environment for a number of
collaboratories (both existing and new ones) geared at collecting and standardizing
research data in social and economic history. Three questions were central to the
pilot project. First, is it possible to optimize worksflows within online teams with
respect to the creation and manipulation of research data? Second, to what extent
does the platform enable and stimulate active participation, effective communication
and joint decision-making within groups? Third, what platform is optimally suited to
international research teams whose members are not to be integrated in the hosting
Institute and whose leaders require a great deal of autonomy versus the Institute’s
ICT department? Thus, how do these platforms function in a heterogeneous
environment of distributed data ‘hubs’?
The combination of these questions required an extensive, initial test of various
applications. In the first stage of the project, three collaboratory platforms were
evaluated: SURFgroepen/Sharepoint, Sakai and Liferay/Alfresco. The chosen
platform was to be installed in a test-environment for four different collaboratories.
Testing the software in different groups allows us to asses the importance of
institutional contexts, differences in workflows as well as group dynamics in the
implementation of communication tools.
The collaboratories involved number between 20 and 40 participants scattered
around the globe. The groups are:
1) Global Collaboratory on the History of Labour Relations in the period 1500-2000,
a worldwide ‘census’ of labour relations;
2) History of Work Information System,; a working group devoted to creating
uniform codes for historical occupational titles (also known as HISCO);
3) Towards a global history of life courses. Creating a network for the development of
data structures for standardized longitudinal historical data, a two-year project
involving an effort of managers of large databases to devise a joint structure for
4) Labour conflicts, a collaboratory building a central database on labor conflicts
(strikes and lockouts).
In Januari 2008 a fifth group requested to participate in the pilotproject:
5) Data infrastructure for the study of guilds and other forms of corporate collective
action in pre-industrial times, a collaboratory around the systematic collection of
data on guilds (1300-1800) in countries such as Italy, China, Turkey, England and
The Hublab pilot project was planned as follows.
The first stage November-December 2007. This stage consisted of two Word
Packages. WP 1. ‘Inventarisation of workflows and functionalities’ aimed to report on
the workflows within collaboratories and to query their leaders on what they would
consider an ideal platform for their groups. The report, to be completed by late
November 2007, formed the input for Work Package 2 and was also disseminated to
the Surf tender community (Deliverable 1). WP2. ‘Comparison of collaboratory
systems’ aimed to find the most suitable platform, given the wishes of the teams’
leaders and the constraints of the Institute’s ICT-department. The WP implied the
initial construction of a scorecard template with weighted testcriteria related to on
the one hand the infrastructural environment (a.o. openness of the architecture and
technical and administrative demands). An important criterion was that the
Institute’s programmers should be able to develop new applications within the
platform and guarantee lasting support. Thus, the platform should match the in-
house knowledge of php, apache, java, mysql et cetera. On the other hand, the criteria
were related to general functional demands for data-sharing collaboratories (a.o.
communication, workflow, datafile sharing, version managment, rights management,
repository characteristics). The ICT-staff responsible for this WP installed and tested
three different applications: SURFgroepen/Sharepoint, Sakai and the Open Source
package Liferay, that was also tested on the possibility for integration with the open
source content management system Alfresco. The result of the test, Deliverable 2, was
a technical report to be completed in December 2007.
In the second stage, Januari 2008, the actual pilot was to be prepared by installing
the selected software in four (later five) collaboratories (WP3). This also implied a
new round of discussions with the hub-leaders on what content should be migrated
from their existing sites to the new environment. The platform (Deliverable 3) was to
be accompanied by a user guide, a configuration management procedure and an
In the third stage, February-May 2008, the actual test (WP4) was to be performed in
the research practice of the collaboratories involved. The intention was that each
group would demonstrate the platform at a workshop and discuss its usability in
online questionnaires and on the sites’ forums. The leaders were supposed to provide
their own reports on the platforms’ performance, and to incorporate in their reports
the experiences of the group members. At the end of the project, their reports were to
be consolidated in a single report with conclusions and practical recommendations
Two final Workpackages covered the entire period of the project. WP5 Knowledge
dissemination intended to distribute the results of the project among interested
parties. One outlet was formed by the Surftender blog, to which contributions on
Hublab were sent. Another intended outlet was a Hublab webpage and contributions
to discussion lists and professional journals (Deliverable 5). Finally, WP6 covered
the project management, which implied ensuring the timely completion of tasks in
accordance with the controlling document, maintaining contacts with SURF and
contributing to the tender meetings, and preparing both an interim and final report
3. The research collaboratories
3.1 Institutional setting
The project Hublab is placed firmly in the research strategy of the International
Institute of Social History. As formulated in the Strategienota 2007-2010, the
institute aims to play a leading role in worldwide research on economic and labour
history, by supporting international teams of peer researchers collecting and
analyzing data. The immediate stimulus for Hublab came from a KNAW grant for the
project ‘Global Hubs for Global History’ (September 2007-December 2009). In this
project, the Institute aims to improve the digital infrastructure for collaborative
projects in its field. More specifically, the project entails upgrading existing databases
into data ‘hubs’. A datahub is seen as a virtual meeting place for researchers and a
repository for their data. It offers a platform for their cooperation, links to
documentation and to publications concerning the project and offers access to the
data that forms the core of the research. The datasets currently managed by the IISH
deal with wages and prices, life courses/life chances, guilds, labor organizations,
strikes and labour relations. Each of these datasets offers insights into specific
elements of global economic and labor history. Furthermore, in their combination
they allow for entirely new research into the dynamics of long-term global processes.
In order to stimulate the comparison and combination of these datasets, efforts are
directed towards improving their interoperability through standardization, improved
documentation, georeferencing et cetera. Building an infrastructure of related
research databases with global data presupposes optimizing cooperation between
peers on a global scale.
Improving the ‘hubs’ is a clear priority of the Institute. However, the project Hublab
came at rather short notice, which meant that scheduled work had to be shifted,
which was not always feasible. This has caused some delay in Hublab’s progress (see
also section 8).
In addition, three other organizational aspects are relevant for an evaluation of
(1) Selecting the platform and installing the pilot environment required a – for the
institute – unusually intensive cooperation between staff of the departments of ICT
and Digital Infrastructure and researchers. ICT staff were responsible for the
hardware involved and for installing the platform, the staff of Digital Infrastructures
had to support the platform, e.g. by writing the User Guide and by performing
helpdesk functions. Researchers, often not familiar with or interested in experimental
software, had to formulate wishes or give judgements on the performances of the tool.
The cooperation between these three groups was not always smooth and mis-
communications could not be avoided, in particular because of the short period
allowed for the project.
(2) Although the collaboratories are supported by IISH, it was an absolute
requirement that the pressure on the relatively small ICT department should be
minimal. Therefore, the platforms had to be installed on separate servers. Also, the
members of the collaboratories had to remain outside the IISH system with respect to
login and mail. The administration of the sites had to be performed in principle
entirely by the hub leaders. These demands presuppose a light, browser independent
and user friendly platform, in which the rights of users can be set easily by the hub
(3) Various collaboratories already had their own website and/or mailing list. This
implied that the hub leaders had to stimulate their colleagues to join the experiment,
while it was not clear that the new platform would definitely replace the old one. In
the meantime, the top priority for the leaders was that the members kept motivated
to contribute to the group’s research activities.
The surveyed groups have all in common that they consist of experts who join to
discuss and build datasets with historical information. However, they display a wide
variety in terms of sources, ways of handling the data from those sources and ways of
cooperating with one another. In this section, we will describe the collaboratories,
with an emphasis on the internal workflows. How is the interaction between the
members – and between the members and the hub ‘leader’ – organized? What are the
actual targets and time frames of the collaboratories? Is there leeway to change the
targets during the project? The following report on the organization and workflows of
the collaboratories involved is based on interviews held in November 2007, at the
start of the Hublab project.
(1) Global collaboratory on labor relations in the period 1500-2000
For the next ten years at least, the IISH is dedicated to pursue the research strategy of
‘Global Labor History’. In this context, the Institute has taken the initiative to make a
worldwide inventory of labor relations at specified intervals in the period 1500-2000.
The project is carried out in cooperation with the Institut für Wirtschafts- und
Sozialgeschichte (WISO) of the University of Vienna. In the project, researchers from
almost all continents joint to merge their data and expertise in order to reconstruct
(the development of) global labor relations. Currently, 22 persons are involved in this
project, that has started on January 1
, 2007 and that will last until April 2009. The
project is supported by NWO Internationalisering Geesteswetenschappen. For this
report, we have interviewed the project leader prof dr Karin Hofmeester (IISH).
Aims. The goal of the project is to create a database in which historical occupational
censuses are ‘translated’ into predefined categories of labor relations. For each
‘sample year’ (1500, 1650, 1800, 1900 en 2000) and for each region/country the
researchers make an estimation of the quantitative importance of particular labour
relations. For what kind of ‘organization’ did people work and what was their position
within that ‘organization’? Thus, what part of the population was not active, what part
worked within households, what part worked for respectively the local community,
non-commercial organizations and commercial organizations? Within each category
a number of groups are distinguished: ‘free’ wage workers, indented labourers, slaves
et cetera. The project director, prof dr Karin Hofmeester, is well aware that the
database will have a temporary character and will contain many missing values.
However, the main purpose of the database is to serve as a first step towards a much
larger international project, for which a grant application will be written after the
completion of the project. Another building block of that application is a parallel
project that investigates labor ideology and labor ethics. It is expected that the
current collaboratory will continue after April 2009 to prepare the grant application.
The website could serve as a permanent platform to present material and to exchange
ideas and knowledge on global labor relations as well as ideological notions
At the moment (November 2007), the website of the p roject
) consists of documents (project
description, codebook, list of participants et cetera). The discussion among
participants operates through a malinglist. In a first workshop in Amsterdam (May
2007) the targets were set and working definitions were discussed. The progress of
the project will be discussed at a second workshop in Vienna (March 2008).
Management. The collaboratory is managed by the projectleader, prof dr Karin
Hofmeester. During the interview, she makes it clear that the goals and planning of
the project were pre-defined and are (no longer) open for discussion within the
group. It is her task to monitor and control the planning, however, her means to do so
are rather limited. The participants have all volunteered to cooperate and are not
rewarded financially for their efforts. There are no sanctions for not living up to the
agreements. However, since the goal of this project is perceived as a temporary
product, defaulting of one of a few participants is not considered a serious problem.
Data collection. In order to translate the original source material on occupations into
the agreed format, the participants use a codebook. The codebook was subject of
strong debates at the first workshop and still seems not have found its final form.
Proposals for amendment are put forword throught the mailinglist. In the end,
however, Karin Hofmeester will decide whether and how the codebook is to be
changed. If necessary, changes in datafiles that have already been submitted will be
made at the end of the project. Of course, when an insurmountable problem arises, all
data may have to be changed during the course of the project.
Data submission. The project operates with Filezilla. The participants ‘ftp’ their data
and mail Hofmeester that their data have arrived. She unzips them, checks them and
uploads them in the participant’s directories.
Quality checks and peer review. As said, the members mail their zipped datafiles
(Access) to Karin Hofmeester. She checks whether the material meets the agreed
standards on file structure and annotation. Error-fraught material might be returned.
At this moment, it is not possible to check the content of the data itself. The
participants are seen as ‘the’ experts for their own region. Members are allowed to
look into each others’ subdirectories, but cannot alter anything. Discussion on each
other’s contribution is possible via the mailing list or by direct email. The latter
cannot be viewed by other members nor by the manager.
Documentation. When collaboratory members have additional comments on their
submitted material, they are invited to make these comments in footnotes. When
someone wants or needs to divert from the agreed format or codes, this has to be
documented in a separate file. References to the sources are put in textfiles and will
be collated in a central file
Access to the data after the project. After the end of the project, the data will remain
available only to members of the collaboratory. At least, this is the case for the period
in which the new grant application will be prepared.
(2) History of Work
In the social and historical sciences, occupational titles and classifications of
occupational positions are very frequently used to indicate and measure differences
in social class or status. However, these differences are difficult to measure across
great geographical distances or periods in time. On the basis of the occupational title
alone, it is often difficult to say whether a particular job entailed supervisory tasks, or
whether a job was salaried or self-employed. Therefore, it is difficult to employ
historical occupational titles in internationally comparative research. To make
worldwide comparisons of labour relations, social stratification, social mobility and a
host of other social phenomena, a harmonization of titles is a necessary precondition.
A very influential effort in that direction is made by the ‘History of Work Information
System’. This hub functions as the central meeting places of researchers who work
with (historical) titles and is also serving as the virtual repository of their datasets.
The website (
) contains occupational titles from all
countries and periods, standardized codes for the professional activities (HISCO, a
historical version of ISCO, which is developed by the International Labour
Organisation) and images of all forms of labour. At the moment, the coded titles
pertain mainly to Western Europe and North America, but the interest from Asia,
Latin America and Russia is growing and researchers from these areas are starting to
contribute to the ‘hub’. For this report, we have interviewed the project leader prof dr
Marco van Leeuwen (IISH and University of Utrecht) and drs Richard Zijdeman
(PhD student UU).
Aims. The History of Work ‘hub’ is the result of a long standing (more than fifteen
years) cooperation between professor Van Leeuwen and various other experts. In this
cooperation, the experts systematically and cumulatively build a central coding
system of all occupational titles in the world. The project can be termed ‘finished’
once all titles are coded and the codes are available to the research community via a
(semi-) automatic coding system. This user-interface is already available for various
countries and languages. For some countries, the work is nearly finished, but in
others it has not even begun properly. The occupational codes are themselves the
condition and crucial means to arrive at an internationally accepted schemes of
historical class and status (respectively HISCLASS and HISCAM). The work on the
status stratification HISCAM is being financed by research grants (VIDI and NWO-
Internationalisering), that will expire in about a year. However, the targeted scheme
has already been developed. Apart from coding and classifying titles, the project
group collects images of historical activities and descriptions of all forms of work in
the past. In a sense, the website serves as a museum and library of historical
Management. Prof Van Leeuwen has the task to control the project, in terms of
targets and planning. However, this role has to be put into perspective since
participation is voluntary and pressure on people may work counterproductive. In his
view, it is therefore not always wise to bother people with intermediate goals and time
schedules. He sees his task as directing the field towards filling up existing lacunae.
This is complicated further by the fact that there exists no complete overview of the
extent of work still to be done. These exists no complete inventory of all sources
containing occupational titles in the world. In running the project, Van Leeuwen
works closely with Dr Maas (UU) and drs Zijdeman (UU). The project leans rather
heavily on their personal commitment, also in the sense that continuity in the future
is not (yet) institutionally secured. On the other hand, the IISH has committed itself
to continued support of the facility.
Datacollection. There is no clearly circumscribed group of participants in this project.
Generally, experts are being invited via mail or call for papers for conferences or
edited volumes to code the occupationals titles in their region on the basis of the
HISCO codebook or the online semi-automatic coding system. Apart from that, the
project leader(s) offer their assistance, by giving advice and/or checking the
submitted datasets. Actually, the mail address on the site is linked to Van Leeuwen’s
personal address, which means that in practice all correspondence goes through his
Data submission. The website is not very clear on how to submit datasets. In
principle, Excel files are preferred, but the data tend to arrive in different formats
that have to be converted by hand. Also, the number of records is stretching the limits
of Excel. The data are all submitted through email. The Excel-sheets are eventually
exported to a MySQL Database.
Quality checks and peer review. The submitted datafiles are visually checked by Van
Leeuwen, Maas en Zijdeman on the correct application of HISCO codes and where
necessary corrected. In case of doubt, they reach consent among themselves on how
to deal with specific issues. They would welcome a form of automatisation of this
procedure, as well as making it more a collective concern (via a peer review system).
Currently, there is a considerable backlog of foreign datasets to be processed. Most
ideal would be an automatic coding system that links the data in submitted files to the
central database and returns them with the correct HISCO codes. As for peer-
reviewing: it is possible to attach the name of a given expert permanently to his/her
coded titles in the central database. This will enable a more efficient control and
possible change of titles coded by these experts. However, this might also discourage
persons to participate. Currently, direct peer review is not possible. However, the link
with the original datasets is preserved and thus it is known who was responsible for
the coding. These original (Excel) files cannot be downloaded.
The HISCO coding itself is no (longer) subject to intense debate, contrary to the
extension of the project into global class and status-schemes. Generally, discussions
on coding have to do with differences in the interpretation of work by locality and
period. Although the project leaders welcome such discussions, they are also
reluctant to contextualise occupational codes (that is, make them place and period
specific), as that would diminish the value of HISCO, HISCLASS and HISCAM as
tools for comparative research. In the future, they do not expect drastic alterations in
the codes, which would imply changing the central database. The discussion on the
codes is all done through mail and is not visible for others.
Documentation. A detailed description of the sources is not required of researchers
who submit datafiles, but personal information and institutional affiliation is
generally provided. The credits for offering data are given in the field ‘provenance’,
and there is also a provenance section on the site. Remarks and other references to
the data are integrated in the database and can be retrieved per record.
(3) Towards a global history of life courses. Creating a network for the development
of data structures for standardized longitudinal historical data
This collaboratory is being constructed in the cont ext of an NWO
Internationaliseringsubsidie and is headed by Dr Kees Mandemakers (IISH). In this
international project, the managers of large databases in the field of historical
demography will develop a standard data model (both online and in workshops) for
longitudinal micro-demographic data with the aim of converting data from the
different databases into an interoperable file structure. Such a structure would
enhance the access and dissemination of the databases in question strongly, in
particular by allowing for international comparative research. Also, simplified
datastructures directed at specific fields of research will facilitate demonstration and
will increase the use by a new generation of researchers. The project runs from
January 2008 - June 2009 (see
). The project is initiated by Dr
Mandemakers (Historical Sample of the Netherlands) in close cooperation with Prof
dr George Alter (Interuniversity Consortium for Political and Social Research, Ann
Arbor, USA) en Prof dr Anders Brandström (Demographic Database Umea, Zweden).
Their organizations contribute financially to the project. In fact, these three ‘leaders’
have set the main targets of the project, which is not open to further discussion in the
group at large. At the moment working documents are prepared to form the basis for
a workshop to be held at the 1
of May 2008 in Ann Arbor which will be
attended by a core group of six important databases (from Europe, Japan and
Canada) A follow up conference with more participants will be at 22th of October
2008 in Miami.
Aims. In the first stage of the project (January-March 2008), the initiative will be
made known to as many experts in this field as possible. People will be invited to join
the discussion group. In fact, already a lot of contacts with databases around the
world exist, since the project continues along the lines discussed in an earlier
workshop (March 2006). Subsequently, the project will work step by step towards an
ideal ‘intermediate structure’, by discussion drafts in workshops and in a discussion
list. The aim is to reach a consensus on a document describing a data model in which
one can upload selected information from the separate databases and that
restructures the complex dynamic information on individual life courses into a range
of relatively simple, comparable and well documented formats depending on the
research fields. Also, the project aims to collaborate in the writing of a grant proposal
to construct the software and documentation that implements the functional design
and data model laid out in the joint document.
The project will not assemble or disseminate data, but aims to remove the technical
reasons for not making complex longitudinal data publicly available in a comparative
The website of the collaboratory will be part of the IISH ‘cluster’ of sites, but will also
be placed in the more neutral environment of the International Commission for
Historical Demography (
). Actually, this site is
managed by Mandemakers and IISH as well. In the process, this public site will be
given more content and will hopefully attract more visitors. At this point, it is not
clear whether and how functionalities of the collaboratory platform can also be used
for this public site.
(4) Labour conflicts
This collaboratory is (like the one on Labour Relations) part of the IISH research
strategy of Global Labor History. In this case, the aim is to construct a standardized
collection of labour conflicts (strikes and lockouts). Labour unrest forms a good
indicator of tensions created by specific labour relations as well as their interaction
with conjunctural trends and societal developments. Already, the IISH has a database
containing standardized information on 16.000 Dutch strikes in the period 1372-
2006. The structure of this database will serve as the model to be adopted by the
collaboratory. This standard (based on the format of the International Labour
Organization) includes the number of workers involved in a specific action, the
duration of the action, the amount of hours/days not-worked, the number of
companies involved, the involvement of labor organizations, the demands of the
strikers, and coding according to international standards of occupational groups and
economic sector involved. Dr Sjaak van der Velden is the leader of this project, which
runs from October 1st, 2007 until October 1st, 2009. The project is sponsored by
KNAW (project Global hubs for global history). For this report, we have interviewed
Dr Van der Velden.
Aims. Dr Van der Velden hopes to create a website that will function as the central
place for information and discussion on labor conflicts, and also as the central
repository for the standardized dataset. Clearly, he is not in a position to enforce the
standard for the intended dataset. In other words, if researchers offer their material
in another format, he will not refuse them. Also, since this is an effort to bring
together people on a voluntary basis, it is not feasible to make a detailed planning
with subtargets et cetera, let alone put sanctions on non-compliance. Basically, Van
der Velden is trying to motivate researchers in this field to join the project and to
meet in the spring at a workshop where the idea can be worked out in more detail.
Only then will it become clear if the group is motivated and able to work on a joint
task such as a publication or grant application. So far, the interest in the project is
encouraging. Already about 40 researchers have indicated their interest in this
Datacollection. In the first stage of the project, Van der Velden wants to focus the
group on discussing the proposed codebook. This will be done through the mail and
through the discussion platform and will be completed at the targeted workshop (May
2008). Possibly, the codebook will have to be redefined in a later stage, but obviously,
Van der Velden hopes to avoid this.
Data submission. The data format proposed by Van der Velden is rather elaborate
and possibly too detailed for many countries. Again, he does not want to put up any
thresholds by demanding a specific format. Thus, he feels that other formats should
be allowed as well and sees as example the hub on Prices and Wages
) in which many Excel files are collected that differ strongly
in structure. Currently, the files are submitted in Excel, but will probably be
converted to a central relational database. It is not yet clear whether online data
entrance should be enabled.
Quality checks and peer review. Van der Velde will check the data and (probably)
convert them into the central file. However, it is not possible for him to check the
content of the submitted material. He aims to allow group-wide inspection of the
individual (and named) files. He supports discussion in the group but in the
meantime fears for a cluttered whole. The moderator will have to decide what shall
open for discussion and what not.
Documentation. Van der Velden expects that each participant at least provides
references to the sources and describes how the strikes were collected.
Access to the data. At this moment, the continuation of the project after October
2009 is unclear. If the group finds this necessary (e.g. for joint publications or grant
applications), a temporary embargo may be laid on the assembled data.
Common to all four groups seems to be the lack of clearly specified workflows,
corresponding to the voluntary nature of the member’s contributions. Expectations
regarding member’s contributions are most explicit within the group on Labour
relations. In the next sections we report on the functionalities that the project leaders
found more or less desirable for their ‘hubs’.
3.3. Desired functionalities
Apart from internal organization and workflow, the interview with the hub leaders
focused on functionalities in a virtual research environment. The interview followed a
set of questions specified in Appendix 1. The answers for each collaboratory are
stored in the file ‘scorecard.xls’. Functionalities deemed important by most of the
1) the platform should be web-based, with guaranteed continuity
2) flexible setting of a variety of user rights by hub administrators
3) activities of users have to be visible in logs
4) a division on the site between a public and private part
5) integration with existing mailserver accounts
6) a webforum
7) a clear and flexible structure of directories
8) facilities for different data formats
9) version management
10) facilities for adding metadata, preferably automatic
11) flexible interface, to be adapted by users at wish
12) search options, both on metadata and content
13) users can change their own passwords
The following functionalities were considered less important:
1) user statistics
2) support for audio and video-conferencing
4) shared calendar
5) (automatic) visibility of individual contributions to joint datasets
6) interface in different languages
Finally, unimportant or even undesirable were:
1) importing or exporting data in xml format
2) registration of users of the public site; guestbook
3) instant messaging
4) online whiteboards
5) to-do lists and automatic alerts
6) synchronization of calendars with PDA’s
7) publication of blogs
8) online manipulation of data with plug-ins for SPSS and other programs
9) incorporation of RSS feeds
10) integration with search systems (Google, Picarta etc)
11) tagging metadata
12) security devices, such as encryption
The priorities reflect the voluntary character of most collaboratories. Functionalities
that may improve a workflow (to-do lists, integration with PDA’s), probably do not
match the egalitarian character of these groups. In addition, several hub leaders are
wary of ‘high tech’ solutions given their own dexterity (or lack of it) and their
expectations regarding their members. However, for the Guilds group that joined the
pilot project in January 2008, sophisticated functionalities such as online
manipulation of a central database and integrated mail were highly important.
4. Comparing collaborative software
In the comparison of Sharepoint, Sakai and Liferay by ICT-staff two sets of issues
were crucial. First, to what extent does the software meet the requirements indicated
by the collaboratory leaders. Second, what does the platform imply for the ICT-
department itself: how quickly can it be installed, how efficient is the support by the
user community or the company providing the program, what is the fit between the
software and the experience and knowledge of developers at the Institute? In
Appendix 2 the results of the ICT-test are presented (in Dutch). Here, we can only
briefly summarize the outcomes.
Sakai was seen as a user-friendly, simple program. However, its use for collaborative
purposes seemed limited, probably because it was designed for the interaction
between teachers and students. This implies that a number of functionalities would
have to developed. However, this is not supported by the creators of the software. In
case of upgrading, no guarantee is given that newly developed applications will still
function. In addition, Sakai has a relatively small community of developers and users
working on improving and developing the product. The first release was in 2005 and
although, for instance, whiteboards and blogs are added, the main target seems to be
the educational sector. Adding functionalities would imply a heavy investment from
the Institute’s ICT en DI departments, and would also require sustaining the
knowledge of specific programming languages. As many developers at the Institute
work on temporary contracts, this cannot be guaranteed. The alternative, a detailed
documentation for developers, is very time-consuming. In this case, the costs to
ensure the stability and safety of the program after adding patches are relatively high.
Renewed ‘building’ and testing of the package takes too much time.
Sharepoint proved to be the most complete environment with guaranteed support
and patches. It looks familiar for people used to Microsoft products. Support and
patches are guaranteed, and making new web-parts with via ‘Sharepoint designer’ is
unproblematic. However, adding additional functionality to existing Sharepoint web-
parts is not possible. The program would tax the ICT capacities within the Institute
heavily, it would take a long time to implement, and heavy training for administrators
would be necessary. Finally, the vendor lock-in is mentioned as a problem.
The open-source package Liferay was considered easy to implement, with sufficient
support from the extended Liferay community, easy to manage by the collaboratory
administrators and with sufficient options to store and share data. The Java ‘engine’
makes is possible to run Liferay on several operating platforms draaien. The Service
Oriented Architecture makes it easy to build and add new functionalities. Already,
many plug-ins are available. During the test period, it was not possible to insoect
Liferay’s performance when combined with the open source content management
system Alfresco, Apparently, the announced cooperation between both is still in a
preliminary stage. Nevertheless, Liferay in itself met already most of the
requirements specified by the users and the ICT department and it was therefore
decided to build the pilot environment on this system.
5. Installation and test of Liferay
The hub leaders users had specified their preferences for a platform with flexible
rights management; a webforum; good directory structure for data, documents and
discussions; version management; storing of metadata, preferably automatic,
visibility of intellectual property of data and proper search facilities. An integration
with existing mail system was also mentioned. For a single group (Guilds) tracking of
changes within datasets and online work on a central database was a high priority. In
retrospect, these wishes can be distinguished in two groups:
1) Wishes that could be implemented easily: rights management, simple directory
structure, webforum, version control and search facilities
2) Wishes that proved beyond the budget, the technical options of Liferay, or the
period allowed for this pilot: visibility of intellectual property; online adding to
datasets; tracking changes in datasets; integration with email and automatic storage
In principle, the latter functionalities could be developed in due course. However, to
some extent these wishes reflect a lack of knowledge of web-based environments as
well as unfounded expectations with respect to collaboratory members’ use of the
platform. Thus, further enrichment of the platform would require a renewed
inventory of the collaboratory leaders wishes.
An integral part of the installation of the test platform was the production of a User’s
Guide, added to this document as Appendix 3. The illustration below shows the portal
of the collaboratory site. One can visit each collaboratory’s public pages, or one can
login as a member and visit the private pages of the sites.
Liferay turned out not to be without problems of its own. A first serious problem was
that the platform did not function on a proxy-server, which was considered necessary
to guarantee security. This problem, that could be fixed rather easily, frustrated the
demonstrations planned by two groups at an international conference in late
February 2008. The second ‘bug’, in version 4.4.2.of Liferay, implied that internal
links were confused when new pages were added. Therefore, several administrators
were hesitant to organize their sites and waited until the problem was solved with the
release of version 5.0.1. Clearly, this limited the period in which the environments
could be tested even further.
Appendix 4 offers a more detailed, technical description of the installation, support
and maintenance requirements of Liferay (in Dutch).
6. User experience of Liferay
Due to minor problems in the ICT department and the server problem mentioned
above (see also section 8), the proper installation of the environment by the
administrators started early March. Some groups waited until the release of version
5.0.1 in early April. All in all, the several delays resulted in the fact that the platforms
were hardly tested in the proper research environment. The general forum that
allowed reporting and discussion on bugs and new features was used quite intensively
for some time. In fact, a large number of the problems that were mentioned here were
solved during the pilot project. Integration with existing mail systems, however,
turned out to be too costly.
Responses to the questionnaire
In May, we sent out a questionnaire to make an inventory of the user experience with
Liferay (Appendix 5). The forms that were filled in and returned make it clear that
feedback from the users was not (yet) available. Thus, the experience described and
analyzed in this section is based on the administrators’ comments only.
Question 1 of the questionnaire related to Adding new members to the groups.
‘Currently, members are admitted first on the general list of users of the portal. Either
they can do this themselves or the administrators sign them on. Subsequently, they
can request permission to join a collaboratory, or the administrator can do so
In general, the administrators indicated satisfaction with the functionality, in
particular the ability to define roles, although complaints were voiced with respect to
not being able to change user data. It was also suggested that the system should
‘recognize’ registered users, making requests to join a second collaboratory more
simple. Finally, the notifications of requests to join a group are presently very
Question 2a was related to general Functionality with respect to rights. ‘The default
rights structure is as follows: administrators have all rights on all files as well as all
portlets. The members, on the other hand, can only modify or remove their own files.
They can only add new structures to folder and forums within the settings created by
This system was definitely appreciated, in particular the possibility to change these
defaults when required.
Question 2b specified : ‘Administrators can change the default rights (view, down-
and upload) of the members. This goes via update associations /update permissions,
but it has to be done for each folder or document separately’.
In this respect, the users requested a more generic solution, by changing the role of
Question 2c: ‘Users of the private pages of the platform can only set rights on their
own documents or folders.’
According to the administrators, this is how it should be done.
Question 3 went into Document management: ‘Liferay allows you to create folders, to
up- and dowload files, documents and images, to make annotations and to manage
One administrator had good experiences with making separate folders for each
member, which allowed for setting rights for these members simultaneously. Version
management was considered very handy. Up- and downloading of data proves
efficient. Problems mentioned had to do with browsers – not everything worked
properly in Firefox. Also, someone mentioned that adding metadata directly to
images is not possible; the metadata are stored in a separate directory.
Question 4 was related to Search facilities: ‘In the Liferay platform, we have installed
several search functionalities: On (text in) documents, on (content of) messages, on
users. Do they meet your requirements?’
Most of the respondents admitted to not having tested this facility and one of them
had noted that they did not function properly: search results did not match the
searched items. One respondent was pleased with the option to add tags to messages,
which increased the search functionality.
Question 5 dealt with Metadata: ‘Relatively few metadata are attached to documents
and files: document name, version, size, and file format. Are you satisfied with this
functionality or should it be extended?’
Most of the administrators haven’t used it. One commented on the unsatisfactory
integration with version management. Versions are only visible after clicking
“actions”, view / edit. Not all versions are visible. And when someone oploads a
wrong version en removes it, all version of all files are removed, including all
Question 6 was on communication: ‘Because integration with the email system is
currently not possible, communication, and to some extent out-bound mail as well,
operates through the forum and the option to comments on separate pages. What is
your opinion on these options?’
This essential element of the platform was not appreciated by the hub leaders. Most
of them still use email or mailing lists and do not see the (subscription of users to) a
forum as a proper alternative.
Question 7 pertained to the calendar: On the page ‘events’ a calendar has been
installed. Is this useful?
Some administrators saw no use for this tool, whereas others pledged for a longer
time period to be shown (currently, if focuses on the present day). One administrator
hoped that people would be able to subscribe to the calendar, and integrate it into
their outlook/icalender/sunbird/google calendars.
Question 8 had to do with Content management of the website: ‘As administrators
you can extend your own webpages, change the content, add images and links et
This functionality was appreciated, although some complain about its limited user-
friendliness. A more experienced user noted that it functions much better than in
many other CMS’s.
Question 9 dealt with General functionality: ‘The platforms allows you to add new
portlets. Have you done this? Have you missed specific functionality during the actual
Most administrators mentioned here the lack of an integrated mail functionality. One
administrator has added the option to post comments directly on each specific page
of the platform.
Question 10 was on Ease of use: ‘An important criterion for the choice of a platform
has been ease of use: clear buttons, intuitive design of the site, good explanation
where necessary. In short: the ‘look and feel’.
Most administrators were not positive with respect to the ‘look and feel’. They
mention strange names and unexpected functions for buttons, difficulties with
creating navigation links and menus in the left-hand part of the screens. One user,
however, is very satisfied with the freedom to design the portal, add portlets at will et
Finally, question 11 was about the ‘Helpdesk’: ‘Frans and Lucien are subscribers to
the category Guest>private pages>make feature request and report bugs in order to
respond to questions and problems. Has this worked well and should it be continued?
A user guide can be found on the messageboard. Is the guide satisfactory? What
points deserve more attention?’
In general the service provided was considered very helpful. The user manual should
be extended and updated with respect to the setting of rights, creating menus and
customizing navigation. One administrator thought the guide should be
supplemented by videos demonstrating how to perform certain tasks.
How can we evaluate the ‘success’ of this pilot? The first goal, to create a workable
virtual research environment, has certainly been achieved. The second goal, to make
this environment into the lively meeting place and virtual laboratory of peers, is
proving more difficult to reach. As said, ordinary users have hardly used or been able
to use the platform, apart from the members of Labor Relations. Most administrators
admit that their members haven’t used the facilities, but they differ strongly in their
expectations on future use. Most of them seem willing to put more energy in urging
members to use the site, particularly when facilities such as email (both in- and
outbound) are added. Others however feel that this platform has no added value to
what they already offer: a site with downloadable data and a mailing list. One leader
claims that the momentum to motivate members was lost when the live
To understand these different reactions, we have to taken into account some of the
institutional factors discussed in section 3. Firstly, the hub leaders themselves have
become involved into the project in various ways. Some of them have been urged
rather strongly to join the pilot, although they saw relatively few merits from the
onset (Labour conflicts). Others, on the other hand, had high expectations from the
collaborative software, which could not always be realized within the pilot (Guilds).
Clearly, the level of motivation of the leader has impact on how he/she designed the
platform and, most important, stimulated members to use the environment.
Secondly, some of the collaboratories exist already for some time (in the case of
History of Work/HISCO many years), others are new. In the case of new
collaboratories, leaders may be hesitant to expose their group to ‘experiments’, which
may weaken their own credibility and position. On the other hand, leaders of settled
groups need to motivate their group to change their routines.
Thirdly, it can be assumed that when workflows (of data to be inspected, or
documents to be discussed) are more important for the working of a group, the more
important a platform can be for the group. Functions such a repository for data and
documents or a mere platform for exchanging ideas seem insufficient to motivate
members of a research team to participate intensely.
Finally, the success of the platform seems to depend to some extent on life
demonstrations. The planned demonstrations for HisWork and Guilds of the Liferay
tools in Lisbon in February 2008 failed due to the bug we already mentioned.
Successful demonstrations were held in March in Vienna (Labour Relations) and end
of April in Ann Arbor (Life Courses). The Labour Conflicts collaboratory will meet in
August for the first time. In the schedule below some of these aspects are
summarized. Again, the ‘success’ of the platform in terms of intensive use by the
groups cannot be determined at this stage. The final column indicates the present
(June 2008) situation of the Liferay platform, after having been operative for two
7. Lessons learned and prospects for future research
What can we learn from this short-term pilot of collaborative software to be tested by
researchers in their daily work?
1) Firstly, the pilot has demonstrated that communication between researchers and
IT developers demands more time and attention than anticipated. Researchers
(perhaps particularly in the humanities) tend to have unrealistic expectations of
applications, with respect to what they can achieve and the time in which they can be
developed. Conversely, developers (perhaps particularly those with a commercial
background) have difficulty to grasp the labor division within collaboratories.
Working groups of researchers are based on mutual trust and voluntariness. Tools
that may function well to speed up productivity within companies (workflow
management, integrated calendars) may have a contrary effect on collaboratories, as
they suggest a hierarchy of functions and lack of trust in timely contributions.
2) Secondly, although Liferay was specifically selected for its user-friendliness, even
the most motivated administrators became disheartened when navigation turned out
to be cumbersome, button labels were unclear, setting of rights was somewhat
complex, online Content Management proved more difficult than what one was used
to et cetera. In short, the product was not immediately attractive and intuitive.
Although most of these problems could be solved within the pilot project, a number
of leaders became hesitant to promote Liferay in their groups, although in practice
common members did not experience most of these problems which are related to the
administrator’s tasks. We can only conclude that more time and energy should have
been spent to make the Liferay platform more intuitively appealing for the user.
3) Life demonstrations seem vital for the acceptance of the platform, but they failed
in a number of groups, for several reasons. Obviously, the budget did not allow us to
convene these international groups specifically to demonstrate the platform.
Therefore, the demonstrations were planned to coincide with already organized
meetings, but this was not always possible. Apparently, the user guide in itself did not
stimulate enough to try out the program. As an alternative we are now thinking of
making video demonstrations for users with different roles. In fact, one of our
administrators has already made such a demonstration for the task ‘adding events to
4) As a general conclusion we can say that (new) collaboratories (in the humanities)
seem to be a rather vulnerable environment for testing new software, in particular in
such a short period. Furthermore, scientific group activities cannot be ‘rushed’, each
group has its own planning of activities and pace of work which cannot be changed
for a pilot project. However, it has to be admitted that the pilot project functioned in
a sense as a ‘pressure cooker’, speeding up processes and decisions that otherwise
may have been stalled.
The project has resulted in two kinds of knowledge. First, we have gained technical
expertise on collaborative software, not only by installing and improving the Liferay
package but by learning from the other groups in the SURF tender as well. For the
time being, the groups will continue using Liferay. In fact, a new group has been
added recently, a collaboratory on Migrant Organizations. Probably, a definite
decision regarding the continuation of this platform will be taken in the final stage of
the Global hubs project (December 2009). Second, we have gained a better
understanding of the interaction between the dynamics of research teams and the
(potential) role of communication software. This will help us to implement IISH’s
strategy to support and create data-hubs based on collaborative research. One
example of this is the European project CLIO-INFRA (
this kind of knowledge is integral to the KNAW research project ‘Socio-technological
aspects of cooperative research in social and economic history’ (S. Dormans). This
project aims to determine ’best practices’ in history, but with relevance for research in
the much wider field of the humanities. Dissemination of these two kinds of
knowledge is foreseen in several ways. In the field of history, several presentations
(a.o. a session on the 15
World Economic History Congress inUtrecht 3-7 august
2009) and publications are planned. In the field of e-science, already several
demonstrations and presentations are planned: at the summer school in Amsterdam
of the University of Washington (august 2008), the joint EASST and 4s conference,
August 2008 Rotterdam and the Oxford e-Research conference (September 2008).
Finally, pending a decision on the continuation with Liferay, the proposed webpage
describing HubLab activities will be created.
For the IISH, future research on collaborative platforms – possibly within the context
of a new SURF tender-project will have to match the specific characteristics of our
research as well as to address key issues discussed in this report. With respect to the
first, all issues concerning data-sharing are of interest to us. How is intellectual
property of data to be understood, how can it be made visible? What kind of license
structure matches on the one hand the way data are gathered within collaboratories,
and on the other hand the way in which upgraded data are made available to the
wider public? How useful are the Creative Commons licenses in this respect? How
can the software support clear documentation, annotation and version control of
shared datasets? With respect to the latter, we need more research into user
motivation to make use of collaborative tools: to what extent is integrated email
necessary and possible, what is the added value of demonstration videos, what makes
a product immediately appealing to lay users?
8. Project evaluation and financial report
Planned and realized deliverables
In this section we recall and evaluate the original planning of the project as outlined
in the controlling document. In the first stage, the project went ahead as planned.
Deliverable 1, The inventory of workflows and desired platform functionalities of the
‘hub’ leaders was ready by late November 2007 as planned, and has been put on the
as scorecard Def.xls and Designing a platform for
collaboratories.pdf (Jan Kok and Frans de Liagre Böhl).
Planned deliverables and time table.
The second stage was the test of by ICT-staff of the relative merits of Sakai,
Sharepoint and Liferay in terms of technical specifications, anticipated learning curve
for administrators and users, open source versus company based, and matching with
the requests of the hub leaders. This test was completed by Mid-December, ahead of
the planning. Deliverable 2 was presented in the form of an excel sheet (technische
test.xls), summarized in the Technische Rapportage Hublab project.pdf (see the Surf
tender site of Hublab and Appendix 2).
In early January, the selected platform Liferay was installed for five test sites, on the
basis of an (internal) configuration management procedure (WP3). However, due to
overburdening of the ICT department and the pressure caused by an international
application for the collaboratory project (see
), WP 3 was delayed
with several weeks. However, in addition to the ori ginal plan, a user guide
‘Collaboratory portal IISH User Guide’ was produced by Frans de Liagre Böhl (see
Appendix 3). Deliverable 3, the actual test platform, was ready by late February.
The actual test of Liferay took place from March-May 2008. As mentioned earlier, an
error in Liferay frustrated logging-in from outside the institute, which made it
impossible to demonstrate the platform as planned by several collaboratories at the
European Social Science History Conference, 26 February-1 March 2008 in Lisbon.
Successful demonstrations were held in Vienna March 2008 and Ann Arbor (USA) in
April 2008. Some further delay, however, was caused by a second software problem
that confused links to added pages in Liferay.
Deliverable 4, the inventory of user experiences (based on the questionnaire in
Appendix 5) was completed in early June, and discussed at the SURF tender meeting
on June 17
(see presentatieSURF3.ppt on the site). The findings have been
summarized in section 6 of this report.
Deliverable 5, a website of webpages introducing the project has not been realized yet,
pending the internal evaluation of Liferay as a tools for the Institute’s collaboratories.
However, the Liferay platform (
) has extensive public pages for
guests which function as an introduction to the project.
Finally, Deliverable 6, an interim and final report by the project leader, were
produced according to plan.
In the controlling document we have specified the anticipated workload per person
for each Work Package. Obviously, in the course of the project all kinds of changes
occurred, e.g. in terms of personnel involved. As the contribution of the Guild’s
project (TDM, Tine de Moor) was not included in the controlling document, we leave
it outside the financial Eindtotaal. The same applies to the contribution of drs
Zijdeman who has done much work for the HISCO site.
We will describe here the most important changes and adaptations with respect to the
original plan. What we see is that the input in hours from the hub leaders (KMA, SVV,
MVL and KHO) was less than anticipated. This has to do with the fact that they had
less time to actually experiment with the platform and to discuss findings with their
members than planned, due to delays described in previous sections. The technical
problems encountered in the beginning resulted in much more input from the ICT
department (MMI: 155 versus 57) than expected. Installing the platform for five
collaboratories, producing a user manual, answering questions and reports, and
developing improvements for the interface meant a huge increase in hours for staff
members of Digital Infrastructures (FLB 210 versus 120 hours and LVW 127 versus
80 hours). In fact, to support this crucial element of the project and to keep the
budget within limits, the decision was made to find alternative funding for travels to
international collaboratory meetings. Finally, the overall management by the project
leader was slightly more than planned (178 vs 158 hours). In his case, the work
continued well into June (preparing for the SURF meeting June 17
and writing the
Final financial report
p/p gepland Gemaakte uren tot 1.7.08
TWE 4 8 60
JEG 4 5 56
JIB 32 32
RZI 56 32
TdM 10 50
kosten schrijven aanvraag hierbuiten gehouden
A short experiment with new communication software in an environment of
emerging collaboratories in the humanities is not without risks. One the one hand, in
the short period in which the test has to take place simply too little interaction
between groups members may to test the product properly. One the other hand, the
tool may not have reached a stage in its production to motivate and stimulate
researchers to work with it. In the case of Hublab, both problems occurred to some
extent. However, the effect was mitigated because we had installed platforms in no les
than five colaboratories, ensuring that, overall, the software was tested on most of its
functionalities. Moreover, although the learning curve for the Liferay tool for
administrators (hub leaders) was steeper than anticipated, it seems without problems
for common members of collaboratories. The latter, however, still awaits further
Having different collaboratories working with the tool allowed us to connect the
different institutional settings, workflows and group dynamics with the reception of
the Liferay tool. Which groups are more motivated than others and why is that?
Although the voluntariness of all collaborative research has to be emphasized
strongly, some groups have a more explicit workflow than others. Also, life
demonstrations seem a crucial precondition for a successful implementation. We will
continue this line of research in the context of Stefan Dorman’s project ‘Socio-
technological aspects of cooperative research in social and economic history’ (Virtual
Further efforts to improve the platform will concentrate on three topics. First, further
improvement of the ‘look and feel’, e.g. by adding demonstration videos, better
navigation, make button labels unequivocal et cetera. Second, improvement of
communication functionalities, in particular integration with email. Last, working on
various aspects related to data-sharing: version management, a licence structure,
intellectual property right and possibly online manipulation of data.
Appendix 1. Interview on current workflows and desired
functionalities of the collaboratory platform.
In the interviews, the first part was devoted to gaining insight in the aims, planning, current
practices of the collaboratories, and expectations regarding cooperation and communication.
The questions involved are subsumed under A. In the second part, more specific questions
regarding functionalities of supporting software were raised (B).
1. 1 management
1. What is the final aim of the project?
2. What is the planning in terms of intermediate goals?
3. Can these intermediate goals be modified in discussion with the participants?
4. Is there leeway vis-à-vis the sponsors to modify the project’s targets during the project?
5. Is there a specific planning?
6. Who monitors the planning?
7. Is it possible to change the planning in discussion with the participants?
8. Is it possible to discuss change of planning with eventual sponsors? Is the planning tied to
9. Are there sanctions for not reaching (intermediate) goals in time?
Quality and progress control
How is the quality of the results assessed (by peer reviewing or only through a central
evaluation of the input?
10. Who monitors progress and how? Are there sanctions for not observing agreements? 11.
What happens once the central goal has been achieved: are all results made public, or only
specific results? Are the data placed under an embargo?
12. Who manages the final results? Does one anticipate permanent supplementing and
correcting of data and how is that to be conducted?
1. 2 Handling individual data
1. Does one work with a central codebook to build the final product and is this codebook
the subject of internal discussion?
a. If so, is that discussion limited to a specific period?
b. How is that discussion organized, who collects and assesses the input?
c. How is the synthesizing conclusion communicated to the participants?
d. If the codebook were to change during the course of the project, does this
imply that already collected data will have to undergo changes as well?
2. If there is no codebook, how is the cohesion between the data submitted to the hub
1.3 Building the central database
1. How do the contributors actually submit data?
2. How are the individual contribution imported into the central database?
3. Who manages the central database?
4. Can all members inspect each others contributions (immediately)?
5. When and how is it possible to comment on each others contributions?
6. Is that internal discussion visible to outsiders as well?
7. How is central control of the contributions organized? By checking for compliance to
the codebook, by checking for internal consistency, are there procedures for finding
errors? Is it possible to locate missing data?
8. How is the feedback to the contributors organized?
9. If the data seem not fit to be put into the central database (see 7), who will
1.4 Producing documentation
1. What is expected of individual contributions with regard to references to source
material, explanation of procedures (if diverging from joint procedures), estimation
methods (if diverging from agreed methods), interpretation et cetera?
2. Are those remarks integrated in the database or submitted in separate text files?
3. Are the remarks summarized in central texts?
2. Expectations regarding the collaboratory platform
Indicate the desirability of each functionality on a scale of 1) not particularly important to 3)
1. The environment has to be webbased, that is, only a computer with access to the internet
and a browser are needed to gain access to the collaboratory platform.
2. The platform has to allow for developing simple applications or modules within the
community environment. Thus, RAD/Scripting has to be supported.
3. Should propriety script language is admitted or do you prefer standard languages
4. How important is ‘proven technology’ for you?
5. The API of the environment needs to be specified clearly.
6. Should it be possible that data from the hub can be offered in xml-format to other systems?
Should it be possible that data can be imported from other systems in xml-format into the
7. The firm/organisation supplying the platform needs to guarantee continuity of the
8. Is management able to set standards for software use?
1. Should there be a central interface only accessible to the hub manager?
2. Should one be able to create new users
3. Should user statistics be kept of visitors of the public website
4. Should user statistics be kept of who is online in the collab
5. Should actions of collab-members be logged?
6. Should there be logs of the manipulation of public data
7. Should there be logs of the manipulation of private data
8. Should authorization be given by means of roles, not individual users
9. Should everything be accessible to all members of the collab
10. Should there be a clear separation between private webpages and collaboratory pages?
11. Should there be a clear separation between private webpages, collaboratory pages and the
12. Should the manager be enabled to overrule rights set by users?
13. Do the participants in the collaboratory need to retain control on their own contributions
14. Complete control: the contribution can only be read by the author
15. Shared across the collaboratory: the contribution is shared by all members of the
16. ad hoc: the contributor decides each time with whom the contribution can be shared
17. open: the contribution is visible to the general public
18. closed: the contributor loses all right after uploading.
19. Should users be allowed to change passwords?
20. What kind of rights should be set on (various) objects (read, write, update)?
21. Suppose a file is not meant for reading. Should it still be found by an (internal) search
22. Should users of the public site be registered
23. Should there be a content management system for the public site?
24. Indicate the number of hours per week that you as leader of the hub can spend on
managing the site (allowing access to new users, setting rights, creating workflows etc)
1. Is webmail needed?
2. Should it be possible link this with existing accounts on the server
3. Address book?
4. Should there be a mailing list
5. Public lists
6. Personal lists
7. Import mailinglists
8. Integration with MS Outlook
9. Integration with mail software
10. Is a webforum needed?
11.And instant messaging?
12 Should sessions be stored?
13 Is audio and/or videoconferencing desirable?
15 Online whiteboard
16 How detailed in terms of graphical options (diagrams, graphs etcetera)
17 Should one be able to make ToDo/Actionlists?
18 Should to lists be shared
19 Granted to third parties?
20 Calendar functionality
21 Shared calendar
22 Planning of meetings
23 Integration with MS - Outlook/Exchange of other (which one)
24 Synchronisation with PDA's
25 Automatic warnings
26 What kind of directories’ functionality is needed
27 Public directories
28 Personal directories
29 Group directories
30. Document routering/ workflow management options
Guestbook on public site
What actually is to shared:
1. Text and spreadsheet files (e.g. as part of a dataset, documentation of sources, methods,
annotation of data or literature, research applications etcetera)
2 To be shared?
3 To be manipulated online? All types of files?
4 URL’s/Links (eg. To literature)
5 To be shared?
6 Other file types
7 To be shared?
8 Is online reading/manipulation possible?
10 If online reading or writing is not possible, do you need plug ins for these filetypes?
11 external databases (such as. MA Access, MySQL)
12 Statistical/graphical software (SPSS)
13 Reference software (Endnote, Reference Manager)
14 Incorperation of RSS Feeds?
15 To be shared?
16 Integration with search engines
21 Other, to wit …
E. Version control
1. How desirable is version management? In what form? One can think of an automatic
logging function that tracks all changes to a dataset, that archives the old versions, and that
sets versions numbers.
2. How desirable is a functionality that tracks changes within a dataset. Should there be a
logbook with annotations of changed data?
1.What kinds of ‘administrative' metadata (author, day of creation, day of uploading, day of
last change, abstract) do you want to keep on the data collected within the collaboratory?
2. Is this to be done by hand or automatic?
3. Should it be possible to develop your own metadata scheme
4. Should the technical metadata (filesize, format type) be stored?
5. Should the metadata (e.g. authors) be protected through authorisation?
6. Should users be enabled to tag metadata?
7. Should those tags be shared?
8. Should they be individual
G Intellectual property
1. How important is it to make (and keep) visible individual contributions to the dataset?
What kind of citation of the data is foreseen? Do you want applications that allow for unique
digital identification of (parts of) the dataset? And should that be done automatic?
1.How desirable is the option of different languages for the interface (N.B. only of some
elements of the interface).
2. How desirable is the option of changing the interface
5. Font sizes
6. How desirable is a search facility for the material collected in the collaboratory?
7. On metadata
8. Full text
9. Store search settings
1. Should scanning of material for viruses be installed?
2. Encryption on the line (SHTP, SSL)?
Technische Rapportage HUBSLAB-project
(Digitale Infrastructuur deelproject IISG)
Opgemaakt door: M.Mieldijk (namens het testteam)
Overige leden: Ole Kerpel
Luciën van Wouw
Datum: 9 december 2007
1. Omschrijving Doel DI deelproject ‘Het opbouwen en testen van 3 verschillende
3. Ontwerp testomgeving
4. Omschrijving werkwijze testteam
7. Conclusie en aanbeveling
1. Doel DI deelproject
Het doel van dit deel project is, ‘het uitbrengen van een aanbeveling van één van de drie
Gestelde voorwaarden zijn:
• Testen van Sakai/Sharepoint/Liferay-Alfresco
• Maximale doorlooptijd tot keuze 2 weken
Naam: Jan Kok
Telefoon: 020 6685866 of 020 8500483
Naam: Frans de Liagre Böhl
Telefoon: 020 6685866 of 020 8500421
Taak: Consulent projectleider
Naam: Mario Mieldijk
Telefoon: 020 6685866 of 020 8500403
Taak: Deelprojectleider / Aanspreekpunt consultant & projectleider
Installateur OS en Platvormen
Naam: Mario Mieldijk
Telefoon: 020 6685866 of 020 8500403
Taak: ICT medewerker tbv. Sakai/Liferay
Naam: Jip Borsje
Telefoon: 020 6685866 of 020 8500403
Taak: ICT medewerker tbv. Sharepoint
Naam: Maarten Kroon
Telefoon: 020 6685866 of 020 8500403
Taak: Externe ICT medewerker tbv Sakai
Naam: Ole Kerpel
Telefoon: 020 6685866 of 020 8500137
Taak: Tester / ontwikkelaar
Naam: Gordan Gupac
Telefoon: 020 6685866 of 020 8500421
Taak: Tester / ontwikkelaar
Naam: Luciën van Wouw
Telefoon: 020 6685866 of 020 8500421
Taak: Tester / ontwikkelaar
De testomgeving bestaat uit een drietal virtuele servers waarvan de Sakai en Liferay draaien
op een Linux Operating system en SharePoint op een Windows 2003 Operating system. Deze
virtuele servers worden geplaatst op de huidige VmWare ESX omgeving. De authenticatie van
de collaboratory systemen verlopen via het pakket of Operating system zelf. Van deze virtuele
omgeving zijn meerdere ‘snapshots’ gemaakt. Deze snapshots kunnen van dienst zijn indien
de omgevingen niet meer zouden fungeren. Er kan dan een stap terug in de tijd worden gezet,
zodat we snel weer kunnen testen. (Crisis Management)
Grafische weergave testopstelling
4.Omschrijving werkwijze testteam
Het testteam bestaat uit mensen die al jarenlange ervaring hebben op ICT-gebied. Echter
hadden we geen van allen eerder met collaboratie software, zoals
SAKAI/SHAREPOINT/LIFERAY, mogen werken. Voor de testers/ontwikkelaars was dit
natuurlijk een ’positief’ fenomeen. Zij konden namelijk de pakketten beter beoordelen op de
‘look and feel’. Maar aan de andere kant werd het ‘positieve’ fenomeen teniet gedaan doordat
iedereen in het ‘diepe’ gegooid werd. Kortom wij stonden voor een moeilijke opgave en
moesten we in een korte tijd een omgeving bouwen en daar ook nog advies over geven.
Wij hielden de opzet van ons testplan dus zo eenvoudig mogelijk :
- Maken inschatting collaboratie software tbv planning
- Systeembeheerders Installeren de Operating Systemen op de Virtuele machines
- Systeembeheerders installeren collaboratie software (best effort)
- Elke tester/ontwikkelaar test één pakket langdurig (ongeveer 18 uur) en houdt een
- Elke tester/ontwikkelaar test als cross-reference kortstondig een pakket van een
collega (ongeveer 12 uur) en past in overleg scoreboard aan.
- Deelprojectleider maakt rapport van bevindingen.
Mede door deze eenvoudige opzet konden we, ondanks dat onze ervaring en tijd beperkt was,
er toch voor te zorgen dat wij een goede aanbeveling zouden doen richting de
B. Wat hebben we niet getest!
Omwille de tijd hebben wij er voor gekozen om systeembeheerachtige zaken niet mee te
nemen in de tests cq. scoreboards. Dit zijn;
- Anti-spam / Antivirus mogelijkheden
- Authenticatie verloopt alleen via het pakket of via de geleverde authenticatie-module
van het operating system
Deze zaken vinden wij wel van belang tijdens de pilot en/of implementatiefase.
5. Uren- en kostenverantwoording
Deel A Uren
Mmi 18 40 € 2000,00
Installatie van extra geheugen in huidige
Rbe 3 3 € 150,00
Installatie/configuratie/ van Virtuele machine
met Linux CentOs 4.x en Sakai
12 32 € 1400,00
Installatie/configuratie van Virtuele machine
met Linux CentOs 4.x en Liferay/Alfresco
12 13 € 2600,00
Installatie/configuratie van Virtuele machine
met Windows 2k3 en Sharepoint 2007
Jib 16 20 € 50,00
24 15 € 750,00
Overleg tussen alle deelprojectleden Mmi
30 35 € 1750,00
Testen van Sakai omgeving Oke
30 18 € 900,00
Testen van Liferay/alfresco Lwo
30 27 € 1350,00
Testen van sharepoint Gcu
30 29 € 1450,00
Bestellen hardware Mmi 0 1 € 50,00
Totaal: 205 233 €12850,00
Deel B kosten