VIVOSemanticTechnologies.Workshop.2011-03x

friendshomeopathInternet και Εφαρμογές Web

4 Δεκ 2013 (πριν από 3 χρόνια και 11 μήνες)

119 εμφανίσεις

VIVO’s Semantic Extensibility for
Research Networking

Opportunities and Challenges


March 25, 2011

Jon Corson
-
Rikert

and the

VIVO Collaboration

First, it’s about the data


Consistent formatting, in a language of the Web


Self
-
describing


Via the ontology


Context associations inherent in the data


Distributed


De
-
referenceable


Reusable without (or with) modification


Persistent independently of any application

VIVO is not just people

and not just a profiling system


Anything can become a type (and have
individuals)


All individuals are structured the same way


Inheritance


Varying property data attributes & relationships


Extend the ontology without modifying the app


Generality does not (yet) provide an optimal
interface

Semantics is more than tagging


Controlled vocabularies stay independent of the
data ontology


Multiple vocabularies can be used independently


Hierarchy can be used for query expansion


Goals


Supplement more explicit relationships


Normalize research interests of faculty with the help
of relationships among controlled vocabularies

Semantic plumbing


SPARQL query language


Public endpoints


Query by relationships, not table structure


Linked open data


Can browse data without having to know the schema
or how to formulate a query


Decide what can be integrated based on common
ontologies and/or shared vocabularies


Expanding tool sets


Triple stores, data conversion to RDF, search engines


Maturing fast

Semantic learning curve

Ontology interoperability


Several ontologies, not one


Ontologies often align with natural, fairly narrow
domains


Reuse existing ontologies, in whole or in part


Focus on faithful representation of the data you have


Avoid the need to create and maintain mappings


VIVO/eagle
-
i

integration work to date


Direct import of each others’ classes and properties


Shared import of BFO upper ontology


Foundational for long
-
term interoperability



Ontology
-
driven applications


Learning from the eagle
-
i

and VIVO experience


Clean separation between the logical data
ontology and application management
ontologies


Access control and editing policies


Menu management


Display control


Contextual search support


Manage via the same interfaces

Open data and application ecosystem

Ecosystem maturity

Interoperating with VIVO


At the RDF level


Produce or consume data compatible with the VIVO ontology


Linked open data requests


SPARQL endpoints


Special “documents”


E.g., all the RDF about a person, including linked grants and pubs,
awards, education, etc.


Other parameterized reports


Much
of the work of
VIVO
development has been to make
it easier to modify and extend VIVO


Local ontology
extensions


Clean MVC separation of logic from
presentation


Easier customization and branding



Adding functionality in VIVO

Inside or outside:

app vs. data interoperability


Implications for the user experience


Close coupling vs. greater freedom


VIVO widgets work at Duke


Some things are already done well outside


E.g., private groups (Google docs)


Why re
-
invent the fully
-
featured CMS


Drupal


Joomla
!


Know the limits of your data

Building the open source community


vivo
-
dev
-
all and vivo
-
imp
-
issues
listservs


High participation, healthy response level from developers
and implementation partners inside and outside the
project


Sourceforge


Merged SVN repository


Mail list forum archives


Wiki


Public
Jira

for issue tracking


Sample SPARQL queries, Harvester transforms and
workflows


Mini
-
grants demonstrating what can be done


Directions already being explored


Chinese Academy of Science knowledge portals


Australian ANDS
-
VIVO


Fulfilling national data registry requirements


Code and ontology extensions


Linkages to institutional and data repositories


NSF
DataStaR

and Data Conservancy projects


Independent development in the US


Wellspring


American Psychological Association

Mini
-
grants address key areas


Controlled vocabularies (Stony Brook)


Direct output to
biosketches

and CVs (Pittsburgh)


R
e
-
use of VIVO data in standard web pages (Duke)


Connection to a second major content
management system and the
HUBzero

scientific
simulation and grid services platform (IU)


Author IDs and disambiguation (ORCID)


Additional project using Google Refine for data
cleanup and export (
Weill Cornell)

How is VIVO contributing to research
networking?


Mini
-
grants are breaking new ground and fostering
community development


Collaborations with other key contributors, including
eagle
-
I and ORCID


VIVO encourages institutions to make data available in
a structured format, not just as web pages


Linked data already promotes the use of common, well
-
known ontologies


Fills a gap by providing consistent data about researchers
and their activities and institutions


Enables analysis and reporting across institutions


“National” search


Extending VIVO search to an aggregation of
RDF from multiple sources, using Apache
Solr


Expects RDF expressed with VIVO ontology


From Harvard Profiles,
Collexis
, and likely others


U24 program mandates no reliance on
sustained centralized infrastructure


Aggregator and indexing software will be
added to VIVO
Sourceforge

site


Configure to harvest desired set of sources



Configurability


Extensibility


Openness


Immediacy


Visual appeal


“Explain the power and utility of
VIVO’s semantic web architecture”


Integration


Connection


Discovery


Dissemination


Migration

Questions yet to address


What access points and services need to be provided
for national (or international) research networking to
succeed?


Providing rich, distributed data enables networking


How will people be able to integrate this data into their
daily workflow and research process?


How will boundaries between public and private data and
services work?


How do we reach beyond universities without losing
core value of authoritative data?


How will distributed identity and authorization work?


What needs are not being addressed?


Thank you