Discovering Linked Government Data

hurriedtinkleAI and Robotics

Nov 15, 2013 (3 years and 1 month ago)

68 views

Digital Enterprise Research Institute

www.deri.ie

HADA


An Access Controlled
Application for Publishing and
Discovering Linked Government Data

Owen Sacco

owen.sacco@deri.org

IESD 2012
-

EKAW 2012

Galway, Ireland

Tuesday 9th October 2012

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

US Government’s principal agency for:


Protecting the Health of all Americans


Providing all essential Human Services

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Promote the advancement of the Health, Safety, and Well
-
Being of the American People

HEALTH AND HUMAN SERVICES DOMAIN

IT PROGRAM MANAGEMENT OFFICE

HHS IT Asset Discovery Application

HADA

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE


Currently, data about HHS IT Investments exists:



In different
systems

In different
data models

With different
levels of
access

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

HADA aims to provide intelligent:

Aggregation of this data to support information discovery

Interoperability amongst the different systems

Fine
-
grained Access Control

Using Semantic Web principles

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

WWW





Docs

Semantic Database

Public Data

EPLC and
other docs

Data

Enterprise
Repositories

Data Access Rules


Who can see what?

Web Application

She searches for
a specific IT
Investment cost

IT asset information are
pre
-
aggregated from
multiple data sources

Which are stored
in a database

Access rules are checked to
grant or restrict access to
the IT Investment Cost

If she has access,
she can view the
Investment cost

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

XML

CPIC Repositories

Code, Documentation, Etc. Repositories

Content Extraction
Layer

Semantic Layer

Data Layer

Instance data

Extracted
instance data
in XML format

System Content Extraction

Docs

Code

Etc.

Metadata Extraction and Manual Clarification

XML

Semantic Transformation
and Synthesis

XML

XML

XML

Existing Ontologies

Semantic
Model
Transformation

Presentation

and
Navigation
of Content

Presentation Layer

EA Repositories

(e.g. FEA)

Semantic Database

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

Privacy Layer

Privacy Preference Manager

Enforcement
of Privacy
Policies

Privacy Preferences Repositories

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT PROGRAM MANAGEMENT OFFICE

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT PROGRAM MANAGEMENT OFFICE

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

Publishing Linked Data using the
Linked Data API



A
RESTful

API over RDF graphs


Acts as a proxy over SPARQL endpoints


Easy
-
to
-
process representations of resources

Indexing and searching RDF data using
SIREn


“A Lucene plugin to efficiently index and query
RDF, as well as any textual document with an
arbitrary amount of metadata fields”

Storing RDF data using
Sesame
and
ARC
over MySQL

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Subject

Predicate

Object

Context

HADA

hasName

“HHS IT Asset
Discovery
Application”

HEAR

HADA

hasAcronym

“HADA”

HEAR

HADA

hasCost

$12345

CPIC

HADA

hasIPAddress

107.20.137.21
0

HEAR


HADA

belongsTo

HHS

HEAR

HADA

hasLabel

“Health and
Human
Services
Asset
Discovery
Application”

ITDashboard

HADA

hasAcronym

“HADA”

ITDashboard

More than one rule can be applied to each
data element

Rules
based
on…

Where the data
comes from
Context

What the data is
about
Subject

What the data is
describing
Predicate


Properties of the
data itself
Object

Any combination of
the above



Attribute based access and fine grained access

HEALTH AND HUMAN SERVICES DOMAIN

IT
PROGRAM

MANAGEMENT OFFICE

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Privacy Preference
Ontology

Namespace: http://
vocab.deri.ie
/
ppo
#

ppo:PrivacyPreference

rdfs:Resource

rdf:Statement

trix:Graph

void:Dataset

rdfs:Resource

ppo:appliesToStatement

ppo:appliesToNamedGraph

ppo:appliesToDataset

ppo:appliesToResource

ppo:appliesToContext

Applies To

ppo:Condition

ppo:ConditionOperator

rdfs:Resource

rdfs:Resource

rdfs:Class

rdfs:Class

rdfs:Literal

rdfs:Propoerty

ppo:hasLogicalOperator

ppo:hasCondition

ppo:hasConditionOperator

ppo:conditionOperatorOf

ppo:hasProperty

ppo:hasLiteral

ppo:classAsSubject

ppo:resourceAsObject

ppo:resourceAsSubject

wo:Weight

ppo:hasPriority

ppo:Operator

ppo:classAsObject

ppo:hasChildConditionOperator

Conditions

ppo:AccessSpace

foaf:Agent

rdfs:Literal

ppo:hasAccessQuery

ppo:hasAccessAgent

ppo:hasAccessSpace

Access Test Queries

acl:Access

acl:Access

ppo:hasNoAccess

ppo:hasAccess

Access Control Privileges

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Privacy Preference
Ontology

PREFIX
ppo
: <http://
vocab.deri.ie/ppo
#> .

PREFIX
hada
: <http://
hprod.dyndns.org
/> .


hada:pp1 a
ppo:PrivacyPreference
;



ppo:appliesToResource



<http://
hprod.dyndns.org
/
hada
/Investment/90000001>;




ppo:hasAccess

acl:Read
;



ppo:hasAccessSpace



[
ppo:hasAccessQuery


"ASK {?x
foaf:topic_interest






<http://
hprod.dyndns.org
/
hada
/vocab/Asset>}"].

Namespace: http://
vocab.deri.ie
/
ppo
#

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Privacy Preference
Ontology

Privacy Preference

90000001

acl:Read

Who is interested
in Asset

ppo:appliesToResource

ppo:hasAccessQuery

ppo:hasAccess

Namespace: http://
vocab.deri.ie
/
ppo
#

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Privacy Preference Manager

User

Privacy Preference
Manager

SPARQL Endpoint

RDF Documents

Privacy Preferences
Repositories

Privacy

Preference

Manager

provides
:



Creating

privacy

preferences



Enforcing

privacy

preferences

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Enforcing Privacy Policies

RDF Data Retriever
& Parser

Privacy Preferences
Enforcer

Privacy Preferences
Creator

Privacy
Preferences

John

Request

Request

RDF DATA

Logs In

John’s Profile

Privacy Preference
Manager


Query

Privacy

Preference

Filtered
RDF Data

Query

RDF Data

Access Query Result

Request

John’s RDF Profile

SPARQL Endpoint

RDF Documents

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Towards Patient Controlled Privacy

Privacy Preference

Manager

Alex

Privacy Preference

Manager

John

SPARQL Endpoint

RDF Documents

HHS is exploring to use on
healthdata.gov
:


Linked Data API
for publishing Linked Data


Privacy Preference Framework

to provide the Patient
to control third party access to his/her health data

SPARQL Endpoint

RDF Documents

Privacy Preferences

Privacy Preferences

Interface

Interface

Digital Enterprise Research Institute

www.deri.ie

Enabling

Networked
Knowledge

Links


HADA:
http://hprod.dyndns.org/



Linked Data API:
http://code.google.com/p/linked
-
data
-
api/


SIREn:
http://siren.sindice.com/


Sesame:
http://www.openrdf.org/



PPO Namespace URI:
http://vocab.deri.ie/ppo#


PPM Screencasts:


Creating Privacy Preferences:
http://bit.ly/p0N1Vi


Viewing Filtered Triples:
http://bit.ly/qiAdxT



Email: owen.sacco@deri.org