Build DoD Vocabularies in the Cloud

warbarnacleΑσφάλεια

5 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

56 εμφανίσεις

Build DoD
Vocabularies in the Cloud

3
rd

Annual SOA & Semantic Technology Symposium:

Interoperable Business Operations Through Shared Understanding

Dr
. Brand Niemann, Director and Senior Data
Scientist, Semantic Community

July 13
th

Competency
Track
-

11:55am
-
12:30pm

July 13
-
14, 2011

Waterford, Springfield, Virginia

1

Semantic Community


So far in 2011, Semantic Community has built Knowledge
-
Centric Systems in the
Cloud for:


Data Science and Journalism:


Data.gov and Federal Computer Week, Ongoing Since January 2011.


1105 Government Information Group/FOSE Institute’s KM 2011 Conference, May 4, 2011, and
Geospatial Summit, September 13, 2011.


AOL Government “Show Me The Data” Due to Launch July 11, 2011.


The Open Group’s TOGAF and UDEF
:


The Open Group San Diego Conference, February 7, 2011.


The Open Group London Conference, May 11, 2011.


Semantic Interoperability
:


Keynote at SEMIC.EU Annual Conference, May 18, 2011.


Conference Presentation at
SemTech

2011, June 7, 2011.


Federal Data Architecture Subcommittee, June
9, 2011.


“Big Health Data”
:


One of the Top Submissions for HealthyPeople.gov Challenge, March 14, 2011.


Finalist in the Health Data Initiative Forum, June 9, 2011.


DoD
:


RFI for Data Analysis and Collaboration Tool to Support the DoD OIG, June 28, 2011.


3
rd

Annual DoD SOA and Semantic Technology Symposium, July 13, 2011.


This presentation will show examples from simple (e.g. Air Force One Source) to
complex (DOD Office of the Inspector General) DoD Vocabularies.

2

Take
-
Home Message


Competency: Creating Competency for Shared
Understanding and Interoperable Business Operations.


This
track focuses on the development of knowledge and skills
for SOA & Semantic projects, the handling of organizational
change management, and the governance needed for and
associated with such projects and initiatives.


Semantic Community Knowledge
-
Centric Systems:


We
take the data (and metadata) directly to information
modeling and
mashup

tools where
we
then can apply stronger
semantic analytics
tools. We
keep the data (structured and
unstructured) and metadata (ontology) together in the
knowledgebase in cloud computing tools
.


We use effective standards
-
based approaches for real
-
world
case studies. This presentation could also be in the other two
tracks!

3

Abstract


Several DoD vocabularies have been harvested into the
cloud computing tools used by the author to produce data
science products. Those are Air Force
OneSource
and the
DoD Common Vocabulary with two vocabularies, one for
the HR community and one for UCORE
-
SL.


The
purpose of the Semantic Community’s data science
products are to show when/where it is practical to insert
semantic technologies in support of cross
-
domain process
and analysis, and the value/ease of using other more
mature technologies for certain tasks. The practical
boundaries we have found supporting data fusion and
analysis for information sharing, and when in the process to
maximize the value from applying semantic technologies,
are discussed
.

4

Note: Credit due to Robert Damashek for suggesting this topic to me.

Overview


1. Introductions


2. Background


3. Semantic Community
Apps


4.
DoD Common
Vocabulary


5. Data Analysis and Collaboration Tool to
Support the DoD OIG


6. Questions and Answers


7. Supplemental Slides


Recreating Other People’s App the Semantic
Community Way!

5

2. Background


My Experience with “Handling
of organizational change
management, and the governance needed for and
associated with such projects and
initiatives”:


I tried to change EPA from the inside (1980
-
1996).


I served a detail to the Department of Interior where I was
able to start a new organization (1997
-
2001).


I tried to change the Federal Government in my Federal
CIO Council (2002
-
2008) Roles.


I also tried to change EPA from the outside at the same
time.


I am now enjoying being free to do what I think is best to
support the Semantic Web/Linked Open Data and
Semantic Technologies, but in an easier and simpler way!


6

2. Background


Federal Semantic Interoperability Community of Practice (
SICoP
)
2003
-
2008:


Five Annual Conferences and Four Special Conferences.


Federal SOA Community of Practices (
SOA
CoP
) 2006
-
Present:


Eleven Semi
-
Annual Conferences. 12
th

October 11
th

.


Only
Special Recognition
for Outstanding Contributions to Both
SICoP

and SOA
CoP
:


Arun

Majumdar
, Cutter Consortium/
VivoMind

Intelligence for
Operationalizing SOA
-
Lessons
Learned (Take Home
Message:
Multi
-
Level
Model
-
Driven Architecture
&
First Order Logic).


Now from the pilots at these conference come powerful
new
semantic analytics tools
like
VivoMind's

Textrium

and
PrologIKS


and
Semantic
Insights Research
Assistant (SIRA) that can be used to
mine content to
produce
data
science products that support data
journalism
!

7

2. Background

Program

Champion

CoP

Leader

Standards

eForms

for
eGov

Mark Forman, OMB

Rick Rogers,

Fenestra
Technologies

eGrants

XML
Schema and Web
Services

Federal SOA
CoP

Roy
Maybury
, DoD

Cory
Casanave
,
Model Driven
Solutions

Web Services and
Open Group MDA
and
SoAML

Federal Semantic
Interoperability
CoP

David
Wennergren
,
Navy CIO

Rick Morris, US
Army, and Mills
Davis,
Project10X

W3C Semantic Web
in Semantic
Technologies

Cloud Computing
Desktop for OGD &
Data.gov/semantic

Vivek

Kundra
,
Federal CIO

Brand Niemann, US
EPA and
Semantic
Community

Web Oriented
Architecure

(
MindTouch
)

Gov 2.0 Platform

for Data Science
Products

and
5
Stars of LOD

Aneesh

Chopra,
Federal CTO

Tim Berners
-
Lee,
W3C Director

Brand Niemann, US
EPA and
Semantic
Community

Open and Quality
Data Visualizations
(
Spotfire
)

8

My Experience with “development
of knowledge and skills for SOA & Semantic
projects”.

3. Semantic Community Apps

General

Web Site

Best Content
-
Centralized

Best Content
-

Distributed

US Federal

Government (1)

Community

Sandbox (2)

Annual Statistical
Abstract (3) and
EPA

Report on the
Environment (4)

FedStats.net (5)

TOGAF (6)

EA

Principals, Inc.
(7)

Training Materials
(8)

Ecosystem of
Frameworks (9)

SEMIC.EU (10)

Web Site (11)

EuroStats

(12)

and
European
Environment

State
and Outlook (13)

Global Data Catalog
and Data Services
(14)

Key: See next slide for Key.

9

Source:
http://semanticommunity.info/Build_SEMIC.EU_in_the_Cloud

Some
Best Practice
Examples of Semantic
Interoperability
Interfaces*

*The
term "interoperable interface" comes from the recent Report to the President and Congress "Designing a
Digital
Future:
Federally Funded Research and Development in Networking and Information Technology", Executive Office of the President and
the President's Council of Advisors on Science and Technology, December 2010 (see

excerpts in the wiki
).


4. DoD Common Vocabulary


The mission of the Enterprise Information Web (EIW) project is to
create an extensible analytical capability built on top of a federation
of information systems across the Department of Defense and
provide information visibility and
access
:


Archives: All
wikis

and vocabularies relevant to the HR EIW project
.


Business Process
Area:
Semantic
models
for the HRM Domain
.


CHRIS Reference
Ontology:
?
.


Retirements and
Separations: DIMHRS Ontology.


HR
Analytics: Queries the
HR Domain
Ontology.


HR Domain Ontology:
C
entral
K
nowledgebase

for
Concepts
and
Terminology
within the DoD HR Domain.


Knowledge
Center: EIW Training Materials


ODSE Sample
Database: Multiple Vocabularies.


Ontology
Repository:
An
important contribution in the overall goal of
data integration across the HR domain
.


10

https://www.commonvocabulary.army.mil/ui/groups/HR_EIW

Sample Content Included in Next Section

5. Data Analysis and Collaboration Tool
to Support the DoD OIG


The mission of the Department of Defense, Office of the Inspector General
(DOD OIG) is to promote integrity, accountability, and improvement of
Department of Defense personnel, programs, and operations to support
the Department’s mission and serve the public interest.

Each goal of the
DOD OIG requires personnel to perform analysis using structured and
unstructured data, both government and non
-
government sources, and
in a wide variety of file formats.


Personnel and data sources are spread
throughout the globe, requiring teams to acquire data in a remote access
storage system for use.


Personnel access analysis tools remotely using
laptops running Windows XP (SP3) with dual core processors, 3GB RAM,
and 50GB memory.


The DOD OIG has recognized a need to

improve the efficiency and
effectiveness of how data is ingested, shared and analyzed across the
organization.


As well as the need to explore advanced analysis
capabilities to better assist personnel in identifying fraud, waste, and
abuse in the Department
.

11

http://semanticommunity.info/Build_DoD_Vocabularies_in_the_Cloud/Proposal_Demo#BACKGROUND

Note: Bolding is mine.

5
.
Data Analysis and Collaboration Tool
to Support the DoD
OIG


Semantic Community Workflow:


5.1 Information Architecture of Public Web Pages in
Spreadsheets as Linked Open Data.


5.2 Public Reports (Web and PDF) in Wiki as Linked
Open Data.


5.3 Desktop and Network Databases in Wiki and
Spreadsheets in Linked Open Data Format.


5.4 Spreadsheets in
Spotfire

as Linked Open Data.


5.5 Spreadsheets in Semantic Insights Research
Assistant for Semantic
Search, Report Writing, and
Ontology Development
.

12

5. Data Analysis and Collaboration Tool
to Support the DoD OIG

13

http://semanticommunity.info/@api/deki/files/12769/=
DoDOIG.xlsx


5.1 Information
Architecture of Public Web
Pages in Spreadsheets as Linked Open Data.

Tabs (12):

Cover Page

Press Room

Publications 2011

DoD IG

Appendices A, F, & I

Report to Congress

Statistical Highlights

Table 3.1 & Figures 3.1 & 3.2


5. Data Analysis and Collaboration Tool
to Support the DoD OIG


MindTouch

makes the world's most respected social knowledge base.
They power
purpose
-
built help
2.0 communities
that connect companies with their customers.
Millions use
their
software every day
.


Many of the world's most respected brands rely on
MindTouch

including NASA,
SAIC, Booz Allen, Microsoft, Cisco, Washington Post, Viacom, the New York Times,
AXA, Timberland and HCA
.


Innovative companies like
RightScale
,
ExactTarget

and Mozilla have standardized
on
MindTouch

for their documentation strategy
.


The open source .NET Web Oriented Architecture Framework (WOAF)
is
redefining
how enterprise software is built
.


MindTouch

is a recognized expert in both open source and Enterprise
2.0
technologies
.


The
MindTouch

Productivity Tools bridge Microsoft office and your desktop for all
Windows applications. Have your users continue to work with the applications
they're familiar with, instead of forcing them to learn a new tool with our
document management solution. With the
MindTouch

Desktop Suite, you'll save
time and money by not having to train users on a new system.


14

http://www.mindtouch.com/

5. Data Analysis and Collaboration Tool
to Support the DoD OIG

15

http://semanticommunity.info/Build_DoD_Vocabularies_in_the_Cloud/2011_DOD_IG_Semiannual_Report_to_Congress

5.2 Public
Reports (Web and PDF) in Wiki as Linked Open Data
.

5. Data Analysis and Collaboration Tool
to Support the DoD OIG

16

5.3 Desktop
and Network Databases in Wiki and
Spreadsheets in Linked Open Data Format
.

http://
www.mindtouch.com/add
-
ons/desktop_suite?product
-
refer=desktop
-
suite


5. Data Analysis and Collaboration Tool
to Support the DoD OIG

17

PC Desktop
Spotfire

Spreadsheets in
Spotfire

as Linked Open Data
.

5.4 Spreadsheets in
Spotfire

as Linked Open Data
.

5. Data Analysis and Collaboration Tool
to Support the DoD OIG

18

http://
www.semanticinsights.com/company/SI%20Fact%20Sheet.pdf


SIRA can be used to find similarity between current and past
events that are expressed or hinted at in text. SIRA can be
used to find relationships of people, places, things and
activities that may be expressed or hinted at in text.

6
.
Questions and
Answers


Sound Byte: Bring the data and the metadata
back together and do the data science first to
accomplish a business need and lay a solid
foundation for integration and application of
semantic technologies.


Questions about the steps I followed?


Questions about the results I produced?


See Supplemental Slides for the Data Science
Approach to Semantic Web/Technology Pilots.

19

7
. Supplemental Slides


7.1 Semantic Technology Training: Building Knowledge
-
Centric Systems


KM 2011


SemTech

2011


7.2 W3C Government Linked Data Working Group


Clinical
Quality Linked Data on
Health.data.gov


Build Clinical Quality Linked Data on Health.data.gov in the
Cloud


Hospital Compare Downloadable Database Example of "5 Star Government
Data“


7.3 Library of Congress Project Recollection and Digital Preservation
Initiative


7.4 Elsevier/
Tetherless

World Health and Life Sciences
Hackathon

(27
-
28
June 2011
)


Build TWC in the Cloud


Build NCI CLASS in the Cloud


Build the NYC Data Mine Health in the
Cloud


Build
SciVerse

Apps in the
Cloud (IN PROCESS)


7.5 Be Informed (IN PROCESS)

20