The Role of XML in Cloud Data Integration - Drrw.net

meatcologneInternet and Web Development

Nov 3, 2013 (3 years and 10 months ago)

71 views

The Role of XML in Cloud Data Integration

Presenter:

David RR Webber,

Oracle Corporation

October 15th, 2010

Introduction


Cloud services introduce new challenges for information
sharing. While certain aspects are familiar and yet still
unresolved, new techniques can be utilized. A canonical
approach underpinning Cloud data exchanges is vital to ensure
consistent understanding and enhanced interoperability.


Developers face many challenges and complexities in using
today’s industry standards for information exchanges. How can
we simplify this and rapidly develop consistent and conforming
information exchanges in the Cloud?


Open source tools will be discussed and government and
industry example exchanges presented.

Agenda


Challenges, New and Old


Why a canonical approach?


Adaptive, agile and context aware infrastructure


Avoiding the n(n
-
1)
2

dilemma


Ensuring Simplicity at the Foundation


Never underestimate the ability of engineers to add
complexity


Open source and open standard solutions


Examples from emergency management domain with NIEM*


Summary and Q&A

* National Information Exchange Model (NIEM) approach

Challenges, New and Old




Why a Canonical Approach?



Adaptive, agile and context aware infrastructure



Avoiding the n(n
-
1)
2

dilemma

Why a Canonical Approach?


Traditional XML information exchange has been schema
driven; many issues for Cloud integration:


W3C XML Schema is inflexible, static, brittle, localized, expensive


Canonical dictionaries exploit Cloud approach with
distributed availability, flexible collaboration and dynamic
updating and referencing


Amazon Web Services AWS catalog an example of dynamic approach


Canonical dictionaries provide the components that
underpin the exchanges while leaving the precise
exchange formatting open for implementers; the “what”
not the “how”; content can change hourly on AWS!


Neutral XML
-
based syntax is future proof

Canonical XML dictionary


A collection of distinct components that represent
discreet business information for an application domain


Includes singleton components and combinations of
related components together as sub
-
assemblies


Information is represented in a simple neutral
conceptual data format that captures the critical
concepts about the data e.g. name, description, content
type, contextual usage pattern, hierarchy


Wikipedia definition:

http://en.wikipedia.org/wiki/Canonical#Computer_science


Baking in Interoperability


Using consistent component definitions dramatically
improves interoperability and reuse


Having formal design methods makes development faster,
easier, predictable and repeatable


Aligning local practice to industry domain dictionary can
reduce complexity and reinforce best practices


Dictionary definitions can be automatically evaluated for
common mistakes and this reduces the opportunity for
errors during design phase


Generating software artifacts from neutral dictionary
definitions ensures reliable information exchange results
across user communities and their particular systems,
platforms and tools

Neutral Content Model Representation


Neutral representations allow business stakeholders to
participate in dictionary development without technology
barriers


Concise neutral formats can be viewed as simple
spreadsheets as they have no special syntax
dependencies


Based on open public standard specifications, semantic
concepts and leading knowledge domain techniques


Neutral representation prevents lock
-
in by vendor,
syntax, tooling or platforms


Maximizes flexibility and future proofing of dictionary
definitions


Linguistic and Semantic Alignment


Formal community domain naming and design rules
provide consistency of definitions


Consistency of definitions minimizes duplication and
overlapping of dictionary components


Dictionaries allow collaboration on component development
to improve the overall results


Formal component content detail drives alignment


Design best practices ensure logical self
-
contained
components that can be selected contextually


Avoids explosion of complexity and excessive over
definition (e.g. “kitchen
-
sink” schema)

What is a Canonical Approach?


There are several flavors of canonical approaches; some
more complex than others


e.g. UBL vis OAGi vis CCTS


Avoid dependence on W3C XML Schema mechanisms


Core Components Technical Specification (CCTS) simple
components with basic hierarchy


Parent components with child entities, and/or components


Associated attributes that denote context and related factors


In CCTS parlance these are ABIE, BBIE and ASBIE

Parent
= Aggregate Business Information Entity

Child
= Basic Business Information Entity

Attribute

= Association Business Information Entity

Conceptual Information Model

Child

(BBIE)

Item

Parent
(ABIE)

Item

Follows

Naming and Design Rule (NDR)
principles and guidelines

Canonical
Components
Dictionary

XML

ebXML CCTS terms (ABIE, BBIE, ASBIE)


Parent
= Aggregate Business Information Entity

Child
= Basic Business Information Entity

Attribute

= Association Business Information Entity

Parent
(ABIE)

Item

Parent
(ABIE)

Item

Parent
(ABIE)

Item

. . . . .

Child

(BBIE)

Item

Child

(BBIE)

Item

Child

(BBIE)

Item

Attribute

(ASBIE)

Attribute

(ASBIE)

* CCTS


Core Components Technical Specification

Attribute

(ASBIE)

Attribute

(ASBIE)

Each compound component

Each atomic component

Optional attributes of component

Example


Person Name


Person Name (ABIE)


Language Code

(ASBIE)


Verified Details
? (ASBIE)


Has Alias
? (ASBIE)


First Name (BBIE)


Middle Name (BBIE)


Last Name (BBIE)


Previous Name
? (ASBIE)


Language Code may exist independently of Person Name

Verified Details and Previous Name are flags that denote additional
information about the entity they are associated with

There are three component items aspects:


structure relationships; content rules; definitions

Naming and Design
Rules (NDR) also
important in ensuring
shorter non
-
specific
context names

e.g. compare
PersonName to
IncidentPersonName

Methods for creating Canonical Dictionary


Harvest from collection of domain exchange schema


Export from SQL database to schema; harvest; rename


Export from modelling tool to schema; harvest; rename


Create manually in XML or spreadsheet

Sample Dictionary Building Processes

EDM

Ele

Def

DDL

1

Export
Components in
XSD syntax

Collection of objects
f rom model

Option 1


From Enterprise Data Model

Import XSD and ref actor f or use with OASIS CAM

2

Option 2


Derive from existing
exchange XSD schema

Import each XSD and merge into CAM dictionary

Exchange
XSD
schema

Exchange
XSD
schema

Exchange
XSD
schema

CAM
template

OASIS
CAM
template

Model
Components
XSD schema

CAM
template

OASIS
CAM
template

NDR

Evaluation,

Refactor,

Renaming

Tool

Apply Naming and
Design Rule (NDR)
checks and edits

4

NDR

Evaluation,

Refactor,

Renaming

Tool

4

Generate

Standard

Components
Dictionary

XML

Merge &
Generate
Dictionary

XML

5

5

XML

Dictionary

of exchange
components

XML

Import

Import

Import

Import

Dictionary

of exchange
components

3

3

Automated


Manual

LEGEND

Analyst
Review

Analyst
Review

ebXML CCTS
compatible
(ABIE, BBIE,
ASBIE)

ebXML CCTS compatible
(ABIE, BBIE, ASBIE)

Ensuring Simplicity at the
Foundation



Never underestimate the ability of
engineers to add complexity


Adaptive, agile and context aware infrastructure


XML validation framework that is configurable dynamically
through the use of XML templates and rules.


“In today's complex information exchanges with XML and associated
large XSD schema, coupled with an array of trading partners, it
becomes a significant challenge to support and maintain accurate
handling of all incoming transactions”.


“With a more adaptive and fault tolerant process, the application is
able to handle a wider variation in content and, hence, more easily
support a broad set of interaction partners with reduced support and
maintenance costs”.


http://www.ibm.com/developerworks/library/x
-
camval/index.html

Avoiding the n(n
-
1)
2

dilemma


New XML validation framework


Automotive parts repair with STARBOD example


Utilizing validation framework with singleton validation
templates that are context rule driven

Source: http://www.ibm.com/developerworks/library/x
-
camval/index.html#figure2

19

Agile Solution Components

Def

Ele

Domain

applications

Industry

dictionary

f ormatted as XML

Interchanges

XML

exchange

realistic test

examples

XML

Schema

Unit Test

Harness

Test

Blueprint
toolkit

Automated


Manual

LEGEND

Definitions Repository (XML)

Exchange Structure Schema

Domain

dictionary

formatted as XML

Templates

Build

CAMV

engine

Content Hints

2

4

3

1

Exchange
Designer Tool

User Interface

Review
Structure

Assembly

Pick

Components

Wantlist

WSDL actions

(optional)

Business

Context

Rules

Agile Validation

Engine

Canonical

Dictionaries

Leveraging Cloud Deployment strengths


Collaboration tools for sharing canonical component
dictionaries


Repositories of templates and code lists


Fault tolerant deployment architectures with redundancy


Machine accessible APIs to allow real time updates and
propagation of changes


Standards based implementations that provide open access


Open source resources for shared implementation support

Open Source and Open
Standard solutions



Examples from Emergency
Management domain with NIEM,
OASIS EDXL, LEXS

Example Emergency Management Scenario


Emergency Response Services Workflow using OASIS EDXL exchanges

Haiti demonstrated

need for agile

exchanges to
rapidly cope with
unfamiliar scenario
and environment
changes

Cloud
-
based
sharing of open
adaptive common
infrastructure
components


Top Down Solution Approach

Def

Ele

Industry

dictionaries

f ormatted as XML

Exchange

generator
tools (CAM)

Automated


Manual

LEGEND

Components Definition (XML)

Local domain

dictionary

formatted as XML

Build

5

2

Exchange

Blueprint
Designer
User Interface

Expand

Structure

Exchange Structure

Pick

Components

Structure Outline
Blueprint

Target

applications

EDM

Ele

Def

DDL

Exchange

Package

Exchange

Components

1

3

6

7

Enterprise
Data Model

Import and ref actor
f or use with CAM

Dictionary


Repository

4

Assembling Components from dictionaries


Determine your business information exchange components
at conceptual level


Search and locate candidate components from appropriate
domain dictionary collections


Catalogue the parts to be used


Dictionary components can be referenced individually or as collections
by an
assembly blueprint

that puts them all together to create a
complete information exchange


Components can be selected from multiple dictionaries


Note any new
extension

pieces as needed


Select components from multiple physical dictionary files


Blueprints themselves also have high re
-
use value


Can be sub
-
assemblies and patterns not just exchange models


Example Assembly Blueprint Outlines

LEXS messaging blueprint

Reusable messaging envelope constructs


OASIS EDXL HAVE message

Business functional components


Message handling,

delivery and control

Payload goes here

Top level sets of
business
information
components

Individual
component

these examples available from CAM editor install package

~ CAMeditor
\
eclipse
\
workspace
\
CAMEditor
\
dictionary
\
bluepri nts
\

LEXS


Law Enforcement eXchange System


http://www.lexs.gov

Exchange Development Process Tools

Expander

Tool

Industry dictionary

Domain dictionary

Component Definitions

4

Component Definitions

Web
tool

Excel

Searc
h

Tools

2

Blueprint

Designer

1

Insert

Dictionary

Parent

Components

3

Completed
Exchange

Template

5

Summary and Q & A



Review


Resource links

Summary


Canonical XML component dictionaries


Neutral representation of components


Deployment to target environments and architectures


Collaborative development and open source


Uses open public standards and government guidelines (NIEM)


Available resources and tools


Illustrative use cases


Leverage strengths of cloud
-
based collaboration resources

Resources



Resource links


Supporting supplemental slides

Links and Resources


DOWNLOADS
-


CAM Toolkit download


https://sourceforge.net/projects/camprocessor



SUPPORTING MATERIALS
-


NIEM Naming and Design Rules (NDR) 1.3


http://www.niem.gov/pdf/NIEM
-
NDR
-
1
-
3.pdf


RESOURCES



UN/CEFACT Core Components Technical Specification


http://www.unece.org/cefact/ebxml/CCTS_V2
-
01_Final.pdf



Tutorials
-

wiki.oasis
-
open.org/cam/CAM_Tutorials


Specifications


www.oasis
-
open.org/committees/cam



docs.oasis
-
open.org/cam


www.oasis
-
open.org/committees/emergency


NIEM site
-

www.niem.gov


LEXS site


www.lexs.gov


Available XML Dictionaries


LEXS 3.1.4

dictionary


OASIS EDXL dictionary


OASIS EML dictionary



NIEM 2.1 dictionaries


CBRN dictionary


Emergency dictionary


Family dictionary


Immigration dictionary


Infrastructure dictionary


Intelligence dictionary


Justice dictionary


Maritime dictionary


Screening dictionary


Trade dictionary


NIEM core dictionary


Immigration blueprint

Available from download site

direct link:

http://sourceforge.net/projects/camprocessor/files

XML

XML

XML

XML

XML

XML

+ includes spreadsheets and sample blueprint

Packaged with CAM editor

see dictionary folder of install

+ spreadsheet

+ blueprint samples


XML

XML

XML

Note: Those marked in bold are model style dictionaries with recursive components.

Conceptual Information View

CAM toolkit
processing

Apply tools in desktop
CAM toolkit editor


CAM Template

DOMAIN DATA COMPONENTS

Structure

Rules

Definitions

Items

Item
(ABIE, BBIE, ASBIE)

Properties

Name

Unique ID

Component Type

Cardinality

Content Type

Content Mask

Children

Group

Structure Context

Where from

Definition

Rules

Language, Label, Notes

* Required items in
Blue

DICTIONARY COMPONENTS

XML View of Dictionary Content

Items

Name

Unique ID

Component Type

Cardinality

Content Type

Content Mask

* See slide notes for
explanation

Parent / Child
linkage

where

referenced

Excel Spreadsheet View

An item
per row

properties
as columns

Type

(ABIE, BBIE)

children

Mapping to Dictionaries


You can compare a template of components to a dictionary


check within a domain for alignment to dictionary


check between domains for interoperability


merge new/existing components with dictionary


Matches on physical names


Reports matching items and details


Reports statistics and percentages of matching


Generates crosswalk xml file


Compatible with Microsoft Excel


Report can be used to do spell checking

Example cross
-
reference spreadsheet

Formatted view
in Microsoft
Excel of import
of cross
-
reference
report details
(from
generated XML
file)

Matched

details;
item and
alignment,

definition