The Earth System Curator

sounderslipInternet και Εφαρμογές Web

22 Οκτ 2013 (πριν από 3 χρόνια και 5 μήνες)

166 εμφανίσεις

The Earth System Curator

Metadata Representations

Prototype Portal in Collaboration with
ESMF

and
ESG


Rocky Dunlap

Spencer Rugaber

Georgia Tech

Who we are


Cecelia DeLuca, NCAR


V. Balaji, GFDL/Princeton University


Don Middleton, NCAR


Chris Hill, MIT


Serguei Nikonov, GFDL


Sylvia Murphy, NCAR


Luca Cinquini, NCAR


Julien Chastang, NCAR


Spencer Rugaber, Georgia Tech


Leo Mark, Georgia Tech


Rocky Dunlap, Georgia Tech

Plus other collaborators: NMM,
Metafor, BFG2, and others

What is the Earth System Curator?


The goal of Curator is to link climate
datasets

with a
detailed description of the
model

that ran to produce the
dataset


Transparent

access to models and datasets


Use cases for climate model metadata


Provenance (history of what happened)


Archival and search (for models and datasets)


Model inter
-
comparison


Compatibility checking


Generation of coupler components


Collaborations with Related Projects


Earth System Modeling Framework (ESMF)


Software infrastructure to facilitate building numerical Earth
System models


Component
-
based model development


Built in tools for managing common modeling tasks (coupling
fields, calendars, grid creation, etc).


Earth System Grid (ESG)


A large scale distributed portal for hosting data produced by
Earth System models


Services such as dataset ingest, faceted search, dataset
browsing, viewing metadata, downloading datasets

Representations of Curator Metadata


UML


RDF/OWL


XML/XML Schema


Relational DB
-

SQL

UML


Unified Modeling Language


What it is


A visual modeling language for representing software systems


Source


OMG Standard


Motivation


Conceptual modeling, human to human communication of the
model, object oriented representation


of the 13 diagrams in UML 2.0, we are using one: class diagram


static structure in terms of classes, attributes on classes,
relationships between classes

UML


Metamodel


Access to metamodel for creating UML Profiles


ability
to define a subset of UML used for building your own
models


Tool support


Enterprise Architect


recommended


Others


Rational Rose, Poseidon, ArgoUML, Microsoft
Visio


Constraint +Query Language


Object
Constraint Language (OCL)

http://swiki.cc.gatech.edu:8080/Curator/46

RDF/OWL


What it is


“Semantic web” ontology language


Primary modeling constructs are
properties

and
classes


Conceptual

implementation language (not low level like
XML)


RDF


Resource Description Framework


Based on {subject, predicate, object} triples


OWL


Web Ontology Language (2.0 coming soon!)


Strong theoretical basis on Description Logics


Source


W3C standard

RDF/OWL


Motivations


Now a widely accepted standard


Simple data model, but OWL still allows complex class
descriptions


Very “web friendly” for use with external systems, semantic
mediation, URIs, XML format for interchange


“Non
-
experts” can build an ontology using Protégé


Architectural considerations: faceted search interface


Tool support


Protégé


Sesame Triple Store, Jena Java API


Example RDF Statements

“Balaji works at GFDL.”

Curator

meeting

GFDL

“18 Oct 2007”

“19 Oct 2007”

Balaji

hasLocation

worksAt

starts

ends

RDF XML Representation

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#"
xmlns:esc="http://www.earthsystemcurator.org">



<rdf:Description rdf:about=“http://....#OctCuratorMeeting">



<esc:hasLocation rdf:resource=“http://....#GFDL”/>



<esc:starts>18 Oct 2007</esc:starts>



<esc:ends>19 Oct 2007</esc:ends>


</rdf:Description>



<rdf:Description rdf:about=“http://....#Balaji">

<esc:worksAt rdf:resource=“http://....#GFDL”/>


</rdf:Description>


</rdf:RDF>

ESG Ontology with Curator Extensions

Protégé 4 beta:
http://protege.stanford.edu/download/registered.html#p4


Update Pizza Tutorial (HIGHLY RECOMMENDED)

http://www.co
-
ode.org/resources/tutorials/ProtegeOWLTutorial
-
p4.0.pdf

XML/XML Schema


What it is


Very widely accepted format for communication between
applications, tag
-
based markup


Source


W3C Standards


Motivations


A standard implementation that modeling groups can adhere
to (most will not be comfortable with RDF/OWL)


Can be output by modeling frameworks such as ESMF


“Use profiles” are small chunks of XML for specific purposes
(part of the egg white?)


XML/XML Schema


Tool support


XMLSpy, oXygen, Notepad...


Query languages


XQuery, XPath


XSLT for transforming XML to other formats


SQL


Relational Databases (RDBMS)


ANSI standard


Motivations


Very mature technology


RDF/OWL and XML are likely NOT good
solutions for long term storage


Fast querying


Large scale metadata storage

Representation Issues/Considerations


What kinds of
constraints

do we need to
precisely model the domain?


structural constraints vs. dynamic constraints


What kinds of
reasoning

and query capabilities
do the applications require?


What role will the
meta
-
model

play?


How do you keep
consistency

among several
representations/notations?


What is the role of auto
-
generation?

Putting it all together...


A prototype application developed this summer
at NCAR in collaboration with ESMF and ESG:


ESMF modeling components become “self
-
describing”


Metadata is exported from an ESMF component in a
standardized XML format (multiple conventions
allowed)


The XML is ingested into ESG and exposed to the
portal for users to search

Metadata Lifecycle

Metadata Lifecycle

1.
ESMF component exports XML metadata

2.
The XML is validated and harvested into a
Java object representation

3.
The Java objects are persisted to a relational
database (RDBMS)

4.
Metadata in the RDBMS is then harvested into
RDF


a Semantic Web ontology language

5.
The RDF is accessed by the ESG web portal
for faceted search of the metadata



ESMF XML Output (example)

<model_component name="Finite Volume Dynamical Core">



<discipline_set>


<discipline name="Atmosphere" />


</discipline_set>



<physical_domain_set>


<physical_domain name=“Earth system" />


</physical_domain_set>



<agency_set>


<agency name="NASA" />


</agency_set>



<institution_set>


<institution name="Global Modeling and Assimilation Office (GMAO)" />


</institution_set> ……


Viewed as a simple
“use
-
profile”

ESMF XML Output (example)


<author_set>


<author name="Max Suarez" />


</author_set>



<coding_language_set>


<coding_language name="Fortran 90" />


</coding_language_set>



<model_component_framework_set>


<model_component_framework name="ESMF (Earth System Modeling Framework)" />


</model_component_framework_set>



<variable_set>


<variable shortname="DPEDT" longname="Edge pressure tendency" units="Pa s
-
1" />


<variable shortname="DUDT" longname="Eastward wind tendency" units="m s
-
2" />


……


</variable_set>


</model_component>

ESG Prototype Data Portal

Faceted

search

Harvested
component

ESG Prototype Data Portal

ESG Prototype Data Portal

Demo of Dycore Portal

http://dycore.ucar.edu/