Services-Oriented Architecture for Water Observations ... - Teamwork

jumentousmanlyInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 4 χρόνια και 22 μέρες)

155 εμφανίσεις







S
HARING
W
ATER
O
BSERVATIONS
D
ATA
U
SING
W
EB
S
ERVICES


Version
2
.3


Nov
7
,

2010



by:


David Maidment,
Tim Whiteaker
,
James Seppi
, Fernando Salas

and Harish Sangireddy

Center for Research in Water Resources

The University of Texas at Austin


and


Ilya Zaslavsky and David Valentin
e

San Diego Supercomputer Center




ii


Distribution

Copyright ©
20
10
,

Consortium of Universities for the Advancement of
Hydrologic Science, Inc.

All rights reserved.

Acknowledgment

Funding for this document was provided by the Consortium of Universities for the Advancement of Hydrologic
Science, Inc. (CUAHSI) under National Science Foundation Grant No. EAR
-
0622374
. In addi
tion, much input and
feedback has been received from the CUAHSI Hydrologic Information System development team. Their
contribution is acknowledged here.

The authors also wish to acknowledge the significant contributions of Dean
Djokic (ESRI) and David Br
iar (USGS) to this document.



Table of Contents

Services
-
Oriented Architecture for Water Observations Data

................................
................................
......................

1

CUAHSI’s Very Large Scale Prototype

................................
................................
................................
............................

4

Implementation using OGC Services

................................
................................
................................
.............................

8

Implementation using ArcGIS Services

................................
................................
................................
..........................

9

Conclusions

................................
................................
................................
................................
................................
..

12

Appendix 1: Water Observations Metadata Specification

................................
................................
.........................

13

Appendix 2: Version History of This Specification

................................
................................
................................
.......

17

Reference
................................
................................
................................
................................
................................
.....

17


Background

The purpose of this document is to d
ocument a procedure for cataloging water observations measurements in a
way that is compatible with the aims of CUAHSI, a water science organization supporting academic research and
education
;

the USGS, the nation’s largest producer of water science inform
ation
;

and ESRI, the makers of ArcGIS,
a
large

publisher and consumer

of geospatial information. It draws upon the several years experience that the
CUAHSI Hydrologic Information System project has had in interacting with the USGS and ESRI to produce,
do
cument and consume web services for water observations data. It is envisaged that the water observations
metadata structure specified in this document, after review and amendment by interaction with CUAHSI’s water
data partner organizations, will be propo
sed for adoption by water research centers and water agencies
throughout the United States.


This document is prepared by authors from CUAHSI in consultation with personnel from the USGS and ESRI, but its
contents have not been formally reviewed by the USG
S or ESRI, and formal endorsement of its contents by those
organizations should not be assumed. Likewise, the structure proposed here assumes implementation using the
standards of the Open Geospatial Consortium (OGC), but this document has not been formal
ly reviewed by the
OGC and its endorsement should not be assumed.

1


S
ERVICES
-
O
RIENTED
A
RCHITECTURE FOR
W
ATER
O
BSERVATIONS
D
ATA


Water observations data are

series or sets of

time
-
indexed values

collected at gages and sampling sites concerning
the quantity and quality of water, or the
characteristics of

weather and climate that influence water conditions.

Such data are collected by many water agencies and also by water scientists.
Three mai
n types of data are
involved:

1.

Physical hydrology, groundwater levels, weather and climate, or water quality data collected continuously
through time using gages or automated samplers
;

2.

Water quality data for surface or groundwater collected intermittently a
t sampling sites and

analyzed
later in a laboratory;

3.

Groundwater levels collected intermittently at wells
.

The first category,
containing
continuous time series data, ha
s

the fewest number of spatial locations but the
greatest number of data values associa
ted with each location. The second category,
containing
water quality data
analyzed in a laboratory, ha
s

a much larger number of spatial locations than for continuous time series data but
fewer data associated with each location. The
se data

also have ma
ny values arising from laboratory analysis of a
single
water
sample. Groundwater levels have the largest number of locations (more than 1 million in the USGS
groundwater database) but sparse measurements in time because groundwater levels do not vary as q
uickly as do
surface water conditions.


A services
-
oriented architecture for water observations

data consists of three types of services:



Catalog Services



which
list
water web services that can supply particular types of water data over
particular
geographic regions;



Metadata Services



which identify collections or series of data associated with particular spatial locations
that can be depicted on
maps
;



Data Services



which convey the values of the water observations data through time, and can be
depicted in
graphs
.

Thus,
using these three services,
a water agency or a water research center producing water observations data can
publish a catalog of services that describe and allow web services access to its water observations data distributed
throu
gh space and over time for a geographic region.
This collection of water data services can be referred to
collectively as a
w
ater
d
ata
Server
. There are two types of water data servers:



Water research data servers

are
maintained by water research cente
rs in
u
niversities to describe data
collected in their research projects, and by organizations such as the USDA Agricultural Research Service
to present the data collected at its experimental watersheds. These
servers
typically deal with local
observation

sites at the scale of a field or a small watershed.



Water monitoring data servers

maintained by public water agencies that collect large amounts of data to
support their water management responsibilities. These typically deal with a larger region, such a
s a river
basin, an aquifer, a state, a city, a county, a water district, or even the entire extent of the United States,
as is the case for the observations data maintained by some agencies of the US government, in particular
the US Geological Survey, the

EPA, and the National Climatic Data Center

(NCDC)
. The NCDC weather and
climate data set

also

includes ob
ser
vations covering the whole earth.

2


A
User

of water data such as a water scientist or student, wants to be able to access the water data from these
various servers seamlessly, that is, to simply search for water data on the internet, identify those servers producing
information of the required typ
e,
obtain metadata about the particular collection at each server, and download
observations data meeting the user’s needs. The capacity to do such a seamless search implies the existence of a
Portal
, that is, a facility which aggregates the catalogs of

individual water data servers and permits search across
their collections of information, rather like Google, Yahoo or Bing support

searches

for text
-
based information on
the internet. Thus, we can consider a triangle of interaction among users, portals

and servers
, as shown in Figure
1.





Figure 1.
A w
ater
information

triangle connecting users with portals and servers


There may be several portals that provide access to the same water observations data through different technical
mechanisms. Thi
s document assumes that the standards of the Open Geospatial Consortium (OGC) are used, in
which case the catalog, metadata and data services described earlier are represented, respectively, by:



Catalog Services for the Web



an OGC standard which provides

a description for each data service and a
single URL address for accessing a collection of such services;



Web Feature Service



an OGC standard with describes map features and their associated attributes which
is used in this instance to document observat
ion locations and the

attributes of

collections of data
measured at each location;



WaterML



a prototype time series data service developed by CUAHSI and adopted by the USGS

for
publishing some of its data.

Thus, the water scientist or student searches the

Catalog Services for the Web
through a

Portal to obtain a listing
of Web Feature Services for water data covering their geographic area of interest, and filters these to select only
those services whose information is desired; then the water scientist or
student queries these service for
collections of time series whose data are of the required kind in general, and filters the

query results

to select
those series that specifically meet the user’s need
s
; finally, the water scientist or student downloads the

observations data into a local database using WaterML data services.



An alternative pattern of implementation
is provided by ESRI’s ArcGIS.com, which is a portal for online GIS where
map services published by ArcGIS Server are indexed with descriptive

information describing their content. Map
services providing particular kinds of information are searched
in ArcGIS.com
using a text query and

the results

displayed on the web overl
aid on standard base maps. T
he time series information for each observation point can
3


be obtained by querying each map feature. A variant on this implementation pattern is where a search is made in
ArcGIS.com for a layer package containing the water observations metadata, this is d
ownloaded to ArcGIS Desktop,
and then the WaterML web services are queried to ingest the time series data into an Arc Hydro time series data
structure or an ArcGIS time
-
enable feature layer.


There are thus several patterns of implementation of the water

i
nformation triangle comprising

Server
s
, Portal
s
,
User
s,
but the content of the information is the same in each case.

O
PEN
G
EOSPATIAL
C
ONSORTIUM


The Open Geospatial Consortium

(OGC) is a group of about four hundred companies and agencies representing the
geospatial information community throughout the world, which has developed the most widely adopted standards
for conveying geospatial information through the internet. These i
nclude the Web Map Service (WMS), which
conveys a map as an image, the Web Feature Service (WFS), which conveys a collection of geospatial features
(points, lines or areas) with their geometry and attributes, and the Web Coverage Service (WCS)

which convey
s a
set of data values arranged on a grid, such as for climate or weather information.

The OGC has also developed a
standard for indexing these services called Catalog Services for the Web (CSW), which is rather like a traditional
card catalog at a libra
ry. The Catalog Services for the Web provides a single web address through which a user can
discover a set of services published at that location, and search through their contents for particular kinds of
information.


The OGC
has
also
developed a
model
called Observations and Measurements for describing the properties of
geospatial features observed using particular procedures. Within this framework, a set of standards has been
established
for synthesizing information drawn from

a heterogeneous collect
ion of

sensors called Sensor Web
Enablement (Jirka, Broering and Walkowski, 2010)
. Sensor Web Enablement has five basic functions: accessing,
eventing, tasking, discovery and integration.

Accessing data is accomplished by the Sensor Observation Service
,
which enables a user to request a dataset based on thematic, temporal and spatial filter criteria. The dataset is
returned
using a customization of the Observations and Measurements model for the data
,

and a language called
SensorML for the metadata abo
ut the sensor.

Eventing means the detection of particular patterns in the sensor
information that may be anomalies or extreme conditions, and Alerting means triggering actions that result from
these events. Discovery of sensor web services uses the Catal
og Services for the Web standard.

Integration of
sensor information streams is accomplished by a project
-
specific set of procedures that are not coded into a
standard.


CUAHSI’s services
-
oriented architecture has a similar goal to Sensor Web
Enablement, but is focused on
observations data stored in archives rather than being obtained directly from sensors, and it may thus be termed
an observations archive web enablement.

This task differs from Sensor Web Enablement in that there are no
funct
ions for eventing and alerting, and the key focus is on discovery, access and integration of observations
information.



In September 2008, CUAHSI proposed to the OGC the establishment of a Hydrology Domain Working Group to
explore the harmonization of i
ts WaterML language with OGC standards. It was apparent that OGC had few
members specializing in hydrology or water resources, so CUAHSI initiated contact with the World Meteorological
Organization’s Commission for Hydrology, and subsequently the OGC and

WMO signed an agreement to support
4


joint development of data standards for Hydrology, Oceanography, Climatology and Meteorology. The
OGC/WMO Joint Domain Working Group in Hydrology contains representatives from the United States, Australia,
Germany, Fra
nce, Italy, the Netherlands, and other countries, and it convenes in person every three months and by
teleconference most weeks. In December, 2010, it is anticipated that this group will propose to the OGC the
establishment of a new standard, called Water
ML2, for conveying water observations data through the internet
that is a narrowed form of the similar functions of the Sensor Observation Service. After a Request for Comments
period, the OGC may consider this standard for adoption some time during 2011
, and it may subsequently be
considered for adoption by the WMO and used by the hydrologic surveys in the more than one hundred countries
represented in the Commission for Hydrology. In this manner, the knowledge and insight arising from CUAHSI’s
research

is

in the process of

being permanently institutionalized as a
n international

standards protocol for

conveying

water observations information through the internet, the first that has ever existed.


CUAHSI’
S
V
ERY
L
ARGE
S
CALE
P
ROTOTYPE


CUAHSI has construct
ed a very large scale prototype of the services
-
oriented architecture
for water observations
data
. The portal
,

called
HIS Central
, is located at the San Diego Supercomputer Center
. There are three patterns
of implementation:



ASCII file approach



a

wa
ter research center or water scientist creates an ASCII file of observations data
and metadata and conveys that to the San Diego Supercomputer Center, where the information is stored
in the CUAHSI Observations Data Model and served using WaterOneFlow web s
ervices based on the
WaterML language. This is the pattern used by the

six NSF

Critical Zone
Observatories (CZO)

to store
their

observations data at a central location, which they call CZO Central
. This approach is convenient for
water scientists who w
ant to store, maintain, and display their own data however they choose, and who
want also to have the data published at a central location so that it can more readily be synthesized and
compared with comparable information measured elsewhere.



HydroServer a
pproach



a

water research c
enter or a water agency mounts a
server locally

that stores
observations data in the Observations Data Model and publishes it through WaterOneFlow web services
.
This pattern is implemented at about a dozen universities in the US for publishing water research data,
and at some water agencies, such as the Texas Water Development Board, which is
in the process of
applying

CUAHSI technology to publish the main stat
e level water databases in Texas. Another variant of
this approach is being implemented at the University of Texas at Arlington to which the National Weather
Service’s West Gulf River Forecast Center has supplied all of its history from 1995 to the prese
nt of hourly
and daily Multisensor Precipitation Estimate data (mainly derived from Nexrad measurements
). These

data
are

stored in the CUAHSI Observations Data Model and

published as

WaterOneFlow web services of

precipitation time series indexed to points

on a regular array mesh
, like a set of virtual rain gages
distributed over the landscape.

The size of this precipitation database is 5TB.



Agency archive

approach



a water agency has an existing water data archive and wishes to retain its
current structu
re. In this instance, as exemplified by the US Geological Survey, the agency programs a
customized WaterOneFlow web service to provide access to its water observations data in the WaterML
language and supplies HIS Central with a data dump of its observati
ons metadata formulated as for the
National Water Information Systems tabular structure. A variant of this approach is used by the EPA to
provide access to its STORET
water quality data


this information is published as a different kind of web
5


service c
alled WQX, which CUAHSI translates to become WaterML, and EPA periodically provides a dump
of all of the STORET database to HIS Central where it is reformulated to an internally defined CUAHSI
metadata catalog.

Harvesting the data and metadata into HIS

Central using the ASCII file approach, and harvesting the metadata into
HIS Central using the HydroServer approach is feasible for small observation networks of the order of a few dozen
observation sites and up to a hundred different variables measured at

those sites. Beyond that scale, harvesting
the observations metadata published through the HydroServer becomes tedious and can take several days to
accomplish for a large water quality dataset from a state agency. Periodically processing large data dump
s from
the USGS and EPA at HIS Central is very time consuming
,

manually laborious
,

and a solution needs to be found that
process can be streamlined or eliminated by the mechanism of the agencies maintaining and publishing their own
observations metadata
.


In CUAHSI’s large scale prototype, the information flow for the HydroServer pattern of implementation is shown in
Figure 2.



Figure 2. Implementation of the water
information
triangle in CUAHSI’s prototype services architecture


The observations
data is stored in the CUAHSI Observations Data Model

(ODM)

(Horsburgh et al., 2008)

implemented using the SQL/Server relational database

(
http://hydroserver.codeplex.com/
).

A set of four query
functions cal
led
WaterOneFlow

web services deliver information from the ODM in the WaterML language:



GetSites



returns a list of observation sites with identification numbers and latitude, longitude
coordinates
;



GetSiteInfo



returns for each site a record for each ob
servation series measured there that specifies the
period of record, the variable measured and the number of values recorded;



GetVariableInfo



returns details about the observed variable such as method and units of measurement;



GetValues


returns an
observation series comprising a sequence of (value, time) pairs that may be
regular or irregular in time.

The WSDL address of the WaterOneFlow web services provided by the HydroServer is registered at HIS Central,
and a metadata harvest is carried out wher
eby a list of sites is obtained using the GetSites function and for each
site, a list of series is obtained using the GetSiteInfo function. This information is added to a large Series Catalog
relational database
at HIS Central, which is
built up incremen
tally as
each new service is

service

is
added. There
6


are 57 such services at present.

Once the metadata harvest is completed, the variables are associated with
concepts in the CUAHSI Hydrologic Ontology (
http://his.cuahsi.org/ontologyfiles.html
), which is a list of concepts
drawn mainly from the EPA Substance Registry system that is used to link water quality data between the USGS
and EPA

(Figure 3)
.

The association between variables in each
water web
service, and concepts in the ontology for
all
such

services registered in the CUAHSI master catalog at HIS Central achieves
semantic mediation
, or unifying of
the meaning of the information across different data descriptions of the same underlyin
g quantities.


Figure 3. CUAHSI Hydrologic Ontology of physical, chemical and biological concepts describing types of water
observations data.


A desktop Hydrologic Information System, called HydroDesktop, (
htt
p://www.hydrodesktop.org
) performs
searches on HIS Central using a custom
-
built web service
GetSeriesCatalogForBox
, using a “
Who, What, When,
Where” query (Figure
4
), whose component criteria are:



Who



the organization supplying the information, the name
of the web service, and internet references
to where the service can be accessed in various forms

(SOAP, REST, Map)
;



What



the character of the information required, defined by the concept and sample medium (air, water,
soil, tissue);



When



the BeginDate and EndDate of the period of interest for historical data, or PreviousNDays for
specifying a number of days before present for real
-
time information;



Where



a latitude
-
longitude box, and a type of site within that (Surface water, Groundwa
ter, Spring,
Atmosphere)
.

This query function is supported by two information web services:
GetWaterOneFlowServiceInfo

that gives a list of
the currently accessible WaterOneFlow services at HIS Central and some information about them, and
GetOntologyTree
,
that provides the current version of the Hydrologic Ontology used to tag the observations
variables at HIS Central.



7



The observations metadata for the resulting list of series is manually interpreted and filtered in HydroDesktop to
eliminate series that
are not of the required type or have too few values to be useful.

HydroDesktop also filters
the results for the latitude
-
longitude box so that results appear only within the shape selected by the user.

The
web services references in the observations me
tadata are used to download the observations data in the WaterML
language from the HydroServer using the WaterML Get Values function shown in Figure 2. The data
are

stored in a
local
database within HydroDesktop, inspected using graphs, and either exporte
d to another application such as
Excel, ArcGIS, or Matlab, or are analyzed internally in HydroDesktop using the R statistical language, or used to run
hydrologic models in HydroDesktop through the OpenMI interface (
htt
p://www.openmi.org

)


Figure
4
. Specifying an observations metadata search using “who, what,
when, where” criteria


Observations series from a large water agency, such as the US Geological Survey are obtained

in HydroDesktop

by
the same process except that the GetValues request received at HIS Central is redirected to the appropriate USGS
WaterML
web service to supply the corresponding observations data. The Observations metadata for USGS series
are transferred from USGS to HIS Central and queried there the same as for ODM
-
based observation networks.
Thus, the current large scale CUAHSI prototy
pe can be described as a
centralized metadata, distributed data

services
-
oriented architecture for water observations data.


To summarize, a search in HydroDesktop has two parts: a
metadata search

where a map of observations metadata
is obtained, and a
data download

where the observations series are downloaded into HydroDesktop for selected
observations metadata records.
Figure 5 shows the HydroDesktop user interface for search and mapping.


A

water observations

theme

is a collection of time series
, de
scribed in HydroDesktop

by
:



An ESRI Shape File that contains the
observations metadata

with one record for each time series;



A set of relational database tables in a SQLLite database that stores the
observations series
.

Select Region

(where)

Start

End

Select Time
Period

Select
Service(s)

Filter
Results

Save
Theme

Select Keyword(s)

8





Figure 5. Map, search, and
times series graphing in HydroDesktop



I
MPLEMENTATION USING
OGC

S
ERVICES


The very large scale CUAHSI prototype just described has been developed using custom
-
designed and built web
services. It is reasonable to ask whether the same functions could be performed using existing web service
standards of the Open Geospatial Consort
ium (OGC), augmented, where necessary, by adjustments to those
standards, or development of new standards. This subject is discussed within the OGC/WMO Joint Hydrology
Domain Working Group
(
http://external.opengis.org/twiki_public/bin/view/HydrologyDWG/WebHome
).


Given the
imminent consideration of the new OGC WaterML standard for downloading water observations series,
an investigation has been made to see how a search
f
or

water observations metadata could be executed using
existing OGC standards. A small search application has been constructed independently of HydroDesktop for the
OGC services scheme proposed here and it has been verified that this search application y
ields the same
information as do the search functions within the CUAHSI large scale prototype

executed from HydroDesktop
. In
this
OGC implementation
scheme, the water information triangle is represented by the functions shown in Figure
6
.


9



Figure
6
. Implementation of the water information triangle in using OGC web service standards


In this
implementation, HIS Central is replaced by a
HydroCatalog
, of which many could exist and

could
interact
with one another, rather than having a single central observations metadata catalog. The workflow is that a new
web service is registered in the HydroCatalog using the reference URL address of an OGC Catalog Services for the
Web (CSW). If neces
sary, the observations metadata can be harvested from the HydroServer to the HydroCatalog,
but that is not required. HydroDesktop
would then perform

three queries:



CSW:GetRecords



supplies a list of web services in the geographic region of interest that

may have
information of the required type;



WFS
:
GetFeature



supplies a feature class of observations metadata with one feature for each time series;



GetValues
:
WaterML1.1



supplies observations data. This is to be replaced by SOS:GetObservations in
Wate
rML 2.0, the newly proposed standard.

This OGC web services model has the merit that it can support both
centralized and distributed metadata

as well
as
centralized or distributed data
, which satisfies all thr
ee implementation patterns found in the CUAHIS
HIS
project.

I
MPLEMENTATION
USING
A
RC
GIS

S
ERVICES

The
water information triangle can also be implemented using ArcGIS services, as shown in Figure
7
. The
observations metadata is published as an ArcGIS map service as a feature class with one record per time series in
the theme or data service. A user
searches ArcGIS.com for relevant information using key words, and opens the
resulting services usi
ng a web browser or by using ArcGIS Desktop, to display a map of observation site locations
displayed over a background map for spatial interpretation. A query on any of these site locations yields a time
series through a REST query to a WaterML time seri
es service.


Get the
met
adata

with

WFS:GetFeature

Get the
data

with

GetValues

(WaterML 1.1)


or
SOS:GetObservations
(WaterML 2.0)

HydroCatalog

HydroServer

HydroDesktop

Search

the catalog for
services

with

CSW:GetRecords

Z???P?]?•?š???Œ

services and

pass
Metadata

with

WFS:GetCapabilities

10



Figure
7
. Water Information Triangle implemented using ArcGIS services


For a brief application example, consider Tropical Storm Hermine, which hit Central Texas during the night of 8
September 2010. This storm caused extensive flooding and seven people were drowned. The National Weather
Service West Gulf River Forecast Ce
nter publishes its Nexrad radar rainfall information using the CUAHSI
Observations Data Model and WaterML web services mounted at UT Arlington. The US Geological Survey has a
real
-
time WaterML web service for its instantaneous streamflow information, whi
ch it operates 24/7/365 with
redundant servers and REST access to water time series. This USGS service operates only for the past 120 days
and provides 15 minute water data. To capture the flows permanently for study of Tropical Storm Hermine,
the
USG
S instantaneous flow data for 1
-
14 September

has been downloaded and republished at the Center for
Research in Water Resources (CRWR) of the University of Texas at Austin from the CUAHSI ODM as a WaterML
web service. The Capit
al Area Council of Governme
nts (CAPCOG) is the regional emergency services coordinator
for the 10 counties around Austin. Searching ArcGIS.com for

“CAPCOG” will yield a map service published by
whose USGS gage points can be queried to get the flow up to the most recent hour. Sea
rching ArcGIS.com for

“Hermine” will yield a map service published by CRWR for observations metadata that can be opened in a web
browser or in ArcGIS desktop, as shown in Figure
8
. Querying on any point in this map will yield the corresponding
time series

data using a REST request to the appropriate WaterML web server, as shown in Figure
9
.


Get the
metadata

with

ArcGIS
map services or layer packages

'???š??š?Z???
data

with

GetValues

(WaterML 1.1)


or
SOS:GetObservations
(WaterML 2.0) REST services

ArcGIS.com

ArcGIS Server

Web browser

ArcGIS Desktop

Search
ArcGIS.com for

type
of information using
keywords

Register

Arc'IS MaƉ
Services

11



Figure
8
.
Map services for rainfall and streamflow
observations metadata
fromTropical Storm Hermine in
ArcGIS.com



Figure
9
. Time series of streamflow delivered by USGS in WaterML using a REST service called by querying on a
point in the map service shown in Figure
8
.


12


Whether implemented as OGC services or ArcGIS service, a logical model is needed to specify the required
ob
servations metadata attributes. These are presented in
Appendix 1, and the history of the development of this
specification to date is given in Appendix 2.


C
ONCLUSIONS

A services
-
oriented architecture for water observations data in the United States has

been developed and tested
using a very large scale prototype by the Consortium of Universities for the Advancement of Hydrologic Science,
Inc (CUAHSI). The insights gained from building this prototype have revealed that equivalent functions can be
perfo
rmed by a

logical

model that consists of catalog services, metadata services, and data services.

This logical
model can be implemented in different physical models, such as the OGC Web Services, or in ArcGIS.com map
services. To achieve
semantic mediati
on
, or unifying of concepts in this information set, the observations variables
in all data services must be associated with concepts in a common ontology, which CUAHSI has established, largely
by building on existing work on semantic mediation between the

USGS and EPA over the past several years. To
achieve
syntactic mediation
, or uniformity of data format, the metadata and data responses must use a common
specification, in the case of water observations metadata as a standardized set of map attributes wh
ere each
observations series forms one feature symbolized by its point location in space, and in the case of the water data
by use of a formally specified water markup language, WaterML.


A new version of WaterML is being proposed for consideration as a
n OGC Standard in December, 2010, and after a
period of comment and adjustment, will be considered for adoption as an OGC standard some time in 2011. This
paper shows how a set of best practices using OGC web services or ArcGIS map services can be used t
o implement
this services
-
oriented architecture for water

observations

data.



13


A
PPENDIX
1:

W
ATER
O
BSERVATIONS
M
ETADATA
S
PECIFICATION


The specification that follows is for the fields of a feature class that contains one record for each time series.
Requir
ed fields are written in

bold text.

The remaining fields are there to help you assess the time series before
actually requesting the data. For example, you might only want to download time series for observations recorded
using a specific measurement met
hod. In order to get a better sense of the data, you may also one to perform
some quick summaries on the fields in the
observations metadata table

to identify all of the unique methods used
or to determine all of the unique site locations (handy when a si
te measures more than one variable and thus
appears in the table more than once).


Table
1

Observations Metadata: Core Fields

Field Name

(Field Type)

Definition

Example
s

USGS

RecordType

The type of information record. There are
presently three,
ObservationsCore
, and two
extensions for Generic ODM services and
DataSet Services (Modis and DayMet)

Observations
Brief

Observations
ODM

Observations
WOFDataSet


OrgHier

Organizational Hierarchy. To

be used for
defining a hiera
r
chy of services. Use the Java
Inverted domain.

Gov.USGS

EDU.Texas

Gov.USGS

ServCode

(Text
-

50)

Network prefix for site codes used by the
WaterOneFlow service, giving the context
within which the site code applies

CCBay

NWISDV

NWISDV

SiteCode

(Text
-

50)

Unique text identifier for a site within a given
WaterOneFlow service
. For the USGS and EPA,
an agency is bonded with the site number

H1

USGS
:02289050

FL005:02289050

(Different sites)

SiteName

(Text
-

255)

Name of a
site

Hypoxia_1


SiteType

Site Type. Defined by both the USGS, and the
EPA.


Surface Water,
Ground

Latitude

(
Double
)

Latitude of the site location in decimal degrees

(WGS_1984); for polygons can be
NULL

27.814


Longitude

(
Double
)

Longitude of the site
location in decimal
degrees

(WGS_1984); for polygons can be
NULL

-
97.141


Elevation

Elevation of site with units. This is needed for
variables where parameter is observed
referenced to the ground surface.

1500 m

37.9 ft

VarCode

(Text
-

50)

Unique text
identifier for a variable within a
given WaterOneFlow service
. DataProviders
should create distinct codes.

DOC

00065

00060:00003

VarName

(Text
-

255)

Name of a variable

Dissolved Oxygen
Concentration

Stream water level
elevation above NAVD
1988, in
feet,Upstream


VarUnits

(Text
-


50)

Units of measure for the variable. Encouraged,
but optional, since some providers for analytical
chemistry may wish for all information

milligrams per liter

ft

14


DataType

(Text


㔰5

Typef⁤慴a
⸠卥e⁴Ue
䍕䅈卉A䍯C瑲W汬lT
噯捡bu污特
⸠䥦eeTeT⁲equeVW⁡n⁡TT楴楯n慬a
䑡M慔祰e

噡汵e

䅶e牡来

䵡硩mum

䵩M業um

却SnT慲T䑥v楡i楯i


Medium

(Text


㔰5

䵥T極m⁩ ⁷ 楣栠瑨e
v慲楡i汥⁡pp汩lV
⸠卥e
䍕䅈卉A䍯C瑲W汬lT⁖ 捡cu污ly

⸠䥦eeTeT
牥queV琠慮⁡TT楴楯n慬a䑡M慔祰e

卵牦慣攠坡瑥r


Vocabulary

(Text


㔰5

噯捡bu污特⁰牥f楸⁦潲iv慲i
慢le 捯ceV⁧ v楮朠gUe
捯c瑥硴Ww楴i楮⁷ 楣栠瑨e⁣ Te⁡pp汩lV

䍃䉡B

N坉卄P

N坉卉PM

佮瑯汯杹

⡔e硴


㔰5

Un楱ue慭a⁦潲 瑨en瑯汯ly⁣ n瑡楮楮朠gUe
捯c捥p琠瑯⁷ 楣栠瑨e⁧楶en⁶a物慢汥⁨慳⁢een
m慰peT⸠USG匠SnT⁅偁P慲e⁵nTe爠m慮T慴e⁴o
捲c慴e⁡nT⁵Ve⁡nT⁓剓

on瑯lo杹

䍕䅈卉A噡V楡i汥⁏n瑯汯杹
vㄮ㈶

卒S

䍯C捥pW

⡔e硴


㔰5

䱥慦⁣ n捥p琠
步kwo牤
f牯m⁴Uen瑯汯杹⁴o
wU楣栠瑨楳⁶慲楡ile⁡pp汩lV

T楳VolveT佸Ogen

却牥慭⁓ 慧e


SerStatus

Series Status
. Active, Inactive, Sporadic.

If Inactive, StartDate and EndDate
are
populated

If Active, and all data is available, EndDate
should be null

If Active, and Data is available for a limited
time, both StartDate and End Date are Null.

Inactive

Active

DataAvail

Limited Data Availability. If the
series
information

is only available for a period of
time
, eg 120Days


P120D

IsRegular

(
ShortInt
)

1 (
TRUE
)

if variable is measured/calculated
regularly in time;
0 (
FALSE
)

otherwise

0

1

TimeStep


For regular data, the time step and time units
give the length of time
between measurements,
e.g., 1 day, 6.5 hrs, 1 month
. Represent as ISO
Duration (P1D) Estimated, it ok. USGS does not
know what the normal sampling interval.

P1D (one Day)

P1M (one Month)

PT12H (Time One Hour)

PT15M (Time 15 Minutes)

P1D

PT15M

StartDate

(Date)

Start date and time for the time period of the
variable at the site
. If data is available for a
limited time, StartDate will be null or empty,
and the value should be calculated as Now
minus the
DataAvail
.

5/3/94 8:40 AM


EndDate

(Date)

End date
and time for the time period of the
variable at the site
. If the site is active, then this
will be null or empty. The value of Now or
LastUpdated is appropriate.

8/31/06 11:26 AM


ValueCount

(
LongInt
)

Number of time series values for the variable at
the

site for the given time period

270


Last Updated

DataCart was Last Updated. This will allow for
querying the WFS for when a record was last
update

2010
-
11
-
03

2010
-
11
-
03

MaxRecords

Maximum Number of Records Returned. If a
service wished to limit the
number of data
values returned, it should indicate so by
populating this value. Zero or empty/null =
unlimited. Optional

Null

null

15


ServType

Service Protocol Type. Rest or Soap.

SOAP

REST

Location

(Text



㈵2
)

偲Ppe牬r⁦潲浡 瑥W 捡c楯i⁰慲慭a瑥爠Wo⁰慳V

瑯⁗慴e牏re䙬Fw⹇e瑖W汵eV

协䅐⁲equeVW

䍃䉡B㩈:po硩慟x

GN位㩂佘O
-
㤷⸱㐱4
㈷⸸ㄴH
-
㤳⸵″〮㈩

GN位㩐佉OT(
-
㤷⸱9ㄠ
㈷⸸ㄴ)

N坉卄嘺

䙌〰㔺〲㈸㤰㔰

N坉卄嘺

U升S
㨰2㈸㤰㔰

⡦u瑵牥)

噡物Vb汥

⡔e硴



㈵2
)

偲Ppe牬r⁦潲浡 瑥W v慲楡i汥⁰慲慭a瑥爠Wo⁰慳V
瑯W
坡瑥牏We䙬Fw⹇e瑖W汵eV

协䅐⁒敱ueVW

䍃䉡B㩄佃

NWISDV:00060/DataType=M
aximum

NWISDV:00060:0003

NWISUV:00060

ReqsAuth

(
ShortInt
)

Request authorization. 1 (
TRUE
)

if
authorization

for download
is
required
;
0
(
FALSE
)

otherwise

0

0

WaterMLURI

(Text


㈵2)

U剉

of⁗慴e牏re䙬Fw⁳e牶i捥 坓䑌

o爠剅協.

䙯爠愠剅協⁓敲v楣e⁴U楳 w楬i⁢e 愠捯浰汥瑥WU剌R
w楴i⁴o步kV⁴o⁩ T楣慴攠睨敲e⁴Ue⁳瑡 琠慮T⁥ T
瑩浥V⁣ n⁢e⁳畢V瑩Wu瑥W⸠筴{me㩳瑡牴素
筴業e㩥nT}

U瑴p㨯⽤慴a
⹣潭⽗.䘯F
⽣I慨V楟1_〮0Vmx?坓䑌M

U瑴p㨯⽷慴erVerv楣
eV⹵V杳.杯g⽮w楳⽩
v?V楴iV=〱㘴㘵〰

V瑡牴䑔=筴{me:V瑡牴
紦enT䑔=筴業e:en
T紦p慲慭a瑥牃T=0
〰㘰

WofVersion

(
Text



)

噥牳楯if⁴Ue⁗慴e牏re䙬潷⁳ rv楣i

ㄮ1

ㄮ1

卩瑥V
坆PU剉

⡔e硴


㈵2)

Ge瑃慰慢楬i瑩eV
U剉Rof⁷eb⁦ 慴u牥⁳ rv楣e
VUow楮朠
V楴攠汯捡瑩WnV

U瑴p㨯⽤慴愮捯洯坆c卥rve
r
FVerv楣i=坆匦牥queV琽
Ge瑃慰慢楬i瑩eV


卩瑥
坍PU剉

⡔e硴


㈵2)

U剉Rof⁷eb慰p楮朠ge牶i捥⁲e污leT⁴o⁴Ue⁤慴a

U瑴p㨯⽤慴愮捯洯坍c卥rv

FVerv楣e=坍匦requeVW
=Ge瑃慰慢楬i瑩WV


䑃牴r䙓U物r
⡔e硴


㈵2)

Ge瑃慰慢楬i瑩eV U剉Rof⁴Ue⁗F匠䑡瑡⁃慲琠
卥牶楣i⁴U慴⁣ n瑡楮V⁴Ue⁳e物rV

U瑴p㨯⽤慴愮捯洯c慴慓敲vi

FVe牶i捥=坆匦
requeVW=
Ge瑃慰慢楬i瑩eV







T慢汥

㈮2
佢Verv慴楯iV⁍e瑡Ta瑡㨠䅤T楴楯n慬a䙩F汤V⁦潲 Serv楣eV⁦牯m⁃U䅈卉A佄M


Field Name

(Field Type)

Definition

Example
s

MethodID

Unique ID within a WaterOneFlow service for the method
used to measure the variable

1

Method

Description of the method used to measure the variable

Multiprobe measurement

QCLevelID

Unique ID within a WaterOneFlow service for the quality
control level of the time series

0

16


QCLevel

Description of the quality control level of the time series

Raw Data

SourceID

Unique ID within a WaterOneFlow service for the original
source of the data

9

SourceName

(Text


㈵2)

N慭af⁴Ue物杩湡氠Vou牣rf⁴Ue⁤慴a

Te硡V⁁ 䴠UniverV楴i⁃ 牰uV
䍨物C瑩


T慢汥″⸠䅤T楴楯n慬a䙩F汤V⁦潲 䑡M慃慲瑗但䑡OaVe琠
Re捯牤⁔祰e


LocType

(Text


㈵2

Typef⁳e牶楣i


楮T楣慴敳⁨ow⁴Ue⁌o捡c楯i⁰慲慭a瑥爠Wf⁡
坡瑥牏We䙬Fw⹇e瑖W汵eV⁣ 汬⁳桯u汤⁢e⁦潲浡 瑥W

䱡瑌on杂潸

䱡瑌on材g楮W

XLL

(
Double
)

For point data, Longitude of the point. For data defined by a
lat/lon box,
western longitude of the box

-
97.141

YLL

(
Double
)

For point data, Latitude of the point. For data defined by a
lat/lon box, southern latitude of the box

27.814

XUR

(
Double
)

For data defined by a lat/lon box, eastern longitude of the
box
; otherwise can be
NULL


-
93.5

YUR

(
Double
)

For data defined by a lat/lon box, northern latitude of the
box
; otherwise can be
NULL


30.2




17


A
PPENDIX
2:

V
ERSION
H
ISTORY

OF
T
HIS
S
PECIFICATION

April, 2009



“Thematic Dataset Tables” designed by HIS team

July, 2009


Dart Cart design generated from Thematic Dataset Tables by UT
-
Austin/ESRI

March, 2010



Data Cart design refined by HIS team and ESRI

May 19, 2010



After discussion on the HIS call, Location and Variable were added back in, along with a field

that
indicates the version of the WaterOneFlow service. We currently need Location and Variable to handle complex
situations like NWIS, which has several “overloads” for those two fields.

June 2, 2010



Location and Variable were made “optional” fields.

While these fields may be provided by the cart
originator, clients are expected to be able to reassemble these fields by reading data from other fields in the cart.
This is necessary if the user wants to amend the GetValues request for a time series, or
if the syntax for
WaterOneFlow web service method requests change with the release of subsequent versions of WaterOneFlow or
the WaterML specification.

November 1, 2010



A meeting was held at the San Diego Supercomputer Center of CUAHSI, ESRI and USGS
per
sonnel to reconsider the data cart specification in the context of its inclusion in a services
-
oriented architecture
for water data in the United States based on standards of the Open Geospatial Consortium. This amended the
observations metadata structure

significantly and resulted in this version being labeled version 2.0 of the
observations metadata design, and now called a CUAHSI Water Observations Metadata Specification.

November 3, 2010.

Revisions done in consultation with the David Briar USGS, After
conference with David
Maidment, Dean
Djokic
, and SDSC. Split into a brief specification, an ODM specification, and a DataService
Specifcation. Added a RecordType to allow for different types of cart records. Added OrgHeir, to allow for the
definition of

service hierarchies. Added SiteType and Elevation to site information. Added fields (SerStatus,
DataAvail) to manage the status and data availability of a time series. Dropped TimeUnits; TimeStep is specified as
an ISO TimeDuration which incorporates a ti
me unit.


R
EFERENCE


Horsburgh J. S., D. G. Tarboton, D. R. Maidment, I. Zaslavsky (2008), A relational model for environmental and
water resources data,
Water Resources Research
,

Volume 44, Paper W05406,
doi:10.1029/2007WR006392.


Jirka, S., A. Broering,

and A.C. Walkowski (2010), Sensor web in practice, Geoinformatics,
Vol. 13, No. 6,
pp. 42
-
4
5.
http://fluidbook.webtraders.nl/geoinformatics/06
-
2010/#42