Statistical Support and Web Development for a Web-based Master Sample Management System for Integrating Aquatic Ecosystem Status and Trend Monitoring

childlikeprudenceInternet και Εφαρμογές Web

5 Δεκ 2013 (πριν από 3 χρόνια και 11 μήνες)

103 εμφανίσεις

Statistical Support and Web Development

for a Web
-
based
Master Sample
Management System
for
Integrating Aquatic Ecosystem Status and Trend Monitoring

Proposal submitted to

Bonneville Power Administration

Pacific Northwest Aquatic Monitoring Partnership


Integrated Status and Trend Monitoring Workgroup



By

Statistics Department

Oregon State University


Principal
Investigator

Don L Stevens, Jr.




2



Background:

Monitoring agencies throughout the northwest are increasingly adopting the
principles of survey

sampling to design stream monitoring networks to track the status and trends
in resource condition (stream habitat and chemistry, biota, riparian condition) for biological
assessments or effectiveness of strategies. In survey sampling, sites are selected

from a
representation of the relevant stream networks (e.g., digital hydrographic traces) by incorporating
randomization in the site selection process. Several algorithms have been developed that allow a
user to select sites that meet their design requir
ements. An algorithm (called a generalized
random
-
tessellation stratified design, GRTS) is increasingly being used to generate a
spatially
-
bal
anced set of sites (see Stevens and Olsen (2004) and Dobbie et al. (2008) for details about the
advantages of a
s
patially
-
bal
anced sample compared with simple random or systematic samples).
One consequence of the increasing interest in using GRTS is that a variety of
spatially
-
bal
anced
designs are being developed in overlapping
geographic
domains according to each u
sers specific
interests. There is potential for redundancy, non
-
optimal designs, and lack of communication
among agencies

with
overlapping

responsibilities
. To alleviate the potential for this type of
problem and to facilitate the integration of the desi
gns during the design process, the concept of a
master sample has been developed and applied to stream networks in the NW (Larsen, et al.,
2008).

A master sample is a full list of sites that could be potentially sampled, structured so that a user
could s
elect a subset from the full list and retain the principle of randomization and spatial
balance in the subset of sites selected (see Larsen, et al., 2008). Statewide master samples
covering stream networks in Oregon, Washington, and Idaho have been develo
ped. A master
sample file consists of a list of sites along with a set of attributes assigned to each site. Each site
is identified by a unique site identifier, site latitude and longitude, and a set of design and
classification attributes (e.g., initial

selection weights, populations, USGS hydrologic unit code,
ecoregion). Master samples can also be easily created for areas (polygons), such as estuaries,
sounds, or near coastal regions.

As users become familiar with the use of a master sample, and as
more and more users draw
subsamples from the same master sample, a master sample tracking and management system will
be necessary. Such a system will allow users to know who else has selected sites from the master
sample covering stream networks in their d
omains; to design individual or integrated monitoring
programs; to know how existing sites relate to a common master sample; and
to know what
others

are collecting at the site over time. Such a management system could allow a user to
select the part of th
e master sample that is relevant to his/her domain, to identify whether other
users have selected subsets within their domains,
and to
upload information about their
evaluation of the sites they selected
giving future users
insight into the history of site
s selected
within their domains. Application of the master sample concept would facilitate data sharing and
integration across multiple agencies in regions of common interest, given that agreement can be
reached on common protocols for indicators of commo
n interest.

The Pacific Northwest Aquatic Monitoring Partnership (PNAMP) is developing an Integrated
Status and Trends Monitoring (ISTM) project to demonstrate the concept in the Lower Columbia
region.


It is anticipated that the PNAMP ISTM project will

increase familiarity with the concept
and use of a master sample. As part of the ISTM effort, PNAMP is p
roposing to develop a
3


prototype web
-
based master sample tracking and management system to support the interests of
increasing numbers of users in drawi
ng samples
from the same
population domain. T
his

system
would allow users to know who else has selected sites from the master sample covering stream
networks in their domains; to design individual or integrated monitoring programs; to know how
existing sit
es relate to a common master sample; and
to know
what
is being collected

at the site
over time. In conjunction with the development and use of the web
-
based master sample
management tool
, there is a
need for dedicated analytical support for design and uti
lization of
results of the monitoring design based on
the
master sample.
This proposal
is to develop the
prototype
master sample management tool
using the Lower Columbia region

as a demonstration
area

and to provide the necessary statistical support.

T
he
Lower Columbia
was selected
as a
demonstration area
because it is a
manageable size,
several monitoring programs using GRTS
designs on stream networks in the Lower Columbia
are already in place, and an example area
-
based sample of the Lower Columbia estuar
y was selected by the USEPA in 2007.

This project will develop the prototype
Master Sample Management System, make it available to
users, and provide statistical design and analysis support for the two years of the project. This
system will be developed
so that it is readily expandable to more extensive regions, e.g., to the
entire Pacific Northwest.
At the end of the two
-
year project period,


Tasks

1)

Manage and administer project
.

This task
c
ov
ers all

administrative and technical work to
fulfill BPA's pr
ogrammatic and contractual requirements such as finan
cial reporting
development of a

Statement of Work (
SOW
)

package (includes SOW and budget)
, and
producing

periodic and final reports.

2)

Develop specifications of a
web
-
bas
ed master sample management system,

and develop
an implementation plan.

Th
e

project will develop a prototype of a
web
-
bas
ed master
sample management system. It will be necessary to explore with the web developer the
various
web
-
bas
ed systems and options to meet the desired capabilities.
W
ith the advice and
in
p
ut of PNAMP, a

small workgroup will be established to define the details of the prototype

and to ensure seamless integration with other PNAMP web
-
based applications
. The work
group will consist of a statistician, a web developer, and

representatives from
various

federal

agencies

(e.g.
PNAMP,
US
EPA, NOAA
), state agencies (e.g., ODFW, WA ECY),
and other
interested parties (e.g., LCREP
,

Pacific States Marine Fisheries
Commission
)
. The group
will define operational attributes of the mast
er sample management system. As the project
develops, continual interaction
between the work group and

the web developer will be
necessary to evaluate progress, explore the draft web
-
based capabilities, and ensure that the
project is proceeding in a desire
d direction.


3)

Regional Coordination.

The project will actively seek to establish partnerships to ensure
compatibility of the system with existing state
-

or region
-
wide master sample management
tools. For example, a Master Sample of streams already exists
for Washington State, and the
Washington State Department of Ecology (WA
-
ECY) has made it available on
-
line.
However, the system, as currently configured, allows subset selection only by Water
Resource Inventory Area, not by any other classification variab
le such as stream order.
Moreover, there is no functionality that would allow a user to submit information about their
designs and the status of the master sample sites that they drew from the web. This project
4


intends to coordinate with agencies like E
CY to insure that the
web
-
bas
ed prototype is
compatible with their systems.

4)

Develop a prototype
web
-
bas
ed master sample management system
.
Development of the
web
-
based prototype will occur based on the specifications established in Task 2.

We
anticipate t
hat
this
system will have
at least
the following capabilities:



Store master samples and associated metadata (e.g., design documentation) on a
readily
accessible

server;



Allow users to download relevant


parts of the master sample;



Allow users to uplo
ad inf
ormation about master sample sites that they have evaluated (to
include design documentation, site evaluation, indicators measured, protocols used);



Provide a tracking system that documents the history of sites selected from the master
sample;



Allow users
to download histories of previously selected sites (to include design
documentation, site evaluation, indicators measured, protocols used);



Allow scaling up to a statewide, or multi
-
state system (this primarily means the capability
to manage substantially
larger master samples than that used for the prototype);




Ensure protection of data (via a secure system);



Provide a mapping
tool

to display master sample sites selected by previous users.

5)

Create Master Sample for the Lower Columbia.
As noted above, severa
l GRTS sample
designs already exist in the Lower Columbia
re
gion.

Methodology for merging an existing
sample with additional sites has been developed by the StatNat Group at Oregon State
University and applied to the Oregon Department of Fish and Wildlife’
s coastal coho
monitoring program.
This methodology will be used to
integrate these
existing sites with
newly selected random sites to produce a high
-
density,
spatially
-
bal
anced GRTS sample.
In
addition, the USEPA selected a high
-
density
area
-
based
sample

of the Lower Columbia
estuary
in 2007. This sample will be reviewed and if suitable, will be used for the
area
-
based
master sample of the
estuarine portion of the Lower Columbia
.


6)

Provide statistical oversight as the web tool is developed
. One of the ben
efits of a web
-
based Master Sample implementation is that it provides access to rigorous statistical
sampling designs for any organization monitoring the Master Sample population. To
maintain that statistical rigor, the web
-
based tool must be developed wi
th close cooperation
between the tool developers and statisticians familiar with the GRTS technique. The
principles underlying the application of the Master Sample are well
-
understood, and Larsen,
et al., (2008) present some examples of selecting focused
samples from the Master Sample
using ancillary information. However, implementation of a variety of design options, for
example, stratification, rotating panels, and oversamples, will

require statistical oversight.
Furthermore, t
he web tool should be deve
loped to facilitate eventual analysis of the sample
selected. There must be a clear path for users to follow from design specification to sample
selection to data collection, input, and analysis. One of the critical elements in that path is
the automatic

production of a
design documentation file.

This file documents the selection
process so that app
ropriate inclusion probability
or weight and other sample structure (e.g.,
stratification, panels) will be available for analyzing the data collected at the s
ample sites.

The content and format of the file will be developed
jointly by the web developer and the
statisticians with input from the workgroup.

Additionally, efforts under PNAMP to develop
and capture metadata related to statistical and monitoring desi
gn, data collection and analysis
will inform the format of design documentation developed under this task.

5


7)

Identify needs and develop web
-
based analysis tools
. Another critical element is the
development of tools that will link the design files with field

data to provide easy access to
analysis tools. It is important that these tools meet the analytical needs of ISTM major
partners (e.g., OR and WA recovery plans
, AREMP
, etc
)
, and that they are commensurate
with standards developed by PNAMP and the Integr
ated Status and Effectiveness Monitoring
Program (ISEMP).

For example, the
R

package
spsurvey

developed by the USEPA already
has tools for routine analysis of GRTS survey data, but there is a need to develop
an
interface

to interact with these
. Moreover,

additional analysis tools are available that have not been
implemented in
spsurvey
, and others, notably analysis of trend from rotating panel studies,
will become available shortly. These tools
will be made available as a
part of the overall
Master Sampl
e implementation.

8)

Provide statistical consultation support to assist users with complex sampling issues.
The intent of the Master Sample web tool is to simplify application of rigorous statistical
monitoring designs and analyses. Documentation will be pr
ovided that will facilitate most
applications; however, the tool will have the capability to create designs to satisfy very
complex requirements.
For complex designs, users may need to consult with a statistician
familiar with the Master Sample management
system to
meet design requirements.


9)

Develop and implement methodology for combining data from non
-
probability
monitoring (e.g., index sites) with data from statistical surveys.

Historically, much
monitoring data was collected with
out

a formal probability

design. Combining such non
-
probability data with data from a probability survey can be difficult because of the lack of
unambiguous link between the data and population representation. Several methods (Brus &
De Gruijter, 2003; Overton et al., 1993; Ste
in & Bernstein, 2007) have been proposed,
and
the applicability and feasibility of these will be evaluated and implemented accordingly.

10)

Develop training materials and user guides.

This task will provide detailed documentation
of the steps required to selec
t a design, annotated examples of partic
ular applications,
guidance for selecting analysis procedures
, and

annotated examples of

applying the analysis
tools.

11)

P
resent seminars/workshops on use of Master Sample management system.

The intent
of the project i
s to make the management system user
-
friendly; nevertheless, we foresee the
need to
communicate and
encourage

its application by providing some introductory training
via presentations, seminars, and/or workshops.
This task will be coordinated through
PNAM
P.



Products/Deliverables:

The project will produce:

1)

A prototype web
-
based master sample management system that is fully functional on the
Lower Columbia region

2)

Lower Columbia Master Sample

3)

Statistical package in the R language interfaced with the manage
ment system that provides
basic analysis of master samples

4)

Statistical support to users to assist with meeting complex design requirements and
subsequent analysis


5)

Documentation and users guides for the system

6)

D
elivery of seminars/workshops on
use of maste
r sample management system.


6


Project Personnel

Senior Statistician:
Don L. Stevens, Jr.

is Senior Research Professor in the Statistics
Department at Oregon State University

(OSU)
. Dr. Stevens is an internationally known
environmental statistician, partic
ularly in the area of environmental sampling and monitoring. He
is a Fellow of the American Statistical Association, an elected member of the International
Statistical Institute, and President
-
elect of The International Environmetrics Society. He made
fun
damental contributions to developing the statistical sampling theory supporting EMAP’s
spatially
-
bal
anced probability sampling, and applying that theory to designing samples of a
variety of aquatic resources. Dr. Stevens has supervisory and project managem
ent experience,
both in academia and contract research. While at Eastern Oregon State University, he was Area
Coordinator for Mathematics and Computer Science, and Principal Investigator on a cooperative
agreement from
US
EPA to develop the sampling design
for the Direct
-
Delayed Research
Project. Subsequently, he held positions as a General Supervisor and Project Manager for two
on
-
site contractors at the USEPA Laboratory in Corvallis.
At OSU, h
e was the Program Director
for the EPA
-
STAR
-
funded Program on De
signs and Models for Aquatic Resource Surveys. He
is a consultant on environmental monitoring design issues for the Warm Springs Indian Tribe,
the National Parks Services Great Lakes Monitoring Network, the San Francisco Estuary
Regional Monitoring Progra
m, California’s Surface Waters Ambient Monitoring Program,
California’s Fish Mercury Program, the Oregon Department of Fish and Wildlife, the Oregon
Watershed Enhancement Board, and Australia’s Commonwealth Science and Industrial Research
Organization Envi
ronmental Informatics group.

Statistician: Lisa Madsen

is an Assistant Professor in the Statistics Department at OSU. Dr.
Madsen’s research focuses on spatial data and problems in environmental and ecological
statistics. Her dissertation addressed the pr
oblem of spatially misaligned data. Since then, she has
been working with dependent, non
-
Gaussian ecological data problems, particularly count data

with many zero counts.

She has expertise in simulating ecological data. She is director and co
-
founder of St
atNat (Statistics for Natural Resources), a group of statisticians at Oregon State
University working on problems in natural resources monitoring. StatNat has close working
relationships with the Oregon
Department of Fish and Wildlife, the Oregon Departmen
t of
Forestry, and the Oregon Watershed Enhancement Board.

Web Developer / Systems Engineer: Clifton Johnson

has over twenty

years experience in the
IT field,

including six

in his current role at Oregon State University. While at

Oregon State
University (
OSU), Clifton has been involved in developing

online websites and specific
applications (including online surveys,

and data processing systems which utilize the open
source R

application, php and mysql database backends) as well as providing

support, custo
m
programming, database design/management and server

administration (primarily focusing on the
linux operating system).

Clifton has an interest in, and a preference for, the development and

use
of open source applications, and was instrumental in the adopt
ion

of Drupal
(
http://www.drupal.org
) as a standard Content Management

System (CMS) web framework,
which has been used for many of the

websites on campus. In addition to his web development
and systems

administration
duties, Clifton assists researchers utilize a Beowulf

cluster to more
efficiently process data or run simulations.



References

7


Brus ,D
. J., and J. J. De Gruijter. 2003. A method to combine non
-
probability sample data with
probability sample data in estim
ating spatial means of environmental variables.
Environmental Monitoring and Assessment
83:
303

317, 2003.

Dobbie, M.J., B.L. Henderson, and D.L. Stevens, Jr. 2008. Sparse sampling: Spatial design for
monitoring stream networks. Statistics Surveys 2:113
-
153.

Larsen, D.P., A.R. Olsen, and D.L. Stevens, Jr. 2008. Using a master sample to integrate stream
monitoring programs. Journal of Agricultural, Biological, and Environmental Statistics
13:243
-
254.

Overton, J., T. Young, and W.S. Overton. 1993. Using

‘found’ data to augment a probability
sample: procedure and case study.
Environmental Monitoring and Assessment
. 26:65

83.

Stein, E.D., and B. Bernstein. 200
8
. Integrating probabilistic and targeted compliance monitoring
for comprehensive watershed asses
sment.
Environmental Monitoring and Assessment
144:117

129


Stevens, D.L., Jr. and Olsen, A.R. 2004. Spatially
-
balanced sampling of natural resources.
Journal of American Statistical Association 99: 262

278.

Oregon State University

8



Budget

Budget Period:

1 June 2009


30 Sept 2009

Personnel



Senior Statistician (4 mo @0.9 FTE @ $10,525/mo)

$37,890


OPE 37%

14,019


Statistician (2 mo @ 0.5 FTE @ $7,141/mo)

7,141


OPE 25%

(note 1
)

1,785


Web Developer (560 hours @ $50/hour, OPE included)

28,000


Sub
-
total personnel

$83,068



Office Supplies



Miscellaneous paper, postage, computer supplies

$100



Travel



POV (4

round trips to Portland @200 mi/trip @ $0.55/mi

$440



Total Direct Cost

$88,375



Facilities & Administration



46.2 % TDC

$41,291



Total Cost


Budget Period 1

$130,666

Notes: (1) summer rate for 9
-
mo faculty

9



Budget Period 2: 1 Oct 2009


30 Sept 2010

Personnel



Senior Statisti
cian (3 mo @0.8

FTE @ $10,525/mo)

$
22,103


OPE 37%

8,178


Senior S
tatistician (
9 mo @0.05

FTE @ $10,525/mo)

4,736


OPE 10%

(note 1)

474


Statistician (2 mo @

0.
1

FTE @ $7,
855
/mo)

1,571


OPE 25%

(note 2)

393



Web Developer (89
0 hours @ $52
/hour, OPE included)

46,280


Sub
-
total personnel

$
8
3,735



Of
fice Supplies



Miscellaneous paper, postage, computer supplies

$2
00



Travel



POV (8 round trips to Portland @200 mi/trip @ $0.55/mi

$88
0



Total Direct Cost

$
8
4,815



Facilities & Administration



46.2 % TDC

$
39,185



Total Cost


Budget P
eriod
2

$
1
24,000

Notes: (1) 1040 appointment rate; (2) summer rate for 9
-
mo faculty



10



Budget Period 3
:
1

Oct
2010



30
May 2011


Personnel



Senior Statistician (8 mo @0.05

FTE @ $
10,946
/mo)

$
4,378


OPE 10%

438


Web developer(50

hours @ $5
4
/h
our, OPE included)

2,700


Sub
-
total personnel

$
7,516



Office Supplies



Miscellaneous

$1
00



Travel



POV (4

round trips to Portland @200 mi/trip @ $0.55/mi

$
440



Total Direct Cost

$
8,056



Facilities & Administration



46.2 % TDC

$
3,
722



Total Cost


Budget Period
3

$
11,778



Total Project Cost

$
266,444


11



Budget
Breakout

by
Task

Task

Cost

1.
Manage and Administer Project

$
9
,000

2.
Produce Plan

24
,000

3.
Regional Coordination

5
,000

4.
Develop a prototype
web
-
bas
ed master
sample management system

112
,
54
4

5.
Create Master Sample for the Lower Columbia

24
,000

6.
Provide statistical oversight as the web tool is developed

9,9
00

7
.

Identify needs and develop web
-
based analysis tools

30,000

8
.
Provide statistical consultation

support to assist users with complex sampling
issues.

15
,000

9
.
Develop and implement methodology for combining data from non
-
probability monitoring (e.g., index sites) with data from statistical surveys

22,000

10.

Produce user’s guide and training ma
瑥t楡汳

㄰ⰰ〰

ㄱ⸠

m牥獥湴n
獥浩湡r
s

a湤⽯爠/潲歳桯o
s

潮⁵獥映浡獴m爠獡浰me慮 来me湴n
sy獴敭


5,000

Total

$
266,444