Data Management Needs and Challenges for Telemetry Scientists

taxidermistplateΛογισμικό & κατασκευή λογ/κού

7 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

83 εμφανίσεις

Data Management Needs and
Challenges for Telemetry Scientists

Josh M London

Wildlife Biologist, Polar Ecosystems Program

National Marine Mammal Laboratory

NOAA NMFS Alaska Fisheries Science Center

Temptation to
identify biologists
as the source for
the raw data

The Tip of a Complex Iceberg

hypothesis

a
gency needs/mandates

f
unding initiatives

o
pportunistic vs. planned

t
ag design/vendor

t
ag programming

Deployment of tags (location, age/sex, time)

Data Management

d
ata quality control

synthesis

m
ovement model

Publications

Contract reports

Status/Listing Review

d
erived products

Field Work

a
nd

Study Design

Narrowing Bottleneck

Many biologists lack the
skills and training for
effective, scalable
database design and data
management practices

Field Work & Tag Deployment


When? Where?


Which Tag/Vendor?


Which Age? Which Sex?
(Do we have a choice?)


Tag Programming


Deployment Length
(attachment type)


Limited Tools for Managing Raw
Telemetry Data

‘raw’ data


via Argos as CSV/Text


Process w/ Vendor Software
(behavior data)


Typically output as CSV


Field data about animal (e.g.
ID, species, sex, age, health)

needs


Explore ‘raw’ data


Address hypotheses


Visualize movement/use


Synthesize w/ dependent (e.g.
health, age) and independent
data (e.g. other animals,
remote sensed)

Biologists Not Trained in Large Scale
Data Management

Biologists


Excel and/or Access


ESRI
ArcMap

(
shapefiles
)


Google Earth


Mouse Click Interaction


Programming (visual basic, R,
python) recipe driven … not
developers


Data Manager


Postgres
/
PostGIS
, Oracle,
MySQL, SQL Server


Normalization and Efficient
Design


Scripting, Jobs, Transactions


Data Integrity


Automation, Reproducible

My Perspective

To address complex questions related to marine mammal telemetry and
understanding animal ecology, I had to become more of a data manager
…And, in the process, I’ve become less of a biologist

Start (2006)


Argos Monthly CDs


SatPack

Access Database


Excel Files (limited to
56k)


Large, Flat Tables


No Central Repository


Current System


Nightly FTP Argos Push


Nightly Data Processing


CSV/External Oracle Table


PL/SQL Procedures


Developed/Designed with
Training via Google Search

My Perspective

Current Limitations


Data access requires a minimum level of technical
skills (basic SQL, Oracle framework, Oracle APEX, R
spatial tools,
ArcMap
)


Single Point of Access/Failure (me)


Limited Documentation of Design


Design May Not be Optimal/Appropriate


Main Objective to Provide Data to Analysts


Not
necessarily designed for providing data to public



My Perspective

Greatest Needs


Research Program


Data Management and Design
C
onsultation


Data Design & Documentation Portal

(user
-
friendly metadata)


Low Tech Exploration
Tools


Database and Application Developers


(data flow and data input)


Training Opportunities

My Perspective

Greatest Needs


External to Program?


Provide Meaningful Public Access to Data


A Clear Data Sharing Policy w/ Best Practices


Encourage/Facilitate Scientific Collaboration


Meet Agency Needs and Requirements


How to Communicate Scientific Knowledge in the
Modern/Digital Age

sharing knowledge/expertise just
as important as sharing data


Publish Data Once


My Perspective

Challenges / Road Blocks


Limited Funds and Priorities


appropriate resources
for doing the priority analysis and science not
available, let alone the resources to distribute data
responsibly


Database design/management often in the hands of
the least skilled users


IT Policies, Investments, and Infrastructure Varied
Across Institutions


No standard(s) for communicating and sharing ‘raw’
animal telemetry data. What is ‘raw’ data?