Using Bayesian Networks to
Predict Plankton Production
from Satellite Data
By:
Rob Curtis, Richard Fenn, Damon Oberholster
Supervisors:
Anet Potgieter, John Field, Laurent Drapeau
Department of Computer Science
Overview
•
Introduction
•
Work Detail
•
Knowledge Acquisition
•
Knowledge Representation
•
Bayesian Learning and Inference
•
Topic Maps
Introduction
•
Aim to predict plankton primary
production using satellite data
•
Daily satellite data on surface
temperature, chlorophyll, winds, currents
•
Archive of ships’ sub

surface details
•
Predict likely subsurface plankton profile
from surface features
Current System
•
Currently best solution uses Self
Organising Maps (SOMs: A type of neural
network) to classify data
–
Resulting solution lacks accuracy
–
Difficult to interpret
Proposed System
•
Propose a system that uses Bayesian
Networks to predict plankton production
–
Use ships’ sub surface profiles + satellite data
to draw cause effect relationships
–
Will use Bayesian Inference and Learning
•
Use Topic Maps to visualize network
Work Detail
Knowledge
Acquisition
Inference Engine
Knowledge
Representation
Learning Engine
Topic Map
Requirements
Elicitation
Rob Curtis
Richard Fenn
Damon Oberholster
Knowledge Acquisition
•
“The process of analyzing, transforming,
classifying, organizing and integrating
knowledge and representing that knowledge in a
form that can be used in a computer system.
Typically the knowledge is based on what a
human expert does when solving problems”
www.centc251.org/Ginfo/Glossary/tcglosk.htm
•
Relating to this project:
–
Huge amounts of data
–
Data is poorly recorded in Excel spreadsheets
–
Gaps in current data
Knowledge Acquisition: Amount of Data
•
2500 ship sub surface readings
–
Recorded over 10 year period
•
Bayesian Network requires satellite data
for the same time period
•
Need to represent data in a form that can
be used by the Bayesian Network
Knowledge Acquisition: Current Data
Knowledge Acquisition: Gaps in Data
Ships’ sub

surface readings
(discrete)
Satellite data (continuous)
Knowledge Acquisition: Gaps in Data
Knowledge Acquisition: Challenges
•
Making sense of all the available data
(consultations with Dr John Field and Laurent
Drapeau)
•
Correlating the 2D continuous satellite data to
3D discrete ships’ sub

surface profile
•
Representing all the data in a form easily used
by the Bayesian Network
•
Integration of disparate data
Knowledge Representation
•
“
A search for formal ways to describe knowledge
presented in informal terms (a prerequisite for its
handling as computation)”
encyclopedia.laborlawtalk.com/Representation
•
Relating to this project:
–
Need to find causal relationships between environment variables
–
Represent those relationships in a Bayesian Network
–
Store the data in a database so that it will be easy for the
Inference and Learning Engines of the Bayesian Network to
Manipulate.
–
Need to consider the temporal aspects of the data
Knowledge Representation: Causal
Relationships
Primary
Plankton
Production
Many variables that influence plankton production:
•
Chlorophyll
•
Surface Temp
•
Wind
•
Current
Chlorophyll
Surface Temp
Wind
Knowledge Representation:
Bayesian Network
•
Directed graphical model
•
Each node represents influencing variable
•
An edge from one node to another represents causal
relationship between those nodes
•
Create Bayesian network structure based on the most
relevant relationships found between the variable
Knowledge Representation:
Temporal aspects
•
Need to divide data up into time steps
•
Each time step is dependant on previous step
t + 1
t
t + 2
Learning Engine
•
Each Node of the Bayesian network will
have a Conditional Probability Table (CPT)
•
Learning engine will implement an
algorithm to update the probabilities in
each of these tables
–
nine years of satellite and ship data will be
used in training the system
Inference Engine
•
The inference engine will be responsible
for calculating the probability of a certain
sequence of observations given certain
input parameters
Testing
•
Nine years of sub

surface data will
be used to train the system.
•
Compare the predicted results for
the tenth year against the recorded
results for that year.
•
The project will be a success if
predictions are very similar to those
that were recorded.
Representing Bayesian
Networks using Topic Maps
Topic Maps: Overview
•
Brief introduction to topic maps and
hypergraphs
•
Applying topic maps to the system
•
Testing
•
Challenges
Topic Maps
•
Topic maps provide means for indexing
data
•
ISO standard for describing knowledge
structures and associating them with
information resources.
Topic Map Structure
•
Topic
–
Anything, subject, entity, concept
•
Occurrence
–
Link to information about topic
•
Association
–
Relationships between topics
Topic Map Structure
Occurrence
Topic
Association
Representing Topic Maps
•
Hypergraphs
hypergraph is a graph that can have smaller graphs
(subgraphs) imbedded within itself
Applying Topic Maps
•
Bayesian Network
–
Topics will represent nodes in the network
–
Associations represent relationships between
nodes in the network
–
Occurrences will link to info about node
•
Future System
–
Web application linking topic maps for
different regions of the ocean
Testing
•
Qualitative approach
•
Low

Fi prototypes to test intuitiveness of
proposed interface to Bayesian Network
•
Test with the intended users of the system
Challenges
•
Representing temporal information using
topic maps
•
Representing Bayesian Network
relationships using topic maps
SUMMARY
•
Represent data in a formal way
using knowledge acquisition and
representation
•
Research the viability of using
Bayesian Networks as a prediction
mechanism
•
Research the viability of using topic
maps for intuitively representing
Bayesian Networks
References
•
Pepper, S. (2002), ”The TAO of Topic
Maps, Finding the Way in the Age of
Infoglut”, retrieved 01/06/2005, URL:
http://www.ontopia.net/topicmaps/materi
als/tao.html
Comments 0
Log in to post a comment