MATLAB Applications in Bioinformatics

creatorprocessΒιοτεχνολογία

2 Οκτ 2013 (πριν από 4 χρόνια και 1 μήνα)

130 εμφανίσεις

© 2003 The MathWorks, Inc.

MATLAB Applications in Bioinformatics

Developing and Deploying Bioinformatics
Applications with MATLAB

© 2003 The MathWorks, Inc.

1

MATLAB for Bioinformatics

Kristen Amuzzini

Biotech, Pharmaceutical, & Medical Industry

The MathWorks, Inc.

© 2003 The MathWorks, Inc.

Presentation Layout


MATLAB applications in Bioinformatics


Customer success stories


MATLAB & The Bioinformatics Toolbox


Sequence analysis


Microarray analysis


Integrating MATLAB with other tools


MATLAB as computational engine for Excel


Questions/Answers & Wrap
-
up

© 2003 The MathWorks, Inc.

Bioinformatics Applications


Sequence analysis


Base calling algorithm design, sequence alignment,
sequence building algorithms


Microarray analysis


Image processing, QA/QC, data normalization, data analysis


Proteomics


Mass Spectrometry signal processing, protein marker
identification and classification, peptide sequence
identification, 2D
-
Gel image analysis


Systems Biology


Interaction network identification, simulation of metabolic
pathways, flux analysis


© 2003 The MathWorks, Inc.

Bioinformatics teams supporting multiple
constituencies with multiple tools.


C/C++, Java, Perl


VB, Excel Macros


SQL


GUI Based tools


Freeware


SPLUS, R, SAS, Mathematica


Web based tools

Research Biologists


Prefer UI/Web based


tools


Want custom analyses

Bioinformatics Team


Algorithm development


Custom one
-
off analyses


Programs for biologists

Software Engineers


C++, Java


Work off MATLAB
prototypes

© 2003 The MathWorks, Inc.

Using MATLAB, bioinformatics teams can support
multiple constituencies.

MATLAB GUI’s,
analyses

Research Biologists


Prefer UI/Web based


tools


Want custom analyses

Bioinformatics Team


Algorithm development


Custom one
-
off analyses


Programs for biologists

Software Engineers


C++, Java


Work off MATLAB
prototypes

MATLAB
prototypes/

Applications

© 2003 The MathWorks, Inc.

Complete draft of the human genome,
accelerated by Applied Biosystems


using MATLAB algorithms.


“Having one integrated package

is a big advantage. Using MATLAB and the

MATLAB Compiler reduced my development time
by a factor of 4 or 5.”


“MATLAB has always been ideal as an algorithm
prototyping tool,” Labrenz concludes, “but the
MATLAB Compiler and C/C++ Math and Graphics
Libraries add a whole new dimension, allowing
rapid delivery of sophisticated solutions.”


Jim Labrenz, Applied Biosystems

User example: Genetic Sequence Base Calling

© 2003 The MathWorks, Inc.

User example: Breast Cancer Prognosis

Rosetta Inpharmatics recently developed a tool
that enables clinicians to determine a breast
cancer patient’s prognosis based on the gene
expression profile of the primary tumor.


“Since MATLAB and the Image Processing Toolbox are
fully integrated and the MATLAB platform is very good for
matrix calculation, we did not have to spend time writing
the low level image processing and the basic data
analysis routines like vector and matrix calculations”


“Our research scientists are happy with the quick
feedback,” Dr. Dai says. “Using MathWorks tools, we can
respond to their requests very fast, and it’s easy for the
scientists to use these tools. Using the GUIs that we
develop in MATLAB, they can access functions without
having to remember the underlying code.”


Dr. Hongyue Dai,

Rosetta Inpharmatics/Merck & Company

© 2003 The MathWorks, Inc.

Academic users



Bioinformatics Teaching


MIT, Stanford, Cornell, Carnegie Mellon, …


Research


Sequencing


Base calling algorithm design


Sequence analysis


Computational biolinguistics


Microarray analysis


Statistical modeling of microarrays


Proteomics


Statistical modeling of protein
-
protein interaction


Systems Biology


Flux Analysis

© 2003 The MathWorks, Inc.

More than 600 textbooks for education and professional use, in 19
languages



Biosciences


Controls


Signal Processing


Image Processing





Mechanical Engineering


Mathematics


Natural Sciences


Environmental Sciences

Thousands of universities teach students using
MathWorks products.

© 2003 The MathWorks, Inc.

Industry
Issues

&
Solutions


Integrating tools from various
programming languages is
difficult, closed source tools are
not customizable, and freeware
is often not supported.



There is no standard biological
data format.





Applications must be easily
deployable within organizations.


MATLAB is a supported, open
architecture, user
-
friendly
environment for data analysis across
applications, algorithm development,
and deployment.



MATLAB and the Bioinformatics
Toolbox provides file format support
for common data sources (web
-
based, sequences, microarray, etc.).



MATLAB’s deployment tools and
user
-
interface design environment
allow easy deployment of MATLAB
based applications.

© 2003 The MathWorks, Inc.

The Bioinformatics Toolbox

Robert Henson

The MathWorks, Inc.


Developing and Deploying Bioinformatics
Applications with MATLAB

© 2003 The MathWorks, Inc.

11

MATLAB & The Bioinformatics Toolbox

© 2003 The MathWorks, Inc.

The MathWorks Product Family

Code Generation

Blocksets

Integrated for:



technical computing, data analysis and visualization



system modeling and simulation



implementation of real
-
time embedded software

PC
-
based real
-
time
systems

Stateflow

Stateflow

Stateflow

Toolboxes

DAQ cards

Instruments

Databases and files

Financial Datafeeds

Desktop Applications

Automated Reports

© 2003 The MathWorks, Inc.


File I/O


FASTA, PDB, SCF, GPR, GAL


Web Connectivity


GenBank, EMBL, PIR, PDB


Sequence Analysis & Alignment


Needleman
-
Wunsch, Smith
-
Waterman


DNA/RNA/AA conversions, pattern searching


Microarray Normalization & Visualization


Lowess, global mean, MAD (median absolute deviation)


Protein Visualization


Atomic composition, molecular weight, hydrophobicity profile




Bioinformatics Toolbox 1.0

212

PY
E
S
FT
F
PEL
MR
K
G
S
Y
N
PV
TH
I
Y
T
A
QDV
K
EV
I
E
Y
A
RL
R
G
IR




| |


|


:
| | |


:
|
:
|
:


: :

|
:


| | |
:
| |

|
:
|
:
:


|
: :

321

PY
I
S
RY
Y
PEL
AV
H
G
A
Y
S
E
-
SE
T
Y
S
E
QDV
R
EV
A
E
F
A
KI
Y
G
VQ

© 2003 The MathWorks, Inc.

Command

History

MATLAB Desktop Tools

Launchpad:

Start other tools and

demos

Workspace

Browser:

See your data

Command Window

© 2003 The MathWorks, Inc.

Sequence Alignment Tutorial Example


Get human and mouse genes from GenBank


Look for open reading frames (ORFs)


Convert DNA sequences to amino acid sequences


Create a dotplot of the two sequences


Perform global alignment


Perform local alignment

© 2003 The MathWorks, Inc.

Microarray Data Analysis Tutorial Example


Plot expression profiles for genes


Filter genes based on information content of profile


Perform hierarchical clustering


Perform K
-
means clustering


Perform Principal Component Analysis

Reference:

DeRisi, JL, Iyer, VR, Brown, PO. "Exploring the metabolic and genetic control of gene expression on a genomic scale." Scien
ce.

1997 Oct 24;278(5338):680
-
6.

© 2003 The MathWorks, Inc.

Integrating and Deploying Bioinformatics Tools with
MATLAB

Robert Henson

The MathWorks, Inc.


Developing and Deploying Bioinformatics
Applications with MATLAB

© 2003 The MathWorks, Inc.

17

Integrating and Deploying

Bioinformatics Tools with MATLAB

© 2003 The MathWorks, Inc.

Connecting to MATLAB

Excel / COM

File I/O

C/C++

Java

Perl

© 2003 The MathWorks, Inc.

Excel

COM

Deploying with MATLAB

© 2003 The MathWorks, Inc.

Push Data into MATLAB

Data I/O


Import Excel ranges


into MATLAB


Export MATLAB data into


Excel ranges


Evaluate MATLAB Statements in


Excel


© 2003 The MathWorks, Inc.

Computational Engine for Excel

Spread Sheet Applications




MATLAB Excel Link can
be the computational
engine behind your Excel
applications




Fast scalable solution


MLPutMatrix("data",B2:H43)

MLPutMatrix("Genes",A2:A43)

MLPutMatrix("TimeSteps",B1:H1)

MLEvalString("clustergram(data,'RowLabels',…


Genes,'ColLabels',TimeSteps)")

© 2003 The MathWorks, Inc.

Image Processing

Signal Processing

Neural Networks

Optimization

Statistics

What else could you do?

Bioinformatics

© 2003 The MathWorks, Inc.

Integrating and Deploying Bioinformatics Tools with
MATLAB

Robert Henson

The MathWorks, Inc.


Developing and Deploying Bioinformatics
Applications with MATLAB

© 2003 The MathWorks, Inc.

23

Summary

© 2003 The MathWorks, Inc.

Industry
Issues

&
Solutions


Integrating tools from various
programming languages is
difficult, closed source tools are
not customizable, and freeware
is often not supported.



There is no standard biological
data format.





Applications must be easily
deployable within organizations.


MATLAB is a supported, open
architecture, user
-
friendly
environment for data analysis across
applications, algorithm development,
and deployment.



MATLAB and the Bioinformatics
Toolbox provides file format support
for common data sources (web
-
based, sequences, microarray, etc.).



MATLAB’s deployment tools and
user
-
interface design environment
allow easy deployment of MATLAB
based applications.

© 2003 The MathWorks, Inc.

Further Information


Bioinformatics Toolbox Product page


Demos, technical literature, trial information


www.mathworks.com/products/bioinfo


MATLAB Central


File exchange and newsgroup access for
MATLAB and Simulink users


www.mathworks.com/matlabcentral


Access to comp.soft
-
sys.matlab











file exchange and newsgroup access for

the MATLAB & Simulink user community