System management

jetmorebrisketSoftware and s/w Development

Aug 15, 2012 (5 years and 1 month ago)

261 views

Bioinformatics for comparative
quantitative LC
-
MS
(2/E)

proteomics
data analysis

Joost de Groot, Twan America, Roeland van Ham

Introduction



Joost de Groot



Scientific Software Developer



Wageningen University and Research (WUR)


Introduction

Introduction



WUR = WU + R


R=Research



DLO (research institutes)


Plant Research International


BioScience (bu)


Applied BioInformatics (clust)

Introduction

The
B
ioScience
:


High
-
throughput analyses of DNA, RNA, proteins and metabolites



Genome analyses and bioinformatics


Research on bioactive and health promoting compounds


Investigate the plant as factory, e.g. for the production of pharmaceutical
proteins


Perform research on stress biology


Explore quality traits of plants, such as taste,
flavour
, insect resistance
plant architecture




Introduction


BioScience is (among others) involved in










Introduction


I am involved in Bioinformatics for Proteomics









Bioinformatics for label
-
free comparative
quantitative LC
-
MS
(2/E)

proteomics data analysis



Introduction


Data from Waters Q
-
TOF, Synapt MS systems


PLGS software data acquisition/processing


+ other software (e.g. Mascot, Progenesis)


We focus on post alignment data quality control
and data quality improvement


Several Proteomics experiments


Differential protein expression in fungi infected plants


Allergens in mother’s milk


Apoplast protein identification


etc




Introduction to LC
-
MS/MS















-

Qualitative LC
-
MS/MS
-
> peptide identity
-
> sequence


Introduction to LC
-
MS/MS













Threonine: CH3
-
CH(OH)
-
CH(NH2)
-
COOH = ~ 101,048Da

Alanine: CH3
-
CH(NH2)
-
COOH = ~ 71,0371Da

Leucine: (CH3)2
-
CH
-
CH2
-
CH(NH2)
-
COOH = ~113,084




Introduction to LC
-
MS













-

Quantitative LC
-
MS
-
> peptide mass/rt/intensity

-

Comparative
-
> alignment of multiple runs

Introduction to LC
-
MS
(what is (was:) the problem?)













This simplified example shows one peak in three runs (replicates)
of a single sample.


Chromatogram of a single peptide (present in every replicate).


Problem: data processing software can make ‘mistakes’ at peak
detection. Result: split peaks.


Peaks of high abundant peptides or tailing peaks are prone to
fragmentation.



History
(how I’ve got involved)


2006/2007 CBSG Ind3 bottleneck project


Bioinformatics solutions for urgent issues in comparative
quantitative proteomics data analysis (Twan America).


Highest priority:


Solve LC
-
MS peak detection fragmentation over multiple
chromatograms
(which needs some explanation I guess

)



History
(split peaks in detail on data level)














~26ppm

History
(split peaks in detail on data level)













-

Quantitative
-
> peptide mass/rt/intensity

-

Comparative
-
> multiple samples = runs

History
(implementation of PACP)
























History
(implementation of PACP)
























History
(PACP)


Procedure published in Proteomics















Post alignment clustering procedure for comparative quantitative proteomics LC
-
MS Data. De Groot, JC
et.al.

Proteomics 2008 V8#1.p32
-
36


Future


We applied for additional Bioinformatics for Proteomics
funding (Twan America (supervisor) and Joost de Groot
(bioinformatics developer)).


Granted:


CBSG2 BB6 project:


Scientific programmer 2 year (~0,5 fte = ~0,25 fte/y)


NBIC/NPC/BIOASSIST/NGI = NBPP
(Netherlands Bioinformatics for Proteomics
Platform)


Scientific programmer 2 year (~1 fte = ~ 0,5 fte/y)


CBSG

NBPP

NBPP

Issues to address


CBSG BB6


Retention time correction of LC
-
MS results.


Several effects can cause (small) drifts in retention time which can
result in less accurate alignments.


PACP and SEDMAT results expect to be improved by Rt correction
methods.


Solution: retention time correction algorithm.

Issues to address (CBSG BB6)

Issues to address (CBSG BB6)

Issues to address


NBPP (BioAssist)


Make tools available (via webservices)


Wrap tools in web services


Enables workflow management systems (like Taverna)


Re
-
engineer PACP (Python
-
>Java WS)


Solution: build web service providers/consumers

Issues to address (CBSG BB6 / NBPP)

Issues to address (NBPP)

Issues (NBIC/NPC)

Netbeans / Java


SEDMAT








GlassFish Application Server









Thanks for your attention.

Feel free to give comments,
remarks or suggestions.


© Wageningen UR