Bioinformatics III - Chair of Computational Biology

lambblueearthBiotechnology

Sep 29, 2013 (4 years and 1 month ago)

72 views

1.LectureWS2008/09BioinformaticsIII1
BioinformaticsIII
“Systemsbiology”
“Cellularnetworks”
“Computationalcellbiology”
Coursewillteachmathematicalmethodsthatareapplied
fromproteincomplexestointeractionnetworks
1.LectureWS2008/09BioinformaticsIII2
Organizationofbiologicalcells
1.LectureWS2008/09BioinformaticsIII3
Organizationofbiologicalcells
(Top)sincethe1950ies,aparadigmbecameestablishedthatinformationflows
fromDNAoverRNAtoproteinsynthesiswhichthengivesrisetoparticular
phenotypes.
(Middle)Theupcomeofstructuralbiology-thefirstcrystalstructureofthe
proteinMyoglobinwasdeterminedin1960-emphasizedtheimportanceofthe
three-dimensionalstructuresofproteinsdeterminingtheirfunction.
(Bottom)Today,wehaverealizedthecentralroleplayedbymolecular
interactionsthatinfluenceallotherelements.
1.LectureWS2008/09BioinformaticsIII4
Organizationoftranscriptionalregulatorynetworks
(left)The'basicunit'comprisesthetranscriptionfactor,itstargetgenewithDNA
recognitionsiteandtheregulatoryinteractionbetweenthem.
(middle)Unitsareoftenorganizedintonetwork'motifs'whichcomprisespecific
patternsofinter-regulationthatareover-representedinnetworks.Examplesof
motifsincludesingleinputmultipleoutput(SIM),multipleinputmultipleoutput
(MIM),andfeed-forwardloop(FFL)motifs.
(right)Networkmotifscanbeinterconnectedtoformsemi-independent'modules'
manyofwhichhavebeenidentifiedbyintegratingregulatoryinteractiondatawith
geneexpressiondata,andimposingevolutionaryconservation.Thenextlevel
consistsoftheentirenetwork(notshown).
1.LectureWS2008/09BioinformaticsIII5
Majormetabolicpathways
1.LectureWS2008/09BioinformaticsIII6
Content(ca.)
Proteininteractionnetworks:
-differenttopologies(randomnetworks,scale-freenetworks)
-introofproteincomplexes:exp.data
-computationalanalysis(mathematicalgraphs)
-graphicallayout(forceminimization)
-qualitycheck(Bayesiananalysis)
-modularity
-networkflow
Transcriptionalregulatorynetworks,motifs
Signaltransductionnetworks
Metabolicnetworks:metabolicfluxanalysis,extremepathways,elementarymodes
FFTprotein-proteindocking,fittingintoEMmaps,tomography
integrationofinteractomeandregulome(Lichtenberg),
integrationofproteinnetworkswithmetabolicpathways
1.LectureWS2008/09BioinformaticsIII7
Appetizer1
Cellcycleproteinsthatarepart
ofcomplexesorotherphysical
interactionsareshownwithin
thecircle.
Forthedynamicproteins,the
timeofpeakexpressionis
shownbythenodecolor;
staticproteinsarerepresented
aswhitenodes.
Outsidethecircle,thedynamic
proteinswithoutinteractions
arepositionedandcolored
accordingtotheirpeaktime.
Lichtenbergetal.Science307,724(2005)
1.LectureWS2008/09BioinformaticsIII8
Appetizer2
c,Standardstatistics(globaltopologicalmeasuresandlocalnetworkmotifs)describingnetworkstructures.Thesevarybetween
endogenousandexogenousconditions;thosethatarehighcomparedwithotherconditionsareshaded.(Note,thegraphforthe
staticstatedisplaysonlysectionsthatareactiveinatleastonecondition,butthetableprovidesstatisticsfortheentirenetwork
includinginactiveregions.)
Luscombe,Babu,…Teichmann,Gerstein,Nature431,308(2004)
a,Schematicsand
summaryofpropertiesfor
theendogenousand
exogenoussub-networks.
b,Graphsofthestatic
andcondition-specific
networks.Transcription
factorsandtargetgenes
areshownasnodesin
theupperandlower
sectionsofeachgraph
respectively,and
regulatoryinteractionsare
drawnasedges;theyare
colouredbythenumber
ofconditionsinwhich
theyareactive.Different
conditionsusedistinct
sectionsofthenetwork.
1.LectureWS2008/09BioinformaticsIII9
Mathematicaltechniquescovered
Mathematicalgraphs–classificationofprotein-proteininteractionnetworks,
–algorithmsongraphs
–regulatorynetworks
Bayesiannetworks–proteininteractionnetworks
Booleannetworks–transcriptionalnetworks
Fouriertransformation–protein/protein-docking,patternmatching
Linearandconvexalgebra
–metabolicnetworks
Ordinaryandstochasticdifferentialequations
–kineticmodellingofsignaltransductionpathways
1.LectureWS2008/09BioinformaticsIII10
Literature
lectureslides
areavailablealready;finalversion3daysbeforelecture
suggestedreading
:linkswillbeputuponcoursewebsite
http://gepard.bioinformatik.uni-saarland.de/teaching
...
Textbooks
€55,-€37,-(Amazon)€54,-
All3booksareavailableinthecomputersciencelibrary!
1.LectureWS2008/09BioinformaticsIII11
assignments
10weeklyassignments
planned
HomeworkassignmentsarehandedoutintheThursdaylecturesandare
availableonthecoursewebsiteonthesameday.
Homeworkwillincludemanyprogrammingassignments.Youcanprogramin
anypopularprogramminglanguage.Werecommendpowerfulscriptlanguages
suchasPhythonorPerlthatallowtosolveproblemsefficiently.
SolutionsneedtobereturneduntilThursdayofthefollowingweek14.00
toPeterWalterinroom0.11Geb.C71,groundfloor,orhandedinprior(!)
tothelecturestartingat14.15.2studentsmaysubmitonejointsolution.
Alsopossible:submitsolutionbye-mailas1printablePDF-fileto
p.walter@bioinformatik.uni-saarland.de.
Tutorial:participationisrecommendedbutnotmandatory.Date:Wed14-16?
HomeworkssubmittedonThursdayswillbediscussedonthefollowingWednesday.
Eachstudentneedstopresenthissolutiontooneoftheassignmentsonthe
blackboardonceinthetutorialsession.
1.LectureWS2008/09BioinformaticsIII12
tests
4tests
areplannedon:
Nov.11,Dec.9,Jan.13,Feb.10
Eachwilllast45minutes.
Testswillcoverthematerialfromthelecturessincethestartofthelectureorsince
thelasttest.
Youneedtopass3outof4tests.
Thegradeforthetestswillbecomputedfromthebest3tests.
Ifyoumissoneofthetestsforamedicalreason,pleaseprovideamedicalcertificate.
Youwillbegivenachanceforanoralexamonthesamesubject.
Ifyouhavepassedonly2tests,youmaychooseoneofthefailedormissingtestsfor
anoralre-exam.
Ifyouhaveonlypassed1test,youhavefailedtheclass.
1.LectureWS2008/09BioinformaticsIII13
Schein=successfulwrittenexam
Thesuccessfulparticipationinthelecturecourse(„Schein“)willbecertifiedupon
successfulcompletionofthewrittenexamthatwilltakeplaceonFeb.20,2009,10–12am.
Participationattheexamisopentothosestudentswhohave
-received50%ofcreditpointsfortheassignmentsand
-presentedonceduringthetutorialsand
-havepassed3tests.
Thefinalmarkofthe„Schein“willbetheaverageofthe3besttestresults(50%)
andofyourresultinthefinalexam(50%).
Thefinalexamwilltake120minandonlycoverthematerialoftheassignments.
IncaseofillnesspleasesendE-mailto:
kerstin.gronow-p@bioinformatik.uni-saarland.de
andprovideamedicalcertificate.
A„secondandfinalchance“examwillbeofferedinApril2009.
1.LectureWS2008/09BioinformaticsIII14
tutors
PeterWalter,SikanderHayat–assignments
Geb.C71,room0.11
p.walter@bioinformatik.uni-saarland.de
1.LectureWS2008/09BioinformaticsIII15
Systemsbiology
Biologicalresearchinthe1900sfollowedareductionistapproach:
detectunusualphenotypeisolate/purify1protein/gene,determineits
function
However,itisincreasinglyclearthatdiscretebiologicalfunctioncanonlyrarely
beattributedtoanindividualmolecule.
newtaskofunderstandingthestructureanddynamicsofthecomplex
intercellularwebofinteractionsthatcontributetothestructureandfunctionof
alivingcell.
1.LectureWS2008/09BioinformaticsIII16
Systemsbiology
Developmentofhigh-throughputdata-collectiontechniques,
e.g.microarrays,proteinchips,yeasttwo-hybridscreens
allowtosimultaneouslyinterrogateallcellcomponentsatanygiventime.
thereexistsvarioustypesofinteractionwebs/networks
-protein-proteininteractionnetwork
-metabolicnetwork
-signallingnetwork
-transcription/regulatorynetwork...
Thesenetworksarenotindependentbutform„networkofnetworks“.
1.LectureWS2008/09BioinformaticsIII17
DOEinitiative:GenomestoLife
acoordinatedeffort
slidesborrowed
fromtalkof
MarvinFrazier
LifeSciencesDivision
U.S.DeptofEnergy
1.LectureWS2008/09BioinformaticsIII18
FacilityI
ProductionandCharacterizationofProteins
EstimatingMicrobialGenomeCapability
•ComputationalAnalysis
–Genomeanalysisofgenes,proteins,andoperons
–Metabolicpathwaysanalysisfromreferencedata
–ProteinmachinesestimatefromPMreferencedata
•KnowledgeCaptured
–Initialannotationofgenome
–Initialperceptionsofpathwaysandprocesses
–Recognizedmachines,function,andhomology
–Novelproteins/machines(including
prioritization)
–Productionconditionsandexperience
1.LectureWS2008/09BioinformaticsIII19
•AnalysisandModeling
–Massspectrometryexpressionanalysis
–Metabolicandregulatorypathway/network
analysisandmodeling
•KnowledgeCaptured
–Expressiondataandconditions
–Novelpathwaysandprocesses
–Functionalinferencesaboutnovel
proteins/machines
–Genomesuperannotation:regulation,function,
andprocesses(deepknowledgeaboutcellular
subsystems)
FacilityII
WholeProteomeAnalysis
ModelingProteomeExpression,Regulation,andPathways
1.LectureWS2008/09BioinformaticsIII20
FacilityIII
CharacterizationandImagingofMolecularMachines
ExploringMolecularMachineGeometryandDynamics
•ComputationalAnalysis,ModelingandSimulation
–Imageanalysis/cryoelectronmicroscopy
–Proteininteractionanalysis/massspec
–Machinegeometryanddockingmodeling
–Machinebiophysicaldynamicsimulation
•KnowledgeCaptured
–Machinecomposition,organization,geometry,
assemblyanddisassembly
–Componentdockinganddynamicsimulations
ofmachines
1.LectureWS2008/09BioinformaticsIII21
FacilityIV
AnalysisandModelingofCellularSystems
SimulatingCellandCommunityDynamics
•Analysis,ModelingandSimulation
–Coupleknowledgeofpathways,networks,and
machinestogenerateanunderstandingof
cellularandmulti-cellularsystems
–Metabolism,regulation,andmachine
simulation
–Cellandmulticellmodelingandflux
visualization
•KnowledgeCaptured
–Cellandcommunitymeasurementdatasets
–Proteinmachineassemblytime-coursedatasets
–Dynamicmodelsandsimulationsofcellprocesses
1.LectureWS2008/09BioinformaticsIII22
“GenomesToLife”ComputingRoadmap
BiologicalComplexity
Comparative
Genomics
Constraint-Based
FlexibleDocking


Computing and Information
Infrastructure Capabilities
Constrained
rigid
docking
Genome-scale
proteinthreading
Communitymetabolic
regulatory,signalingsimulations
Molecularmachine
classicalsimulation
Proteinmachine
Interactions
Cell,pathway,and
network
simulation
Molecule-based
cellsimulation
Current
U.S.
Computing
1.LectureWS2008/09BioinformaticsIII23
Arebiologicalnetworksspecial?
Albert-LaszloBarabasi
Statisticalphysics:
triestofindinguniversalscalinglawsofsystems,
e.g.howdoesthedynamicsofaglasschange
whenyoulowerthetemperature?
Phase-transition„criticalslowingdown“.
„Relaxtiontimesinspin-glassesorglassesareobservedtogrow
tosuchanextentatlowtemperaturesthatthesesystemsdonot
reachthermalequilibriumonexperimentallyaccessibletime-
scales.Propertiesofsuchsystemsarethenoftenfoundto
dependontheirhistoryofpreparation;suchsystemsaresaidto
age.
Similarobservationsaremadeincoarseningdynamicsatfirst
orderphasetransitions.Somepropertiesofspin-glassesand
glassesmustthereforebestudiedviadynamicalapproaches
whichallowtakingpossiblehistorydependenceexplicitlyinto
account.“
1.LectureWS2008/09BioinformaticsIII24
Apowerlawrelationshipbetweentwoscalarquantitiesxandyisanysuchthatthe
relationshipcanbewrittenas
wherea(theconstantofproportionality)andk(theexponentofthepowerlaw)are
constants.
Powerlawscanbeseenasastraightlineonalog-loggraphsince,takinglogson
bothsides,theaboveequationisequalto
whichhasthesameformastheequationforaline
Powerlawsareobservedinmanyfields,includingphysics,biology,geography,
sociology,economics,andwarandterrorism.Theyareamongthemostfrequent
scalinglawsthatdescribethescalinginvariancefoundinmanynaturalphenomena.
www.wikipedia.org
Powerlaws
k
axy



axk
axy
k
loglog
)log(log


c
mx
y


1.LectureWS2008/09BioinformaticsIII25
Degree
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Themostelementarycharacteristicofanodeisits
degree(orconnectivity),k.Itisdefinedasthe
numberoflinksbetweenthisnodeandothernodes.
aIntheundirectednetwork,nodeAhask=5.
bInnetworksinwhicheachlinkhasaselected
directionthereisanincomingdegree,kin,which
denotesthenumberoflinksthatpointtoanode,
andanoutgoingdegree,kout,whichdenotesthe
numberoflinksthatstartfromit.
E.g.,nodeAinbhaskin
=4andkout
=1.
AnundirectednetworkwithNnodesandLlinksis
characterizedbyanaveragedegree<k>=2L/N
(where<>denotestheaverage).
1.LectureWS2008/09BioinformaticsIII26
Degreedistribution
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Thedegreedistribution,P(k),givesthe
probabilitythataselectednodehasexactlyk
links.
P(k)isobtainedbycountingthenumberofnodes
N(k)withk=1,2...linksanddividingbythetotal
numberofnodesN.
Thedegreedistributionallowsustodistinguish
betweendifferentclassesofnetworks.
1.LectureWS2008/09BioinformaticsIII27
Clusteringcoefficient
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Inmanynetworks,ifnodeAisconnectedtoB,andBis
connectedtoC,thenitishighlyprobablethatAalsohas
adirectlinktoC.Thisphenomenoncanbequantified
usingtheclusteringcoefficient
wherenI
isthenumberoflinksconnectingthekI
neighboursofnodeItoeachother.
Inotherwords,CI
givesthenumberof'triangles'thatgo
throughnodeI,whereaskI
(kI
-1)/2isthetotalnumberof
trianglesthatcouldpassthroughnodeI,shouldallof
nodeI'sneighboursbeconnectedtoeachother.

1
2


k k
n
C
l
l
1.LectureWS2008/09BioinformaticsIII28
Clusteringcoefficient
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
aOnly1pairofnodeA's5neighboursarelinkedtogether(B
andC),whichgivesnA
=1andCA
=2/20.
Bycontrast,noneofnodeF'sneighbourslinktoeachother,
givingCF
=0.Theaverageclusteringcoefficient,<C>,
characterizestheoveralltendencyofnodestoformclusters.
Animportantmeasureofthenetwork'sstructureisthefunction
C(k),whichisdefinedastheaverageclusteringcoefficientofall
nodeswithklinks.FormanyrealnetworksC(k)k-1,whichis
anindicationofanetwork'shierarchicalcharacter.
Theaveragedegree<k>,averagepathlength<ℓ>andaverage
clusteringcoefficient<C>dependonthenumberofnodesand
links(NandL)inthenetwork.
Bycontrast,theP(k)andC(k)functionsareindependentofthe
network'ssizeandtheythereforecaptureanetwork'sgeneric
features,whichallowsthemtobeusedtoclassifyvarious
networks.
1.LectureWS2008/09BioinformaticsIII29
Barabasi&Oltvai,NatureRevGen5,101(2004)
Aa
TheErdös–Rényi(ER)modelofarandomnetworkstartswithN
nodesandconnectseachpairofnodeswithprobabilityp,which
createsagraphwithapproximatelypN(N-1)/2randomlyplaced
links.
Ab
ThenodedegreesfollowaPoissondistribution,wheremost
nodeshaveapproximatelythesamenumberoflinks(closeto
theaveragedegree<k>).
Thetail(highkregion)ofthedegreedistributionP(k)
decreasesexponentially,whichindicatesthatnodesthat
significantlydeviatefromtheaverageareextremelyrare.
Ac
Theclusteringcoefficientisindependentofanode'sdegree,so
C(k)appearsasahorizontallineifplottedasafunctionofk.
Themeanpathlengthisproportionaltothelogarithmofthe
networksize,logN,whichindicatesthatitischaracterizedby
thesmall-worldproperty.
Randomnetworks
1.LectureWS2008/09BioinformaticsIII30
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Scale-freenetworks
Scale-freenetworksarecharacterizedbyapower-lawdegree
distribution;theprobabilitythatanodehasklinksfollows
P(k)~k--,whereisthedegreeexponent.
Theprobabilitythatanodeishighlyconnectedisstatistically
moresignificantthaninarandomgraph,thenetwork'sproperties
oftenbeingdeterminedbyarelativelysmallnumberofhighly
connectednodes(„hubs“,seebluenodesinBa).
IntheBarabási–Albertmodelofascale-freenetwork,ateach
timepointanodewithMlinksisaddedtothenetwork,it
connectstoanalreadyexistingnodeIwithprobabilityI
=
kI/JkJ,wherekI
isthedegreeofnodeIandJistheindex
denotingthesumovernetworknodes.Thenetworkthatis
generatedbythisgrowthprocesshasapower-lawdegree
distributionwith=3.
1.LectureWS2008/09BioinformaticsIII31
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Scale-freenetworks
(Bb)Power-lawdistributionsareseenasastraight
lineonalog–logplot.
(Bc)ThenetworkthatiscreatedbytheBarabási–
Albertmodeldoesnothaveaninherentmodularity,
soC(k)isindependentofk.
Scale-freenetworkswithdegreeexponents2<<3,
arangethatisobservedinmostbiologicalandnon-
biologicalnetworks,areultra-small,withtheaverage
pathlengthfollowingℓ~loglogN,whichis
significantlyshorterthanlogNthatcharacterizes
randomsmall-worldnetworks.
1.LectureWS2008/09BioinformaticsIII32
Importanceofthedegreeexponent
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
ThevalueofinP(k)k
-
determinesmanypropertiesofthe
system.Thesmallerthevalueof,themoreimportanttherole
ofthehubsisinthenetwork.
Ingeneral,theunusualpropertiesofscale-freenetworksare
validonlyfor<3.
For2>>3thereisahierarchyofhubs,withthemost
connectedhubbeingincontactwithasmallfractionofall
nodes.
For=2ahub-and-spokenetworkemerges,withthelargest
hubbeingincontactwithalargefractionofallnodes.
Here,thedispersionoftheP(k)distribution,definedas2
=
<k
2>-<k>2,increaseswiththenumberofnodes(thatis,
diverges),resultinginaseriesofunexpectedfeatures,suchas
ahighdegreeofrobustnessagainstaccidentalnodefailures.
For>3,thehubsarenotrelevant,mostunusualfeaturesare
absent,andinmanyrespectsthescale-freenetworkbehaves
likearandomone.
1.LectureWS2008/09BioinformaticsIII33
Shortestpathandmeanpathlength
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
Thedistanceinnetworksismeasuredbythepathlength,
whichtellsushowmanylinksweneedtopassthroughto
travelbetweentwonodes.
Astherearemanyalternativepathsbetweentwonodes,
theshortestpath—thepathwiththesmallestnumberof
linksbetweentheselectednodes—hasaspecialrole.
Indirectednetworks,thedistanceℓAB
fromnodeAtonode
BisoftendifferentfromthedistanceℓBA
fromBtoA.E.g.
inb,ℓ
BA
=1,whereasℓAB
=3.
Oftenthereisnodirectpathbetweentwonodes.As
showninb,althoughthereisapathfromCtoA,thereis
nopathfromAtoC.Themeanpathlength,<ℓ>,
representstheaverageovertheshortestpathsbetweenall
pairsofnodesandoffersameasureofanetwork'soverall
navigability.
1.LectureWS2008/09BioinformaticsIII34
Firstbreakthrough:scale-freemetabolicnetworks
(d)Thedegreedistribution,P(k),ofthemetabolicnetworkillustratesitsscale-freetopology.
(e)ThescalingoftheclusteringcoefficientC(k)withthedegreekillustratesthehierarchical
architectureofmetabolism(Thedatashownindanderepresentanaverageover43
organisms).
(f)ThefluxdistributioninthecentralmetabolismofEscherichiacolifollowsapowerlaw,
whichindicatesthatmostreactionshavesmallmetabolicflux,whereasafewreactions,with
highfluxes,carrymostofthemetabolicactivity.
Barabasi&Oltvai,NatureReviewsGenetics5,101(2004)
1.LectureWS2008/09BioinformaticsIII35
Secondbreakthrough:Yeastproteininteractionnetwork:
firstexampleofascale-freenetwork
Amapofprotein–proteininteractionsin
Saccharomycescerevisiae,whichis
basedonearlyyeasttwo-hybrid
measurements,illustratesthatafew
highlyconnectednodes(whicharealso
knownashubs)holdthenetwork
together.
Thelargestcluster,whichcontains
78%ofallproteins,isshown.Thecolour
ofanodeindicatesthephenotypiceffect
ofremovingthecorrespondingprotein
(red=lethal,green=non-lethal,orange
=slowgrowth,yellow=unknown).
Barabasi&Oltvai,NatureRevGen5,101(2004)
1.LectureWS2008/09BioinformaticsIII36
Summary
Manycellularnetworksshowpropertiesofscale-freenetworks
-protein-proteininteractionnetworks
-metabolicnetworks
-geneticregulatorynetworks(wherenodesareindividualgenesandlinksare
derivedfromexpressioncorrelatione.g.bymicroarraydata)
-proteindomainnetworks
However,notallcellularnetworksarescale-free.
E.g.thetranscriptionregulatorynetworksofS.cerevisaeandE.coliareexamples
ofmixedscale-freeandexponentialcharacteristics.
Itisatopicofongoingdebatewhethertheanalysisofsubnetworks(availabledata
issparse)allowsconclusionsontheunderlyingtopologyoftheentirenetwork.
Nextlecture:
-mathematicalpropertiesofnetworks
-originofscale-freetopology
-topologicalrobustness
Barabasi&Oltvai,NatureRevGen5,101(2004)