CRISPR Properties - LIFL

peaceevenBiotechnology

Oct 4, 2013 (3 years and 8 months ago)

114 views

Using
CRISPRs
in micro-
evolution
studies
Algorithmique, combinatoire du texte et
applications en bio-informatique
Institut de Génétique et
Microbiologie
GPMS: Génomes Polymorphisme
et
Minisatellites
http://minisatellites.u-psud.fr/
Encadré
par :
Christine POURCEL
Gilles VERGNAUD
Réalisé
par
:
Ibtissem GRISSA
28/09/2007
Outline
of the talk
-
CRISPR properties
-
Bioinformatics
tools

Background
-
CRISPR Properties
-
Bacterial
Defense
system

Results

Results
-
Bioinformatics
tools: CRISPRFinder, CRISPRdb
-
Micro-evolution
studies
-
Bioinformatics
tools: CRISPRFinder, CRISPRdb
-
Micro-evolution
studies
-
CRISPR properties
-
Bioinformatics
tools
C
C
lustered
lustered
R
R
egularly
egularly
I
I
nterspaced
nterspaced
S
S
hort
hort
P
P
alindromic
alindromic
R
R
epeat
epeat
CASS : CRISPR + cas

Structure :
DR
Leader
DR
(24 –
47 bp)
spacers
Leader (AT-rich)
Degenerated
DR
cas
TTTGATTATTGCCTGTGCGGCAGTGAACTCAGGGGACTGGCGAACAATGTCTTTCATGATTTTCTAAGCTGCCTGTGCGGCAGTGAACGAAAAGGTAAGATGGGCAA
GCTTCTAGTAGTTTTTCTAAGCTGCCTGTGCAGCAGTGAACATTATCTGAATGGCATTTTCTTTGGCGCAGATTTTCTAAGCTGCCTGTGCGGCAGTGAACAGTAAG
ATAATACGATAACATCCTGTTTGTAAAA
TACTTAT
almost
all archaea
(29/31)
40% of eubacteria
(156/391)

Observed
in procaryotic
genomes
:
Examples
of CRISPRs
-
CRISPR properties
-
Bioinformatics
tools
1
AGGTTTTGCTGCCTTTTCGGCGGGTAT
C
TCAAAGTCAACTTGTAAATGACGATTTTCACG
32
2
ATTTTCAGCTGCCTATTCGGCAGGTCA
C
AGTTTGGGGCTGAGTTTGCCATTTTCCTAAAT
32
3
ATTTTCAGCTGCCTATTCGGCAGGTCA
C
GATGAAGCAGACCACCTCGATTACCCCACGCT
32
4
ATTTTCAGCTGCCTATTCGGCAGGTCA
C
ACTATTTATCAAGACCTTCTTTAAAATCAAAC
32
5
ATTTTCAGCTGCCTATTCGGCAGGTCA
C
AGTTTGGGGCTGAGTTTGCCATTTTCCTAAAC
32
6
ATTTTCAGCTGCCTATTCGGCAGGTCA
C
(4626121)
(4626448)
**
**
*
*
**
Shewanella
sp. ANA-3 (CRISPR_2)
Yersinia pestis
KIM (CRISPR_4)
1
TTATTGGGCTGCCTGTGCGGCAGTGAAC
GTTATACCCCGCGCAGGGAGTGAAGCGTTGAC 32
2
TTTCTAAGCTGCCTGTGCGGCAGTGAAC
TTAAGTTCTTTTTGTCAGCATCTTTAATAAA
T 32
3
TTTCTAAGCTGCCTGTGCGGCAGTGAAC
CTGAAATACAAATAAAATAAATCGTCGAACAT 32
4
TTTCTAAGCTGCCTGTGCGGCAGTGAAC
(2875721)
(2875928)
** **
Sulfolobus
tokodaii
str. 7 (CRISPR_2)
7
GATGAATCCCAAAAGGAATTGAAAG
TGATTGATCACAATGAGAAGACTGTAAAGCTGATAAAC 38
8
GATGAATCCCAAAAGGAATTGAAAG
TGTTGAGGCATAAATTAATCTATCCTTAATGAAAAAT 37
9
GATGAATCCCAAAAGGAATTGAAAG
TTCTTCCTCAGCCTCCATTTTGTTTATGATTTGTAGTGCC 40
10
GATGAATCCCAAAAGGAATTGAAAG
TTCAATAATCTCTATCTTTCCAAAATCTGTAAATGAAGAC 40
109
GATGAATCCCAAAAGGAATTGAAAG
AAAGCACAGTCAATAACGTTATCTGGTATCATATTATCAAA
41
110
GATGAATCCCAAAAGGAATTGAAAG
CTTTCT
CCTTCCCTCTGATCTCTCGCTGAATTGAAAAGA 39
111
GATGAATCCCAAAAGGAATTGAAAG
GTAAGTATTGATGCTAACATTGACTTCGCTGTCCCAGGGGC
41
112
GATGAATCCCAAAAGGAATTGAAGG
AAGTATAATAACGATAGTACTAAAATTAATTGATCC 36
113
GATGATTCTCAAAAGGAATTGATAA
* * ***
(32702)
(39896)
1
16 CRISPRs
1
248 motifs
CRISPRs
CRISPRs
: a B
: a B
acterial
acterial
Defense
Defense
system
system
-
CRISPR properties
-
Bioinformatics
tools

CRISPRs
spacers
generally
originate
from
mobile
elements
(plasmids, phages) (Y. pestis, S. thermophilus, S.
Solfataricus, S. pyogenes…)

CRISPRs
are transcribed
and subsequently
processed
as micro RNAs
(owing
to the cas
genes
machinery) : RNA interference
(RNAi) system to
block phage reproduction.
Cas proteins
and CRISPR spacer
sequences
constitute
a bacterial
immune system that
works
by
a mechanism
similar
to that
of RNAi
in higher
organisms
B
B
acterial
acterial
Defense
Defense
against phage invaders
against phage invaders
-
CRISPR properties
-
Bioinformatics
tools
CRISPR Provides
Acquired
Resistance Against
Viruses
in Prokaryotes
Barrangou, Horvath et al, Science 2007
System :Streptococcus
thermophilus
(used
to make
yogurt
and cheese)
-Infection with
phage
incorporation of phage-related
spacers
within
CRISPR1
-Such
bacteria
become
resistant
to further
infection by similar
phage strains
-If the spacer
is
taken
out,
the resistance
is
lost
-At
least one cas
gene
is
necessary
for resistance
to phage
-At
least one cas
gene
to generate
phage-resistant
bacteria
Producing
more phage-resistant
bacterial
strains
for industrial
use?
CRISPRFinder
tool
-
CRISPR properties
-
Bioinformatics
tools
-
CRISPRs
can
be
found
relatively
easily
using
existing
software tools
BUT
-
Output not appropriate
for this
purpose
-
Background (tandem repeats,..) 
further
postprocessing
and manual
curation!!
-
Difficulty
in defining
the DR consensus endpoints
+ degenerated
DR
-
Sensitivity
(short repeats
are generally
neglected)
-
Absence of Web tool
(easy
and intuitive)
Dedicated
software tool
for the identification and preliminary
analysis
of CRISPRs
-
Precision
-
Intuitive and easily
used
-
Web service
CRISPRFinder
Workflow
-
CRISPR properties
-
Bioinformatics
tools
max
imal repeats
Sequence(s)
CRISPR possible localizations
DR
DR
23bp -
55bp
25bp -
60bp
DR’
DR
DR
23bp -
55b
p
[0.6DR -
2.5DR]
2
3
DR
DR
[ , ]
CRISPR structure check
Tandem Repeats
Elimination
Identification of candidate DRs
Questionable
CRISPRs
Confirmed
CRISPRs
?
A
A
b
d
c
e
Utilisation de Vmatch
(Reputer)
CRISPRFinder
Output
-
CRISPR properties
-
Bioinformatics
tools
CRISPRFinder
http://crispr.u-psud.fr/Server/CRISPRfinder.php
CRISPRFinder
CRISPRFinder
Output
http://crispr.u-psud.fr/Server/CRISPRfinder.php
-
CRISPR properties
-
Bioinformatics
tools
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
CRISPRdb
http://crispr.u-psud.fr/crispr/CRISPRdatabase.php
Spacer
dictionnary
creator
-
CRISPR properties
-
Bioinformatics
tools
Spacer
dictionnary
c
reator
Example
of use :
The micro-evolution
of Y. pestis
species
Evolution de la structure CRISPR

gain de spacers
: insertion polarisée
adjacente à
la séquence leader

Perte interstitielle de spacers
par
recombinaison entre 2 DR

Conservation de l’ordre des spacers
spacer
acquisition
e
last DR duplication
a
DR
b
c
d
Leader
CRISPR YP1 in three
different
strains
-
CRISPR properties
-
Bioinformatics
tools
tttgattatTGCC
TGTGCGGCAGT
GAACATATTCT
CGAGCGATAGC
AATAGCCATTC
CAC

e
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCGGTCA
AACAAATTTAG
GCGACGATTTA
ACA

f
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAAAAAGA
ATTTGGGATTA
AAGTTACCCAT
CAG

g
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCAATGC
CTGAATCTCTG
GCGTGATAGCT
GCGG

h
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAGTAAGA
TAATACGATAA
CATCCTGTTTG
TAA



Souche Java9

tttgattatTGCC
TGTGCGGCAGT
GAACTCAGGGG
ACTGGCGAACA
ATGTCTTTCAT
GAT

a
TTTCTAAGCTGCCTGTGCGGCAGT
GAACGAAAAGG
TAAGATGGGCA
AGCTTCTAGTA
GTT

b
TTTCTAAGCTGCCTGTGCGGCAGT
GAACATTATCT
GAATGGCATTT
TCTTTGGCGCA
GAT

c
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCGCCAT
TCCGTGAACCT
GAGCGCGTTCG
CGA

d
TTTCTAAGCTGCCTGTGCGGCAGT
GAACATATTCT
CGAGCGATAGC
AATAGCCATTC
CAC

e
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCGGTCA
AACAAATTTAG
GCGACGATTTA
ACA

f
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAAAAAGA
ATTTGGGATTA
AAGTTACCCAT
CAG

g
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCAATGC
CTGAATCTCTG
GCGTGATAGCT
GCGG

h
TTTCTAAGCTGCCTGTGCGGCAGT
GAACACGTCAT
CCTGAAGGCTA
GGCAGCTCGGC
TTC

0
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAGTAAGA
TAATACGATAA
CATCCTGTTTG
TAA



Souche 02-449

tttgattatTGCC
TGTGCGGCAGT
GAACTCAGGGG
ACTGGCGAACA
ATGTCTTTCAT
GAT

a
TTTCTAAGCTGCCTGTGCGGCAGT
GAACGAAAAGG
TAAGATGGGCA
AGCTTCTAGTA
GTT

b
TTTCTAAGCTGCCTGTGCGGCAGT
GAACATTATCT
GAATGGCATTT
TCTTTGGCGCA
GAT

c
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCGCCAT
TCCGTGAACCT
GAGCGCGTTCG
CGA

d
TTTCTAAGCTGCCTGTGCGGCAGT
GAACATATTCT
CGAGCGATAGC
AATAGCCATTC
CAC

e
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCGGTCA
AACAAATTTAG
GCGACGATTTA
ACA

f
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAAAAAGA
ATTTGGGATTA
AAGTTACCCAT
CAG

g
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCAATGC
CTGAATCTCTG
GCGTGATAGCT
GCGG

h
TTTCTAAGCTGCCTGTGCGGCAGT
GAACACGTCAT
CCTGAAGGCTA
GGCAGCTCGGC
TTC

0
TTTCTAAGCTGCCTGTGCGGCAGT
GAACGAAATTG
TGGGTGTAGAT
GTTGCAGACGC
CTC

V
TTTCTAAGCTGCCTGTGCGGCAGT
GAACTCTGACG
TTGCCTGTGTT
GCCGCTCTCGT
ATT

W
TTTCTAAGCTGCCTGTGCGGCAGT
GAACAGTAAGA
TAATACGATAA
CATCCTGTTTG
TAA
Souche 195P
YP1
(2769)
YP2
(2895)
YP3
(1773)
prophage
(2363-2409)
Ter
Yersinia pestis
CO92
4,653,728bp
Ori
metA
aceB
atpA
AspC
nuoM
phosphotransferase
Usher
protein
http://crispr.u-psud.fr/crispr/MultipleAnalysis/CRISPRdetector1.php
Outil phylogénétique
??
Les spacers
identiques renseignent sur un ancêtre commun
-
Spolygotypage
chez M. tuberculosis
(CRISPR inactif)
strai
n
s
1
s
2
s
3
s
4
s
5
s
6
s
7
s
8
s
9
s
1
0
s
1
1
s
1
2
s
1
3
s
1
4
s
1
5
s
1
6
s
1
7
s
1
8
s
1
9
s
2
0
s
2
1
s
2
2
s
2
3
s
2
4
s
2
5
s
2
6
s
2
7
s
2
8
s
2
9
s
3
0
s
3
1
s
3
2
s
3
3
s
3
4
s
3
5
s
3
6
s
3
7
s
3
8
s
3
9
s
4
0
s
4
1
s
4
2
s
4
3
199
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
18
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
314
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
310
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
312
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
100
1
1
1
207
1
1
1
1
1
1000
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
307
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
171
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
318
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1000
0000000000000
1
1
1
304
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
306
1
1
1
1
1
1000
1
10
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
303
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
10
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
334
1
10
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
57
1
1
1
1
1
1000
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
10
1
1
1
1
Spoligotypage
M. tuberculosis
Evolution of Y.Pestis
CRISPR
-
CRISPR properties
-
Bioinformatics
tools
CRISPR YP1 evolution
(Pourcel
et al 2005)
Y. pestis:
-
109 sequenced
alleles
-
29 spacers
tentat
ive evoluti
onary scenar
io
for YP1 CRISPR
geno
1
abcde
m
n
pestoide
Ge
orgia
a2j
2
b2k
2l
2m2
geno10 26 33 34
35 38 45
abcdef
ghl
/q/r
/s/u/x/y/z
g
eno11
@13
efgh
geno 37
abcdfgh
ge
n
o 47
abcdefgh
ovw
geno49
a
bcd
efghop
geno
44 48 50
a
bcde
fgho
geno
9, 14@25, 27
@32, 36, 39@43, 46
,
51
abcdefgh
orient
al
is
a2b2
c2d2
e2
ge
no3
abcdjk
antiqua (A
frica
)
geno 2
abcdj
antiqua (A
frica
)
inter
mediate
abc
d
anti
qua
geno 5
4
abc
i
medievalis
geno 6
1
abct
medievalis
ge
no 6, 7,
8, 52, 53
,
55@6
0
abc
antiqu
a (Asi
a), m
edi
evalis

(Ir
an)
a2
b2c2
d2
predict
ion
for
Y.
pesti
s anc
esto
r
abc
def
"91
001"
ad
f
p
estoide
China
a2b2
c2
d2
ancestor
abcde
CRISPR elements
in Yersinia pestis
acquire
new repeats
by preferential
uptake
of bacteriophage
D
NA, and provide
additional
tools
for evolutionary
studies
POURCEL et al, Micr
obiolog
y
2005
Spacers
dictionnary
creator
-
CRISPR properties
-
Bioinformatics
tools
Y. Pestis
evolution
(Antiqua-> Medievalis
->Orientalis)?
Spacers
dictionnary
creator
-
CRISPR properties
-
Bioinformatics
tools
Y. Pestis
evolution
(Antiqua-> Medievalis
->Orientalis)?
Y. Pestis
evolution
(Antiqua-> Medievalis
->Orientalis)?
Spacers
dictionnary
creator
-
CRISPR properties
-
Bioinformatics
tools
Créer un fichier binaire de tous les spacers
introduit (0: n’existe pas, 1 : existe)
CRI
SPR
_DR
29
_2
s135
s134
s133
s132
s131
s130
s129
s128
s127
s126
s125
s124
s123
s122
s121
s120
s119
s118
s117
s116
s115
s114
s113
s112
s111
s110
s109
s108
s107
s106
s105
s104
s103
s102
s101
s100
s99
s98
s97
s96
s95
s94
s93
s92
s91
s90
s89
s88
s87
s86
s85
s84
s83
s82
s81
s80
s79
s78
s77
s76
s75
s74
s73
s72
s71
s70
s69
s68
s67
s66
s65
s64
s63
s62
s61
s60
s59
s58
s57
s56
s55
s54
s53
s52
s51
s50
s49
s48
s47
s46
s45
s44
s43
s42
s41
s40
s39
s38
s37
s36
s35
s34
s33
s32
s31
F
SLF2-
51
5
FSLJ2-
00
3
F
SLN
1
-
01
7
J28
18
str
1
1.2.3.4.5.6.7.8
.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.25.26.27.28.29.30
.31.32.33.69.70.72.73.74.75.76.77.79.80.81.82.83.8
4.85.86.87.88.89.90.97.98.99.100.101.104.
str
2
34.35.39.40.41.42.43.44.45.46.47.48.49.50.59.60.61.63.64.65.66.69.72.77.91.92.93.94.97.98.101.102.103.104.
str
3
51.52.53.54.55.56.57.59.60.61.62.63.65.66.67.68.69.71.72.76.78.79.80.95.96.98.99.102.103.104.105.
str
4
36.37.38.39.40.41.42.49.50.58.59.60.61.63.64.65.67.72.77.103.
Str2
Str1
Str4
str3
SpacerPHYL
Comparaison de deux méthodes : parcimonie et distance
1) Pairwise
alignment
a.b.-.-.f.g.h.i.&
-.b.c.d.f.-.h.i.j
-.*.-.-.*.-.*.*.&
INPUT: (séq
orientées : leader à
gche)
>Yersinia_pestis_CO92
b.c.e.h.i.j.k.l.m
>Yersinia_pestis_KIM
a.b.f.g.h.i.j.l
>Yersinia_pestis_Antiqua
a.b.c.d.j.k
>Yersinia_pestis_Microtus
a.d.f
>Yersinia_pestis_Nepal516
a
&
Ending
gap
-
gap
*
match
2) Matrice de distance
3) Arbre
4) Alignement et
Construction d’un arbre
phylogénétique

valueOpeningGap
= 20

valueEndingGap
= -10

valueManyGaps
= 20

valueFirstMatch
= 100

valueNextMatch
(i) = 100 + i
CRISPR, a tool
of micro-evolution
analysis
-
CRISPR properties
-
Bioinformatics
tools
-
I
ntra-species
analysis
(Evolutionnary
history)
-
Strains
identification
-
Epidemiological
studies
A good phylogenetic
tool
?
Ancient
species:
-
highly
polymorphic
in spacer
composition
-
CRISPRs
absence
-
Strains
differenciation
: ok
-
Phylogenetic
relations : not sufficient
To Sum
up
-
CRISPR properties
-
CRISPR extraction
-
CRISPRFinder
for CRISPR identification
-
CRISPR database
and related
tools
-
Spacer
dictionnay
creator
http://crispr.u-psud.fr/Server/CRISPRfinder.php
http://crispr.u-psud.fr/crispr/CRISPRHomePage.php