Why Rattlesnakes - MCBIOS-group-project

crashclappergapSoftware and s/w Development

Dec 13, 2013 (3 years and 3 months ago)

61 views

Why Rattlesnakes?


Model organisms for

adaptation to
low
-
energy lifestyles


Low metabolism
-

down
-
regulate
whole organ systems


Starvation bouts up to two years


Pit Vipers: rapid rates of evolution
of mitochondrial genome

Applications:


Transcriptomics

of specific tissues using
RNASeq

during

starvation


Reprogram the
transcriptome

of muscles
during prolonged starvation


Reactivation

of digestive system upon
feeding



Comparative biology in medical research.


Chicken was one of the first vertebrate, non
-
mammal genomes sequenced


Assemblies of green anole lizard and python.

Progress


Began with pilot 454 sequencing for
repeats


Collaboration with faculty at KAUST


Illumina

Sequencing


Call by MCBIOS for community project


Share in the

assembly and annotation,
for training of students and
development of programs, publications,
funding

Current Data

Reads

source

Library

Files

KB

454

gDNA

single read

13


23,540,109

11.8X

Illumina

PE

gDNA

100
bp

on 179
bp

library

4


104,700,982

52.4X

Illumina

MP

gDNA

100
bp

on 6.6kbp library

2


42,937,388

21.5X

Illumina

PE

Mixed
tissue RNA

265bp (including adapters 130bp)

2


10,534,064

5.3X


181,712,543

90.9X

Transcriptome

Assembly


Transcriptome


50mer assembly 11,000
contigs


No BLAST data yet


Ptitsyn


10237 with significant BLAST scores


888 with no annotation


Trinity Assembly

Transcriptome


Number of Transcripts Produced

77746

N50 of
Transcriptome
(
bp
)

590

Largest
Trancript
(
bp
)

35858

Number of Peptide
Models
Produced
from Assembly

74109

BLAST Report (NR Database)

Numerical Value

Percentage

Number of Peptides Analyzed (subset of total)

30515

41%

Average Peptide Length

203

Number of Models with
Evalue

of
-
40

13117

43%

Number of Models with
Evalue

of e
-
5 or less

16155

53%

BLAST Species Distribution (e
-
40 or less)

Number of Hits

Perecent

of Total

Anolis

carolinensis

(Carolina anole)

4897

37%

Crotalus

adamanteus

(eastern diamondback)

3075

23%

Gallus
gallus
(chicken)

904

7%

Homo sapiens

445

3%

Total Unique
Phylogenetic
Top Hits

193

Assembly data


Genome


Velvet on
Blacklight

(Jeff Pummill)


43
Kmer
; does not include 454 data

Genome Assembly Comparison Using
Assemblathon

Stats Perl Script

Crotalus horridus
horridus

Boa constrictor
constrictor

Number of scaffolds

533118

373780

Total size of scaffolds

1769030264

1753784632

Longest scaffold

309993

578805

Shortest scaffold

300

1

Number of scaffolds > 1K nt

218438

143655

Number of scaffolds > 10K nt

38799

23752

Number of scaffolds > 100K nt

432

3780

Number of scaffolds > 1M nt

0

0

Number of scaffolds > 10M nt

0

0

Mean scaffold size

3318

4692

Median scaffold size

801

726

N50 scaffold length

13409

65989

L50 scaffold count

32098

7038

% of assembly in
scaffolded

contigs

79.60%

0.00%

% of assembly in
unscaffolded

contigs

20.40%

100.00%

Average number of
contigs

per scaffold

1.5

1

Average length of break (>25 Ns) between
contigs

in
scaffold

2964

0

Assembly data (cont’d)

Genome Assembly Comparison Using
Assemblathon

Stats Perl Script

Crotalus horridus
horridus

Boa constrictor
constrictor

Number of
contigs

795199

373780

Number of
contigs

in scaffolds

379361

0

Number of
contigs

not in scaffolds

415838

373780

Total size of
contigs

992068015

1753784632

Longest
contig

63158

578805

Shortest
contig

51

1

Number of
contigs

> 1K
nt

290984

143655

Number of
contigs

> 10K
nt

5939

23752

Number of
contigs

> 100K
nt

0

3780

Number of
contigs

> 1M
nt

0

0

Number of
contigs

> 10M
nt

0

0

Mean
contig

size

1248

4692

Median
contig

size

729

726

N50
contig

length

1969

65989

L50
contig

count

121600

7038

Where are the data


Submitted

to
Genbank

Bioproject
/SRA


Available on secure FTP site at uark.edu


See
drhoads@uark.edu

for ID and password