Illumina (PM session)

hordeprobableBiotechnology

Oct 4, 2013 (4 years and 7 days ago)

489 views

© 2009 Illumina, Inc. All rights reserved.

Illumina, illumina
Dx
, Solexa, Making Sense Out of Life, Oligator, Sentrix, GoldenGate, GoldenGate Indexing, DASL, BeadArray, Array of Arrays, Inf
ini
um, BeadXpress, VeraCode, IntelliHyb,

iSelect, CSPro, and GenomeStudio are registered trademarks or trademarks of Illumina, Inc. All other brands and names contain
ed
herein are the property of their respective owners.

The
Illumina

Analytical
Model

Gordon L Spangler, PhD

Senior Field Applications Scientist

November 17, 2012

2

2

Today’s Presentation

I will NOT be talking about
BioInformatics
.

I will instead attempt to provide a
background of the
Illumina

Analytical
Platform that can enhance YOUR
understanding of
BioInformatics
.


3

3

The Analytical Model

Platform Selection

Analytical Construct

Signal Generation

Data Acquisition

Data Interpretation

HiSeq
,
MiSeq
,
GAIIx

Sample Prep

Sequencing Run

1
o

Analysis (
Basecalls
)

2
o

Analysis (Alignment)

Illumina

Speak

4

4

The Analytical Platform

Illumina

Sequencing by

Synthesis (
SBS
)

CCD

Imaging

5

5

Platform Selection

6

6

Applications

7

7

Platform Selection

Single Base

Whole Genome

Individual

Species

8

8

Illumina

Sequencing Spans the Spectrum

650G

2
-
10 Gb

Sample Size

Yield

Index (Multiplex)

Thousands

One

9

9

Illumina

Platforms

10

10

The Analytical Construct

11

11

The Analytical Construct

12

12

Sample Preparation

DNA

RNA

Gel size selection, if needed

13

13

Signal Acquisition

14

14

Larger, dual
-
surface enabled


>5x increase in imaging area


Retains 8 lane format


25mm wide X 75mm long


Lanes 1.7 mm wide


Only compatible with cBot

Flow Cell Design

15

15


Cluster Generation


Bind single
DNA
molecules to
surface

Amplify on
surface

~1000 molecules per ~ 1
µm
cluster

Transfer Analytical Construct to a Solid Phase Substrate

16

16

Libraries are “Clustered” onto Solid Phase

17

17

Data Acquisition

18

18

Sequencing By Synthesis Read


Add 4
Fl
-
NTP’s
+ Polymerase

Incorporated FI
-
NTP imaged

Terminator & fluorescent
dye cleaved from FI
-
NTP

X 36
-

151

19

19

DNA

(0.1
-
1.0 ug)



Single molecule array

Sample
preparation

Cluster Generation

5’

5’

3’

G

T

C

A

G

T

C

A

G

T

C

A

C

A

G

T

C

A

T

C

A

C

C

T

A

G

C

G

T

A

G

T

1

2

3

7

8

9

4

5

6

Image acquisition

Base calling

T

G

C

T

A

C

G

A

T


Sequencing

From Library to Tag

20

20

Dual Surface Imaging

Operates in epi
-
illumination mode:


Fluorescence and emission from the
same side of the sample

Continuous Scanning:


Cameras operate in TDI (Time Delay
Integration):


Fluorescence image read out
continually

Dual surface scanning


Top first


all lanes


Bottom second


all lanes


21

21

Excitation and Scanning on HiSeq Sequencing Systems

Laser line

Allows two swaths (two columns)
to be imaged in one lane/surface

Red laser line precedes green
laser line

Stage moves flow cell in y
direction to scan down the channel







Flow Cell Lane

~12.5min/surface

swaths

y

22

22

Optics Maximize Image Capture Throughput

4 CCD cameras


1 per channel

Capture image for each channel simultaneously


Top surface scanned in first pass (all 4 colors)


Bottom surface scanned in second pass (all 4 colors)


A

T

C

G


23

23

Flow Cell Swath/Tile

Flow Cell

Swath

Tile

8 Tiles (one per processor)

HiSeq Control Software
(HCS) divides 1 swath
(image) into 8 portions
(tiles) for image analysis
optimization

HiSeq tile is not the same
as GA tile


HiSeq tile (5.5mm
2
) is 10x
GA tile (0.55mm
2
)


24

24

HiSeq Sequencing Systems Tiles

2

3

4

5

6

7

8

21

22

23

24

2
5

26

27

28

1

42

43

44

45

46

47

48

61

62

63

64

6
5

66

67

68

4
1

TOP

BOTTOM

HiSeq tiles numbers do not
correlate to GA tiles
number

25

25

Overview

Image Capture (.
tif
)

Template Generation (.
clocs
)

Intensities (.
cif
)

Base Calls (.
bcl
)

Y

X


X:Y | A G C T

1400:1750 | 40 50 20 80

A

G

C

T

26

26

Data Acquisition and Sequence Generation

27

27

RTA

Processing

28

28

Sequence Analysis Viewer (
SAV
)

29

29

Primary Analysis v1.9

by Jeremy Peirce, PhD

Senior Staff Scientist

Illumina
,
Inc

http://support.illumina.com/sequencing/sequencing_software/real
-
time_analysis_rta/training.ilmn

30

30

Data Interpretation

31

31

Overview

Alignment of Tags

Assessment of Variance

# ** CASAVA depth
-
filtered
snp

calls **

#$ CMDLINE /
illumina
/CASAVA
-
1.8a10/
libexec
/CASAVA
-
1.8.0a10/filterSmallVariants.pl
--
projectDir
=/data/
pipeline_in
/Runs/110120_P20_0994_B809UWABXX_RNA
-
Index
-
8x1tile/Build_Project_RNA
-
PE
-
Dmx
-
Rta110_Sample_human
-

chrom
=chr20.fa

#$ SEQ_MAX_DEPTH chr20.fa 1.85296888423124

#

#$ COLUMNS
seq_name

pos
bcalls_used

bcalls_filt

ref Q(
snp
)
max_gt

Q(
max_gt
)
max_gt|poly_site

Q(
max_gt|poly_site
)
A_used

C_used

G_used

T_used

chr20.fa 259156 1 0 G 10 AG 3 AG 3 1 0 0 0

chr20.fa 261183 1 0 A 4 AA 3 AG 3 0 0 1 0

chr20.fa 266241 1 0 T 10 GT 3 GT 3 0 0 1 0

chr20.fa 286838 1 0 A 3 AA 4 AC 3 0 1 0 0

chr20.fa 287058 1 0 T 3 TT 4 CT 3 0 1 0 0

chr20.fa 290833 1 0 T 10 CT 3 CT 3 0 1 0 0

chr20.fa 294503 1 0 A 8 AG 2 AG 3 0 0 1 0

chr20.fa 295451 1 0 T 8 CT 2 CT 3 0 1 0 0

chr20.fa 299331 1 0 T 7 CT 2 CT 3 0 1 0 0

chr20.fa 299693 1 0 C 2 CC 5 CG 3 0 0 1 0

chr20.fa 299759 1 0 G 10 GT 3 GT 3 0 0 0 1

chr20.fa 299984 1 0 C 3 CC 3 AC 3 1 0 0 0





32

32

Alignment to a Reference Genome

33

33

Variant Calling Summary

reference:

Gapped ELAND alignments

Poor ELAND alignments
re
-
aligned by small
variant caller

Unmapped ‘shadow’ reads
recovered by GROUPER

Local read realignment and incorporation
of GROUPER results improves sensitivity
of
both SNP and
indel

calls

34

34

Cloud
-
Based Analysis

Local Analysis

BaseSpace

Data Analysis Options

for
MiSeq

Data

MiSeq

Sequencing
Results


.
bcl

files

MiSeq

Reporter


Amplicon

De Novo Assembly

Library QC

Metagenomics

Resequencing

Small RNA

BaseSpace


Amplicon

De Novo Assembly

Library QC

Metagenomics

Resequencing

Small RNA



Third Party Software

Secure Data Storage

Data Sharing

Illumina Experiment
Manager:


Define analysis
parameters

35

35

BaseSpace
:
Illumina

and the Cloud

https://
basespace.illumina.com/project/2/appsession/108/details

36

36

Conclusion

37

37

38

38

The Human Genome Project

One person
. Billions of
Dollars. Ten Years.


39

39

Personal Genomes

STANFORD, Calif.
--
(
BUSINESS WIRE
)
--
The first few times that scientists mapped
out all the DNA in a human being in 2001, each effort cost
hundreds of millions of
dollars

and involved more than 250 people. Even last year, when the lowest reported
cost was
$250,000
, genome sequencing still required almost 200 people. In a paper
to be published online Aug. 9 by
Nature Biotechnology
, a Stanford University
professor reports sequencing his entire genome for less than
$50,000

and with a
team of just two other people.

One person. $10,000. Two Weeks.


One person. $
1,000
.

One Day.


40

40

Moore’s Law

41

41

Questions