Essential Bioinformatics and Biocomputing (LSM2104: Section I ...

tastelesscowcreekBiotechnology

Oct 4, 2013 (3 years and 10 months ago)

96 views

Gene Expression

Chapter 9

1

What is Gene Expression?


The process of transcribing and translating a
gene to yield a protein product



Why are we interested in gene expression?



Tells us which genes are involved in which
functions

2

Gene Expression


Cells are different because of
differential gene
expression
-

proteome



About 40% of human genes are expressed at one time.


Gene is expressed by
transcribing

DNA into single
-
stranded
mRNA
-

transcriptome


mRNA is later
translated

into a
protein

3

Molecular Biology Overview

4

Cell

Nucleus

Chromosome

Protein

Gene (DNA)

Gene (mRNA),

single strand

cDNA

Gene Expression

5


Genes control cell behavior by
controlling which proteins are
made by a cell



House keeping genes vs.
cell/tissue specific genes



Regulation:



Transcriptional (promoters and
enhancers)


Post Transcriptional (RNA
splicing, stability, localization
-
small non coding RNAs)

Gene Expression

6

Regulation:



Translational (3’UTR repressors,
poly A tail)



Post Transcriptional (RNA
splicing, stability, localization
-
small non coding RNAs)



Post Translational (Protein
modification: carbohydrates,
lipids, phosphorylation,
hydroxylation, methlylation,
precursor protein)

cDNA

How do you measure Gene Expression?


7

Traditional Methods


Northern Blotting


Single RNA isolated


Probed with labeled
cDNA


Western Blotting


Multiple proteins


Probed with antibodies to a specific protein



RT
-
PCR


Primers amplify specific
cDNA

transcripts

8


9

How do Microarrays work?


Microarray:


New Technology (first paper: 1995)


Allows study of thousands of genes at same time




Glass slide of DNA molecules


Molecule: string of bases (25
bp



500
bp
)


uniquely identifies gene or unit to be studied


10

Gene Expression Microarrays

The main types of gene expression microarrays:



Short oligonucleotide arrays

(Affymetrix)


cDNA or spotted arrays

(Brown/Botstein).


Long oligonucleotide arrays (Agilent Inkjet);


Fiber
-
optic arrays


...


11

Fabrications of Microarrays


Size of a microscope slide

12

Images: http://www.affymetrix.com/

Differing Conditions


Ultimate Goal:


Understand expression level of genes under
different conditions



Helps to:


Determine genes involved in a disease


Pathways to a disease


Used as a screening tool


13

Gene Conditions


Cell types (brain vs. liver)


Developmental (fetal vs. adult)


Response to stimulus


Gene activity (wild vs. mutant)


Disease states (healthy vs. diseased)

14

Expressed Genes


Genes under a given condition


mRNA extracted from cells


mRNA labeled


Labeled mRNA is mRNA present in a given
condition


Labeled mRNA will hybridize (base pair) with
corresponding sequence on slide

15

Two Different Types of Microarrays


Custom spotted arrays (up to 20,000 sequences)


cDNA


Oligonucleotide



High
-
density (up to 100,000 sequences) synthetic
oligonucleotide

arrays


Affymetrix

(25 bases
)

16

Microarray Technology

17

Microarray Image Analysis


Microarrays detect gene
interactions: 4 colors:


Green: high control


Red: High sample


Yellow: Equal


Black: None



Problem is to quantify image
signals

18

Microarray Animations


Davidson University:


http://www.bio.davidson.edu/courses/genomics/chip/chip.html



Imagecyte
:


http://www.imagecyte.com/array2.html


19

20

Microarray analysis

Operation

Principle:


Samples are

tagged with

flourescent

material to

show pattern of

sample
-
probe

interaction

(hybridization)


Microarray may

have 60K probe

21

Microarray Processing sequence

Gene Expression Data

Gene expression data on
p

genes for
n

samples

22

Genes

mRNA samples

Gene expression level of gene
i
in mRNA sample
j

=

Log (
Red intensity

/

Green intensity
)

Log(Avg. PM
-

Avg. MM)


sample1

sample2

sample3

sample4

sample5



1


0.46


0.30


0.80


1.51


0.90

...

2

-
0.10


0.49


0.24


0.06


0.46

...

3


0.15


0.74


0.04


0.10


0.20

...

4

-
0.45

-
1.03

-
0.79

-
0.56

-
0.32

...

5

-
0.06


1.06


1.35


1.09

-
1.09

...

Some possible
applications?


Sample from specific organ to show which
genes are expressed



Compare samples from healthy and sick host to
find gene
-
disease connection



Probes are sets of human pathogens for
disease detection

23

Huge amount of data from single microarray


If just two color, then amount of data on
array with
N

probes is 2
N



Cannot analyze pixel by pixel



Analyze by pattern


cluster analysis

24

Major Data Mining Techniques


Link Analysis


Associations Discovery


Sequential Pattern Discovery


Similar Time Series Discovery



Predictive Modeling


Classification


Clustering


25

Some clustering methods and software


Partitioning

K
-
Means, K
-
Medoids, PAM,
CLARA …


Hierarchical

Cluster, HAC

BIRCH

CURE

ROCK


Density
-
based


CAST, DBSCAN

OPTICS

CLIQUE…


Grid
-
based

STING

CLIQUE

WaveCluster…


Model
-
based

SOM (self
-
organized map)

COBWEB

CLASSIT

AutoClass…


Two
-
way Clustering


Block clustering

26

27

Eisen et al.

Proc. Natl. Acad. Sci.

USA 95 (1998)


data clustered


randomized


row column both

time

A dendrogram (tree) for clustered genes

Let p = number of genes.

1. Calculate within class
correlation.

2. Perform hierarchical
clustering which will produce
(2p
-
1) clusters of genes.

3. Average within clusters of
genes.

4 Perform testing on averages
of clusters of genes as if they
were single genes.

28

1

2

3

4

5

Cluster 6=(1,2)

Cluster 7=(1,2,3)

Cluster 8=(4,5)

Cluster 9=

(1,2,3,4,5)

E.g. p=5

29

A real case


Nature Feb, 2000

Paper by

Allzadeh. A et al


Distinct types of

diffuse large

B
-
cell lymphoma

identified by gene

expression

profiling


Discovering sub
-
groups

30

31

Time Course Data

Gene Expression is Time
-
Dependent

32

Sample of time course of

clustered genes

time

time

time

Limitations


Cluster analyses:


Usually outside the normal framework of statistical
inference


Less appropriate when only a few genes are likely to
change


Needs lots of experiments



Single gene tests
:


May be too noisy in general to show much


May not reveal coordinated effects of positively
correlated genes.


Hard to relate to pathways

33

But a few
Links

34


Affymetrix

www.affymetrix.com


Stanford
MicroArray

Database

http://
smd.stanford.edu/resources/restech.shtml


Yale Microarray Database

http://
www.med.yale.edu
/microarray/


NCBI Gene Expression Omnibus
http://www.ncbi.nlm.nih.gov/geo/


University of North Carolina Database
https://genome.unc.edu/