in the Search for Genes

gooseliverΒιοτεχνολογία

22 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

79 εμφανίσεις

Genetic Epidemiological Strategies

in the Search for Genes


Tuan V. Nguyen


University of New South Wales

Faculty of Medicine

Genes and Diseases


Many diseases have their roots in gene and
environment.



Currently, >4000 diseases, including sickle cell
anemia and cystic fibrosis, are known to be
genetic and are passed on in families.


Genes and Medical Sciences


The central question for the medical sciences is
the extent to which it will be possible to relate
events at the molecular level with the clinical
findings or phenotypes of patients with
particular diseases.

Contents


Genes and DNA


Detection of genetic effects


Search for specific genes

Chromosomes

Each human cell contains 23 pairs of chromosomes
(distinguished by size and banding pattern). This is for males.
Females have two XX chromosomes

DNA and Genes


DNA carries the instructions
that allow cells to make
proteins.


DNA is made up of 4
chemical
bases
(A, T, G, C).


The bases make “
words
”:
AGT CTC GAA TAA


Words make “
sentence
” =
genes
:


< AGT CTC GAA TAA>


Genes, Alleles, and Genotypes


Location of a gene is called
locus
.


Alleles

are alternate forms of a gene. Example:
A,
a


Genotype
: the maternal and paternal alleles of an
individual at a locus defines the genotype of the
individual at that locus. Example:
AA, Aa, aa
.


How Do Genes Work?


Genes tell cell how to make
molecules, called
proteins
.


Protein allows cells to perform
specific functions.


If the instructions are fine, things
will be normal. If the instructions
are changed (mutated),
abnormality will be resulted.



Inheritance


The passing of genes from parents to child is the
basis of inheritance.


We are not identical to our parents: half of our
genes are from our mothers and half from our
fathers.


Each brother and sister inherits different
combination of chromosomes. N = 2^23 =
8,388,608 combinations.


Identical twins receive exactly the same
combination of genes from their parents.

Genetic effects


Three types of gene action:
additive, dominant,
and

epistasis.


Additive effect.


AA: 9, Aa = 7, aa = 5.


Dominant effect.


AA: 9,


Aa = 9, aa = 5.


Epistasis: interaction of alleles ar 2 loci


For locus 1: AA: 9, Aa = 7, aa = 5.


For locus 2: AA: 5, Aa = 5, aa = 9.

How to detect genetic effects?


Clues to Genetics and Environment

Epidemiol characteristics


Genetics

Environment

Geographic variation




+


+

Ethnic variation




+


+

Temporal variation




-


+

Epidemics





+/
-


+

Social class variation




-


+

Gender variation




+


+

Age






+/
-


+

Family variables


History of disease



+


+


Birth order




+/
-


+


Birth interval




-


+


Co
-
habitation




-


+

Methods of Investigation of Genetic Traits


Family studies
.
Examine phenotypes (diseases) in the
relatives of affected subjects (probands).


Twin studies
.
Examine the intraclass correlation between
MZ (who share 100% genotypes) and DZ twins (who share
50% genotypes).


Adoption studies
.
Seek to distinguish genetic from
environmental effects by comparing phenotypes in children
more closely resemble their biological than adoptive parents.


Offspring of discordant MZ twins
.
Control for
environmental effect; test for large genetic contribution to
etiology.

Basic Genetic
-
Environmental Model

Phenotype

(P)

= Genetics + Environment

Genetics

= Additive
(A)

+ Dominant
(D)

Environment = Common
(C)

+ Specific
(E)

=> P = A + D + C + E

Cov(Y
i
,Y
j
) = 2
F
ij
s
2
(
a
) +
D
ij
s
2
(
d
) +
g
ij
s
2
(
c
) +
d
ij
s
2
(
e
)

F
ij
: kinship coefficient

D
ij
: Jacquard’s coefficient of identical
-
by
-
descent

g
ij
: Probability of sharing environmental factors

d
ij
: Residual coefficient


V
P

= V
A

+ V
D

+ V
C

+ V
E


Statistical Genetic Model

V = variance; P = Phenotype; A, D, C, E = as defined

Kinship coefficients


Expected coefficient for

Relative

s
2
(
a
)


s
2
(
d
)


s
2
(
c
)

Spouse
-
spouse

0

0

1

Parent
-
offspring

1/2

0

1

Full sibs

1/2

1/4

1

Half
-
sibs

1/4

0

1

Aunt
-
niece

1/4

0

1

First cousins

1/8

0

0

Dizygotic twins

1/2

1/4

1

Monozygotic twins

1

1

1

Broad
-
sense heriatbility: H
2

= (V
A
+ V
D
) / V
P


Narrow
-
sense heriatbility: H
2

= V
A

/ V
P

Cov(Y
i
,Y
j
) = 2
F
ij
s
2
(
a
) +
D
ij
s
2
(
d
) +
g
ij
s
2
(
c
) +
d
ij
s
2
(
e
)


V
P

= V
A

+ V
D

+ V
C

+ V
E

Heritability (H
2
)

Statistical Methods for Estimating Heritability


Simple linear regression


Y
offp

=
b
(Y
p

) + e

H
2

= 2
b


Twin concordance

Intraclass correlation: rMZ and rDZ

H
2

= 2(rMZ
-

rDZ)


Path analysis and variance component model


Twin 1

Twin 2

E1

C1

D1

A1

A2

D2

C2

E2

Path Model for Twin Data

r

= 1

r

= .5 / .25

r

= 1 / .5

a c d e a d c e

A=additive; D=dominant; C=common environment; E=specific environment

Intraclass Correlation:

Femoral neck bone mass

MZ

DZ

rMZ = 0.73

rMZ = 0.47

Genetic Determination of Lean, Fat and Bone Mass

rMZ, rDZ : Intraclass correlation for MZ and DZ twins

Multivariate Analysis:

The Cholesky Decomposition Model

Lean

mass

Fat

mass

LS

BMD

FN

BMD

TB

BMD

E1

E2

E3

E4

E5

G1

G2

G3

G4

G5

LS=lumbar spine, FN=femoral neck, TB=total body, BMD = bone mineral density

Genetic and Environmental Correlation between
Lean, Fat and Bone Mass

Strategies for finding genes

How many genes?


Initial estimate: 120,000.


DNA sequence: 60,000
-

70,000.


HGP: 32,000
-

39,000 (including non
-
functional genes = inactive genes).

Effect size

Number of genes

Major genes

Polygenes

Oligogenes

Distribution of the number of genes

Finding genes: a challenge

One of the most difficult challenges ahead is to
find genes involved in diseases that have a
complex pattern of inheritance, such as those
that contribute to osteoporosis, diabetes,
asthma, cancer and mental illness.


Why Search for Genes?


Scientific value



Study genes’ actions at the molecular level


Therapeutic value


Gene product and development of new drugs;


Gene therapy


Public health


Identification of “high
-
risk” individuals


Interaction between genes and environment

Genomewise screening vs

Candidate aene approach


Genomewise screening


No physiological assumption


Systematic screening for chromosomal regions of
interest in the entire genome


Candidate gene


Proven or hypothetical physiological mechanism


Direct test for individual genes

Linkage vs Association


Linkage


Transmission of genes within pedigrees


Association


Difference in allele frequencies between cases and
unrelated controls

Statistical models


Linkage analysis

traces cosegregation and
recombination phenomena between observed markers and
unobserved putative trait. Significance is shown by a LOD
(log
-
odds) score.


Association analysis
compares the frequencies of
alleles between unrelated cases (diseased) and controls.


Transmission disequilibrium test (TDT)
examines the transmission of alleles from heterozygous
parents to those children exhibiting the phenotype of
interest.

Two
-
point linkage analysis: an example

??

138 /142

134 /142

146 / 154

142 /146

142 /154

134 / 146

142 / 154

134 / 146

134 / 154

134 / 146

134 / 154

Non
Rec

Non Non Non Non
Rec

Non

D

142

D

142

d

134

Non = non
-
recombination; Rec = recombination

134


142


D d

1/4

1/4

1/4

1/4

134


142


D d

0

1/2

0

1/2

134


142


D d

q/2

(1
-
q
)/2

q
/2

(1
-
q
)/2

No linkage

Complete linkage

Incomplete linkage

LOD

score

Estimated value of
q

0 0.1 0.2 0.3 0.4 0.5

Estimation of
q

-
6

-
4

-
2

0

+2

+4

+6

Max LOD score

Basic linkage model

LR: likelihood ratio

LR(
q
) = L(data |
q
) / L(data |
q

=
0.5)

LOD = Log
10

max [LR(
q
)]

Haseman
-
Elston model

(allele sharing method)

X
i1

= value of sib 1;
X
i2

= value of sib 2

D
i

=
abs
(
X
i1

-

X
i2
)
2

p
i

= probability of genes shared identical
-
by
-
descent

E(
D
i

|
p
i
) =
a

+
b
p
i

If
b

= 0

=>


s
2
(
g
) = 0;
q

= 0.5, i.e. No linkage

If
b

< 0

=>


s
2
(
g
) > 0;
q

ne 0.5, i.e. Linkage


Behav Genet 1972; 2:3
-
19

Identical
-
by
-
descent (IBD)

126 / 130

134 / 138

126 / 134 126 / 138 130 / 134 130 / 138 126 / 138


A B C D E



A and D share no alleles



A, B and E share 1 allele (126) ibd; C vs D; A vs C; B, D and E



B and E share 2 (126 and 138) alleles ibd

Alleles ibd if they are identical and descended from the same ancestral allele

Identical
-
by
-
state (IBS)

126 / 126

126 / 138

126 / 126 126 / 138 126 / 138 126 / 126


A B C D



A and D share 1 allele (126) ibs



B and C share 126 ibs, 138 ibd

Alleles ibs if they are identical, but their ancestral derivation is unclear

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

Squared

difference

in BMD

among

siblings

Number of alleles shared IBD

0 1 2

Sibpair linkage analysis:

allele
-
sharing method

Linkage between VDR gene and lumbar spine bone mineral density

in a sample of 78 DZ twin pairs.

Nature 1994; 367:284
-
287

Association analysis


Presence/absence of an allele in a phenotype.

Genotype Fx No Fx

BB 50 10

Bb 30 30

bb 20 60

Total 100 100

Frequency of allele B among fx: (50x2 + 30) / (100x2) = 0.65

Freq. of allele B among no fx: (10x2 + 30) / (100x2) = 0.25


Association analysis: an example

Association between vitamin D receptor gene and bone mineral density

Association analysis


Three conditions of association


The genetic marker is the putative gene


The marker is in linkage disequilibrium (association)
with the putative gene or with a nearby locus


Random artefact, population admixture

Linkage and association


Linkage without association


Many trait
-
causing loci


Association between a marker and a loci can be weak or
absent


Association without linkage


A minor effect of the genetic marker


Poor discriminant power for phenotype within a pedigree

Statistical issues

Diagnostic reasoning



Disease is really

Test Present Absent

______________________________________________

+ve True +ve False +ve

-
ve False
-
ve True
-
ve

______________________________________________

Statistical reasoning



Null hypothesis (Ho) is

Stat test Not true True

______________________________________________

Reject Ho No error Type I (
a
)

Accept Ho Type II (
b
) No error

______________________________________________

Study design: minimize type I and type II errors



l
††††††
䱏䐠㴠㌠†††††††䱏䐠㴠=


1.1

7460


8931


1.2

2048


2566


1.3

1033


1299


1.5

489


615


2.0

199


242


1.5

191


154


3.0

88


115




No. of sibpairs required to establish linkage
for a single gene and recombination = 0

l

= familial relative risk

Strategies for improvement of power


Population and sampling


Phenotypes


Statistical analysis

Population and sampling


Population


Homogenous populations


Sampling units


Related members


Large, multigenerational families (rather than
sibpairs)


Phenotypes


Low
-
level, intermediate


Well
-
defined and highly reproducible


Statistical analyses


Multivariate analysis vs. univariate analysis


Variance component model


Power


Locus
-
specific power
:

probability of detecting an
individual locus associated with the trait, e.g. 1
-
b
i


Genomewide power
: probability of detecting
any
of the
k
loci, e.g. 1
-
b
1

x
b
2

x
b
3

x … x
b
k



Studywise power
: probability of detecting
all k

loci, e.g.
(1
-
b
1
)
x
(1
-
b
2
)
x
(1
-
b
3
)
x ... x
(1
-
b
k
)

Summary


Most diseases are regulated by genes and
environment.


Genetic dissection of multifactorial diseases
is a challenge.


Gene
-
hunting is a major endeavour in
epidemiological research.


Substantial progress in statistical models.


Perspective


Can genes be found?


The Human Genome Project


Influences of biotechnology


Should “epidemiology” become “genetic
epidemiology”?



BMJ 2001; 322: 28 April. Special issue on genetics.


Nguyen TV, Eisman JA.
Genetics of fracture:
challenges and opportunities
.
J Bone Miner Res

2000; 15:1253
-
1256.


Nguyen TV, Blangero J, Eisman JA.
Genetic
epidemiological approaches to the search for
osteoporosis genes
.
J Bone Miner Res

2000;
15:392
-
401.


Nguyen TV, et al. Bone mass, lean mass and fat
mass: same genes or same environment.
Amer J
Epidemiol

1998; 147:3
-
16.

Further readings