CIS 595 Bioinformatics


Oct 2, 2013 (4 years and 6 months ago)


CIS 595 Bioinformatics

Lecture 2

Based on the book chapter:

Hunter, L., Molecular Biology for Computer Scientists. Artificial
Intelligence for Molecular Biology, Ed. L. Hunter, pp. 1
46, AAAI
Press, 1993.

Contains figures taken from

Molecular Biology of the Cell. 4th ed.

Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.,

New York: Garland Publishing; 2002.:

Available from


All of the genetic information in any living
creature is stored in deoxyribonucleic acid
(DNA) and ribonucleic acid (RNA)


polymers of four simple nucleic acid units,
called nucleotides.

Four nucleotides (or bases):

Purines: adenine (A) and guanine (G)

Pyrimidines: cytosine (C) and thymine (T) (uracil: U)

Each nucleotide consists of

one of two base molecules (a purine or a pyrimidine)

sugar (ribose in RNA and deoxyribose DNA)

one or more phosphate groups.

Nucleic Acids

Figure 4
3. DNA and its building blocks.

DNA is made of four types of nucleotides, which are linked covalently into a polynucleotide chain (a DNA
strand) with a sugar
phosphate backbone from which the bases (A, C, G, and T) extend. A DNA molecule is composed of two DNA stra
nds held
together by hydrogen bonds between the paired bases. The

at the ends of the DNA strands indicate the polarities of the two strands,
which run antiparallel to each other in the DNA molecule. In the diagram at the bottom left of the figure, the DNA molecule i
s s
hown straightened out; in
reality, it is twisted into a double helix, as shown on the right. For details, see
Figure 4

Genes and Genetic Code

Genetic information can be stored in several
DNA molecules (chromosomes)

In diploids, each chromosome contains two similar
DNA molecules, one from each parent

Bacteria have a single circular chromosome

Primary role of DNA: carry the encoding of the

Each triplet of nucleotides (called codon) codes for a
single AA

Stop codons define the end of a protein

Open reading frame: long enough stretch of codons
without intervening stop codons

There are 6 different ways to start parsing a segment
of DNA


Figure 4
14. Two closely related species of deer with very different chromosome numbers.

In the evolution of the Indian
muntjac, initially separate chromosomes fused, without having a major effect on the animal. These two species have roughly
the same number of genes. (Adapted from M.W. Strickberger, Evolution, 3rd edition, 2000, Sudbury, MA: Jones & Bartlett

Genetic Code

Figure 6
50. The genetic code.

The standard one
letter abbreviation for each amino acid is presented below its three
abbreviation (see
Panel 3
, pp. 132 133, for the full name of each amino acid and its structure). By convention, codons are
always written with the 5
terminal nucleotide to the left. Note that most amino acids are represented by more than one codon,
and that there are some regularities in the set of codons that specifies each amino acid. Codons for the same amino acid tend


contain the same nucleotides at the first and second positions, and vary at the third position. Three codons do not specify a
amino acid but act as termination sites (stop codons), signaling the end of the protein
coding sequence. One codon AUG acts
both as an initiation codon, signaling the start of a protein
coding message, and also as the codon that specifies methionine

Organization of Genes

Figure 4
15. The organization of genes on a human chromosome.

(A) Chromosome 22, one of the smallest human chromosomes, contains 48

nucleotide pairs and makes up approximately 1.5% of the entire human genome. Most of the left arm of chromosome 22 consists o
f s
hort repeated
sequences of DNA that are packaged in a particularly compact form of chromatin (heterochromatin), which is discussed later in

is chapter. (B) A tenfold
expansion of a portion of chromosome 22, with about 40 genes indicated. Those in
dark brown

are known genes and those in
light brown

are predicted
genes. (C) An expanded portion of (B) shows the entire length of several genes. (D) The intron
exon arrangement of a typical gen
e is shown after a
further tenfold expansion. Each exon

codes for a portion of the protein, while the DNA sequence of the introns

is relatively unimportant. The
entire human genome (3.2

109 nucleotide pairs) is distributed over 22 autosomes and 2 sex chromosomes (see
Figures 4

). The term
human genome sequence

refers to the complete nucleotide sequence of DNA in these 24 chromosomes. Being diploid, a human somatic cell therefore
contains roughly twice this amount of DNA. Humans differ from one another by an average of one nucleotide in every thousand,

a wide variety of
humans contributed DNA for the genome sequencing project. The published human genome sequence is therefore a composite of man
y i
sequences. (Adapted from International Human Genome Sequencing Consortium,

409:860 921, 2001

Reading Frames

Figure 6
51. The three possible reading frames in protein synthesis.

In the process of translating a nucleotide sequence

into an amino acid sequence

the sequence of nucleotides in an mRNA molecule is read from the 5 to the 3 end
in sequential sets of three nucleotides. In principle, therefore, the same RNA sequence can specify three completely differen
amino acid sequences, depending on the reading frame. In reality, however, only one of these reading frames contains the actu

Transcription and Translation

Steps in mapping from DNA sequence to
folded protein:

: from DNA segment to
messenger RNA (mRNA)

: from mRNA to protein

translational modification
: transformation
of a protein into its functional form (i.e. by
attaching various chemical groups)

From DNA to Protein

Figure 6
21. Summary of the steps leading from gene to protein in eucaryotes and bacteria.

The final level of a protein in the cell depends on the
efficiency of each step and on the rates of degradation of the RNA and protein molecules. (A) In eucaryotic cells the RNA mol
le produced by
transcription alone (sometimes referred to as the primary transcript) would contain both coding (exon) and noncoding (intron)

quences. Before it can be
translated into protein, the two ends of the RNA are modified, the introns are removed by an enzymatically catalyzed RNA spli
g reaction, and the
resulting mRNA is transported from the nucleus to the cytoplasm. Although these steps are depicted as occurring one at a time
, i
n a sequence, in reality
they are coupled and different steps can occur simultaneously. For example, the RNA cap is added and splicing typically begin
s b
efore transcription has
been completed. Because of this coupling, complete primary RNA transcripts do not typically exist in the cell. (B) In procary
s the production of mRNA
molecules is much simpler. The 5 end of an mRNA molecule is produced by the initiation of transcription by RNA polymerase, a
the 3 end is produced
by the termination of transcription. Since procaryotic cells lack a nucleus, transcription and translation take place in a co
n compartment. In fact,
translation of a bacterial mRNA often begins before its synthesis has been completed

E.Coli Example

Figure 7
33. The clustered genes in
E. coli

that code for enzymes that manufacture the amino acid tryptophan.

five genes are transcribed as a single mRNA molecule, a feature that allows their expression to be controlled coordinately.
Clusters of genes transcribed as a single mRNA molecule are common in bacteria. Each such cluster is called an operon



Binding of RNA polymerase to promoter region of


DNA region upstream and near a protein coding

Creation of a complementary strand of RNA

Contains introns and exons

Ends with 250 or more adenosines (poly
A tail)

Splicing our introns and connecting exons using
spliceosome to create mRNA

mRNA is transported to the cytoplasm to bind with a

DNA Transcription

Figure 6
8. DNA is transcribed by the enzyme RNA polymerase.

The RNA polymerase
(pale blue)

moves stepwise along the DNA, unwinding the
DNA helix at its active site. As it progresses, the polymerase adds nucleotides (here,
small "T" shapes)

one by one to the RNA chain at the
polymerization site using an exposed DNA strand as a template. The RNA transcript is thus a single
stranded complementary copy o
f one of the two
DNA strands. The polymerase has a rudder (see
Figure 6
) that displaces the newly formed RNA, allowing the two strands of DNA behind the
polymerase to rewind. A short region of DNA/RNA helix (approximately nine nucleotides in length) is therefore formed only tra
ently, and a "window"
of DNA/RNA helix therefore moves along the DNA with the polymerase. The incoming nucleotides are in the form of ribonucleosid
e t
riphosphates (ATP,
UTP, CTP, and GTP), and the energy stored in their phosphate
phosphate bonds provides the driving force for the polymerization r
eaction (see
). (Adapted from a figure kindly supplied by Robert Landick

Transcription Initiation

Figure 6
16. Initiation of transcription of a eucaryotic gene by RNA polymerase II.

To begin
transcription, RNA polymerase requires a number of general transcription factors (called TFIIA, TFIIB,
and so on). (A) The promoter contains a DNA sequence called the TATA box, which is located 25
nucleotides away from the site at which transcription is initiated. (B) The TATA box is recognized and
bound by transcription factor TFIID, which then enables the adjacent binding of TFIIB (C). For
simplicity the DNA distortion produced by the binding of TFIID (see
Figure 6
) is not shown. (D)
The rest of the general transcription factors, as well as the RNA polymerase itself, assemble at the
promoter. (E) TFIIH then uses ATP to pry apart the DNA double helix at the transcription start point,
allowing transcription to begin. TFIIH also phosphorylates RNA polymerase II, changing its
conformation so that the polymerase is released from the general factors and can begin the
elongation phase of transcription. As shown, the site of phosphorylation is a long C
polypeptide tail that extends from the polymerase molecule. The assembly scheme shown in the
figure was deduced from experiments performed
in vitro
, and the exact order in which the general
transcription factors assemble on promoters in cells is not known with certainty. In some cases, the
general factors are thought to first assemble with the polymerase, with the whole assembly
subsequently binding to the DNA in a single step. The general transcription factors have been highly
conserved in evolution; some of those from human cells can be replaced in biochemical experiments
by the corresponding factors from simple yeasts


Figure 7
43. Activation of transcription initiation in eucaryotes by
recruitment of the eucaryotic RNA polymerase II holoenzyme complex.

An activator protein bound in proximity to a promoter attracts the holoenzyme
complex to the promoter. According to this model, the holoenzyme (which contains
over 100 protein subunits) is brought to the promoter separately from the general
transcription factors TFIID and TFIIA. The "broken" DNA in this and subsequent
figures indicates that this portion of the DNA molecule can be very long and of
variable length. (B) Diagram of an
in vivo

experiment whose outcome supports the
holoenzyme recruitment model for gene activator proteins. The DNA
domain of a protein has been fused directly to a protein component of the
mediator, a 20
subunit protein complex which is part of the holoenzyme complex,
but which is easily dissociable from the remainder of the holoenzyme. When the
binding site for the hybrid protein is experimentally inserted near a promoter,
transcription initiation is strongly increased. In this experiment, the "activation
domain" of the activator (see
Figure 7
) has been omitted, suggesting that an
important function of the activation domain is simply to interact with the RNA
polymerase holoenzyme complex and thereby aid in its assembly at the promoter.
The ability of gene activator proteins to recruit the transcription machinery to
promoters has also been demonstrated directly, using chromatin
immunoprecipitation (see
Figure 7
). DNA
bound activator proteins typically
increase the rate of transcription by up to 1000
fold, which is consistent with a
relatively weak and nonspecific interaction between the activator and the
holoenzyme (a 1000
fold change in affinity corresponds to a change in D

of ~4
kcal/mole, which could be accounted for by just a few weak, noncovalent bonds

RNA Splicing

Figure 6
26. The RNA splicing reaction.

(A) In the first step, a specific adenine nucleotide in the intron sequence (indicated in

attacks the 5 splice site and cuts the sugar
phosphate backbone of the RNA at this point. The cut 5 end of the intron
becomes covalently linked to the adenine nucleotide, as shown in detail in (B), thereby creating a loop in the RNA molecule.
The released free 3
OH end of the exon sequence then reacts with the start of the next exon sequence, joining the two exons
together and releasing the intron sequence in the shape of a

The two exon sequences thereby become joined into a
continuous coding sequence; the released intron sequence is degraded in due course

Alternative Splicing

Figure 6
27. Alternative splicing of the a
tropomyosin gene from rat.

tropomyosin is a coiled
coil protein (see
Figure 3
that regulates contraction in muscle cells. The primary transcript can be spliced in different ways, as indicated in the figu

produce distinct mRNAs, which then give rise to variant proteins. Some of the splicing patterns are specific for certain type
s o
cells. For example, the a
tropomyosin made in striated muscle is different from that made from the same gene in smooth
muscle. The arrowheads in the top part of the figure demark the sites where cleavage and poly
A addition can occur



a complex combination of RNAs
and proteins that produces a protein from mRNA


critical for translation.
They have

an anti
codon (that binds to codon) near one end

The corresponding AA on the other end

After translation and post
modification, the protein is transported by

Passive diffusion

And active transport mechanism

mRNA Translation

Figure 6
65. Translating an mRNA molecule.

Each amino acid added to the growing end of a polypeptide chain is
selected by complementary base
pairing between the anticodon on its attached tRNA molecule and the next codon
on the mRNA chain. Because only one of the many types of tRNA molecules in a cell can base
pair with each
codon, the codon determines the specific amino acid to be added to the growing polypeptide chain. The three
cycle shown is repeated over and over during the synthesis of a protein. An aminoacyl
tRNA molecule binds to a
vacant A
site on the ribosome in step 1, a new peptide bond is formed in step 2, and the mRNA moves a distance of
three nucleotides through the small
subunit chain in step 3, ejecting the spent tRNA molecule and "resetting" the
ribosome so that the next aminoacyl
tRNA molecule can bind. Although the figure shows a large movement of the
small ribosome subunit relative to the large subunit, the conformational changes that actually take place in the
ribosome during translation are more subtle. It is likely that they involve a series of small rearrangements within each
subunit as well as several small shifts between the two subunits. As indicated, the mRNA is translated in the 5
direction, and the N
terminal end of a protein is made first, with each cycle adding one amino acid to the C
of the polypeptide chain. The position at which the growing peptide chain is attached to a tRNA does not change
during the elongation cycle: it is always linked to the tRNA present in the P site of the large subunit

Genetic Regulation

At any particular time, a cell is producing only a
small fraction of the proteins coded for in its

The amount of each protein must be precisely
regulated in order for the cell to function properly

The cell regulates protein production in
response to the environment or other cues

The production process is controlled by a
complex collection of proteins that influence
which genes are expressed.

Expression Efficiency

Figure 6
3. Genes can be expressed with different efficiencies.

Gene A is transcribed and translated much more efficiently
than gene B. This allows the amount of protein A in the cell to be much greater than that of protein B

Gene Expression

Figure 7
3. Differences in mRNA expression patterns among
different types of human cancer cells.

This figure summarizes a
very large set of measurements in which the mRNA levels of 1800
selected genes (arranged


were determined for 142
different human tumors (arranged


each from a different
patient. Each small

bar indicates that the given gene in the
given tumor is transcribed at a level significantly higher than the
average across all the cell lines. Each small

bar indicates a
average expression level, and each

bar denotes an
expression level that is close to average across the different
tumors. The procedure used to generate these data mRNA
isolation followed by hybridization to DNA microarrays is described
Chapter 8

(see pp. 533 535). The figure shows that the relative
expression levels of each of the 1800 genes analyzed vary among
the different tumors (seen by following a given gene


across the figure). This analysis also shows that each type of tumor
has a characteristic gene expression pattern. This information can
be used to "type" cancer cells of unknown tissue origin by matching
the gene expression profiles to those of known tumors. For
example, the unknown sample in the figure has been identified as a
lung cancer. (Courtesy of Patrick O. Brown, David Botstein, and the
Stanford Expression Collaboration

6 Steps of Expression Control

Figure 7
5. Six steps at which eucaryotic gene expression can be controlled.

Controls that operate at steps 1 through
5 are discussed in this chapter. Step 6, the regulation of protein activity, includes reversible activation or inactivation b
protein phosphorylation (discussed in
Chapter 3
) as well as irreversible inactivation by proteolytic degradation (discussed in
Chapter 6

Nucleosome Disruption

Figure 4
34. A cyclic mechanism for nucleosome disruption and re

According to this model, different chromatin remodeling complexes
disrupt and re
form nucleosomes, although, in principle, the same complex might catalyze both reactions. The DNA
binding protein
s could function in
gene expression, DNA replication, or DNA repair, and in some cases their binding could lead to the dissociation of the histon
e c
ore to form nucleosome
free regions of DNA like those illustrated in
Figure 4
. (Adapted from A. Travers,

96:311 314, 1999.)

Control Regions

The control mechanisms are responsive to the level of a
product already in the cell as well as to a tremendous
variety of extracellular signals.

Regulatory proteins recognize and bind to very specific
sequences in the DNA (
control regions

Control regions tend to border the protein coding regions
of genes

Control regions just upstream of the coding region that
encourage expression are called

Similar regions either downstream of the coding region
or relatively far upstream are called

Sequences that tend to prevent the production of a
protein are called

Switching Tryptophan Genes

Figure 7
34. Switching the tryptophan genes on and off.

If the level of tryptophan inside the cell is low, RNA polymerase binds to the promoter and
transcribes the five genes of the tryptophan

operon. If the level of tryptophan is high, however, the tryptophan repressor is activated to bind to the
operator, where it blocks the binding of RNA polymerase to the promoter. Whenever the level of intracellular tryptophan drops
, t
he repressor releases its
tryptophan and becomes inactive, allowing the polymerase to begin transcribing these genes. The promoter includes two key blo

of DNA sequence
information, the
35 and
10 regions highlighted in

Figure 6

Repressor Proteins

Figure 7
49. Five ways in which eucaryotic gene repressor proteins can

(A) Gene activator proteins and gene repressor proteins compete for
binding to the same regulatory DNA sequence. (B) Both proteins can bind DNA, but
the repressor binds to the activation domain of the activator protein thereby
preventing it from carrying out its activation functions. In a variation of this strategy,
the repressor binds tightly to the activator without having to be bound to DNA
directly. (C) The repressor interacts with an early stage of the assembling complex
of general transcription factors, blocking further assembly. Some repressors also
act at late stages in transcription initiation, for example, by preventing the release of
the RNA polymerase from the general transcription factors. (D) The repressor
recruits a chromatin remodeling complex which returns the nucleosomal state of the
promoter region to its pre
transcriptional form. Certain types of remodeling
complexes appear dedicated to restoring the repressed nucleosomal state of a
promoter, whereas others (for example, those recruited by activator proteins) render
DNA packaged in nucleosomes more accessible (see
Figure 4
). However the
same remodeling complex could in principle be used either to activate or repress
transcription: depending on the concentration of other proteins in the nucleus, either
the remodeled state or the repressed state could be stabilized. According to this
view, the remodeling complex simply allows chromatin structure to change. (E) The
repressor attracts a histone deacetylase to the promoter. Local histone
deacetylation reduces the affinity of TFIID for the promoter (see
Figure 7
) and
decreases the accessibility of DNA in the affected chromatin. A sixth mechanism of
negative control inactivation of a transcriptional activator by heterodimerization was
illustrated in
Figure 7
. For simplicity, nucleosomes have been omitted from (A)
(C), and the scale of (D) and (E) has been reduced relative to (A)

Complex Example

Figure 7
50. Eucaryotic gene regulatory proteins often assemble into complexes on DNA.

Seven gene regulatory
proteins are shown in (A). The nature and function of the complex they form depends on the specific DNA sequence that
seeds their assembly. In (B), some assembled complexes activate gene transcription, while another represses transcription.
Note that the

protein is shared by both activating and repressing complexes

Computational Challenge:

Consensus Sequences

Figure 6
17. Consensus sequences found in the vicinity of eucaryotic RNA polymerase II start points.

The name given to each consensus
(first column)

and the general transcription factor that recognizes it
(last column)

are indicated. N indicates any nucleotide, and two nucleotides
separated by a slash indicate an equal probability of either nucleotide at the indicated position. In reality, each consensus

quence is a shorthand
representation of a histogram similar to that of
Figure 6
. For most RNA polymerase II transcription start points, only two or three of the four sequences
are present. For example, most polymerase II promoters have a TATA box sequence, and those that do not typically have a "stro

INR sequence.
Although most of the DNA sequences that influence transcription initiation are located "upstream" of the transcription start
nt, a few, such as the DPE
shown in the figure, are located in the transcribed region

Computational Challenge:

Consensus Sequences

Figure 6
28. The consensus nucleotide sequences in an RNA molecule that signal the beginning and the end of most introns in huma

Only the three blocks of nucleotide sequences shown are required to remove an intron sequence; the rest of the intron can be
upied by any
nucleotide. Here A, G, U, and C are the standard RNA nucleotides; R stands for either A or G; Y stands for either C or U. The

highlighted in

forms the branch point of the lariat produced by splicing. Only the GU at the start of the intron and the AG at its end are i
riant nucleotides in the
splicing consensus sequences. The remaining positions (even the branch point A) can be occupied by a variety of nucleotides,
hough the indicated
nucleotides are preferred. The distances along the RNA between the three splicing consensus sequences are highly variable; ho
er, the distance
between the branch point and 3 splice junction is typically much shorter than that between the 5 splice junction and the br
h point

Mechanisms of Evolution

The similarities and differences among closely related
molecules provide important information about their
structure and function

Molecules (or their sequences) which are related to one
another are said to be homologous

Sources of variation:

Point mutation (insertion, deletion, change)

Gene duplication (create pseudogenes)

Crossover (sexual reproduction)

Effects of mutations:

Neutral (in introns, third codon position); basis of genetic drift

Lethal (very often)

Advantageous (rarely); get fixed in the population

4 Types of Innovation

Figure 1
23. Four modes of genetic innovation and their effects on the DNA sequence of an organism

Effect of Mutation

Figure 1
29. A mutant phenotype reflecting the function of a gene.

A normal yeast (of the species
Schizosaccharomyces pombe)

is compared with a mutant in which a change in a single gene has converted the cell from a
cigar shape

to a T shape

The mutant gene therefore has a function in the control of cell shape. But how, in
molecular terms, does the gene product perform that function? That is a harder question, and needs biochemical analysis to
answer it. (Courtesy of Kenneth Sawin and Paul Nurse.)

Gene Duplication

Figure 1
51. The consequences of gene duplication for mutational analyses of gene function.

In this hypothetical example, an ancestral
multicellular organism has a genome containing a single copy of gene G, which performs its function at several sites in the b
, indicated in
. (A)
Through gene duplication, a modern descendant of the ancestral organism has two copies of gene G, called G1 and G2. These hav
e d
iverged somewhat
in their patterns of expression and in their activities at the sites where they are expressed, but they still retain importan
t s
imilarities. At some sites, they
are expressed together, and each independently performs the same old function as the ancestral gene G (alternating

yellow stripes);

at other
sites, they are expressed alone and serve new purposes. (B) Because of a functional overlap, the loss of one of the two genes


(red cross)

reveals only a part of its role; only the loss of both genes in the double mutant reveals the full range of processes for whi
these genes are responsible.
Analogous principles apply to duplicated genes that operate in the same place (for example, in a single
celled organism) but are

called into action
together or individually in response to varying circumstances. Thus, gene duplications complicate genetic analyses in all org

Human vs. Mouse

Figure 1
53. Human and mouse: similar genes and similar development.

The human baby and the mouse shown here
have similar white patches on their foreheads because both have mutations in the same gene (called

required for the
development and maintenance of pigment cells. (From R.A. Fleischman,
Proc. Natl. Acad. Sci. USA

88:10885 10889, 1991.
© National Academy of Sciences

Life as a Biochemical Process

The revolution in biology over the last three
decades resulted from the understanding cells in
terms of their chemistry

The functions of life (e.g. cell division, immune
reaction) are the interactions of complex, self
regulating networks of chemical reactions


the substances that carry out and
regulate these activities:

macromolecules (proteins, carbohydrates, lipids)

a variety of small molecules.

Life as a Biochemical Process

genetic material

specifies how to
create proteins, when and how much to

control the flow of energy and
materials through the cell

including the creation and transformation of
carbohydrates, lipids and other molecules,

Proteins ultimately accomplish all of the
functions that the cell carries out.

Life as a Biochemical Process

In even the simplest cell, there are thousands of
kinds of biomolecules interacting with each other

The amount and concentration of molecules in
various cell compartments determines what
influence they have

The concentrations vary over time, on scales of
seconds to decades.

Interactions among biomolecules are highly non
linear, as are their interactions with molecules
from outside the cell.

Catalysis and Metabolic Pathways

: most essential chemical
processes within a cell

catabolic processes: breaking down proteins, fats and

anabolic processes: building new materials.

Processes in intermediary metabolism are almost
catalyzed reactions

require special compounds (catalysts or enzymes) that
facilitate the reaction


are mostly proteins

materials transformed by catalysts are called substrates

catalysts are not changed by the reactions they
participate in

Catalysis and Metabolic Pathways

: combinations of reactions which
accomplish tasks like turning foods into useable
energy or compounds

Even the basic transformations of intermediary
metabolism can involve dozens or hundreds of
catalyzed reactions.

Allosteric regulation:

the presence or absence of substrates can affect the
behavior of the enzymes themselves

Competitive inhibition:

substance other than the usual substrate of the
enzyme binds to the active site of the enzyme

this is preventing it from having an effect on its

How Enzymes Work

Figure 2
47. How enzymes work.

Each enzyme has an active site to which one or two

molecules bind, forming an enzyme
complex. A reaction occurs at the active site, producing an enzyme
product complex. The

is then released, allowing the enzyme to bind
additional substrate molecules

How Enzymes Work

Figure 3
52. Some general strategies of enzyme catalysis.

(A) Holding substrates together in a precise alignment. (B) Charge stabilization of
reaction intermediates. (C) Altering bond angles in the substrate to increase the rate of a particular reaction

Positive regulation

Figure 3
57. Positive regulation caused by conformational coupling between two distant binding sites.

In this example, both glucose and
molecule X bind best to the

conformation of a protein with two domains. Because both glucose and molecule X drive the protein toward its
closed conformation, each ligand helps the other to bind. Glucose and molecule X are therefore said to bind

to the protein

Negative regulation

Figure 3
58. Negative regulation caused by conformational coupling between two distant binding sites.

The scheme here resembles that in the
previous figure, but here molecule X prefers the

conformation, while glucose prefers the

conformation. Because glucose and molecule X
drive the protein toward opposite conformations (closed and open, respectively), the presence of either ligand interferes wit
h t
he binding of the other

Metabolic chain

Figure 2
34. How a set of enzyme
catalyzed reactions generates a metabolic pathway.

Each enzyme catalyzes a particular chemical reaction,
leaving the enzyme unchanged. In this example, a set of enzymes acting in series converts molecule A to molecule F, forming a

tabolic pathway


Figure 2
35. Some of the metabolic pathways and their
interconnections in a typical cell.

About 500 common metabolic
reactions are shown diagrammatically, with each molecule in a
metabolic pathway represented by a filled circle, as in the

box in
Figure 2

Model Organisms

Most valuable methods in biological research

are invasive

require organisms to be sacrificed

require many generations of observation

require observations on large populations

Much of this work is impractical or unethical to
carry out on humans

research involving humans can be done ethically

in some areas of biomedical research, such as final
drug testing, it is obligatory

Biologists have selected a variety of model
organisms for experimentation.

Model Organisms

E. coli (the ubiquitous intestinal bacterium Escherichia

fast reproduction time, safe and easy to work with

common target for genetic engineering, (genes from other
organisms are inserted into its genome and produced in

Saccharomyces cervesiae (brewer’s yeast)

safe, easy to grow, short generation time

yeasts are eukaryotes

useful in in sequencing large amounts of DNA

Arabidopsis thaliana (a common weed)

useful in agriculture research

very little repetitive DNA (genome 30 times less DNA than corn)

Model Organisms

Caenorhabditis elegans (the nematode worm)

has exactly 959 cells doing exactly the same thing

one of the simplest creatures with a nervous system (1/3 of its

useful to gaining insight in gain insight into the mechanisms of
aging and development

Drosophila melanogaster (the fruit fly)

short generation times

many different genetically determined morphological

a staple of classical genetics research

Mus musculus (laboratory mouse)


genetically practically identical to people

Experimental Methods

Important for understanding the strengths and
limitations of various sources of data


Imaging/microscopy; new technologies (e.g. Atomic
Force Microscope) offer the potential to create
images of individual molecules.

Gel electrophoresis; a method for separation of a
protein mixture by their charge and size

The resulting individual spots or bands can be removed from
the gel for further processing, in a procedure called blotting

Experimental Methods


A group of cells with identical genomes (also called a cell

individual genes could be cloned

by inserting a piece of DNA containing a gene of interest into
the bacterial genome; creating recombinant DNA)

how: put the new gene into vectors

a virus that infects
bacteria or a plasmid; add antibiotic resistance genes

Genetic engineering: inserting foreign genes into plants,
animals, even people.

Hybridization / Immunological staining; to identify the
presence or absence of a particular molecule (or even
a region of a molecule) in vanishingly small amounts

Application in microarrays to measure expression of all
genes in a genome

Experimental Methods

Gene Mapping and Sequencing; to identify the
location and size of all of the genes

Polymerase chain reaction (PCR); to rapidly produce
huge amounts of a specific region of DNA, simply by
knowing a little bit of the sequence around the desired

Crystallography and NMR; to determine 3D structure
of biomolecules

Mass spectrometry; to identify sequence of protein,
RNA, or DNA fragments

Hybrid systems; to assess protein