ppt file

fleagoldfishΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

72 εμφανίσεις

DNA Properties

CSE, Marmara University

mimoza.marmara.edu.tr/~m.sakalli/cse546


Oct/19/09

Computational

Molecular

Biology

Bioinformatics

Genomics

Genomics

Proteomics

Functional

genomics

Structural

bioinformatics

Structural

bioinformatics

Computational

Molecular

Biology

Bioinformatics

Genomics

Genomics

Proteomics

Functional

genomics

Structural

bioinformatics

Structural

bioinformatics

No simple definition of being alive!! (life)..

Reproducing itself, a default mechanism for every alive being

How about computer programs, crystals, and self building and self
learning robotics and computers..


Life on earth is a result of an evolutionary process, and idea is that all
living things have a common ancestor and are related through…


Basic components of evolution:

Inheritance

Variation: defined legal moves in genotype space.

Selection: a probabilistic evaluation function


In Computer Science: DNA is a string of symbols from alphabet
{A,C,G,T}

A search through a very large space of possible organism
characteristics.


And the words built from the four letter alphabet covers all the
inherited

characteristics (called the genotype) of all the organisms.

The Central Dogma in molecular biology

http://proquestcombo.safaribooksonline.com/0596002998/blast
-
CHP
-
2

3 processes: Replication, Transcription, and Translation.

Every cell in our body has 23
chromosomes

in the nucleus and the
genes

in these chromosomes are
responsible for almost all of the characteristics (not merely a physical).

http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.600, by Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raf
f,
Keith Roberts, and Peter Walter

Figure 4
-
5. The DNA double helix.

(A) A space
-
filling model of 1.5 turns of the DNA double helix. Each turn of DNA is made up of 10.4
nucleotide pairs and the center
-
to
-
center distance between adjacent nucleotide pairs is 3.4 nm. The coiling of the two strands a
round
each other creates two grooves in the double helix. As indicated in the figure, the wider groove is called the major groove,
and

the smaller
the minor groove. (B) A short section of the double helix viewed from its side, showing four base pairs. The nucleotides are
lin
ked
together covalently by phosphodiester bonds through the 3
-
hydroxyl (
-
OH) group of one sugar and the 5
-
phosphate (P) of the nex
t.
Thus, each polynucleotide strand has a chemical polarity; that is, its two ends are chemically different. The 3 end carries
an
unlinked
-
OH
group attached to the 3 position on the sugar ring; the 5 end carries a free phosphate group attached to the 5 position on

th
e sugar ring.


Polymer of:

Ribose sugar

Phosphate

Nitrogenous base

Bases

A, C, G, T

and Uracil

Pairing rule

A (R)


T (Y)

G (R)


C (Y)

PuRine, Pyrimidine

DNA structure and base pairing

Why double
-
stranded!

Chemically and biophysically more stable!!, allows some error correction (backup)
if accidentally damaged

UV irradiation
--
.

Genes (less than 5% of all),
providing the coding information.

Instructions for protein synthesis,

regulatory functions..

Redundancy translates to
robustness!!

Synonymous codons

Dual strands

Diploid


In
translation

the information now
encoded in RNA is deciphered
(translated) into instructions for
making a protein.

Codon: Sets of three nucleotides.
Codon determines which amino
acid to be added next in the protein
chain.

For example, GCU, the first codon
in the figure, codes for alanine.

RNA
-

Translation

The table of the nucleotide triplets (codons) and their corresponding aa. a uracil (U) is substituted for a
thymine (T). This is Universal process..

The RNA alphabet is A, C, G, and U, GAAUUC

the third position of a codon is often insignificant

ATG: Start codon protein (methionine)

T in the middle hydrophobic aa.

64 possible codons but 20 total aa, start and stop kind of!!.. Or regulatory functions.


Second nt position, U, C, A, G

1
st

nt position, U, C, A, G

3
rd

nt position, U, C, A, G

SNP, single nucleotide polymorphism, wobbling in the code, neutral
synonymous mutations.

Some changes at every third of the DNA sequence, for example a point mutation such as
that shown below, will not yield any variation of the amino acid sequence and nor the
protein produced, for example alanine is produced in either case of a U to a C, therefore a
point mutation from U to C would make no difference.

GC
U
AGGAUCUCAGGCUCA

GC
C
AGGAUCUCAGGCUCA

Point mutation

Protein coding sequences are called
exons
.
The redundant parts are

introns
, intervening
DNA segments. Both introns and exons are
transcribed into mRNA (see next slide) but
only exons remain in the final transcript.
Frameshift of the sequence: 6 possible
reading. Therefore it is important to
know which codon to start translation
with, and where to stop.


http://en.wikipedia.org/wiki/Gene

Splicing of DNA to
eliminate introns

A Science Primer http://www.ncbi.nlm.nih.gov/About/primer/est.html

A protein
-
coding region framed with
Met (ATG) and any stop codon is
(called an
open reading frame
). TAA,
TAG, or TGA. An example of an ORF.

….TCGA
ATG
GCATTCGCAGTC…………..T
ACTTGCACGCTTGACCGTCA
TAA
GCA….

In addition, each of the 20 aa’s have
different chemical properties which
cause the protein chains to form
different 3D shapes, and differentiate
their particular functions in the cell.

For example, certain folding patterns
(called tertiary structures) make it
possible for specific enzymes to bind
in a particular place. One change in
the DNA sequence could change the
amino acid, which could change the
protein structure…. And the
enzymes..

Levels and types of genome variations

Plant genomes may differ from one another in different ways:

http://www.igd.cornell.edu/Comparative%20Genomics


1.
Amount of DNA in the nucleus. Quantified in picogrms, (also called C
-
value),
varies over 1000
-
fold.

2.
Number and size of chromosomes.

3.
Differences at the sequence level, both in the |absolute order| of the bases, and in
the type and number of different classes of sequences.


Organisms originated millions of years ago, from the same sequence should be sharing
the same sequential structures, family
-
tree, phylogeny.

Some of the mechanisms of genetic variations:



Point mutations



Insertions and deletions



Translocations



Transposons, (mobile) jumping genes, retrotransposons copying
themselves from RNA back to DNA


reverse transcriptase,



Splicing, transcription and translation errors

Finding genes: cDNA
:
The genetic sequence could
be analyzed from the DNA, but it has too much non
-
genetic
junk materials, jut studying mRNA, however, mRNA and
protein are very unstable and therefore difficult to work
with.

Instead, scientists use special enzymes to convert RNA into
complementary DNA (cDNA)

which is a much more
stable compound and because it was generated from a
mRNA in which the introns have been removed, cDNA
represents only transcribed DNA sequence, the genes.

Genetic Mapping:

Used for linkage mapping, and uses the
concepts of Mendelian inheritance and recombination
frequencies to determine the chromosomal location by
analyzing their inherited patterns. Done by either Southern
blot (electrophoresis separated fragments subsequently
detected by probe hybridization) and, more recently
polymerase chain reaction
-

PCR (using thermal cycling)
based methods.

A tomato F2 population used to calculate recombination
frequencies, and genetic distances, between a selection of
SSRs simple sequence repeat (microsatellites) SSRs and
other molecular markers.


DNA: contains non
-
genic material

RNA: unstable

cDNA: stable and mainly genes

Comparative mapping
: Among related but
sexually incompatible species, heterologous
(between species) DNA markers can be used to
generate comparative maps and to infer linkage
conservation and the position of orthologous (if
branched from the homologous) loci. This
requires a minimal amount of similarity between
the target and probe species and so cannot be
used with more distantly related species. Most
gramineae genomes (i.e. grass species, maize,
rice, wheat, barley, millet, etc) are connected
through comparative genetic maps. While
genome size varies dramatically among grass
species, but the gene content and gene order
remain more highly conserved..


Packing of DNA
in the nucleus

http://employees.csbsju.edu/hjakubowski/classes/ch331/DNA/oldnastructure.html

Figure 1
-
38. Genome sizes compared.

Genome size is measured in nucleotide pairs of DNA per haploid genome, that is, per single copy of the
genome. (The cells of sexually reproducing organisms such as ourselves are generally diploid: they contain two copies of the
gen
ome, one inherited
from the mother, the other from the father.) Closely related organisms can vary widely in the quantity of DNA in their genome
s,
even though they
contain similar numbers of functionally distinct genes. (Data from W.
-
H. Li, Molecular Evolution, pp. 380 383. Sunderland, MA: S
inauer, 1997.)

Archebacterium living in a
superheated sulphur vent at the
bottom of the ocean

A two
-
ton polar bear roaming the
arctic circle


Genome size (length of DNA)
varies from 5,000 (SV40 virus)
to 3*10
9

(humans) 10
11

(higher
plants)


All organism share basic properties


Made of cells (membrane
-
enclosed
sacks of chemicals)


Carry basic reactions (e.g. core
metabolic and developmental
pathways)

Three major groups:

Archaea (recently discovered)

Bacteria (germs, algae, symbiotic organisms)

Eukaryotes

Animals

Green Plants

Fungi

Protists

Viruses

Figure 1
-
21. The three major divisions (domains) of the living world.

Note that traditionally the word
bacteria

has been used to refer to
procaryotes in general, but more recently has been redefined to refer to eubacteria specifically. Where there might be ambigu
ity
, we use the term
eubacteria

when the narrow meaning is intended. The tree is based on comparisons of the nucleotide sequence of a ribosomal RNA subunit i
n
the
different species. The lengths of the lines represent the numbers of evolutionary changes that have occurred in this molecule

in

each lineage.

Tree of Life