Decoding the rockfish genome: An introduction to modern genomics
and marine biology
Target Audience: AP high school biology students or lower division undergraduate college students.
4 to 11
How do eukaryotic genes work?
Pgs. 12 to 16
What can comparison of DNA sequences tell us about evolution?
Pgs. 17 to 29
How can I use bioinformatics to annotate genes?
What does a typical rockfish gene look li
This investigation focuses on how genes are found in a newly sequenced genome. This process is called
structural gene annotation. Concepts explored include the cell, the molecular basis of heredity, and
evolution. Students will use
online analytical tools to explore the fine structure of genes in the flag
, explore the connection between gene structure and cellular functions,
and the connection between function and evolutionary conservation of gene seque
bioinformatics are dynamic fields well
suited for capturing the imagination of
students in inquiry driven
classroom efforts. Genomic studies
omprehensive catalog of basic genetic information in a
re and functions responsible for organism’s survival, evolution, and
interactions with other organisms
of the same or different species
New genomes are being sequenced
at an increasing rate, leaving vast quantities of orphaned data that can be explored i
n authentic research
In this investigation, students will be given raw segments of DNA (i.e. genomic scaffolds) from the
genome project. In order to inform their gene annotations, students will search
for evidence ava
ilable online in the form of similar proteins and RNAs from closely related organisms, as
well as humans. This “extrinsic” evidence will be combined with “intrinsic” signals in the DNA itself
(signals that direct the cellular apparatus through transcripti
on and translation) to devise gene models
from raw DNA. Students will be answering the basic question: what does a
look like? The answer may depend on the extrinsic evidence the student is able to find. Students will
tructural gene annotation by hand, examine alignments of gene sections from many different
vertebrates using the UCSC genome browser, use a gene annotation pipeline to perform structural gene
annotation, assign a putative function to the gene, and describe
how the gene is important to the
organism’s development, reproduction, and/or survival.
This example highlights some elements of 9
12 National Science Education Content Standards A and C,
Science as Inquiry, and Life Science. Students will obtain the me
ans necessary to perform and
understand scientific inquiry. The primary life science standards covered include: the cell, the molecular
basis of heredity, and biological evolution. This lab manual assumes that each student has some
familiarity with cell
and molecular biology and access to a computer and the worldwide web. The
background information covered is not meant to be an exhaustive treatment of these topics, but rather,
reviews the specific information necessary to perform and understand the exerc
Lab 1. How do eukaryotic genes work?
Goal: To give a basic understanding of genes and their functions.
Review of Some Basic Molecular Biology:
The “Central Dogma of Biology” describes information flow in biological systems. It states that DNA
makes RNA, which makes proteins. Transcription is the process of turning DNA into RNA. Translation is
the process of turning RNA into proteins. We will focus our analysis on predicting protein coding genes
from raw genome sequence in a marine fish that
has recently been sequenced. Only messenger RNA
(mRNA) genes make, or code for, proteins. Ribosomal genes are transcribed into ribosomal RNAs
(rRNA), transfer RNA genes produce tRNA molecules, and many other RNA genes do not code for
proteins. Most DNA
is found in the nucleus, is transcribed in the nucleus, and is exported from the
nucleus to the cytoplasm for translation.
The flag rockfish
is a member of a diverse marine fish assemblage with an
estimated 102 species native to the
west coast of North America. Rockfishes of the genus
support important commercial and recreational fisheries on the west coast of North America and are
the dominant assemblage on most cold temperate reefs. These live
bearers have a low intrinsic
population increase and highly sporadic recruitment, releasing large numbers of pelagic larvae into a
variable coastal environment. Their slow growth rates render them vulnerable to overfishing.
Uncertainty in the success of any particular year c
lass has favored an evolutionary strategy whereby
some species of the
genus have extremely long lifespans and do not show signs of aging
(negligible senescence), while others demonstrate typical aging patterns and have short lifespans.
has a maximum lifespan of only 18 years, but it is closely related to
has a maximum age of at least 116 years. Researchers are expecting to gain insight into the genetic
mechanism for negligible senescence by sequencing and com
paring the genomes of these two species.
Transcription and Eukaryotic Gene Structure:
Most cell functions involve chemical reactions. Food molecules taken into cells react to provide the chemical
constituents needed to synthesize other molecules. Both breakdown and synthesis are made possible by a
large set of protein catalysts, called en
zymes. The breakdown of some of the food molecules enables the cell to
store energy in specific chemicals that are used to carry out the many functions of the cell. Cells store and use
information to guide their functions. The genetic information stored
in DNA is used to direct the synthesis of
thousands of proteins that each cell requires Cell functions are regulated. Regulation occurs both through
changes in the activity of the functions performed by proteins and through the selective expression of
individual genes. This regulation allows cells to respond to their environment and to control and coordinate cell
growth and division. (National Science Education Standards pg. 184)
In all organisms, the instructions for specifying the characteristics of o
rganisms are carried in DNA, a large
polymer formed from subunits of four kinds (A, G, C, and T). The chemical and structural properties of DNA
explain how the genetic information that underlies heredity is both encoded in genes (as a string of molecular
“letters”) and replicated (by a templating mechanism). Each DNA molecule in a cell forms a single
chromosome. (National Science Education Standards pg. 185)
The structure of a gene dictates how cellular proteins will interact with it to transcribe a mess
RNA and translate that mRNA into a protein. Some of the important details of DNA and a gene are
Each cell contains the same genome sequence, but different genes are transcribed into mRNAs
by different cells and by the same cells at dif
DNA is double stranded, and has 5’ and 3’ ends, pronounced “five prime” and “three prime.” It is
read from 5’ to 3’ and the two strands go opposing directions, the 5’ on one is the 3’ on its
Eukaryotic Genes have several different parts that orchestrate the cellular processes of
transcription and translation (see Figure below):
egion (5’ UTR)
egion (3’ UTR)
is a region of DNA up to a few hundred bp immediately upstream of the
transcription start point to which transcription factors (proteins) bind, and recruit RNA
polymerases for transcription.
are the sections of transcribed DNA that are exported from the nucleus after introns have
been removed and the ends of the transcript stabilized to form the mature mRNA. A gene may
have one exon or several exons separated by intervening sequences ca
lled introns. The
beginning and ends of exons are untranslated.
The Five Prime Untranslated Region (5’ UTR)
is the part of the mature mRNA immediately
upstream of the coding sequence. It may contain introns. Before translation can start, the
nds to the modified 5’ end of the 5’UTR after export to the cytoplasm.
Coding sequences (CDS)
of DNA are ultimately the sections of exons that are translated by
ribosomes into amino acid sequences once the mature mRNA is exported to the nucleus. The
in coding region of DNA begins with the start codon ATG and ends in the stop codons TAG,
TGA, or TAA. Note that the process of transcription copies Ts (thymines) as Us (uracils).
coding regions between exons. Introns usually begin with GT
and end in AG (or
the reverse compliments) in what is known as “GT/AG rule”. They are excised from pre
in the nucleus by proteins that recognize these sequences. Their excision is part of the pre
mRNA processing step that also includes adding a 5’c
ap (a modified guanine) to the mRNA and
A tail (a long stretch of As) for message stability.
The Three Prime Untranslated Region (3’UTR)
is the region of DNA after the stop codon in a
gene. Once it is transcribed, a polyA tail is added that help
s to stabilize the mRNA. These
regions sometimes have binding sites to microRNAs that, when present, signal the mRNA for
is the DNA between genes. There are still recognizable DNA elements in
Enhancers and silen
are sequence elements in the DNA that are more distant from
the transcription startpoint than promoters yet still regulate transcription by
determining what kind of molecules will be able to bind to the DNA . “Regulation”
refers to turning transcript
ion “up or down”.
(a.k.a. microsatellites) are stretches of repeated nucleotide
motifs from one to 30 bp. For example, the sequence ACACACACACACAC, contains the
motif AC repeated seven times.
are repeats that do no
t occur in tandem. These are often ancient
viral DNA elements that have incorporated themselves into other organisms’ genomes,
have made copies of themselves over the course of evolutionary time (10s of millions of
years), and have lost their ability to b
ecome full viruses and infect their host. Dispersed
repeats can be also found within introns of genes and must be identified to keep gene
finders from characterizing them as exons belonging to native genes.
Tandem repetitive DNA can serve important purpos
es such as protecting the ends of
chromosomes, called telomeres, and serving as binding sites during cell division towards
the constriction point of a chromosome (centromere).
Most genome sequences are erroneously contaminated by bacterial and human genes
too. This illustrates a major point of scientific research. Researchers must be constantly
guard” for a myriad of problems that obscure the truth.
One gene can encode different proteins in different cell types by including or excluding different
s in a process known as
. Different cell types are formed in a process
called differentiation, during development of the organism. After differentiation, cell types have
different proteins present that direct the cell to use the DNA i
n a way that suits the function of
the cell. Exons are mixed to form alternative mRNA products.
How DNA becomes a protein
RNA processing: Removal of introns
addition of 5’ cap and poly
mRNA codes for amino acids (the building blocks of proteins) in stretches of three nucleotides called
codons. Each codon specifies an amino acid, the building blocks of proteins, or a stop codon that
to stop translation. tRNAs carry specific amino acids and interact with ribosomes to deliver the specified
amino acid to the growing chain of
amino acids (called a polypeptide ). The translation of nucleotide to
amino acid sequences follows a sta
ndard genetic code (below
Use the genetic code to translate the following DNA: ATG TTG CGA TGA
Long stretches of RNA that translate into amino acids without a stop codon (known as open reading
frames), start codons, intron/exon
boundaries, and other gene specific features form clues as to where
genes are located in eukaryotic genomes. Finding genes in prokaryotic genomes is much easier because
there are no introns and there is little intergenic DNA. For eukaryotes, annotating
tens of thousands of
genes by hand is an extremely time consuming process, in particular because DNA has to be examined
forwards and backwards, and for different starting points. Initial
gene prediction is now accomplished
, but these still ne
ed to be hand
checked for quality by
for genes of highest
Gene Annotation Lab Name: ________________________
The following is the nucleotide sequence of the human
globin gene. Gene regions are indicated as
Transcribed but not translated
in the DNA
are shown in the right hand margin.
ATATCTTAGA GGGAGGGCTG AGGGTTTGAA GTCCAACTCC TAAGCCAGTG
CCAGAAGAGC CAAGGACAGG TACGGCTGTC ATCACTTA
CTCCCAGGAG CAGGGAGGGC 101
GCTTCT GACACAACTG TGTTCACTAG CAACCTCAA
TGCACCTG ACTCCTGAGG AGAAGTCTGC CGTTACTGCC CTGTGGGGCA
GGATGAAGTT GGTGGTGAGG CCCTGGGCAG
AGGTTACAAG ACAGGTTTAA GGAGACCAAT AGAAACTGGG CATGTGGAGA 351
CAGAGAAGAC TCTTGGGTTT CTGATAGG
GCTGCTGGTG GTCTACCCTT GGACCCAGAG
TTGGGG ATCTGTCCAC TCCTGATGCT GTTATGGGCA
ACCCTAAGGT GAAGGCTCAT GGCAAGAAAG TGCTCGGTGC CTTTAGTGAT
GGCCTGGCTC ACCTGGACAA CCTCAAGGGC ACCTTTGCCA CACTGAGTGA
GCTGCACTGT GACAAGCTGC ACGTGGATCC TGAGAACTTC AGG
TT CTTTCCCCTT CTTTTCTATG GTTAAGTTCA 701
TGTCATAGGA AGGGGAGAAG TAACAGGGTA CAGTTTAGAA TGGGAAACAG 751
ACGAATGATT GCATCAGTGT GGAAGTCTCA GGATCGTTTT AGTTTCTTTT 801
ATTTGCTGTT CATAACAATT GTTTTCTTTT GTTTATTCTT GCTTTCTTTT 851
TTTTTCTTCT CCGCAATTTT T
ACTATTATA CTTAATGCCT TAACATTGTG 901
TATAACAAAA GGAAATATCT CTGAGATACA TTAAGTAACT TAAAAAAAAA 951
CTTACACAGT CTGCCTAGTA CATTACTATT TGGAATATAT GTGTGCTTAT 1001
TTGCATATTC ATAATCTCCC TACTTTATTT TCTTTTATTT TTAATTGATA 1051
ATGGGTTAAA GTGTAATGTT TTAATATGTG 1101
TACACATATT GACCAAATCA GGGTAATTTT GCATTTGTAA TTTTAAAAAA 1151
TGCTTTCTTC TTTTAATATA CTTTTTGTTT ATCTTATTTC TAATACTTTC 1201
CCTAATCTCT TTCTTTCAGG GCAATAATGA TACAATGTAT CATGCCTCTT 1251
ATAA CAGTGATAAT TTCTGGGTTA AGGCAATAGC 1301
AATATTTCTG CATATAAATA TTTCTGCATA TAAATTGTAA CTGATGTAAG 1351
AGGTTTCATA TTGCTAATAG CAGCTACAAT CCAGCTACCA TTCTGCTTTT 1401
ATTTTATGGT TGGGATAAGG CTGGATTATT CTGAGTCCAA GCTAGGCCCT 1451
ACGTGCTGGT CTGTGTGCTG GCCCATCACT TTGGCAAAGA ATTCATCCCA
CCAGTGCAGG CTGCCTATCA GAAAGTGGTG GCTGGTGTGG CTAATGCCCT
G CTCGCTTTCT TGCTGTCCAA TTTCTATTAA 1651
T GTTCCCTAAG TCCAACTACT AAACTGGGGG ATATTATGAA 1701
GGGCCTTGAG CATCTGGATT CTGCCT
CAATGATGTA TTTAAATTAT TTCTGAATAT TTTACTAAAA AGGGAATGTG 1801
GGAGGTCAGT GCATTTAAAA CATAAAGAAA TGATGAGCTG TTCAAACCTT 1851
AATAC ACTATATCTT AAACTCCATG AAAGAA
1. Without looking, redraw the sketch of a eukaryotic gene, pre
mRNA, and mature mRNA in the space
below. Then correct your own work.
2. Circle and label the
following in the sequence of β
globin (opposite page) and in the sketch of the gene above.
A. Transcription start point
B. Transcription end point
C. Translation start point
D. Translation end point
3. Explain what evidence (i.e. signals in the DNA sequen
ce) you used to support your answers to (2).
4. Based on the sequence information provided above, draw a sketch of the human
globin gene below
labeling the promoter, 5’UTR, coding sequences, introns, and 3’UTR. Make sizes roughly proportional to
5. What would be the first three and the last three amino acids produced from the mRNA transcribed from this
gene? Note: a stop codon does not produce an amino acid.
6. Is a mutation
more likely to disrupt the function of the protein produced if it occurs in an intron or
coding sequence? Explain.
What can comparison of DNA sequences tell us about evolution?
Goal: To give an understanding of how DNA shed light on evolution.
pecies evolve over time. Evolution is the consequence of the interactions of 1) the potential for a species
to increase its numbers, 2) the genetic variability of offspring due to mutation and recombination of genes,
3) a finite supply of the resources re
quired for life, and 4) the ensuing selection by the environment of
those offspring better able to survive and leave offspring. The great diversity of organisms is the result of
more than 3.5 billion years of evolution that has filled every available nich
e with life forms. Natural
selection and its evolutionary consequences provide a scientific explanation for the fossil record of ancient
life forms as well as for the striking molecular similarities observed among the diverse species of living
The millions of different species of plants, animals, and microorganisms that live on earth
today are related by descent from common ancestors. Biological classifications are based on how
organisms are related. Organisms are classified into a hierarchy
of groups and subgroups based on
similarities which reflect their evolutionary relationships.
National Science Education Standards, p. 185
Whole genome sequences from many different species have been aligned by researchers. Now that
you’ve gained some ex
perience in understanding the fine structure of a gene, lets see what the same
globin, looks like when compared among many different species. Do you think it is possible
for mutations at a single gene to reveal phylogenetic relationships among
Go to the UCSC genome browser web page at genome.ucsc.edu, and select Genomes.
Select the clade “Mammal”, genome “Human”, clear all text from “position or search term,” and
enter the abbreviation for human beta globin “HBB.”
You will get a list of search results, choose HBB: Homo Sapiens Hemoglobin Beta. You will now
see a close up view of the
gene in humans. Some sections on your screen may look
different than below, but the top should be similar. Find the brows
er navigation tools, exact
location of the gene on chromosome 11, the sketch of the gene, and miscellaneous information
tracks below the gene sketch.
The beta globin
gene is in the reverse orientation in this view of the chromosome (arrows on the
sketch point to the left). Before looking deeper, scroll down and reverse your orientation of the
gene so that it matches the orientation of the gene from the hand
n exercise you
performed earlier. (Lab 1 Question 4)
Scroll up to the picture and see if you can reconcile the genome browser’s gene sketch with your
hand sketch of this gene. How are UTRs, coding sequences, and introns depicted in the
hly redraw the sketch below, and label the major pieces.
Scroll down to the Comparative Genomics Track controls and adjust the settings so that
“conservation” is set to “full.” This will give your browser the most expanded view of the
alignment among man
y different species, which allows you to see how similar (i.e. conserved)
each nucleotide is among many different vertebrates.
Then hit “refresh” and scroll back up to
the view the changes.
Under, “Mutliz alignment of 46 (your number might differ) spe
cies” you will see vertical bars
corresponding to each nucleotide location in the genome. The taller the bars, the more
conserved the DNA sequence is at that location in pairwise comparison of the human versus the
species identified on the left of the scr
To complete the lab, continue from this point to answer the questions on the “Evolution
Which species on your screen looks most sim
ilar to the human sequence? Does that make
phylogenetic sense? Explain.
Does it look like some regions of the gene are more conserved than others?
What evidence supports your answer?
Which regions of the gene appear more conserved?
might that be the case?
Zoom in all the way to the “base level” from
. This view shows
the location of the first exon/intron boundary.
Does the intron follow the GT/AG “rule”? Explain.
Based on the species you see
in the browser viewer, what percent similar are the sequences
in each of the first four nucleotides of the intron?
What could explain the variation in conservation you see in (b)?
Write down the amino acid sequence for the six amino acids before the intron begins.
Are more phylogenetically similar species, like mammals, more similar to each other than they
are to fishes?
At what % of sites are all species in the group ident
ical? What about mammals? Humans vs.
What could explain this?
Do you think this pattern is something unique to beta globin, or is a general property of your
species genomes? Support your idea by going to the HDB (Homo sapiens hemoglobin de
gene and repeating the analysis below.
Lab 3: How can I use bioinformatics to annotate genes?
Goal: To give an understanding of how raw data and research help us predict genes in a genome.
Bioinformatics is branch of biological science involved in using computers to analyze biological data.
Bioinformatics normally deals with very large datasets (sometimes in excess of 500 GB) that would be
nearly impossible to generate and manage without the
use of supercomputers.
Recent advances in
sequencing technology are allowing more genomes to be sequenced each year
. However, it takes ten
times as long to find genes and describe their functions (i.e. annotate a genome) than it does to
. As a result, there is a widening
gap between sequenced genomes and annotated,
searchable genomes in publically available databases
The Maker Annotation Pipeline
Website does not work anymore
is a genome annotation pipeli
ne that seeks to close
this gap. A pipeline is a series of programs
working together like when you follow the different steps in a protocol to dissect a frog and label its
was optimized for use on non
Much has been learned about biology
in the l
ast fifty years from model species like the fruit fly (
), nematode worm (
baker’s yeast (
), and bacteria (
) that have small genomes, are
easy to raise, have short life cycles, are easy to ob
serve the connection between genotype and
phenotype, and can be easily manipulated genetically. However, because sequencing costs have
dramatically decreased in the last decade, genomes from species with interesting phenotypes are now
being sequenced at i
ncreasing rates. The Maker pipeline
information to make
structural gene annotations from raw DNA:
signals in the organism’s DNA, which are found
ich is eviden
ce supplied to Maker
based on similarity
of genomic regions
to other organisms’ mRNA (also known as expressed sequence
tags, or ESTs) and protein sequences
first step of the multi
involves finding repetitive DNA and
labeling (i.e. “masking”) the genomic DNA using the program RepeatMasker. Repeat masking is
important in order to prevent inserted viral exons from being coun
ted as fish exons.
A repeat library
was developed specifically from the
can detect the intrinsic signals in the DNA model genes. Because “what a gene looks like” differs
ng genomes, gene finders must be “trained” to know what the signals look like in each
new species sequenced. The program BLAST (Basic local alignment search tool) is used to find similarity
of genome sequence to public mRNA and protein sequences. There a
re different kinds of BLAST
searches depending on what kind of sequences (DNA, RNA, protein) are being compared. The program
helps to polish up
since “Local” alignments of sequences end wherever
similarity between sequences begi
ns to decrease
The final annotations predicted by Maker are those
that are supported by both kinds of information, intrinsic signals and extrinsic evidence.
supported by both
Intrinsic and Extrinsic
The object of this
exercise is to
use Maker, a cutting edge research tool, to predict genes from a
small section of the
genome using current methodologies.
1) Retrieve Scaffold folder from public drive.
2) Go to
3) Click on the link to BLASTx.
4) Upload the scaffold.
5) Change the database to UniProtKB/Swiss
6) Click the BLAST button. BLAST will take several minutes to run. It will search the Swissprot
for matches to your query based on local sequence similarity.
7) Select all of the sequences, then click “Get selected sequences.
8) Select view as fasta and select the first 10 results.
9) Copy the results to a file and save as ProteinEvid
Thought Question: What protein appears most in the search results? Do a quick internet search, what
does this protein seem to do?
10) Go to
11) Click on the link to t
12) Upload the scaffold.
13) Change the database to expressed sequence tags (EST). Enter “Sebastes” in the Organism line.
14) Click the BLAST button. This may take several minutes.
15) Select 10 of the sequences, 2 from each “column” of align
ments in the picture at the top. Clicking on
any of the lines under the long bar will take you straight to that entry, check the box. Then click “Get
16) Select view as fasta and select the first 15 results.
17) Copy the results to a file and save as ESTEvidence.fasta.
18) Go to
19) Click new guest account. Remember to write your
guest number down.
20) Click on the manage files link next to new job.
21) Upload the scaffold file, the protein evidence, the EST evidence as FASTA files.
22) Upload the HMM file (
on public drive)
as a SNAP HMM file
23) Go back to the new jobs t
24) Upload the Sequence file in the “Choose a genome fasta file” menu.
25) Upload the EST file as ESTs from a related organism.
26) Upload the Protein file.
27) Upload the SNAP file.
28) Set “Consider single exon EST evidence when generating
annotations” to yes.
29) Click “Add Job to Queue”.
30) Once Maker has finished, click the icon in the view results tab.
32) Click the “View in Apollo” button.
33) Select “open with Java Web Start Launcher”.
34) Select run to launch Apollo.
Right click the Maker annotated gene and select “Sequence”.
36) Select Peptide Sequence and highlight the sequence.
37) Go to
38) Select protein BLAST.
39) Paste in the sequence, change the database to Uniprot/Swissprot, and click BLAST.
40) Determine the identity of the gene based off of the more similar, significant blast result hit. The E
value of the “query” sequence you entered against a database “s
ubject” is a measure of the probability
that the hit was random. Values less than 10 x 10
are considered reliable indicators of significant
41) Complete the Worksheet below
to finish the lab
Have the students annotate the gene
s with and without the
gene finder, the EST
evidence, or the protein evidence to see which most strongly affects the resulting protein sequence.
Measure percent overlap of different gene annotations to see how similar they are.
Paste a screen shot of your Apollo result in the space below:
2) How many genes were annotated by MAKER in the scaffold?
3) If there are any predicted genes, which appear complete, that is,
beginning with ATG and ending with
a stop codon?
4) How many exons are in each gene supported by MAKER?
5) How long are the introns on average, and how many are there per gene?
6) Do all introns follow the GT/AG rule?
6) For which genes were
UTRs predicted? Name the genes and UTR types.
7) Sometimes assembly of genomes from many small pieces results in chimeric sequences (Recall from
mythology that a chimera is a monster comprised of pieces of different animals). Do your blast results
gest that the gene MAKER predicted is such a monster?
8) Describe the protein from the strongest protein Blast hit. Can you assign a putative function to the
gene based on this information?
9) Research how the gene is important to the organism’s de
velopment, reproduction, and/or survival.
What does a typical rockfish gene look like?
Goal: Use skills learned in previous labs and classes to complete a gene annotation using an
: Each student should take a scaffol
d and repeat the annotation process above. There are 50
random scaffolds provided in
(Ask your instructor for the file location). Two students
should independently annotate the same scaffold. Each individual should complete the worksheet with
their scaffold. Compare your results to another student who annotated the same scaffold.
: Each pair of students with the same scaffold will present a 15 minute presentation on
their annotations. Discuss your worksheet results, annotations, the features of those annotations, what
the predicted genes are and why they are important. Be sure to
support your point with citations and
evidence and to include tables and figures where appropriate. The scaffolds may contain single exon
genes, multi exon genes, one gene per scaffold, more than one gene per scaffold, or simply no genes.
As a group dis
cuss what the general features of all of the scaffolds say about eukaryotic gene structure.
Can you classify different kinds of rockfish genes based on the results you’ve discovered?
As of the writing of this manual, scaffolds containing age
related genes have not yet been identified in
or its sister species that lives 10X longer than it. Inquire to see if these are now available to
. There are hundreds of candidate aging genes that are found in
both humans and rockfishes. Students could annotate the same gene from the two species to see if
there are differences that represent a “smoking
gun” that might explain negligible senescence in the