Genomics (BIO 294) Laboratory, Week
, Spring 2011
GOAL: To get an introduction to the
Revisit the Entrez
Gene website before the lab, and look over your
documentation of your exploration of last week, to remind yourself how that database is
structured and what information you found there
At the end of class, email me
documentation of your
the Ensembl database
is the central
maintained by the
European Bioinformatics Institute
and the Wellcome Trust Sanger Institute
. The two main entry points
are by (1) searching
for specific genes, which
takes you to the Ensembl equivalents of the Entrez Gene page, or by
(2) browsing genomes, which takes you to the Ensembl equivalents of the Entrez Map Viewer.
In this exploration, we will be accessing it by searching for specific genes.
Just like last we
you will work with your lab partner to familiarize yourself with the information about genes
available in Ensembl. And just like Entrez Gene, the Ensembl pages for a gene can be
overwhelming at first, given the amount and diversity of information
organized series of pages
. While you and your partner introduces yourselves to this datab
of you should
the steps by which
you have navigated through the Ensembl database
and what information
has been displayed on each page
After we have all explored different
genes, we will get together as a group and exchange notes on what we have found.
You may be interested in watching a ten
minute video introduction to Ensembl, which you can
The simplest way to find your gene is to select the same organism’s genome that you studied last
week, and then enter the gene’s name in t
he search box (for example, the name for the human
phosphofructokinase gene that is active in liver cells is PFKL). Clicking Go will take you to the
Results Summary, and then clicking on the appropriate Gene result will take you to the Result in
e, where clicking on the Ensembl protein_coding Gene link will take you to the page
for that gene.
ENSEMBL INFORMATION ON YOUR
or more, working with your partner
database listing for
your gene. You may find it helpful to have one partner handle the mouse and the other type up
the documentation in a Word document on an adjacent computer.
While the Ensembl database listing for a gene has some of the same informatio
n that is found in
Entrez Gene, it is organized completely differently, and has different strengths and weaknesses.
Note the tabs at the top of the page. When you first get to the page for your gene, there
should be three tabs. The leftmost tab is for th
e human genome taken as a whole. The
next tab is a map of part of the chromosome on which the gene is found. Depending on
how much time we have, you may explore that later. The third tab is for the gene itself.
Depending on what links you click on,
or an individual transcript will appear later.
Within each tab, there is a hierarchical table of contents on the left. You can navigate
through the pages for a gene, transcript, or protein, by clicking on items within this table
of contents, or by clicki
ng on forward
buttons within each page (see
The pages within this table of contents differ from gene to gene, depending on
how much information is available for that gene.
In comparison to Entrez Gene, the Ensembl database
(a) more pages with less
information on each page, (b) more graphical representations, and (c) fewer links to other
databases. The graphical nature of the pages can make it fun to navigate, but it may take
time to make sense of all of the graphics. Clickin
g on many of the items within any given
graphic will bring up an infobox with information on that item and/or links to further
At the top of the Gene page is a table containing lists of transcripts and proteins.
Clicking on any of these will
bring up the corresponding Transcript tab
information is partway through the Transcript tab table of contents)
. But before you go
there, explore the Gene pages.
If you scroll down below that table, you will find the Gene Summary. Here and
lsewhere, you can click on the he!p button for more information.
The he!p button is
Try clicking on it w
henever you find yourself perplexed by what you’re
Just as in the case of the Entrez Gene database, the Ensembl database has
array of information, some of which you may be able to make some sense of (don’t forget
the he!p button), and some of which will just perplex you. Don’t give up right away when
something is unclear, but also don’t spend forever wrestling with
a single page. I may be
able to help you if you’re interested in something but can’t make sense of it, but frankly I
haven’t yet explored all the dark corners of the Ensembl database.
To the right of the he!p button is the button to navigate forward to th
e next page in the
hierarchical table of contents (in this case, it is Splice variants).
You can customize the information on any page by clicking on the Configure this page
button on the left, just below the table of contents, and clicking on any of the
remove them f
rom or add them to the display. In some cases (such as External Data and
Genomic Alignments), you have to first configure the page before it will display anything
Clicking on the graphics will often bring up infoboxes w
ith links that can take you to
more information on that feature. Try doing this a lot.
Often, the easiest way to get back to where you were is using the back arrow in your
Once you’ve explored the Gene
based displays to your heart’s content, hav
e a look at one
or more individual transcripts. In the table at the top of each Gene page,
listed by length in both base pairs and amino acids (if applicable), and by biotype. There
should be at least one protein
coding transcript for your
gene. These are the most relevant
If there is more than one protein
coding transcript, make a note of how they differ. Are
there skipped or alternative introns? Alternative start or stop codons?
The best place to
get this information is on the Gene
Summary or Splice Variants pages in the Gene
You can click through to Transcript
based displays either by clicking on the Transcript
ID for one of the transcripts in the table, or by clicking on that transcript in the graphic,
and then c
licking on the Transcript number in the infobox that appears. Once you have
done so, the table of contents to the left of the page will show the available Transcript
based display pages. You can navigate through these either by using the table of contents
or by clicking the forward and back buttons on each of the display pages.
You may sometimes get error messages instead of an image when you click on a
particular page. If so, try clicking to that page a few more times and you might succeed,
or you may ke
eping getting the same error message. You could give up, or move onto
other things and try coming back to that page later.
Eventually, clicking through the transcript pages will bring you to the Protein Summary
page, which you could also have gotten to di
rectly by clicking on the Protein ID for a
coding transcript instead of the Transcript ID.
If your gene has a Domains &
Features page in the Protein Information set of pages, be sure to click through to and
explore the InterPro page for one or more
of your protein’s domains. We’ll be coming
back to InterPro and analogous protein databases later this semester, to explore them in