Putting TAIR to work for you: Hands-on workshop for beginning and advanced users

infestationwatchSoftware and s/w Development

Oct 28, 2013 (3 years and 7 months ago)

60 views



Putting TAIR to work for you:





H
ands
-
on workshop fo
r beginning and advanced users



Part I: Introduction to
getting information at TAIR


-----------------------------------------------------------------------------------------------------------------------------
-
-------------------------------

1. How do y
ou find a
desired

gene

/ gene model /protein

in TAIR?


A
. You are interested in a specific locus (based on candidate gene search, a paper, etc.)
-

how do you get to this
Locus page?



1. Quick Search (this is a "contains" search!). Enter locus name in se
arch box.



If you want to do an exact name search
-

choose 'Exact name search'


from the
Q
uick

S
earch dropdown menu.



2. For a more advanced search, choose 'Genes' from the dropdown SEARCH menu on the TAIR homepage


http://www.arabidopsis.org/servlets/Search?action=new_search&type=genehttp://www.arabidopsis.org/servl ets/Search
?action=ne
w_search&type=gene


B.
You have a sequence from
another species

-

how do you find a related gene at TAIR?



1.
BLAST (protein or
DNA
sequence) (
http://www.arabidopsis.org/Blast/index.jsp
)

(Tools
-
>BLAST)


-


WuBLAST o
r Fasta searches are als
o available in Tools menu




2.Click on the blue locus

link in the
list of
results to

go to
the
locus page.


C.
How do you learn about specific gene models and proteins?





1. Start at a locus page
(
e.g
.
http://www.arabidopsis.org/servl ets/Tai rObj ect?id=128061&type=l ocus
)



a
. To get to a specific gene model page, e.g. AT4G35790.1 click on the AT4G35790
.1

link near the

top of the locus

page.

(
http://www.arabidopsis.org/servlets/TairObject?id=129396&type=gene
)



2. To get to a specific prote
in model page, e.g. AT4G35790.1

scroll down t
he locus page to the Protein Data section

and click on the AT4G35790.1 link.

You can also get to the protein from the gene model page.

(
http://www.arabidopsis.or
g/servlets/TairObject?type=aa_sequence&id=1009124182
)

-----------------------------------------------------------------------------------------------------------------------------
-
-------------------------------

2. How do you find the important informatio
n about a gene?


A. How do you find the function of the gene?



1. Locus page: (e.g.:
http://www.arabidopsis.org/servl ets/Tai rObj ect?type=locus&name=AT4g35790
)






a. Curator description





b. Annotation section
-
>
Annotation detail page
(* find information in other species using the "GO database" link

on the keyword detail page)





c. Ara
C
yc link in External links section





d. Phenotypes of mutan
t germplasms





e. Interpro domains for representative gene model (* use protein pages to see domains associated with each

gene model)





f. Publications






2. Textpresso:


(
http://www.
textpresso.org/arabidopsis/
)





a. Use quick search bar (choose "Textpresso full text")





b. Accessible under "Search" menu






3. AraCyc / PlantCyc:








a. Use quick search bar (choose "Metabolic Pathways")





b. AraCyc / PlantCy
c search page
(
http://www.plantcyc.org:1555/ARA/server.html
)


B. How do you find out where
the gene
is expressed and where
the protein
might be located in the cell?


1.

Locus page
-


Extern
al links:








a.
e
-
FP browser (expr
ession and location)


b.
Genevestigator (expression)







c.
NASCarrays (expression)









d.
A
tGenExpress (expression)






2.

Protein page


a.
SUBA (location)

3. How do you get information from GBr
owse?

A.

What types of data can I find in GBrowse


http://www.arabidopsis.org/cgi
-
bin/gbrowse/arabidopsis/



1. Many t
racks
for different data types:





a.
ESTs and cDNAs


b.
homo
logs

of other model organisms


c.
homologs of other plants





d. Brassica ESTs





e
.
peptides f
rom Mass Spec experiments





f
. T
-
DNAs




g. promoters





h
.
Vista Plot (sequence conservation)


i
. gene structures predicted by other programs


j
. pol
ymorphisms


k
. methylation data


l
. sequence or GC content (depending on level of zoom
-
> DNA/GC track)



m
. markers



n
. 6
-
frame translation


o
. many more . . . and the list keeps growing!


2. You can upload your own data, too!



B.
How can I get the

DNA sequence information?


1
.
***
Download decora
ted fasta files


2. Use the DNA/GC track under “DNA” and zoom in

-----------------------------------------------------------------------------------------------------------------------------
----------
---------------------
4. How do you find and/or work with
data sets

in TAIR
and the PMN
(kate)


A.
How do you generate data

sets based on:
______________?


1. Sequence




a. BLAST:
http://www.arabidopsis.org
/Blast/




b. WU
-
BLAST2:
http://www.arabidopsis.org/wublast/index2.jsp



c. FASTA alignment tool:
http://www.arabid
opsis.org/cgi
-
bin/fasta/nph
-
TAIRfasta.pl


2. DNA or protein sequence motif

(e.g. consensus binding sequence for a transcription factor; known amino acid

motif in a new protein
-
protein interaction domain . . .)




a. Patmatch:
http://www.arabidopsis.org/cgi
-
bi n/patmatch/nph
-
patmatch.pl


3
. Proteins domain or biochemical property:



a. Protein Search (limit by domain or by PI, MW, etc.):



http://www.arabidopsis.org/servlets/Search?action=new_search&type=prot ein


4
. Function, location, involvement in a biological process (GO term)
:



a.

Gene search (limit by GO

terms):

http://www.arabidopsis.org/servl ets/Search?action=new_search&type=gene



b. Keyword search:
http://www.arabidopsis.org/servlets/Search?action=new_search&type=keyword


5
.

Expression Pattern
:


a.

Gene search (limit by PO

terms):

http://www.arabidopsis.org/ser
vlets/Search?action=new_search&type=gene



b. Keyword search:
http://www.arabidopsis.org/servlets/Search?action=new_search&type=keyword



6
. Biochemical pathway



a. Advanced Query

http://www.plantcyc.org:1555/ARA/query.html


b. P
athway page


i.
Search using Quick Search or use search page:
http://www.plantcyc.org:1555/ARA/server.html?



ii. On Pathway page,
use "Download Genes" button



(e.g.:
http://www.plantcyc.org:1555/ARA/NEW
-
IMAGE?type=PATH
WAY&obj ect=PWY
-
3561
)


7
. Mapping region (candidate genes)



a. Gene search (limit by location in genome based on marker or map position






http://www.arabidopsis.o
rg/servlets/Search?action=new_search&type=gene



b. Seqviewer: In "Close
-
up View" click the button for "List Genes in Range."



(e.g.
http://www.arabidopsis.org/servl ets/sv?click.x=378&click.y=45&action=click&zooms=3&box0=on&box1


=on&box2=on &box3=on&box4=on&box5=on
)

8.
P
henotype:



a. Seeds/Germplasm search: (limit by phenotype):



http://www.arabidopsis.org/servlets/Search?action=new_search&type=germpl asm



b. Gene search: (search by phenotype, use "contains"):


http://www.arabidopsis.org/servlets/Search?action=new_search&type=gene


9
. Gene families:



a. Gene Family pages (Under "Browse menu
-
> Gene Families):

http://www.arabidopsis.org/browse/genefamily/index.jsp


B.
How do you get more information about your data set?


1. Download

information for a set of genes

or proteins

(Under "
Downloads
"
-
> "Bulk Data Retrieval") :

http://www.arabidopsis.org/tools/bulk/index.jsp






a.
Nucleotide and P
rotein
S
equences (coding region, promoters, UTRs):

http://www.a
rabidopsis.org/tools/bulk/sequences/index.jsp





b. protein domains, MWs, Uniprot IDs, predicted # of transmembrane domains, etc.:

http://www.arabidopsis.org/tools/bulk/protei n/in
dex.jsp





c.
G
ene descriptions:
http://www.arabidopsis.org/tools/bulk/genes/index.jsp





d. GO annotations:
http
://www.arabidopsis.org/tools/bulk/go/index.jsp





e.
M
icroarray elements:
http://www.arabidopsis.org/tools/bulk/microarray/index.jsp





f.
L
ocus history:
http://www.arabidopsis.org/tools/bulk/locushistory/index.jsp


2. Mine FTP data sets
(Under “Downloads”)


ftp://ftp.arabidopsis.org/home/tair
/


a. Genes


b. GO and PO Annotations


c. Maps


d. Metabolic Pathways


e. Proteins


f. Microarray Data


g. Sequences


h. User requests


i. Many more!



C
.
How do you “analyze” data sets or experimental data at TAIR and the PMN?


1. GO categorization



a. Create charts showing functional categorization of genes/proteins:









http://www.arabidopsis.org/tools/bulk/go/index.jsp

(Use:

Functional Categorization

)





-

This can b
e compared to the whole genome in Arabidopsis (Use "Whole Genome Categorization

)







b.
AMIGO provides a tool for identifying over
-
represented terms:




http://amigo.geneont
ology.org/cgi
-
bin/ami go/term_enrichment


2. OMICs viewer


a. Use OMICs viewer to overlay

transcript, proteomic, and metabolomic data on
a metabolic pathway
map
:





http://www.plantcyc.
org:1555/ARA/expression.html





Thank you for coming!




*** If you have any questions

***

Please stay for Part II, come b
ack tomorrow, schedule a meeting with me today . . .

or send an e
-
mail to
curator@arabidopsis.org

or
curator@plantcyc.org