Arabidopsis Co-express: A tool for large-scale mining of expression data

taxidermistplateΛογισμικό & κατασκευή λογ/κού

7 Νοε 2013 (πριν από 4 χρόνια και 5 μέρες)

107 εμφανίσεις

Arabidopsis Co
-
express


Srinivasan, et al.

1


Arabidopsis Co
-
express: A tool for large
-
scale mining of expression data


Vinodh Srinivasan
1
, Grier Page
1
, and Ann E. Loraine
1,2*

1
Section on Statistical Genetics, Department of Biostatistics, University of Alabama at
Birmingham

2
Department of Genetics, D
epartment of Computer and Information Sciences, and
Comprehensive Cancer Center, University of Alabama at Birmingham

*To whom correspondence should be addressed


Abstract

Ever
-
expanding collections of publicly
-
available microarray expression data offer
res
earchers unprecedented opportunities for investigating genome
-
scale questions related
to gene function and regulation, as well as more focused questions addressing specific
biological processes and pathways. We have implemented a novel gene expression data
-
mining tool (http://www.ssg.uab.edu/coexpression) that compares patterns of correlated
expression in experiments from the same array platform for sets of up to fifty query
genes. The tool delivers a set of base results describing correlated expression bet
ween all
genes represented on the array and a set of user
-
supplied query genes or probe sets. In
addition, it performs pathway
-
level co
-
expression analysis identifying genes connected to
query genes in the co
-
expression network. Users receive data in tabu
lar, XML. and Web
page formats, allowing rapid exploration of results using a Web browser, Cytoscape, or
other data
-
mining or visualization tools. To aid in downstream analysis by computational
Arabidopsis Co
-
express


Srinivasan, et al.

2

biologists and statisticians, we provide a suite of custom pyt
hon modules and tutorials
demonstrating their use. A simple ReST
-
style Web services provides direct access to
expression values, including expression values computed using three popular array
processing and normalization methods. Two example analyses focus
ing on cellulose and
glucosinolate biosynthesis pathways demonstrate biological applications for the tool.


Introduction

Availability of abundant, high
-
quality data sets from microarray expression experiments
utilizing the ATH1 microarray from Affymetrix
have generated a renaissance in the field
of gene networks analysis in this important model plant species. These data are making it
possible to explore correlated expression patterns for the entire genome, as well as
answer focused questions regarding spec
ific pathways and processes. The data are
available from several different sources, notably the Nottingham Arabidopsis Stock
Center AffyWatch service. For a small fee relative to the cost of performing the original
experiments, researchers can obtain copie
s of the original, unprocessed “raw” data files
(CEL files) on DVD or CD media formats. The ability to access the data in their pre
-
processed form has been a boon to the field; it has allowed numerous groups to
investigate specific questions as well as bui
ld Web
-
accessible tools for data
-
mining,
visualization, and analysis.

Several groups have developed on
-
line, Web
-
based analysis tools that offer a
variety of methods and approaches for analyzing and visualizing publicly
-
available
Arabidopsis expression da
ta. GeneVestigator offers a client
-
side Java “applet” that
features a number of different analysis methods, including a “digital northern”
Arabidopsis Co
-
express


Srinivasan, et al.

3

visualization showing how expression levels differ across diverse tissue and sample types
(Zimmermann, 2004)
. The Leeds Arabidopsis Co
-
expresson Tool (ACT) serves

results
from an all
-
versus
-
all comparison of expression patterns in Arabidopsis
(Manfield, 2006)
.
To use the tool, Web site visitors enter pairs of genes and then v
iew an interactive,
“clickable” plot showing correlation coefficients between the query genes and all genes
represented on the ATH1 array. The ATTED tool generates a list of genes that are most
highly
-
correlated (in terms of expression) with a single user
-
entered query gene
(Obayashi, 2007)
. Top
-
ranking genes appear in results Web page that also shows a
graphic dep
icting the local co
-
expression network surrounding the query gene. The
Expression Angler hosted at the University of Toronto also provides lists of highly
-
coexpressed genes for a single input query gene, and allows users to view heatmaps and
other visualiz
ations depicting patterns of co
-
expression across multiple experimental
conditions
(Toufighi, 2005)
.

These and many other tools typically offer outstanding on
-
line visualization
capability, but they tend to de
-
emphasize “bulk” distribution of entire results sets
,
focusing instead on graphical or HTML
-
based presentation of data. This strategy for
results dissemination sometimes makes it difficult to perform further down
-
stream data
-
mining using the analysis results as inputs. For example, it is difficult to merge
results
from different tools or even “runs” from the same tool. To support downstream data
-
mining applications, it is necessary to provide access to results in “bulk” and/or machine
-
readable formats.

Our approach complements these tools by providing exte
nsive analysis results in
simple, machine
-
readable formats accessible and useful for multiple user groups,
Arabidopsis Co
-
express


Srinivasan, et al.

4

including “bench” biologists, methodologists, and computational biologists. We focus on
providing the data in ways that allow researchers to visualiz
e and explore results using
third
-
party visualization programs (e.g., TableView
(Johnson, 2003)
, Cytoscape
(Shannon, 2003)
), programs that excel in the presentation and manipulation of data in
tabular formats (e.g., MicroSoft Excel), statist
ical analysis environments (e.g.,
Bioconductor
(Gentleman, 2004)
), or their own custom scripts and software. To support
the latter group, we also provide simple file parsing and data manipulation tools using the
(relatively) easy
-
to
-
learn Python prog
ramming language. Lastly, we provide ReST
-
style
and SOAP
-
based Web services that allow access to “raw” expression and co
-
expression
values from our database, and demonstrate the functionality of these tools via analysis of
two highly
-
co
-
expressed pathways
in Arabidopsis: cellulose biosynthesis and
glucosinolate biosynthesis from tryptophan.

Results

Analyses

The co
-
expression tool performs simple linear regression comparing a set of query
(“bait”) genes entered by the user to the rest of the genes represent
ed on the selected
array, where options currently include the ATH1 (22,810 probe sets)
(Redman, 2004)

or
AG (~8,000 probe sets)
Arabidopsis

microarray designs from Affymetrix. When users
input two or more query genes (or probe set ids), the tool also performs pathway
-
level co
-
expression analy
sis, in which genes that are co
-
expressed with two or more members of
the query group are ranked and reported. We described the PLC method in detail
previously the current incarnation of the PLC method as implemented in the co
-
expression tool operates as d
escribed previously, with some differences in how results are
Arabidopsis Co
-
express


Srinivasan, et al.

5

ranked
(Wei, 2006)
.

Briefly, the PLC analysis examines the co
-
expression results for
each of the query genes and then builds a list of genes that are co
-
expressed with two or
more members of the query group, where co
-
expression is defined as an r
-
squared
regression result gr
eater than a user
-
specified threshold. (Note that the r
-
squared value
obtained from the regression is also the square of Pearson’s correlation coefficient.)
Genes co
-
expressed with two or more members of the query gene group are ranked first
according to
the number of genes within the query group with which they are co
-
expressed, and second by the average r
-
squared value. Thus, genes that are co
-
expressed
with many members of the query group appear higher in the list than genes co
-
expressed
with fewer memb
ers of the query group. In essence, genes identified by PLC comprise a
set of genes neighboring the query genes in the larger co
-
expression network.

Data sets and quality
-
control

The co
-
expression system incorporates data from over 1,700 array hybridizati
ons
harvested from the NASC AffyWatch CD subscription service
(Craigon, 2004)
. The tool
currently includes four major da
ta releases, which differ in the number and types of
arrays offered as well as in the types of pre
-
processing methods used to generate
expression values (Table 1). We performed a quality
-
control procedure on each array,
applying a method based on deleted r
esiduals that identifies arrays whose expression
values deviate significantly from other arrays in the same group, where “group” is
defined as all the arrays in a single experiment, which includes both control and
experimental samples. [
Ann’s note: was K
-
S/deleted residuals test done on each set of
expression values from each array processing method?

] Each array receives a quality
control statistic based on a Kolmogorov
-
Smirnov (K
-
S) test statistic that quantifies this
Arabidopsis Co
-
express


Srinivasan, et al.

6

deviation: arrays with large K
-
S are

of lower quality
(Persson, 2005; Trivedi, 2005)
. In
general, we find that only around 3% of the arrays fail to achieve a K
-
S quality control
statistic D <= 0.15 (Figure 1) .

Co
-
expression tool operation

No registration is required to access the co
-
expression tool. To begin the analysis, users
visit the co
-
expression tool “home page” (http://www.ssg.uab.edu/coexpression) a
nd
click the tab labeled “Run the Tool.” Users then enter analysis parameters in this and
subsequent screens. In the Step One tab, users select a data release and specify the K
-
S
quality control metric described above. In Step Two, users enter a comma
-
sepa
rated list
of one or more AGI (Arabidopsis Genome Initiative) ids or Affymetrix probe set ids.
Internally, the system stores expression values using probe set ids; user
-
entered AGI
codes are mapped onto probe set ids using probe set
-
to
-
gene id annotations
from
Affymetrix. In Step Three, users select sample types (by tissue) and designate the array
model (ATH1 or AG) to use in the analysis. The tool does not allow co
-
expression
analyses across arrays. In Step Four, the tool generates a table of experiment id
s and
corresponding text for each experiment for each of the tissue types selected in the Step
Two. Users then may select all or a subset of individual experiments to include in the
analysis. In Step 5, users then enter an r
-
squared threshold the tool will

use in performing
pathway
-
level co
-
expression analysis
(Wei, 2006)
. In general,
we find that r
-
squared
threshold of 0.36, corresponding to Pearson’s correlation coefficient 0.6, provides
reasonable results. In Step 6, users then enter a comma
-
separated list of one or more
email addresses that will receive two email messages notifying
the recipients that the
Arabidopsis Co
-
express


Srinivasan, et al.

7

analysis has begun (email message one) and later that the analysis has completed (email
message two).

Co
-
expression analyses results:

Upon successful completion of the analysis, the co
-
expression system sends a “Job
Completion” mes
sage to the email addresses listed in Step Five. The “Job Completion”
message contains links to results files stored on the co
-
expression tool server, including a
“zip” file that contains several plain text and HTML
-
format files. These include lists of
sli
des and experiments used in the analysis; a report on analysis parameters; files in
comma
-
separated formats containing regression results for each of the user
-
entered query
genes; outputs from PLC analysis, and so on. Table 2 provides a complete list of t
he
output files created during a single analysis run and describes the content of each.

Web services [
in progress
]

State what Web services are. Describe the difference between ReST (
re
presentatioinal
s
tate
t
ransfer) versus SOAP (
s
imple
o
bject
a
ccess
p
rotoc
ol) approach. State that we
provide both. Explain how users would use the ReST
-
style approach to access expression
values from inside TableView, or using a Web browser to download and save the plain
text results, which they can then import into Excel or ot
her spreadsheet programs. Point
out that R, python, and most languages have mechanisms that make it easy to access
URLs. Be sure to mention that the Web site provides a tutorial explaining how to use R,
python, and TableView to access expression values.

D
ata
-
mining code [
in progress
]

Arabidopsis Co
-
express


Srinivasan, et al.

8

Remind the reader that the co
-
expression tool is different from other tools in that we
provide “bulk” access to machine
-
readable data files users would need for downstream
analyses. Describe how we also provide some python mod
ules that can read in the data
files output by the co
-
expression tool. Describe that these include a python
implementation of PLC analysis, some functions for generating the Cytoscape files, and
functions for working with GO and other annotation files. Poi
nt out that python is an
intepreted language and therefore is well
-
suited to interactive data
-
mining sessions, in
which users build up data structures “live” in a session, much as they do with R.

Biological application of the co
-
expression tool: example an
alyses

Coexpression pattern of our candidate genes, genes involved in cellulose biosynthesis
(CESA1, 3,4,6,7, and 8), was studied using the data from our recent version of
coexpression tool, v_3_1. The RMA image
-
processed expression level for each of the 6

genes was regressed against 22,800 other genes for the 1755 chips. Genes were then
sorted by the average rank of P values and sign of b1, slope. Genes were analyzed as
groups, primary cellwall and secondary cellwall group, based on the stage of cell wall
formation they are involved in. Table 3 shows 40 most highly ranked genes were studied
were identified to be highly coregulated with the genes in its cellwall group. In the
primary cell wall formation group, 14 from the current study were identified to be
a
subset of 40 co
-
regulated genes from the previous study. In the secondary cell
-
wall
formation group, 32 from the current study were identified to be a subset of 40 co
-
regulated genes from the previous study.

Example analysis: Glucosinolate biosynthesis
from tryptophan

Arabidopsis Co
-
express


Srinivasan, et al.

9

Glucosinolates comprise a family of amino
-
acid derived sulfur
-

and nitrogen
-
containing
compounds that contribute to plant defense against pathogens and/or herbivory.
Glucosinolates are stored in the vacuole, and when the plant is injured an
d cellular
integrity is comprised, they become available for conversion into highly
-
reactive,
potentialyl toxic breakdown products via the action of

-
thioglucosidases known as
myrosinases. This defense mechanism, sometimes called the “mustard oil bomb,” h
as
been widely studied in Arabidopsis and other crucifers; as a result, many of the enzymes
required to synthesize glucosinolate compounds have been identified. Recently,

Hirai, et al. used a co
-
expression
-
based methodology to identify Myb
-
family transcri
ption
factors required for biosynthesis of cysteine
-
derived glucosinolates [
ref
]. Similarly,
Gachon et al. used hierarchical clustering of expression profiles to expose patterns of
correlated expression among genes involved synthesis of glucosinolates fro
m tryptophan
[
ref: Gachon CM, PMB 2005
]. Co
-
expression analysis of glucosinolate biosynthesis
provides an excellent example application for the co
-
expression tool system described
here.

The AraCyc tool hosted at The Arabidopsis Information Resource reports

six
genes whose products catalyze reactions that convert the amino acid tryptophan to indolic
glucosinolate precursors in
Arabidopsis
. We used the Co
-
expression Tool to identify
genes that are co
-
expressed with one or more of these six “bait” genes. All s
ix genes are
highly
-
co
-
expressed with each other. Figure 4 depicts multiple scatter plot generated
using the TableView tool, with expression values imported from the co
-
expression ReST
-
style Web service, which allows users to access expression values in ou
r database via a
simple URL
-
based scheme. We found that each gene was co
-
expressed with
Arabidopsis Co
-
express


Srinivasan, et al.

10

approximately 200 to 300 other genes with Pearson’s r >= 0.6. Consistent with previous
observations, most of the co
-
expression relationships were positive. Figure depic
ts 5
shows a screen capture from Cytoscape showing the pattern of co
-
expression of genes
closely
-
linked to the pathway. By examining the most closely
-
connected genes, users can
develop lists of candidate genes for experimental analysis.

Example analysis: S
econdary cell wall biosynthesis

[
VS
-

to add
]


Discussion

Materials and Methods

Co
-
expression Analysis

The co
-
expression tool is based on large
-
scale linear regression analysis of expression
values between genes of interest and the rest of the genes on a s
elected array using the
methodology described previously
(Persson, 2005; Wei, 2006)
. Each regression analysis
yields three values useful for evaluating co
-
expression relationships: a slope parameter that
indicates the direction (po
sitive or negative) of co
-
expression, and p and R
-
squared values
that indicate the strength of the co
-
expression relationship. The R
-
squared value, also
known as the coefficient of determination, is the square of the correlation coefficient (R
-
squared) and

is the fraction of variance in one variable that can be explained by variation in
the other. Thus, R
-
squared values closer to one indicate higher correlation and a stronger
linear relationship between the compared variables. The p value quantifies the con
fidence
in the correlation; it is the probability that the observed value for R
-
squared could have
Arabidopsis Co
-
express


Srinivasan, et al.

11

been obtained by chance under the null hypothesis that the two variables being compared
are not linearly
-
related.


Figures

Figure 1. Distribution of Kolmo
gorov
-
Smirnov test statistics for 1,290 AffyWatch array
hybridizations corresponding to a subset of arrays from the co
-
expression tool data
release 3. [
Ann’s note: I got these data from the coexpression_dummy database, arrays
version 3. Versions 4,5,6 all
had KS values of 0.1, and Version 2 had some missing
values, possibly corresponding to duplicate arrays
.] Array “CEL” files were processed
using the RMA algorithm as implemented in Bioconductor.



Figure 2. Screen capture depicting co
-
expression tool oper
ation.

[
insert here
]

Arabidopsis Co
-
express


Srinivasan, et al.

12


Figure 3. Venn diagrams depicting overlapping co
-
expression patterns for cellulose
synthesis genes involved in (A) primary cell wall and (B) secondary cell wall
biosynthesis.








Figure 4. Multi
-
scatter plot visualization from Ta
bleView reveals that genes for
indolic glucosinolate biosynthesis are highly
-
coexpressed.

A
. The TableView
application was used to access the getExpVals Web service, which delivers data in
comma
-
separated, plain text formats. [
to
-
do: show TV interface w/ t
able loading
function
] B. Clicking the multi
-
scatter plot icon revealed that all six query genes are
highly coexpressed. Probe set to gene mappings are reported in Table 4.

Arabidopsis Co
-
express


Srinivasan, et al.

13



Figure 5. Network visualizations of PLC analysis for (A) secondary cell wall
bi
osynthesis and (B) glucosinolate biosynthesis from tryptophan.

The PLC analysis
performed by default during a co
-
expression analysis run generates several files that can
be loaded into Cytoscape. The resulting images depict interactions among the “bait” or

query genes, as well as their interactions with neighboring genes. Neighboring genes
connecting with two or more of the query genes appear in the visualization, thus allowing
users to focus in on certain subsets of the co
-
expression network that are likel
y to be most
relevant to shared function (if any) of the bait genes provided in the beginning of the
analysis.


[
insert here

-

Cytoscape visualizations for CESA & Gluco PLC runs]


Tables

Table 1. Co
-
expression tool data releases.

Available data sets are d
ivided into major
releases, involving addition of new slides (e.g., 2.x, 3.x), with minor releases referring to
Arabidopsis Co
-
express


Srinivasan, et al.

14

different methods of processing of data sets the same set of slides. For example, major
release 3 has three minor releases (3.0, 3.1, & 3.2), ea
ch comprising sets of slides that
were processed using different algorithms.

Data Release

Data Source

Number arrays
(slides)

Processing
Methods

Published
example
analyses

2.0

Affywatch I,II

486 ATH1, 80
AG

RMA

(Cui, 2006;
Wei, 2006)

3.0

Affywatch
I,II,III

1769 ATH1, 80
AG

RMA

-

3.1

Affywatch
I,II,III

1769 ATH1

GCRMA

-

3.2

Affywatch
I,II,III

1769 ATH1

MAS5, log2
transformation,
d
ivide by
average

-


Table 2. Data output files.

When an analysis completes, users receive an email message
reporting a URL where they can download a “zip” file containing several results files.

Output Files

Description

HTML Outputs

CoexpInitialSummaryReport.html

Summarizes input queries and
genes of interest annotation
report.

CoexpCompleteSummaryReport.html

Summarize Initial Summary
Report and lists analyses level
output files.

IDXSummaryReport.html

Summarizes genes of interest
annotation information.

ErrorLogFile.txt

Contains log messages w
hen
analyses encounters run
-
time
errors and exceptions.

StatusLogFile.txt

Contains analyses run related
log messages This file can be
used check
the status of your
job

Analysis Outputs



Arabidopsis Co
-
express


Srinivasan, et al.

15

NOTE: Every input AGI Id is mapped to its corresponding Affymetrix array
probeset. Since the co
-
expression tool performs the analyses at the probeset level,
the analyses output files are named after probeset Ids.

261279_at58544.csv

Probe set level result file. Every AGI Id
mapped Probeset Id has its own output file.
For more information on AGI<=> Probese
t
Id mapping, see IDXSummaryReport.html
file. Recorded statistic results are computed
from large scale linear regression between
probes of interest and selected array's
regressor probes. In this example, the probe
set of interest, '261279_at', is regressed

against 22810 ATH1 probesets.

All_ATH158545.csv

A consolidated result file with output
statistics recorded for all query probe sets.

ProbesOfInterest_ATH158546.csv

Records result statistics from large scale
linear regression comparing query probe sets
to each other.


Table

3. Co
-
expression tool Web services
. These Web service provide machine
-
readable access to expression values stored in the co
-
expression tool database.

Service Name

Description

URL

ReST
-
style Web service(s)

URL:

http://obiwan.ssg.uab.edu:8080/coexpressi
on/cgi
-
bin/get
Co
expVals.py


Accepts one or more probe set ids and data release version and returns
expression values across all arrays in the data set.

SOAP
-
style (BioMoby) Web Services

[
VS
-

fill in
]
















Table 4. Genes involved in gluc
osinolate biosynthesis from tryptophan as reported
in AraCyc version 3.5.

Values in the column labeled “Other names” are from TAIR or
are reported in a review of glucosinolate biosynthesis [
ref: Grubb, C.D. 2006
].

Arabidopsis Co
-
express


Srinivasan, et al.

16

AGI code

Other Names

Probe set
(ATH1)

enzy
me

At2g22330

CYP79B3, CYTOCHROME
P450, FAMILY 79,
SUBFAMILY B,
POLYPEPTIDE 3, T26C19.1

264052_at

cytochrome p450

At4g39950

CYP79B2, CYTOCHROME
P450, FAMILY 79,
SUBFAMILY B,
POLYPEPTIDE 2, T5J17.120,
T5J17_120

252827_at

cytochrome p450

At4g31500

CYP83B1,

ATR4,
CYTOCHROME P450
MONOOXYGENASE 83B1,
F3L17.70, F3L17_70, RED1,
RNT1, SUR2

253534_at

cytochrome p450

At2g20610

SUR1, ALF1, F23N11.7,
F23N11_7, HLS3, RTY,
RTY1, SUPERROOT 1

263714_at

transaminase
activity

At1g24100

F3I6.2, F3I6_2, UDP
-
GLUCOSYL TRANSF
ERASE
74B1, UGT74B1

264873_at

UDP
-
glucosyl
transferase

At1g74100

F2P9.3, F2P9_3,
AtST5a*

260387_at

sulfotransferase


Table 5. Genes encoding cellulose synthase enzymes involved in secondary cell wall
biosynthesis.

[VS
-

please add table. Can you find a
good review of CESA gene function that we could
cite? It should be recent as possible.]

Acknowledgements

NSF Plant Genome Award number 0217651 partly funded this work.

References

Craigon, D. J., James, N., Okyere, J., Higgins, J., Jotha
m, J., and May, S. (2004).
NASCArrays: a repository for microarray data generated by NASC's transcriptomics
service. Nucleic Acids Res

32
, D575
-
577.

Cui, X., and Loraine, A. (2006). Global correlation analysis between redundant probe sets
using a large col
lection of Arabidopsis ath1 expression profiling data. Comput Syst
Bioinformatics Conf, 223
-
226.

Arabidopsis Co
-
express


Srinivasan, et al.

17

Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis,
B., Gautier, L., Ge, Y., Gentry, J.
, et al.

(2004). Bioconductor: o
pen software
development for computational biology and bioinformatics. Genome Biol

5
, R80.

Johnson, J. E., Stromvik, M. V., Silverstein, K. A., Crow, J. A., Shoop, E., and Retzel, E.
F. (2003). TableView: portable genomic data visualization. Bioinformatics

19
, 1292
-
1293.

Manfield, I. W., Jen, C. H., Pinney, J. W., Michalopoulos, I., Bradford, J. R., Gilmartin,
P. M., and Westhead, D. R. (2006). Arabidopsis Co
-
expression Tool (ACT): web server
tools for microarray
-
based gene expression analysis. Nucleic Acid
s Res

34
, W504
-
509.

Obayashi, T., Kinoshita, K., Nakai, K., Shibaoka, M., Hayashi, S., Saeki, M., Shibata, D.,
Saito, K., and Ohta, H. (2007). ATTED
-
II: a database of co
-
expressed genes and cis
elements for identifying co
-
regulated gene groups in Arabidops
is. Nucleic Acids Res

35
,
D863
-
869.

Persson, S., Wei, H., Milne, J., Page, G. P., and Somerville, C. R. (2005). Identification
of genes required for cellulose synthesis by regression analysis of public microarray data
sets. Proc Natl Acad Sci U S A

102
, 86
33
-
8638.

Redman, J. C., Haas, B. J., Tanimoto, G., and Town, C. D. (2004). Development and
evaluation of an Arabidopsis whole genome Affymetrix probe array. Plant J

38
, 545
-
561.

Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., A
min, N.,
Schwikowski, B., and Ideker, T. (2003). Cytoscape: a software environment for
integrated models of biomolecular interaction networks. Genome Res

13
, 2498
-
2504.

Toufighi, K., Brady, S. M., Austin, R., Ly, E., and Provart, N. J. (2005). The Botany
A
rray Resource: e
-
Northerns, Expression Angling, and promoter analyses. Plant J

43
,
153
-
163.

Trivedi, P., Edwards, J. W., Wang, J., Gadbury, G. L., Srinivasasainagendra, V.,
Zakharkin, S. O., Kim, K., Mehta, T., Brand, J. P., Patki, A.
, et al.

(2005). HDBSt
at!: a
platform
-
independent software suite for statistical analysis of high dimensional biology
data. BMC Bioinformatics

6
, 86.

Wei, H., Persson, S., Mehta, T., Srinivasasainagendra, V., Chen, L., Page, G. P.,
Somerville, C., and Loraine, A. (2006). Transc
riptional coordination of the metabolic
network in Arabidopsis. Plant Physiol

142
, 762
-
774.

Zimmermann, P., Hirsch
-
Hoffmann, M., Hennig, L., and Gruissem, W. (2004).
GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant
Physiol

136
, 2
621
-
2632.