Microarray Workshop: Analyzing Agilent Gene Expression data in GeneSpring GX9 using the VSMC data

overratedbeltAI and Robotics

Nov 25, 2013 (3 years and 10 months ago)

98 views

Microarray Workshop
: Analyzing Agilent Gene Expression data in GeneSpring GX9 using the VSMC data


Overview of sections and how later sections depend upon earlier ones.

Section 1:

Importing Data into the GeneSpring GX, Preprocessing, and Creating an “Expe
riment”




Exercise 1
: Import data and create an experiment


Section 2:

Setting up the Experiment (Experiment Setup)


Exercise 1
: Define Experiment Parameters and assign parameter values to each sample.


Skip the section on manual entry.

Exercise 2
: Cre
ate experimental interpretations to group replicate samples into conditions



Allows for averaging over replicates


Section 3:

Viewing Expression Data in GeneSpring GX


Exercise 1:

View expression data in a Profile Plot




Defining the order in which you
want groupings displayed


Exercise 2:

View expression data in Spreadsheet View


Exercise 3:

View expression data in Scatterplot View


Exercise 4:

View data for a single entity


Section 4:

Exercise 1:
Perform quality control

assessment

on samples.

Skip
.

Exercise 2: Use the hierarchical clustering algorithm to create a condition tree.
Skip


The sample data you receive has already been assessed for quality.



Section 5:

Perform quality control on probes (individual entities)


Exercise 1
:

Fil
ter for pr
obes with reliable intensity measurements




Assembles QC probes list.



Section 6:

Identifying probes of interest.

These steps are required in order to complete Sections 7 and 8.


Exercise 1
.

Find candidates for differential expression using statistics.




Requires QC probes list.




Saves 3 ANOVA lists




Saves
s
imilar effects of IL1
β

in artery and vein


Exercise 2
.

Filter probes based on fold
-
change




Requires an ANOVA list


Saves FC greater than 1.5 in probes with significant interaction p
-
values

Exer
cise 3
.

Find other genes with similar expression profiles to a target gene.

Skip.



Section 7
.

Clustering Gene Expression Data



Exercise 1
.

Use the
Heirarchical

clustering algorithm to build an entity and condition tree.


Requires

FC (fold
-
change
) list


Saves Heirarchical cluster output


Exercise 2
.

Use the K
-
means clustering algorithm to group probes with similar expression profiles.


Requires FC list


Saves K
-
means clustering output


Creates lists of clustered entities


Section 8
.

Biological Queries


Exercise 1
: Perform GO ontology analysis to determine the biological functions of your genes of interest.


Exercises 2 and 3


Skip.


Workshop

Questions


As you work through the tutorial, remember the biological problem you are atte
mpting to solve and the
experimental objective.

You might enjoy working in pairs. That is fine!

Please
complete

Section 1 before we start lecture this morning.

For your reference, the table on p. 18 of the tutorial shows what each sample is.


There are 9
questions for you to answer as you progress through the tutorial this afternoon. See how far you
can get. You are responsible for as far as you can get with diligent use of your time. Type your answers directly
on the worksheet and email to me at the en
d of the period today.



We will stop at ~3:15 for a short lecture and discussion on RNA
-
Seq.



S
ection 3:
Viewing Expression Data

Exercise 1, when you have reached p. 28.

1.

Each line in the 3 profile plots you have just generated represents the data fro
m one entity or spot (=
gene) on the array
s, averaged over the replicate arrays

for a specific condition. In these plots, what is
the significance of the slope of any given line?


After
Spreadsheet

view:

Each row represents data at one location (feature,
entity) on the microarray.

2.

a)

Look at the spreadsheet view as described in the tutorial, p. 28. There were 12 samples in the
experiment, but in the spreadsheet view, there are only 4 columns with numerical values.
Explain.


b)

Ho
w would you expect the number of columns with numerical values to change if you change
your interpretation to Tissue Type?


c)

To all samples?


After
Scatterplot

view
-


Bottom of page 31.

3
.

a)

You selected genes from the scatterplot view that
met specific criteria. You are now viewing the
results with those genes in Profile view. Describe how the Profile view agrees with your
expections given how you selected genes from the scatterplot view.

b)

In general, how does the effect of IL
1
β

on these same genes in veins compare to the effect in
arteries?

c)

Does this result lend support to your hypothesis that some genes respond differently to IL1β in
arteries than in veins?


Page 32, last bullet has an error. Change the settings on the rig
ht in the search box to be identical to
those shown in Figure 24.


Examining a single gene.

4
.

Page 35. Does the profile plot for VCAM 1 suggest that this gene should be among those in your entity
list for genes down
-
regulated in arteries by IL1β?


Sectio
n 5
:

QC of probes

5
.

One
biological

reason why an entity (probe) might be eliminated from consideration in this step is
because the gene it represents is not expressed in either arterial or venous smooth muscle cells in
either the presence or absence of I
L1 beta.


What
technical

reasons could you think of that would a result that would lead to the elimination you have
just done
?


Section 6
:

Differential expression

6
.

In this section you have generated lists of genes that show differential expression acr
oss conditions,
and that show changes in expression level of greater than 1.5. In the process, you have made it
possible to eliminate genes that don’t meet these criteria from further consideration. Explain why having
software that performs these steps i
s important to the successful solution of the biological problem
under investigation.


Section 7
:

Clustering

7
.

Hierarchical clustering. To make this cluster tree, you used the list of genes that show at least a 1.5 fold
change in expression level.



a)

What 3 samples clustered together at the far right of the tree
? (Move the cursor over the sample
number and compare to the table on p. 18 of the tutorial.)


b)

What is the evidence from the tree that some genes are highly expressed
under the

far right

conditions relative to others, and that other genes are expressed at very low levels under these
conditions relative to other conditions
? (You can see this best if you compress the rows multiple
times.)


c)

Is the general effect of IL1
β the same on all genes
?


8
.

K
-
means clustering.

If you have followed the instructions in the tutorial to the letter, the samples in
the K
-
means clusters are displayed from left to right in each cluster in the same order as they are
presented in the table
on p. 18.


a)

What conditi
ons result

in the highest and lowest peaks?


b)

Is this result (a) in agreement with the hierarchical clustering?


c)

If you made a Venn diagram of the 4 clusters, would you expect any overlaps?


d)

No matter how man
y individual gene profiles you look at in any of the 4 clusters, you will never
see a flat line. Explain why.


Section 8
: Biological Queries

9
.

Run the GO analysis on Cluster 1 from K
-
Means, as described in the tutorial. Then repeat the analysis
with a d
ifferent cluster.
What cluster did you use, and a
re the biological categories the same or
different?