Supplemental Material

stalliongrapevineBiotechnology

Oct 1, 2013 (3 years and 9 months ago)

73 views

Supplemental Experimental Procedures


Antibodies for ChIP experiments and western blots

For the ChIP experiments we used anti
-
ER


(sc
-
543), anti
-
GATA3 (sc
-
248), anti
-
p300 (sc
-
585) from
Santa Cruz Biotechnologies. Antibodies for H3K4me1 (ab8895) and H3K27Ac (ab4729) are from
Abcam. For the western blots we used anti
-
ER

(6F11/2, Novocastra) and total H3 (ab1791, Abcam).


ChIP
-
Seq lib
rary preparation, Illumina Sequencing, and enrichment analysis

For every ChIP
-
Seq library the starting material was four 15cm culture dishes. ChIP
-
Seq of histone
marks were performed from three 15cm plates. For the ChIP
-
Seq libraries in siGATA3 conditions
a
pool of the siRNAs described at the previous section was used at a 50nM final concentration.
Sequences generated by the Illumina GAIIx genome analyzer
were aligned against NCBI Build 36.3
of the

human genome using BWA version

0.5.5 (Li and Durbin, 2010).

For transcription factor ChiP
-
Seq libraries we generated 36bp reads and 50bp for the histone marks libraries. Reads were filtered
by removing those with a BWA alignment quality score less than 15. Enriched regions of the genome
were identified by comparin
g the ChIP samples to input samples using the MACS peak caller (Zhang
et al., 2008) version 1.3.7.1. Raw ChIP
-
Seq data were visualized using the USCS genome browser.

Heatmaps were generated as described in Heinzman et al., 2007. The input file included the

binding
events centers, sorted on binding signal (reads in peak) from stronger to weaker. Reads were
normalized over input and 10 million reads.





Differential binding analysis (DBA)

Significantly differentially bound sites were identified using the
Bioconductor package DiffBind (Stark
et al, 2011) in a manner described in Ross
-
Innes et al., 2012.

For GATA3 (Fig. 1B and
Supplemental

Fig. S1A), peaks that were identified in at least two of the ten samples (five

replicates
of each of Vehicle and E2) wer
e included in the DBA. For baseline ER binding

sites (Fig. 1B), all
peaks identified in at least two of the three

replicates of the ER ChIP for the siControl condition we
considered. For

ER binding sites in the siGATA3 condition (Fig. 1C), peaks that were
identified in at
least three of the six

samples (three replicates each of siControl and siGATA3) were included in
the

DBA.

In all cases, an FDR cutoff of 0.10 was used to determine significantly differentially bound
sites.



Microarray expression experimen
t

Gene expression analysis was carried out on Illumina Human HT12 version 3 arrays. All data analysis
was carried out on R using Bioconductor packages (Gentleman et al., 2004). Raw intensity data from
the array scanner was processed using the BASH (Cairns
et al., 2008) and HULK algorithms as
implemented in the
beadarray

package (Dunning et al., 2007). Log2 transformation and quantile
normalisation of the data was performed. Differential expression analysis was carried out using the
limma

package (Smyth, G.K
., 2005). Differentially expressed genes (DEGs) were selected using a p
-
value cut
-
off of <0.01 after application of FDR correction for multiple testing applied globally to correct
for multiple contrasts.


Integration of ChIP
-
Seq datasets with expression da
ta

We used the USCS March2006.hg18 version of the human version, to generate a bed file of 50kb
window centered on TSSs, which we subsequently overlapped with the ChIP
-
Seq binding events.
'Peaks to genes' lists enrichment was assessed in a threshold
-
free m
anner using R
-
GSEA, an
implementation of Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005) in R.
Enrichment of these 'peaks to genes' lists with the DEG lists was further assessed independently
using the hypergeometric test for ove
r
-
representa
tion. In

Supplemental Fig
.

S3C

we used a
significance p
-
value <0.01 for E2
-
Veh siControl genes from our microarray.


Transcription factor motif analysis

De novo motif discovery was done using Weeder (Pavesi

et al., 2004). The motif analysis for
identifying enrichment of known motifs
(Supplemental Fig.S1C and Fig. S2C)
was performed using
the program CLOVER (Frith et al., 2004), which compares and calculates probability enrichments to
sets of DNA sequences in

the JASPAR CORE vertebrates Human collection of transcription factor
-
binding patterns (Sandelin et al., 2004).


Chromosome
-
Conformational
-
Capture (3C) Assays

Briefly, cells from a 15cm
2

dish were fixed for 10 min with 1% formaldehyde, scraped, pelleted a
nd
washed 2 x with PBS. Cells were lysed in 0.5% SDS
-
Lysis buffer for 20 min, assisted by syringing
using a 26G needle in order to efficiently remove the cytoplasm and isolate single nuclei. Nuclei were
pelleted and subsequently digested with BamH1 (2400 u
nits) overnight. Two thirds of the digest was
ligated with T4 DNA Ligase (3200 units) at 16
o
C for 6 hours and further digested with EcoR1 (1000
units) for 2 hours. The remaining one third was used as a minus Ligase Control. Subsequent
proteinase K treatme
nt was performed overnight at 65
o
C. Two rounds of phenol
-
chloroform extraction
were performed, followed by one round of chloroform extraction. DNA was ethanol precipitated and
eluted in a final volume of 120ul of H20. All enzymes are from New England Biola
bs (NEB).

Each interaction was normalized over the copy number, minus ligase value and lastly over the
negative control interaction



Primer sequences:

1:
TTCAAAAGGGAACAGATAGCTCAGA, 2: CCTGTGTGAGTACTGCCCTGACT,

3: TCAAGCCTTTATCACCAAGTGATTC, 4: AGCAGGAGCTGTACTATCGGTAACA,

5: TTGTGAGCCTTAATCCTTTTTCCTC, 6: TTCTCTTGTCTGCTTTGCTATCAGG,

7: CTCTGTCTATCAGCAAATCCTTCCA, 8: AGAGAGGAGGCTGTGGAAGTTAGTG,

9: CTTCCCCAGAGCAATAAAGTGTGAT, 10: GGGGAGACCACATAAGCAATAAGAT,

11: GCGAGGA
GAGTGAAGACTGTAAAA.

Copy number RP
-
11 Forward:
ACCCCCTCTCCATAACTATGAGAA,

Reverse:
GCTGTCTACGTCTACTCAATCCTTACTG


Supplemental References

Cairns, J.M., Dunning, M.J., Ritchie, M.E., Russell, R., and Lynch, A.G. (2008). BASH: a tool for
managing BeadArray spa
tial artefacts. Bioinformatics

24
, 2921
-
2922.

Dunning, M.J., Smith, M.L., Ritchie, M.E., and Tavare, S. (2007). beadarray: R classes and methods
for Illumina bead
-
based data. Bioinformatics

23
, 2183
-
2184.

Frith, M.C., Hansen, U., Spouge, J.L., and Weng, Z.

(2004). Finding functional sequence elements by
multiple local alignment. Nucleic Acids Res

32
, 189
-
200.

Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L.,
Ge, Y., Gentry, J.
, et al.

(2004). Bioconduc
tor: open software development for computational biology
and bioinformatics. Genome Biol

5
, R80.

Li, H., and Durbin, R. (2010). Fast and accurate long
-
read alignment with Burrows
-
Wheeler transform.
Bioinformatics

26
, 589
-
595.

Pavesi, G., Mereghetti, P.,
Mauri, G., and Pesole, G. (2004). Weeder Web: discovery of transcription
factor binding sites in a set of sequences from co
-
regulated genes. Nucleic Acids Res

32
, W199
-
203.

Robinson, M.D., and Oshlack, A. (2010). A scaling normalization method for differen
tial expression
analysis of RNA
-
seq data. Genome Biol

11
, R25.

Smyth, G.K. (2005). Limma: linear models for microarray data in R. In Bioinformatics and
Computational Biology Solutions using R and Bioconductor,
R.
Gentleman
,
V
.
Carey
,
W
.
Huber
,
R
.
Irizarry
, S.
Dudoit,

ed. (New York: Springer), pp. 397

420.

Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A.,
Pomeroy, S.L., Golub, T.R., Lander, E.S.
, et al.

(2005). Gene set enrichment analysis: a knowledge
-
based
approach for interpreting genome
-
wide expression profiles. Proc Natl Acad Sci U S A

102
,
15545
-
15550.

Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., Myers,
R.M., Brown, M., Li, W.
, et al.

(2008). Model
-
based an
alysis of ChIP
-
Seq (MACS). Genome Biol

9
,
R137.