lncRNA Conference Call Minutes – 23 rd August 2011

beadkennelAI and Robotics

Oct 15, 2013 (4 years and 25 days ago)

167 views

1


lncRNA

Conference Call Minutes


23
rd

August

2011


Present:

Jen Harrow,
Mark Thomas,
Charlie Steward, Sarah Grubb Jose Gonzalez, Electra Tapanari
,
Ben
Brown,

Leonard Lipovich, Emily Wood, Roderic Guigo, Andrea Tanzer
,

Balazs Banfai, Jainab Khatun
, Rory
Johnson, Thomas Derrien, Tejaswini Mishra
, Cedric Howald


Notes from last minutes



Last minutes were not distributed

before the meeting


Analysis of lncRNA translation using machine learning

(
BB
)



Toward understanding the

expression patterns of lncRNA



do lncRNA have a fundamentally different
expression pattern across GENCODE 7 expression quantification samples than mRNA



Can we predict the label “mRNA” or “lncRNA” from expression data alone
?



Long RNA expression patterns are predictive of lnc/mRNA stat
us



87% correct classification



Expression patterns are predictive of MS data



wrong by ~ 1 peptide



Restricting to only GM12878 expression data improves fit



Expression data is predictive of binarized MS detectability



Covariate importance



Conclusion



Our models predict that the vast majority (95
-
99%) of lncRNA are non
-
coding and should not produce
peptides given their expression patterns



Questions/Discussion

regarding method used, absolute error, significance of H1 cells being the best
predictor, stem
cells, biclustering collaboration



LL

and RG

to send out citation
s



Mining MS data for lncRNA translation: the ectopic ORFeome and cryptic mRNA (GENCODE 7 update)
(
LL/
EW
)



GENCODE 7 lncRNA ORF in MS data: Overview



happy to post excel spreadsheet on wiki if requested



Nonsingleton peptides matching lncRNA ORF



Definition: nonsingleton peptide hit



Only a minority of lncRNA ORF MS matches were nonsingleton peptides



lncRNA transcripts of protein
-
coding transcriptional
units (TU)

-

overview

o

No evidence of unique peptide to transcript association

o

Peptides are in frame matches of the kno
wn gene’s protein in every case
-

examples

o

Visual e
xamples: EMG1 locus, RHOF locus

o

JH commented that all would be revealed in the next
presentation



lncRNA transcripts of bona fide lncRNA TU: 2 lncRNA genes with peptide MS?

o

Example: Pseudogene

-

AFG3L1P and summary

o

Example: Suspicious lncRNA
-

CIPLAFQRASK



GENCODE 3c lncRNA and non
-
GENCODE Jia et al 2010 lncRNA with nonsingleton peptide hit
s in
GENCODE 7



Conclusions



Zero bona fide GENCODE 7 lncRNA TU with nonsingleton peptide support



2


Analysis GENCODE

7 lncRNA using Mike Lin’s PhyloCSF
(comparative)
analysis

(JH
)



Method



429 transcripts with high PhyloCSF

score separated into 2 lists: obvious QC errors with HGNC names
and anonymous gene name errors



Obvious QC error e
xample
s



SULT1C2P1 (transcribed unprocessed pseudogene)



SRD5A2

(indel in first exon, unresolved GRC ticket)



NCRNA00238 (no believable conserved

CDS in any variant, kept as lincRNA)



OST4 (ultra
-
conserved (rodents) 37aa translation)



Anonymous gene name examples



Transcribed unitary pseudogene: DB1L5P



Unitary pseudogene:
PRORSD1P



Transcribed unprocessed pseudogene



Known antisense loci: AATK antisense



Antisense to CHD3



Example of annotation made into a coding locus



Will update GENCODE 7 file once analysis is complete



Questions/Discussion

regarding recalculation of mass spec, mass spec method, candidates for mass spec
validation, PhyloCSF

scores/alignments, annotation overlap, ambiguous CCDS
/ORF




Create list of candidates for mass spec validation



BB to email Mike regarding changing alignments



AT/TD to send JH list to check annotation overlap examples



AT to send list to BB of ambiguous
CCDS/ORF


GENCODE lncRNA
companion
paper

(RG/TD)



Draft by the end of the month



Would like to include some of the analysis presented (unless the data is going to be used elsewhere)



Discussion regarding what is included



Send current list of GENCODE 7 lncRNA

to mailing list


For more detail regarding this call please refer to the presentation
s

and/or WebEx recording available on the
GENCODE wiki


Next Teleconference will be held
during
the
GENCODE meeting on

1
5/16
th

September

2011

(date and time to
be
confirmed)