Nicolas Delhomme | 12 Oct 09:36 2012

Re: get synthetic exon dataset with easyRNASeq

Hi Meritxell,

I've Cced the Bioc Mailing list in case this is of interest to others.

On Oct 11, 2012, at 6:15 PM, Meritxell Oliva wrote:

> Hi Nicolas,
> I am an easyRNASeq "newbie" user.
> First of all, congratulations for the development of the pipeline: so far it's one of the best R libraries I
have found to deal with RNASeq data, as it tries to tackle problematic issues such as unique read-exon
count assignment and also wraps the normalization packages (DESeq, edgeR), so you get all you need in one
go. Thanks!

Thanks, that's nice to hear. Let me know as well whenever your encounter problems of think of new features!

> I do have a question: I would like to create a non-redundant, synthetic exon dataset, using the Ensembl68
gene models. From what I understand from the manual, when using easyRNASeq() if you summarize your counts
by counts=gene,summarization=geneModels, this  synthetic exon dataset is generated in order to create
unique read-exon correspondances. This is what I do, and I store the object as RNASeq object, to preserve
the genomic annotation. However, the annotation that I get if I apply the function genomicAnnotation()
to this object, is the original one from Ensembl, with redundant exons shared between transcripts. I
would like to get the synthetic exon dataset, to select unique coding regions for each gene transcript. 
> How can I get this dataset? My ultimate goal is to perform gene expression differential analysis at gene,
transcript and exon level. First one is solved, and I want to find the best way to do perform the latter ones.

At the moment it's still a dual step process, but I plan on making that easier. You first need to run
easyRNASeq(counts=gene,summarization=geneModels,etc...) and asking to get an "RNAseq"
outputFormat: rnaSeq <-
easyRNASeq(counts=gene,summarization=geneModels,outputFormat="RNAseq",etc...). This will
give you an object of the class RNAseq that contains the geneModel annotation accessible through
geneModel(rnaSeq). That's a RangedData object containing the synthetic exon, although it is still
redundant for genes located on opposite strands. So if you're not using stranded RNAseq data, you need to
do some more filtering.

> Can you help me?

Hope this did, let me know if not,


> Thanks
> Meritxell Oliva
> PhD student 
> IBB (Biotechnology and Biomedicine Institute)
> Comparative and Functional Genomics group
> Campus Universitari - 08193 Bellaterra Cerdanyola del Vall├Ęs - Barcelona

Bioconductor mailing list
Search the archives: