Nicolas Delhomme | 6 Mar 12:46 2013
Picon

Re: EasyRNASeq - gff file is not recognized

Dear Gabriella,

If you look at the vignette of the package:

vignette("easyRNASeq")

You'll see a short description of the format in section 4.4. More precisely, read the format description in
the "genomeIntervals" section page 16 that describe how your gff3 file should look like. Given the error
message you get, your gff file does not contain the ID key among the attributes (the ninth column) or the ID
key is incorrectly formatted.

HTH,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme@...
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------

On Mar 6, 2013, at 11:39 AM, Maria Gabriela RL wrote:

> Dear Nico, 
> 
> Many thanks for your response. The gff3 file that I provided was able to be read. However, a new error came
up. It seems to me that there is something wrong with my gff file. Could you recommend something.
> 
> Again, many thanks for your help,
> 
> Gabriela
> 
> > genes_FGS1 <- easyRNASeq(filesDirectory="/projects/irg/grp_stich/personal_folders/Gabby/NGS_R2/cluster/write/EASYRNASeq/",
> +  gapped=F,
> validity.check=TRUE,
> + validity.check=TRUE,
> + chr.map=chr.map,
> filenames=files,
> + organism="custom",
> + annotationMethod="gff",
> + annotationFile="/projects/irg/grp_stich/personal_folders/Gabby/NGS_R2/cluster/write/ZmB73_5b_FGS.gff",
> + count="genes",
> + filenames=files,
> + summarization="geneModels",
> + outputFormat="RNAseq")
> Checking arguments...
> Fetching annotations...
> Read 994386 records
> Error in .getGffRange(organismName(obj), filename = filename, ignoreWarnings = ignoreWarnings,  :
>   You gff file misses the ID key defining the exon ID in the gff attributes. The format should be 'gene:exon-number'.
> 
> 
> 
> 
> On Wed, Mar 6, 2013 at 11:09 AM, Nicolas Delhomme <delhomme@...> wrote:
> Dear Gabriela,
> 
> Given that error:
> 
> > Your file: /projects/ZmB73_5b_FGS.gff3 does not contain a gff header: '##gff-version 3' as first
line. Is that really a gff3 file?
> 
> 
> your gff3 appears not to contain a header.
> 
> Add the following line:
> 
> ##gff-version 3
> 
> to the beginning of your gff3 file and that should solve the problem.
> 
> Cheers,
> 
> Nico
> 
> ---------------------------------------------------------------
> Nicolas Delhomme
> 
> Genome Biology Computational Support
> 
> European Molecular Biology Laboratory
> 
> Tel: +49 6221 387 8310
> Email: nicolas.delhomme@...
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
> 
> 
> 
> 
> 
> On 6 Mar 2013, at 10:57, Gabriela [guest] wrote:
> 
> >
> > Hello,
> >
> > I am trying to generate a table of gene counts to use later with Deseq. However, I got an error message that
the maize gff file that I am using is wrong. I downloaded this file directly from the plant ensembl website.
> >
> > I have to mention that I used a .gff file and a .gff3, and with both I have the same issue. Any hint in how to
solve my problem.
> >
> > Many thanks for your help in advance,
> >
> > Gabriela
> >
> > -- output of sessionInfo():
> >
> >> genes_FGS1 <- easyRNASeq(filesDirectory="/projects/EASYRNASeq/",
> > +  gapped=F,
> > + validity.check=TRUE,
> > + chr.map=chr.map,
> > + organism="custom",
> > + annotationMethod="gff",
> > + annotationFile="/projects/ZmB73_5b_FGS.gff3",
> > + count="genes",
> > + filenames=files,
> > + summarization="geneModels",
> > + outputFormat="RNAseq")
> > Checking arguments...
> > Fetching annotations...
> > Error in .readGffGtf(filename = filename, ignoreWarnings = ignoreWarnings,  :
> >
> >>
> >
> >
> >
> >
> >
> >
> >
> >
> >> sessionInfo()
> > R version 2.15.2 (2012-10-26)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=C                 LC_NAME=C
> > [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] grid      parallel  stats     graphics  grDevices utils     datasets
> > [8] methods   base
> >
> > other attached packages:
> > [1] VennDiagram_1.5.1      easyRNASeq_1.4.2       ShortRead_1.16.1
> > [4] latticeExtra_0.6-24    RColorBrewer_1.0-5     BSgenome_1.26.1
> > [7] biomaRt_2.14.0         genomeIntervals_1.14.0 intervals_0.13.3
> > [10] Rsamtools_1.10.1       Biostrings_2.26.2      GenomicRanges_1.10.4
> > [13] IRanges_1.16.4         edgeR_3.0.2            limma_3.14.1
> > [16] pasilla_0.2.13         DESeq_1.10.1           lattice_0.20-10
> > [19] locfit_1.5-8           DEXSeq_1.2.1           Biobase_2.18.0
> > [22] BiocGenerics_0.4.0     pasillaBamSubset_0.0.2
> >
> > loaded via a namespace (and not attached):
> > [1] annotate_1.34.1      AnnotationDbi_1.18.1 bitops_1.0-4.2
> > [4] DBI_0.2-5            genefilter_1.38.0    geneplotter_1.34.0
> > [7] hwriter_1.3          plyr_1.7.1           RCurl_1.91-1
> > [10] RSQLite_0.11.1       splines_2.15.2       statmod_1.4.15
> > [13] stats4_2.15.2        stringr_0.6.1        survival_2.36-14
> > [16] tools_2.15.2         XML_3.9-4            xtable_1.7-0
> > [19] zlibbioc_1.4.0
> >
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> 
> 

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane