15 May 19:46
Re: GSEABase error in parsing msigdb_v2.5.xml
From: Martin Morgan <mtmorgan@...>
Subject: Re: GSEABase error in parsing msigdb_v2.5.xml
Newsgroups: gmane.science.biology.informatics.conductor
Date: 2008-05-15 17:46:26 GMT
Subject: Re: GSEABase error in parsing msigdb_v2.5.xml
Newsgroups: gmane.science.biology.informatics.conductor
Date: 2008-05-15 17:46:26 GMT
Thanks Vladimir for the report, more below... "Vladimir Morozov" <vmorozov@...> writes: > Hi, > > I get error reading the last vesrsion of Broad msigdb . Is it supposed > to work? > >> gss <- getBroadSets('/data/PathDB/msigdb_v2.5.xml') > Error: 'getBroadSets' failed to create gene sets: > invalid BroadCollection category: 'c5' The Broad added a category; I've updated GSEABase in both the devel and release branches. The update should be available with biocLite after 12 noon Friday; look for GSEABase 1.2.1 in the release. One aspect that is a little unsatisfactory is that the subcategories (CC/ BP/MF for c5, for instance) are not encoded in the XML, and so are not present in the gene sets. Martin >> traceback() > 6: stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > 5: value[[3]](cond) > 4: tryCatchOne(expr, names, parentenv, handlers[[1]]) > 3: tryCatchList(expr, classes, parentenv, handlers) > 2: tryCatch({ > geneSets <- unlist(mapply(.fromXML, uri, "//GENESET", factories, > SIMPLIFY = FALSE, USE.NAMES = FALSE)) > }, error = function(err) { > stop("'getBroadSets' failed to create gene sets:\n ", > conditionMessage(err), > call. = FALSE) > }) > 1: getBroadSets("/data/PathDB/msigdb_v2.5.xml") >> packageDescription('GSEABase') > Package: GSEABase > Type: Package > Title: Gene set enrichment data structures and methods > Version: 1.2.0 > Author: Martin Morgan, Seth Falcon, Robert Gentleman > Maintainer: Biocore Team c/o BioC user list > <bioconductor@...> > Description: This package provides classes and methods to support Gene > Set Enrichment Analysis (GSEA). > License: Artistic-2.0 > Depends: R (>= 2.6.0), methods, AnnotationDbi, Biobase, annotate > Suggests: Ruuid, hgu95av2.db, GO.db, org.Hs.eg.db > Imports: methods, XML, graph > LazyLoad: yes > biocViews: Infrastructure, Statistics > Collate: utilities.R AAA.R AllClasses.R AllGenerics.R getObjects.R > methods-CollectionType.R methods-ExpressionSet.R > methods-GeneColorSet.R methods-GeneIdentifierType.R > methods-GeneSet.R methods-GeneSetCollection.R > methods-OBOCollection.R zzz.R > Packaged: Wed Apr 30 02:43:40 2008; biocbuild > Built: R 2.7.0; ; 2008-05-14 16:18:51; unix > > -- File: /usr/local/lib64/R/library/GSEABase/Meta/package.rds > > > Althogh > getBroadSets('/data/PathDB/msigdb_v2.1.xml') > works. I don's see obvios signs of corruption in the 2.5.xml > [rstats:GeneLogic070523] head -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > <?xml version="1.0" encoding="UTF-8"?> > > > ==> /data/PathDB/msigdb_v2.5.xml <== > <?xml version="1.0" encoding="UTF-8"?> > > tail -n 2 /data/PathDB/*.xml > ==> /data/PathDB/msigdb_v2.1.xml <== > <GENESET STANDARD_NAME="GNF2_ZAP70" SYSTEMATIC_NAME="c4:526" > ORGANISM="Human" CHIP="GENE_SYMBOL" CATEGORY_CODE="c4" > CONTRIBUTOR="Broad Institute" CONTRIBUTOR_ORG="Broad Institute" > DESCRIPTION_BRIEF="Neighborhood of ZAP70" DESCRIPTION_FULL="Neighborhood > of ZAP70 zeta-chain (TCR) associated protein kinase 70kDa in the GNF2 > expression compendium" TAGS="" > MEMBERS="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PRKCH,KLRK1,B > TN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96,RASGRP1,GZ > MM,TRD@,MATK,ITGAL,KLRB1" > MEMBERS_SYMBOLIZED="ZAP70,PTPN4,UNC84B,TUSC4,CTSW,RARRES3,BTN3A2,NKG7,PR > KCH,KLRK1,BTN3A3,MYBL1,GZMA,ARL4C,SH2D1A,TXK,CD7,RORA,CD247,IL18RAP,CD96 > ,RASGRP1,GZMM,TRD@,MATK,ITGAL,KLRB1"/> > </MSIGDB> > > ==> /data/PathDB/msigdb_v2.5.xml <== > <GENESET > STANDARD_NAME="INOSITOL_OR_PHOSPHATIDYLINOSITOL_KINASE_ACTIVITY" > SYSTEMATIC_NAME="c5:1203" ORGANISM="Homo sapiens" AUTHORS="Ashburner M, > Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski > K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, > Lewis S, Matese JC,Richardson JE, Ringwald M, Rubin GM, Sherlock G." > EXTERNAL_DETAILS_URL="http://amigo.geneontology.org/cgi-bin/amigo/go.cgi > ?view=details&search_constraint=terms&depth=0&query=GO:00044 > 28" CHIP="GENE_SYMBOL" CATEGORY_CODE="c5" CONTRIBUTOR="Gene Ontology" > CONTRIBUTOR_ORG="Gene Ontology" DESCRIPTION_BRIEF="Genes annotated by > the GO term GO:0004428. Catalysis of the phosphorylation of myo-inositol > (1,2,3,5/4,6-cyclohexanehexol) or a phosphatidylinositol." > DESCRIPTION_FULL="" TAGS="Molecular function" > MEMBERS="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PIK3CB,PIK3CG > ,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB" > MEMBERS_SYMBOLIZED="FXN,SMG1,PIP4K2B,PIP5K3,ATM,PIK3C2A,PIK3C3,PIK3CA,PI > K3CB,PIK3CG,PIK3R2,PIK3R3,IPPK,PI4KA,PI4KB,PI4K2A,ITPKA,ITPKB"/> > </MSIGDB> > > > > Best > Vlad > > > > Vladimir Morozov > > ALS Therapy Development Institute > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M2 B169 Phone: (206) 667-2793 _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed