12 Jan 2013 00:10
Re: xps: hugene11 chip gives problems
Dear Philip, I am glad to hear that using 'celnames' could solve your problem. It is interesting to hear that you have never had problems with names of CEL-files. Personally I prefer to change the names, especially the names of the CEL-files from GEO which are simply numbers with a prefix. Have a nice weekend, too. Christian On 1/11/13 10:34 PM, Groot, Philip de wrote: > Dear Christian, > > Thank you very much! I was thinking that it must have been something in the CEL-file itself, but it turns out to be the filename! I'll adapt the script on our production server to fix the issue. I have to mention that we use xps for quite some years now. We never encountered this issue before! > > I worked through your recommendations from yesterday. I could indeed properly load the affymetrix sample data. And changing the location of the root-scheme did not fix the issue either! Fortunately, we do understand this now! > > And you are right: if xps is updated, I need to recreate the schemes too. This needs only to be done once every 6 months (usually) and is not a big problem. And it also forces me to check the Affymetrix site for updated annotations etc. I just feel more comfortable if the schemes are created by the current running version of xps. > > Have a nice weekend. > > Regards, > > > Dr. Philip de Groot Ph.D. > Bioinformatics Researcher > > Wageningen University / TIFN > Nutrigenomics Consortium > Nutrition, Metabolism & Genomics Group > Division of Human Nutrition > PO Box 8129, 6700 EV Wageningen > Visiting Address: Erfelijkheidsleer: De Valk, Building 304 > Dreijenweg 2, 6703 HA Wageningen > Room: 0052a > T: +31-317-485786 > F: +31-317-483342 > E-mail: Philip.deGroot@... > Internet: http://www.nutrigenomicsconsortium.nl > http://humannutrition.wur.nl/ > https://madmax.bioinformatics.nl/ > ________________________________________ > From: cstrato [cstrato@...] > Sent: 11 January 2013 21:05 > To: Groot, Philip de > Cc: bioconductor@... > Subject: Re: [BioC] xps: hugene11 chip gives problems > > Dear Philip, > > Meanwhile I did another test and renamed my CEL-files to mimic your > names. This is what I get: > > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", > filedir=datdir, celdir=celdir, celfiles=celfiles) > Opening file > </Volumes/MitziData/CRAN/Workspaces/hugene11/na33/hugene11stv1.root> in > <READ> mode... > Creating new temporary file > </Volumes/MitziData/CRAN/Workspaces/hugene11/tmp_HuBrPr_cel.root>... > Importing > </Volumes/MitziData/CRAN/Workspaces/hugene11/celtest/Brain_01_1.1.CEL> > as <Brain_01_1.1.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <DataSet> is added to Content... > Importing > </Volumes/MitziData/CRAN/Workspaces/hugene11/celtest/Prostate_01_1.1.CEL> as > <Prostate_01_1.1.cel>... > hybridization statistics: > 2 cells with minimal intensity 14.5 > 1 cells with maximal intensity 23266.3 > > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) > Error: Tree set <> could not be found in file content > Error: Tree set <> could not be found in file content > > > As you can see I can now replicate your error. > > The solution is simple, i.e. use parameter 'celnames'. Now the result is: > > celfiles <- c("Brain_01_1.1.CEL","Prostate_01_1.1.CEL") > > celnames <- c("Brain01","Prostate01") > > data.genome11 <- import.data(scheme.hugene11, "tmp_HuBrPr", > filedir=datdir, celdir=celdir, celfiles=celfiles, celnames=celnames) > Opening file > </Volumes/MitziData/CRAN/Workspaces/hugene11/na33/hugene11stv1.root> in > <READ> mode... > Creating new temporary file > </Volumes/MitziData/CRAN/Workspaces/hugene11/tmp_HuBrPr_cel.root>... > Importing > </Volumes/MitziData/CRAN/Workspaces/hugene11/celtest/Brain_01_1.1.CEL> > as <Brain01.cel>... > hybridization statistics: > 1 cells with minimal intensity 17.5 > 1 cells with maximal intensity 22402.1 > New dataset <DataSet> is added to Content... > Importing > </Volumes/MitziData/CRAN/Workspaces/hugene11/celtest/Prostate_01_1.1.CEL> as > <Prostate01.cel>... > hybridization statistics: > 2 cells with minimal intensity 14.5 > 1 cells with maximal intensity 23266.3 > > for (i in 1:length(rawCELName(data.genome11, fullpath = FALSE))) > + cat(sprintf("%s\n", rawCELName(data.genome11, fullpath = FALSE)[i])) > Brain_01_1.1.CEL > Prostate_01_1.1.CEL > > As you can see, now everything works fine. The reason for introducing > parameter 'celnames' was from the beginning to allow alternative names > w/o the need to change the names of the original CEL-files, since often > CEL-files had names such as 'Breast_tissue;24/08/1999;batch-1,lot-2.1.CEL'. > > I hope that using parameter 'celnames' does solve your problem. > > Best regards, > Christian > > > On 1/10/13 9:10 PM, cstrato wrote: >> Dear Philip, >> >> I have just tried a subset of CEL-files from the Affymetrix >> "gene_1_1_st_ap_tissue_sample_data" for HuGene_1.1 array, but I cannot >> repeat the error you get. Here is my output for one CEL-file only: >> >> > library(xps) >> >> Welcome to xps version 1.19.1 >> an R wrapper for XPS - eXpression Profiling System >> (c) Copyright 2001-2013 by Christian Stratowa >> >> > scheme <- root.scheme("./na33/hugene11stv1.root") >> > x.xps <- import.data(scheme, "tmp_x", celdir = "./cel", celfiles = >> "HumanBrain_1.CEL", verbose = TRUE) >> Opening file <./na33/hugene11stv1.root> in <READ> mode... >> Creating new temporary file >> </Volumes/MitziData/CRAN/Workspaces/hugene11/tmp_x_cel.root>... >> Importing <./cel/HumanBrain_1.CEL> as <HumanBrain_1.cel>... >> hybridization statistics: >> 1 cells with minimal intensity 17.5 >> 1 cells with maximal intensity 22402.1 >> New dataset <DataSet> is added to Content... >> > cat("The loaded .CEL-files are:\n"); >> The loaded .CEL-files are: >> > for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >> HumanBrain_1.CEL >> > >> > sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] xps_1.19.1 >> >> loaded via a namespace (and not attached): >> [1] tools_2.15.0 >> > >> >> >> As you see everything is ok. I did also run the triplicates of the Brain >> and Prostate samples and could do RMA w/o problems. >> >> Could you please try the following two options: >> >> 1, Could you try to use the CEL-files from the Affymetrix dataset to >> make sure that there is no problem with the CEL-files. >> >> 2, I see that you did create the ROOT scheme files in directory: >> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >> >> I must admit that I have never tried to store the scheme files in the >> package directory, since I have the feeling that this may cause >> troubles, especially when you update R and/or the xps package to a new >> version. >> Could you please try to save your file "hugene11stv1.root" in a >> different directory such as '/home/degroot/schemes' or better to create >> this file in this directory, and then try if you still get the problem. >> >> Best regards, >> Christian >> >> >> On 1/10/13 1:03 PM, Groot, Philip de wrote: >>> Hi Christian, >>> >>> I am trying to do an analysis using xps and the hugene11 chip. However, >>> I run into problems for which I need your help. >>> >>> I created a small test-script to demonstrate the problem: >>> >>> library(xps) >>> >>> scheme <- >>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >>> >>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>> "G092_A05_01_1.1.CEL", verbose = TRUE) >>> >>> cat("The loaded .CEL-files are:\n"); >>> >>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >>> >>> cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >>> >>> Upon execution, I get: >>> >>>> library(xps) >>> >>> Welcome to xps version 1.18.1 >>> >>> an R wrapper for XPS - eXpression Profiling System >>> >>> (c) Copyright 2001-2012 by Christian Stratowa >>> >>>> scheme <- >>>> root.scheme("/local2/R-2.15.2/library/xps/schemes/hugene11stv1.root") >>> >>>> x.xps <- import.data(scheme, "tmp_x", celdir = ".", celfiles = >>>> "G092_A05_01_1.1.CEL", verbose = TRUE) >>> >>> Opening file </local2/R-2.15.2/library/xps/schemes/hugene11stv1.root> in >>> <READ> mode... >>> >>> Creating new temporary file >>> </mnt/geninf16/home/guests/pdegroot/dataanalysis/PHILIPG/tmp_x_cel.root>... >>> >>> >>> Importing <./G092_A05_01_1.1.CEL> as <G092_A05_01_1.1.cel>... >>> >>> hybridization statistics: >>> >>> 1 cells with minimal intensity 19 >>> >>> 1 cells with maximal intensity 21364.4 >>> >>> New dataset <DataSet> is added to Content... >>> >>>> >>> >>>> cat("The loaded .CEL-files are:\n"); >>> >>> The loaded .CEL-files are: >>> >>>> for (i in 1:length(rawCELName(x.xps, fullpath = FALSE))) >>> >>> + cat(sprintf("%s\n", rawCELName(x.xps, fullpath = FALSE)[i])); >>> >>> Error: Tree set <> could not be found in file content >>> >>> Error: Tree set <> could not be found in file content >>> >>> NA >>> >>> The weird thing is: I only have this problem with the hugene11 chip. As >>> far as I can see, al other chips work properly (still na32 based). >>> >>> This effects all other steps, because there is no “content” to normalise >>> etc. >>> >>> I created the root-scheme as follows: >>> >>> scmdir <- paste(.path.package("xps"), "schemes/", sep = "/") >>> >>> scheme <- import.exon.scheme("hugene11stv1",filedir=scmdir, >>> layoutfile=paste(libdir, "HuGene-1_1-st-v1.r4.clf", sep="/"), >>> schemefile=paste(libdir,"HuGene-1_1-st-v1.r4.pgf", sep="/"), >>> probeset=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.probeset.csv", >>> sep="/"), >>> transcript=paste(anndir,"HuGene-1_1-st-v1.na33.1.hg19.transcript.csv", >>> sep="/"), add.mask = TRUE) >>> >>> (libdir and anndir are also defined off course). >>> >>> I even updated the na32 annotation to the latest Affymetrix version >>> (na33) the exclude a problem there. It does not fix the issue. >>> >>> Please note that I am running root version 5.32/04 as version 5.32/01 is >>> no longer available for download. Root works properly as far as I can >>> see. >>> >>> Do you have any clue where this problem originates from? Thank you! >>> >>> sessionInfo(): >>> >>>> sessionInfo() >>> >>> R version 2.15.2 (2012-10-26) >>> >>> Platform: x86_64-unknown-linux-gnu (64-bit) >>> >>> locale: >>> >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> >>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> >>> [7] LC_PAPER=C LC_NAME=C >>> >>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> >>> [1] xps_1.18.1 >>> >>> loaded via a namespace (and not attached): >>> >>> [1] tools_2.15.2 >>> >>> Regards, >>> >>> *Dr. Philip de Groot >>> Bioinformatician / Microarray analysis expert* >>> >>> Wageningen University / TIFN >>> Netherlands Nutrigenomics Center (NNC) >>> >>> Nutrition, Metabolism & Genomics Group >>> Division of Human Nutrition >>> PO Box 8129, 6700 EV Wageningen >>> Visiting Address: >>> >>> "De Valk" ("Erfelijkheidsleer"), >>> >>> Building 304, >>> Verbindingsweg 4, 6703 HC Wageningen >>> Room: 0052a >>> T: 0317 485786 >>> F: 0317 483342 >>> E-mail: Philip.deGroot@... <mailto:Philip.deGroot@...> >>> I: http://humannutrition.wur.nl <http://humannutrition.wur.nl/> >>> >>> https://madmax.bioinformatics.nl >>> >>> http://www.nutrigenomicsconsortium.nl >>> <http://www.nutrigenomicsconsortium.nl/> >>> >>> >>> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@... >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed