2 Sep 15:41
Re: question regarding removing probes/probe sets before normalization
Jenny, Thanks for your email. The reason you provided makes perfect sense. However, I've looked into this and found that it may not be the root cause of the error. I tried just masking 25 probes and 1 probe set (U133A array) as below: >maskedprobes [1] "222152_at1" "222152_at2" "222152_at3" "222152_at4" "222152_at5" "222152_at6" [7] "222152_at7" "222152_at8" "222152_at9" "222152_at10" "222152_at11" "204659_s_at1" [13] "204659_s_at2" "204659_s_at3" "204659_s_at4" "204659_s_at5" "204659_s_at6" "204659_s_at7" [19] "204659_s_at8" "204659_s_at9" "204659_s_at10" "204659_s_at11" "210893_at1" "210893_at2" [25] "210893_at3" > maskedprobeSets [1] "222152_at" Presumably it should run into error since all the probes within the probe set have been taken out. However, the code still runs through and the dimension works out correctly. A <- ReadAffy() B <- justMAS(A,tgt = 500,scale = TRUE) > dim(pm(A)) [1] 247940 54 # the dimension without masking is 247965, which is equal to 247940 + 25 > dim(B) Features Samples 22281 54 # the dimension without masking is 22283, which is equal to 22281 + 2 (the codes knows that all the probes in 204659_s_at have been taken out, so the total # of masked probe sets is 2). In addition, for one of the dataset which the code does not run into problem, I found that there are ~ 4000 probe sets having this kind of problem: all the probes within the probe sets have been taken out. Yet it still runs through and the dimension works out correctly. Sorry for bothering you so much on this, I really appreciate it. Best, -Jack On Wed, Sep 1, 2010 at 4:57 PM, Jenny Drnevich <drnevich@...>wrote: > Hi Jack, > > There shouldn't be a limit on the number of probes/probesets that you can > remove. However, based on the error message, my guess is that there's some > redundancy between your list of probe sets to remove and probes to remove. > Probably you have all the probes of a probe set in your probe list, and that > probe set in your probe set list as well. The code first removes all the > probes, so when it goes to remove that probe set, it won't find anything and > could throw an error. I don't know for sure that this is your problem, but > I've seen it before. > > Jenny > > > At 02:54 PM 9/1/2010, Jack Luo wrote: > > Hi Jenny, > > I run the code through 3 different datasets (each consists of 50-60 U133a > cel files), the first 2 datasets run pretty smoothly, the last one has the > error msg: > Error in ans[[i]][, i.probes] : incorrect number of dimensions > > Not sure what's causing the problem, but the last one has the largest > number of masked probes/probesets: ~179k masked probes (out of 247965 in > total) and ~15k masked probe sets (out of 22283 in total). Does the code > have a limit on the number of masked probe/probesets? or there are certain > probes/probe sets that can not be filtered? > > Thanks a lot, > > -Jack > > > On Tue, Aug 31, 2010 at 5:57 PM, Jenny Drnevich <drnevich@...> > wrote: > Hi Jack, > > It's best to also post your question to the BioC list. First of all, you > link was to the September 2006 code, but there was an update to the code in > September 2008 (I think it still should work): > > https://stat.ethz.ch/pipermail/bioconductor/2008-September/024296.html > > > Did you see this more recent post on the BioC list about removing > individual probes? The individual probe names have to be in the correct > format, and your's aren't: > https://stat.ethz.ch/pipermail/bioconductor/2010-January/031463.html > > > HTH, > Jenny > > > At 11:46 AM 8/31/2010, you wrote: > > Hi Jenny, > > I am a BioC user and find the following link: > https://stat.ethz.ch/pipermail/bioconductor/2006-September/014242.html > > which is related to removing probes/probe sets from the cdf. It's highly > related to my work, which needs to perform normalization after filtering out > some probe pairs whose PM is not significantly higher than MM. > > I played around with the code in the above link by following the > instruction, it works perfectly when some given probe sets need to be taken > out, but have some problem when some probes (not whole probe sets) need to > be taken out. I am wondering do you have an example for taking some probes > as well as probe sets? I have not figured out what's the correct input for > probes. > > I tried the following inputs and neither of them works. > > maskedprobeSets = rownames(exprs(justRMA(filenames = celfiles)))[1:100] # > this format works > > maskedprobes = rownames(pm(AffyBatch))[1:500] # doesn't work > maskedprobes = as.character(1:500) # doesn't work > > RemoveProbes(listOutProbes=maskedprobes, listOutProbeSets=maskedprobeSets, > cleancdf) > > > Thanks a lot! Your help will be highly appreciated. > > -Jack > > > [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed