3 Feb 2013 12:24
Re: Log transformation and left censoring
Hi Paul given your description, one possibility to explore might be a variance stabilising transformation. E.g. DESeq provides one that smoothly interpolates between the square-root function for low counts and the log-transformation for higher counts, see Section 6 (and 7) of the vignette. Best wishes Wolfgang Il giorno Jan 31, 2013, alle ore 8:57 AM, Paul Harrison <Paul.Harrison@...> ha scritto: > Hello, > > We have been using voom and limma for some time now, and while we're > fairly happy with it, it seems to produce significance levels that are > on the conservative side. We also use edgeR to produce more optimistic > results, but don't entirely trust the significance levels that it > reports. I am looking for something in-between these extremes, and > want to run an idea past this list as a sanity check. I would > especially value Gordon and Charity's comments if they have time. > > The voom log transformation is essentially: > > log2( (count+0.5) / library.size ) > > It then does some clever things with weights. What I'm considering instead is > > log2( count / library.size + moderation.amount / mean.library.size ) > > where moderation.amount is much larger then 0.5, say 5. A couple of things here: > > - Instead of down-weighting low counts, I'm trying to get rid of the > extra variation from low counts by artificially left censoring the > data. > > - I'm using the mean of the libaray sizes because I want the left > censor to be in the same place for each sample even if the library > sizes are different, so that if a gene is entirely switched off in one > condition it won't look variable just because there is a different > left censor in each sample. > > I'm also using this transformation to create heatmaps. > > This seems to be working with the data set I am working with, I get > more significant results and they seem reasonable by eye. It seems to > me that even if this approach isn't ideal it should at least be safe, > at worst it will cause limma to reduce the df.prior and produce less > significant results. Anything I've missed? > > -- > Paul Harrison > > Victorian Bioinformatics Consortium / Monash University > > _______________________________________________ > Bioconductor mailing list > Bioconductor@... > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed