5 Oct 2012 09:41
EdgeR condition-specific dispersion
Dear Thomas, It does make sense to estimate condition-specific dispersions, but most of the time it isn't worthwhile to do so, and the only penalty for not doing so when you could have is some loss of statistical power (fewer DE genes). It makes sense when a perturbed condition is more variable than a 'normal' condition, for example cancer tumour vs normal tissue, or knockout vs wildtype. For it to be worthwhile, there must be a substantial difference between in variability and a relatively large number of replicate samples in each group. It is almost certainly not worthwhile if you only have 2-3 replicates in each condition. I wonder how you have established that the dispersion varies with the combination of cues? By running edgeR separately on different conditions? Otherwise you might be examining standard deviations rather than dispersions, and they are not the same thing. Is the sequencing depth similar between the different conditions? If the library sizes are different, then edgeR will assign different variances to different observations, even though the dispersions might be the same. Anyway, edgeR is limited to estimating the dispersion at the gene level. It cannot be easily modified to estimate the dispersion on a condition-specific basis. On the other hand, voom (a function in the limma package) estimates observation-specific dispersions, and can be easily modified to do so in a condition-specific manner. This is part of the work of Charity Law, who is currently writing up her PhD thesis. If you really need to go in this direction, I can show you how to do so using voom. Best wishes Gordon > Date: Tue, 2 Oct 2012 17:15:47 +0000 > From: Thomas Frederick Willems <twillems@...> > To: "bioconductor@..." <bioconductor@...> > Subject: [BioC] EdgeR condition-specific dispersion > > I'm dealing with a factorial RNA-seq data set in which cells have been > stimulated with various combinations of extra-cellular cues. As such, I > was interested in applying the GLM framework in edgeR to assess the > contribution of each extra-cellular cue to the differential expression > of certain genes. My concern, however, is that both the expression level > and the dispersion of each gene varies greatly with the combination of > cues. EdgeR doesn't seem to estimate condition-specific dispersion but > rather one dispersion per gene (if the tagwise options is used). My > question is therefore two-fold: > 1) Does it make sense to want to estimate condition-specific > dispersions? > 2) Is there a way to modify the edgeR framework so that it does this? > > Thanks > Thomas ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}} _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed