5 Dec 00:35 2012

## edgeR:Differences in results between two different versions of edgeR

Dear Dorota, The important settings are prior.df and trend. prior.n and prior.df are related through prior.df = prior.n * residual.df, and your experiment has residual.df = 36 - 12 = 24. So the old setting of prior.n=10 is equivalent for your data to prior.df = 240, a very large value. Going the other way, the new setting of prior.df=10 is equivalent to prior.n=10/24. To recover old results with the current software you would use estimateTagwiseDisp(object, prior.df=240, trend="none") To get the new default from old software you would use estimateTagwiseDisp(object, prior.n=10/24, trend=TRUE) Actually the old trend method is equivalent to trend="loess" in the new software. You should use plotBCV(object) to see whether a trend is required. Note you could also use prior.n <- getPriorN(object, prior.df=10) to map between prior.df and prior.n. There has also been a change in the default behaviour of exactTest(). To make the new exactTest() behave like the old version, you would use exactTest(object, rejection.region="smallp") The new default gives much more reliable results than the old when the dispersion is very large. Best wishes Gordon > Date: Mon, 03 Dec 2012 19:36:58 +0100 > From: "Dorota Herman" <dorota.herman@...> > To: Bioconductor mailing list <bioconductor@...> > Subject: [BioC] edgeR:Differences in results between two different > versions of edgeR > > Dear list, > > when I run the same code for RNA-seq data to find differentially > expressed genes using exactTest() in two different versions of edgeR, I > obtain considerable different results. The data set contains 36 > libraries divided into 12 groups, where each library is consist of 24 > 000 genes (none of them has all zero counts). While the older version > (edgeR_2.0.5) gives me 97 significantly differentially expressed genes > between two selected groups, the newer version (edgeR_3.0.4) does not > find any significantly differentially expressed genes; moreover FDR is > less than 1 only for 13 genes. I realize these two versions are far from > each other in their developmental process. However, I would be still > interested in reasons of such a difference. > > Running in parallel the same code in two different versions of edgeR, I > find out that it is most likely attributed by the estimateTagwiseDisp() > function, which are > > estimateTagwiseDisp(object, prior.n=10, trend=FALSE, prop.used=NULL, > tol=1e-06, grid=TRUE, grid.length=200, verbose=TRUE) in edgeR_2.0.5 > > and > > estimateTagwiseDisp(object, prior.df=20, trend="movingave", span=NULL, > method="grid", grid.length=11, grid.range=c(-6,6), tol=1e-06, > verbose=FALSE) in edgeR_3.0.4 > > The greatest impact seems to have parameters prior.n prior.df as their > settings say how much we want our tagwise dispersion be influenced by a > common dispersion. Although setting a prior.df to very low (that would > be an equivalent of a high prior.n) makes a difference in FDR values, > the results from two different edgeR versions are still very distinct, > so are estimated $tagwise.disperion parameters . Another candidate > parameter for changes seems to be the prop.used but I am not sure if its > equivalent in edgeR_3.0.4 is ?span? parameter, is it? On the other hand > there are parameters related to the estimation algorithm, that I would > not expect to cause such a difference in the further outcome, could > they? > > What am I missing here? Settings of which parameter would make outcomes > of DE genes analyses more comparable between two different edgeR > versions? > > Best wishes > Dorota > > > ================================================================== > Dorota Herman, PhD > VIB Department of Plant Systems Biology, Ghent University > Technologiepark 927 > 9052 Gent, Belgium > Tel: +32 (0)9 3313692 > Email:dorota.herman@... > Web: http://www.psb.ugent.be