Mark Robinson | 10 Oct 22:40 2012
Picon

Re: EdgeR for proteomics

Hi Fabricio,

I suggest you check (at least) 2 things:

1.
> disp <- estimateCommonDisp(b)
> disp$common.dispersion = 0.0001004979

> disp$common.dispersion =  3.999943

Your example only makes 1 call to estimateCommonDisp(), but you have 2 drastically different values.  Are
you reporting these as the estimated values, or are you actually running this command and *setting* the
common dispersion?  It's not clear from your message.

You may also want to study some of the GLM-based case studies in:
http://www.bioconductor.org/packages/2.11/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf

For example, the standard GLM work flow would be similar to that on Page 7.

2. 
> lrt <- glmLRT(b,fit,coef=fit$design)

The docs for the 'coef' argument (?glmLRT) say:
----
   coef: integer or character vector indicating which coefficients of
         the linear model are to be tested equal to zero.  Values must
         be columns or column names of ‘design’. Defaults to the last
         coefficient.  Ignored if ‘contrast’ is specified.
----
As you can see, the function is expecting something very different to what give as your 'coef' argument.
Maybe you want 'coef=2:6', if you are looking for any difference between your 6 groups.  Of course, maybe
you actually want to split your factors into 2 … one of ("La","Lm","MO") and one of ("6h","24h") and
construct a design matrix accordingly.  But, this is also not clear from your message.

Hope that helps,
Mark

----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland

v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson@...
o: Y11-J-16
w: http://tiny.cc/mrobin

----------
http://www.fgcz.ch/Bioconductor2012

On 10.10.2012, at 06:41, Fabricio Marchini wrote:

> Hi,
> 
> I'm using EdgeR to analyse a proteomic data with peptide counting. I have
> limited experience on R/EdgeR/Statistics so I appreciate some help.
> Using the follow code:
> 
> a=file[,2:64]
> 
> b=DGEList(counts=a,group=rep(c("La6h","La24h","Lm6h","Lm24h","MO6h","MO24h"
> ),c(10,11,10,11,10,11)), lib.size=colSums(a))
> 
> b <- calcNormFactors(b)
> 
> times <- rep(c("La6h","La24h","Lm6h","Lm24h","MO6h","MO24h"),c(10,11,10,11,
> 10,11))
> 
> times <- factor(times,levels=c("La6h","La24h","Lm6h","Lm24h","MO6h","MO24h"
> ))
> 
> design <- model.matrix(~factor(times))
> 
> disp <- estimateCommonDisp(b)
> 
> fit <- glmFit(b,design,dispersion=disp$common.dispersion)
> 
> lrt <- glmLRT(b,fit,coef=fit$design)
> disp$common.dispersion = 0.0001004979
> 
> All proteins (3430) had a p.value of 0.
> 
> I tried also with
> 
> fit <- glmFit(b,design,dispersion=disp$common.dispersion)
> 
> lrt <- glmLRT(b,fit,coef=fit$design)
> disp$common.dispersion =  3.999943
> 
> and that gave me all the proteins with p.value lower than 6.29E-05.
> 
> That gave a signal that I'm doing something wrong or because of both common
> dispersions my data is not a appropriate for the analysis.
> 
> Any suggestions or corrections?
> 
> --
> Fabricio K. Marchini
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@...
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane