17 Nov 2012 07:42
Experimental design with edgeR and DESeq packages (RNA-seq)
> Date: Thu, 15 Nov 2012 12:09:10 +0100 > From: Yvan Wenger <yvan.wenger@...> > To: bioconductor@... > Subject: [BioC] Experimental design with edgeR and DESeq packages (RNA-seq) > > Hi everybody, > > I just started using edgeR and DESeq and am looking for a confirmation > that I am not doing a silly thing. > > Basically, we have 7 conditions and for only 2 of these sample we have > biological triplicates. Let us say that the samples are "A", "A", "A", > "B", "C" (most of the genes are NOT regulated in my experiment). > Finally, let us say we just want to compare "B" to "C", but using all > the information available. Can we use all the dataset for estimating the > common and tagwise dispersion? Typically using the commands (note that I > compare here "B" to "C", thus samples without replicates). > > edgeR: > countTable=read.table('mytable',header=F,row.names=1) ; dge <- > DGEList(counts=countTable,group=c("A","A","A,"B","C")) ; dge <- > calcNormFactors(dge) ; dge <- estimateCommonDisp(dge) ; dge <- > estimateTagwiseDisp(dge) ; et <- exactTest(dge, pair=c("B","C")) Yes, this is a perfectly standard analysis. edgeR estimates the genewise dispersion values from the three replicates for Group A and uses these dispersions even though you are comparing B to C. The assumption here is obviously that A, B and C are similar populations, so that genes with higher biological coefficient of variation (BCV) in condition A also tend to have higher BCV in conditions B and C as well. Gordon > or > > DESeq: > countTable = read.table('mytable.csv', header=F,row.names=1) ; design > = data.frame(row.names = colnames(countTable),condition = > c("A","A","A,"B","C")) ; condition = > design$condition;cds=newCountDataSet(countTable,condition); > cds=estimateSizeFactors(cds);cds=estimateDispersions(cds); > res=nbinomTest(cds,"B","C") > > Is it ok to do so (to use samples not compared in the end to estimate > the dispersion) Does this correspond to the example "working partially > without replicates" from the DESeq manual) ? Or should I just consider > that there is no replicates for sample B and C and proceed by ignoring > other samples completely ? > > Many thanks ! > > Yvan ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}} _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed