18 Jun 2012 07:24
Re: down-expression and high-expression in single cell + amplification
Hi Tim, Thanks for the response! No i didnt.. Is there any recommended tool for removing the pcr duplicates from raw reads? Thanks, Pap On Sun, Jun 17, 2012 at 7:29 PM, Tim Triche, Jr. <tim.triche@...>wrote: > you say you are "new to this field" yet you seem to have done almost > everything right. > > have you tried removing PCR dupes that might be skewing your results, > before transcript assembly with RSEM and DE testing? > > > > On Sun, Jun 17, 2012 at 9:23 AM, papori [guest] <guest@...>wrote: > >> >> Hi all, >> First of all, I am new to this field so i am sorry if i am not clear.. >> >> I will try to explain what is my aim, and what i did before DESeq. >> >> I am trying to do Differential expression analysis using DESeq for >> De-Novo invertebrate . >> >> We had an experiment of 3 conditions with 3 biological replicate for >> each.(total of 9 samples) >> We used hiseq2000 50bp single end reads. >> We had a different library size for each.(that was single cell experiment >> so we had amplification step.. what yield variance in the library sizes..) >> >> We reconstructed the transcriptome using Trinity. >> Estimating counts with RSEM. >> >> And then i used DESeq.. >> >> i have weird behavior of the data, and i dont know if it is because >> something wrong that i did.. >> >> i am always getting down-expression from condition 1 to condition 2 and >> high-expression from condition 2 to condition 3.(for all the transcripts, >> no out-layers..) >> >> The number of counts that got for each condition to reference >> transcriptome was: >> 32M, 27M, 40M respectively.. >> What made me to think that because cond 2 has lowest count it has a >> behavior of down-expression from 1 to 2 and high-expression from 2 to 3.. >> >> if my conclusion is right, i am in a big mass..(Normalization??) >> >> my DESeq script is: >> Conditions = c("C1", "C2", "C3", "C1", "C2", "C3","C1", "C2", "C3") >> Counts<-round(MultiGeneMat,0) >> cds <- newCountDataSet(Counts,Conditions) >> cds <- estimateSizeFactors(cds) >> cds <- >> estimateDispersions(cds,method="per-condition",sharingMode="maximum",fitType="local") >> >> res_1vs2 <- nbinomTest(cds,condA="C1",condB="C2") >> sigDESeq_1vs2 <- res_1vs2[res_1vs2$padj <= 0.1, ] >> sigDESeq_1vs2 <- na.omit(sigDESeq_1vs2) >> >> res_2vs3 <- nbinomTest(cds,condA="C2",condB="C3") >> sigDESeq_2vs3 <- res_2vs3[res_2vs3$padj <= 0.1, ] >> sigDESeq_2vs3 <- na.omit(sigDESeq_2vs3) >> >> res_1vs3 <- nbinomTest(cds,condA="1",condB="C3") >> sigDESeq_1vs3 <- res_1vs3[res_1vs3$padj <= 0.1, ] >> sigDESeq_1vs3 <- na.omit(sigDESeq_1vs3) >> >> >> >> Is there anything wrong here? or anywhere else?? >> If i wasnt clear enough so tell me in what and i will try to explain.. >> Any help will be appreciate here! >> Thanks, >> Pap >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- output of sessionInfo(): >> >> R version 2.14.0 (2011-10-31) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C >> LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] edgeR_2.4.6 limma_3.10.3 DESeq_1.6.1 locfit_1.5-8 >> Biobase_2.14.0 >> >> loaded via a namespace (and not attached): >> [1] annotate_1.32.3 AnnotationDbi_1.16.19 DBI_0.2-5 >> genefilter_1.36.0 geneplotter_1.32.1 >> [6] grid_2.14.0 IRanges_1.12.6 lattice_0.20-6 >> RColorBrewer_1.0-5 RSQLite_0.11.1 >> [11] splines_2.14.0 survival_2.36-14 tools_2.14.0 >> xtable_1.7-0 >> > >> >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@... >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > *A model is a lie that helps you see the truth.* > * > * > Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf> > > -- -- ----------------- Dror Hibsh 0507-669599 ------------------ [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed