Dror Hibsh | 18 Jun 2012 07:24
Picon

Re: down-expression and high-expression in single cell + amplification

Hi Tim, Thanks for the response!
No i didnt..
Is there any recommended tool for removing the pcr duplicates from raw
reads?

Thanks,
Pap

On Sun, Jun 17, 2012 at 7:29 PM, Tim Triche, Jr. <tim.triche@...>wrote:

> you say you are "new to this field" yet you seem to have done almost
> everything right.
>
> have you tried removing PCR dupes that might be skewing your results,
> before transcript assembly with RSEM and DE testing?
>
>
>
> On Sun, Jun 17, 2012 at 9:23 AM, papori [guest] <guest@...>wrote:
>
>>
>> Hi all,
>> First of all, I am new to this field so i am sorry if i am not clear..
>>
>> I will try to explain what is my aim, and what i did before DESeq.
>>
>> I am trying to do Differential expression analysis using DESeq for
>> De-Novo invertebrate .
>>
>> We had an experiment of 3 conditions with 3 biological replicate for
>> each.(total of 9 samples)
>> We used hiseq2000 50bp single end reads.
>> We had a different library size for each.(that was single cell experiment
>> so we had amplification step.. what yield variance in the library sizes..)
>>
>> We reconstructed the transcriptome using Trinity.
>> Estimating counts with RSEM.
>>
>> And then i used DESeq..
>>
>> i have weird behavior of the data, and i dont know if it is because
>> something wrong that i did..
>>
>> i am always getting down-expression from condition 1 to condition 2 and
>> high-expression from condition 2 to condition 3.(for all the transcripts,
>> no out-layers..)
>>
>> The number of counts that got for each condition to reference
>> transcriptome was:
>> 32M, 27M, 40M respectively..
>> What made me to think that because cond 2 has lowest count it has a
>> behavior of down-expression from 1 to 2 and high-expression from 2 to 3..
>>
>> if my conclusion is right, i am in a big mass..(Normalization??)
>>
>> my DESeq script is:
>> Conditions = c("C1", "C2", "C3", "C1", "C2", "C3","C1", "C2", "C3")
>> Counts<-round(MultiGeneMat,0)
>> cds <- newCountDataSet(Counts,Conditions)
>> cds <- estimateSizeFactors(cds)
>> cds <-
>> estimateDispersions(cds,method="per-condition",sharingMode="maximum",fitType="local")
>>
>> res_1vs2 <- nbinomTest(cds,condA="C1",condB="C2")
>> sigDESeq_1vs2 <- res_1vs2[res_1vs2$padj <= 0.1, ]
>> sigDESeq_1vs2 <- na.omit(sigDESeq_1vs2)
>>
>> res_2vs3 <- nbinomTest(cds,condA="C2",condB="C3")
>> sigDESeq_2vs3 <- res_2vs3[res_2vs3$padj <= 0.1, ]
>> sigDESeq_2vs3 <- na.omit(sigDESeq_2vs3)
>>
>> res_1vs3 <- nbinomTest(cds,condA="1",condB="C3")
>> sigDESeq_1vs3 <- res_1vs3[res_1vs3$padj <= 0.1, ]
>> sigDESeq_1vs3 <- na.omit(sigDESeq_1vs3)
>>
>>
>>
>> Is there anything wrong here? or anywhere else??
>> If i wasnt clear enough so tell me in what and i will try to explain..
>> Any help will be appreciate here!
>> Thanks,
>> Pap
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>  -- output of sessionInfo():
>>
>> R version 2.14.0 (2011-10-31)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>> LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=C
>>             LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] edgeR_2.4.6    limma_3.10.3   DESeq_1.6.1    locfit_1.5-8
>> Biobase_2.14.0
>>
>> loaded via a namespace (and not attached):
>>  [1] annotate_1.32.3       AnnotationDbi_1.16.19 DBI_0.2-5
>> genefilter_1.36.0     geneplotter_1.32.1
>>  [6] grid_2.14.0           IRanges_1.12.6        lattice_0.20-6
>>  RColorBrewer_1.0-5    RSQLite_0.11.1
>> [11] splines_2.14.0        survival_2.36-14      tools_2.14.0
>>  xtable_1.7-0
>> >
>>
>>
>> --
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@...
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
>
> --
> *A model is a lie that helps you see the truth.*
> *
> *
> Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>
>

--

-- 
-----------------
Dror Hibsh
0507-669599
------------------

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane