Re: HTqPCR normalization issues - third posting

Thanks, much appreciated !

It would be important for us to understand wether we are doing something 
fundamental wrong, or if there actually is a bug on the software 
(happens), because we are using heavily this package for validating NGS 
gene expression analysis findings..

Thanks you so much for the excellent work !

Keep in touch

Alessandro & Elena

On 10/10/2013 3:56 PM, James W. MacDonald wrote:
> Hi Allesandro,
>
> I believe this package is still maintained, and it is unfortunate that 
> you have not received a reply. The expectation is that package 
> maintainers will subscribe (and pay attention) to the Bioc listserv, 
> but the list is fairly high traffic, so it never hurts to add a CC to 
> the maintainer as well (which I have done for you).
>
> Best,
>
> Jim
>
>
>
> On Thursday, October 10, 2013 8:35:06 AM, Alessandro Guffanti [guest] 
> wrote:
>>
>> Dear all, this is our third posting without a real reply so we wonder 
>> if this package is actually not maintained anymore ? if yes, it would 
>> be useful for us to know...
>>
>>
>> We are using HTqPCR to analyze a set of cards which we trasformed in 
>> this format, which is accepted by HtQPCR:
>>
>>   2    Run05    41    Passed    sample 41    ABCC5    Target 30
>>   3    Run05    41    Passed    sample 41    ADM    Target 31.3
>>   4    Run05    41    Passed    sample 41    CEBPB    Target 29.8
>>   5    Run05    41    Passed    sample 41    CSF1R    Target 31.2
>>   6    Run05    41    Passed    sample 41    CXCL16    Target 26.9
>>   7    Run05    41    Passed    sample 41    CYC1    Target 25.7
>>
>>   [...]
>>
>>   The total number of files and groups is as follows - summarized in 
>> the file "Elenco_1.txt" which is used below:
>>
>>   File    Group
>>   41.txt    Sano
>>   39.txt    Sano
>>   37.txt    Sano
>>   35.txt    Sano
>>   43.txt    Sano
>>   34.txt    Sano
>>   44.txt    Sano
>>   38.txt    Sano
>>   48.txt    Sano
>>   40.txt    Sano
>>   47.txt    Sano
>>   6.txt    Non Responder DISEASE
>>   26.txt    Non Responder DISEASE
>>   2.txt    Non Responder DISEASE
>>   69.txt    Non Responder DISEASE
>>   68.txt    Non Responder DISEASE
>>   5.txt    Non Responder DISEASE
>>   71.txt    Responder DISEASE
>>   3.txt    Responder DISEASE
>>   17.txt    Responder DISEASE
>>   1.txt    Responder DISEASE
>>   19.txt    Responder DISEASE
>>
>>   The comparison is DISEASE vs non DISEASE, but what leaves us 
>> dubious is the normalization part.
>>   Note that sample 41 is the *first* of the list.
>>
>>   Here is the code up to the dump of the normalized values matrices:
>>
>>   library("HTqPCR")
>>   path <- ("whatever/")
>>   files <- read.delim (file.path(path, "Elenco_1.txt"))
>>   files
>>   filelist <- as.character(files$File)
>>   filelist
>>   raw <- readCtData(files = filelist, path = path, n.features=46, 
>> type=7, flag=NULL, feature=6, Ct=8, header=FALSE, n.data=1)
>>   featureNames (raw)
>>   raw.cat <- setCategory(raw, Ct.max=36, Ct.min=9, replicates=FALSE, 
>> quantile=0.9, groups =files$Group, verbose=TRUE)
>>
>>   s.norm <- normalizeCtData(raw.cat, norm="scale.rank")
>>   exprs(s.norm)
>>   write.table(exprs(s.norm),file="Ct norm scaling.txt")
>>
>>   g.norm <- normalizeCtData(raw.cat, norm="geometric.mean")
>>   exprs(g.norm)
>>   write.table(exprs(g.norm),file="Ct norm media geometrica.txt")
>>
>>   Now if we look at the content of the two expression value files, it 
>> looks like that the first column
>>   (corresponding to the first sample) is always unchanged, while all 
>> the others have been normalized.
>>
>>   In this case the first dataset is sample 41 so you can check 
>> comparing between the corresponding column
>>   above and below what is happening.
>>
>>   We do not include here all the columns; however, you can see that 
>> all the samples *except the first (number 41)* have all their values 
>> normalized
>>
>>   Ct norm scaling:
>>
>>       41    39    37    35    43    34    44    38
>>   ABCC5    30    27.37706161    26.47393365    29.7721327 
>> 31.20189573    26.39260664    26.32436019    27.54274882
>>   ADM    31.3    30.36540284    28.51753555    32.31241706 
>> 34.40473934    26.29800948    29.82796209    28.60208531
>>   CEBPB    29.8    28.53383886    26.65971564    27.84151659 
>> 30.06540284    27.3385782    27.36597156    26.29080569
>>   CSF1R    31.2    27.66625592    28.05308057    37.18976303 
>> 36.98767773    31.0278673    34.56255924    29.75772512
>>   CXCL16    26.9    27.56985782    24.15165877    30.28018957 
>> 28.82559242    25.91962085    26.89251185    26.96492891
>>    Ct norm geometric
>>
>>       41    39    37    35    43    34    44    38
>>   ABCC5    30    27.73443878    26.93934246    29.88113261 
>> 30.76352197    26.51166676    26.8989347    27.49219508
>>   ADM    31.3    30.76178949    29.01887064    32.4307173 
>> 33.92136694    26.41664286    30.47900874    28.5495872
>>   CEBPB    29.8    28.90631647    27.12839047    27.94344824 
>> 29.64299633    27.46190571    27.96328103    26.24254985
>>   CSF1R    31.2    28.0274082    28.5462506    37.32591991 
>> 36.46801611    31.16783762    35.31694663    29.70310587
>>   CXCL16    26.9    27.92975172    24.57624224    30.39104955 
>> 28.42060473    26.03654728    27.47948724    26.91543574
>>
>>   This looks odd - why the first sample seems to be taken as a 
>> 'reference' for both normalization methods and hence is left unchanged ?
>>
>>   This happens with ANY normalization procedure selected.
>>
>>   Another (related ?) oddity is that in the final differential 
>> analysis result the same sample ID is always reported
>>   in the feature.pos field, as you can see below:
>>
>>       genes    feature.pos    t.test    p.value    adj.p.value
>>   22    NUCB1    41    -1.998838921    0.077900837 0.251381346
>>   8    ERH    41    -1.958143348    0.091329532    0.251381346
>>   16    MAFB    41    -1.887142703    0.09421993    0.251381346
>>   28    RNF130    41    -1.904866754    0.099644523 0.251381346
>>   3    CEBPB    41    -1.853176708    0.103563968    0.251381346
>>   18    MSR1    41    -1.80887129    0.10432619    0.251381346
>>
>>   Are we doing something wrong in the data input or subsequent 
>> elaboration here? can we actually trust these normalizations?
>>
>>   Many thanks in advance - kind regards
>>
>>   Alessandro & Elena
>>
>>
>>
>>
>>   -- output of sessionInfo():
>>
>>
>> R version 3.0.1 (2013-05-16)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>> [8] base
>>
>> other attached packages:
>> [1] HTqPCR_1.14.0      limma_3.16.8       RColorBrewer_1.0-5 
>> Biobase_2.20.1
>> [5] BiocGenerics_0.6.0
>>
>> loaded via a namespace (and not attached):
>> [1] affy_1.38.1           affyio_1.28.0 BiocInstaller_1.10.3
>> [4] gdata_2.13.2          gplots_2.11.3         gtools_3.0.0
>> [7] preprocessCore_1.22.0 stats4_3.0.1          zlibbioc_1.6.0
>>
>> -- 
>> Sent via the guest posting facility at bioconductor.org.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@...
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: 
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099

-- 
Alessandro Guffanti

Alessandro Guffanti

Head, Bioinformatics

*Genomnia srl*

Via Nerviano, 31/B – 20020 Lainate (MI)

Tel. +39-0293305.702 / Fax +39-0293305.777

www.genomnia.com <http://www.genomnia.com>

alessandro.guffanti@... <mailto:alessandro.guffanti@...>

*P* *Per cortesia, prima di stampare questa e-mail pensate all'ambiente.*

*           Please consider the environment before printing this mail 
note.*

-----------------------------------------------------------
Il Contenuto del presente messaggio potrebbe contenere informazioni confidenziali a favore dei
soli destinatari del messaggio stesso. Qualora riceviate per errore questo messaggio siete pregati 
di cancellarlo dalla memoria del computer e di contattare i numeri sopra indicati. Ogni utilizzo o 
ritrasmissione dei contenuti del messaggio da parte di soggetti diversi dai destinatari è da 
considerarsi vietato ed abusivo.

The information transmitted is intended only for the per...{{dropped:10}}

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Gmane