Wei Shi | 12 Jan 06:59 2013
Picon
Picon

Re: normalization of illumina bead array data

Dear Michele,

Their data on ArrayExpress are not in beadstudio format, therefore they could not be loaded into
beadstudio to test whether there are errors or not.

The link below points to the Illumina user guide which describes how detection p values are calculated
(page 106).

http://support.illumina.com/documents/MyIllumina/c94519f7-9348-4308-a32f-b66ff3959e99/GenomeStudio_GX_Module_v1.0_UG_11319121_RevA.pdf

Hope this helps.

Cheers,
Wei

On Jan 12, 2013, at 7:59 AM, michele caseposta wrote:

> Dear Wei,
> I am sorry to bother you again.
> I contacted the authors that produced the data to ask help with the dataset, but they were not collaborative.
> All they told me was "we used beadstudio and it gave us no errors".
> This aside, I would like to know more about this relation between intensity and detection score. Is it a
fixed relation as you point out? If so, is there a place where I can read more about it? Is it possible that a
probe A, composed by more beads is more reliable than a probe B with less beads, even though the intensity of
A is lower than the intensity of B?
> Thanks,
> Michele
> 
> On Jan 2, 2013, at 1:03 AM, Wei Shi wrote:
> 
>> Dear Michele,
>> 
>> I had a close look at the data used in your analysis and found that the detection data for some arrays seem to
be wrong.
>> 
>> With illumina bead array data, probes with larger intensities should have a equal or higher detection
score (or equal or lower detection p value) than probes with lower intensities. However, this is not the
case for some of the arrays in this dataset. The second column in your 'maqc' object is one of such arrays. My
code below found 325 probes which had larger intensities but smaller detection scores:
>> 
>> > tmp_sel <- !duplicated(maqc$E[,2])
>> > d2e <- maqc$E[tmp_sel,2]
>> > d2d <- maqc$other$Detection[tmp_sel,2]
>> > d2ds <- d2d[order(d2e)]
>> > sum(d2ds[-1] - d2ds[-c(length(d2ds))] < 0 )
>> [1] 325
>> 
>> This is the reason why negative values were calculated for sigma. It is not the problem of
normexp.fit.detection.p function, but the problem of the data.
>> 
>> You can contact the data submitter to let him/her correct this.
>> 
>> Let me know if we could be of any further assistance.
>> 
>> 
>> Cheers,
>> Wei
>> 
>> 
>> On Dec 23, 2012, at 12:45 PM, Michele wrote:
>> 
>>> I am trying to process the raw data downloaded from: 
>>> http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-380
>>> 
>>> At the moment of using the function neqc I get the following error:
>>> 
>>> Error in if (sigma <= 0) stop("sigma must be positive") : 
>>> missing value where TRUE/FALSE needed
>>> 
>>> The problem seems to be in this line:
>>> 
>>> In sqrt(weighted.mean(v, freq) * n/(n - 1))
>>> 
>>> of the function normexp.fit.detection.p
>>> 
>>> This is generated by the fact that in this function, the difference among p-values is computed, and some
of those differences turn out to be negative.
>>> 
>>> Following is the code with which I am trying to process the data.
>>> 
>>> library(rstudio)
>>> library(beadarray)
>>> library(limma)
>>> 
>>> sample.name <- strsplit(dir("~/Workspaces/data/E-MTAB-380/E-MTAB-380.raw.1/"), ".txt")
>>> group <- sapply(sample.name, function(x) ifelse(length(grep("RR",x))>0,"MT","WT"))
>>> 
>>> setwd("~/Workspaces/data/E-MTAB-380/E-MTAB-380.raw.1/")
>>> maqc <- read.ilmn(files=dir("~/Workspaces/data/E-MTAB-380/E-MTAB-380.raw.1/"), probeid =
"Reporter name", other.columns = c("Detection", "Avg_NBEADS"))
>>> 
>>> colnames(maqc$E) <- sample.name
>>> colnames(maqc$other$Detection) <- sample.name
>>> colnames(maqc$other$Avg_NBEADS) <- sample.name
>>> maqc$targets <- unlist(sample.name)
>>> 
>>> maqc.norm <- neqc(maqc, detection.p='Detection')
>>> 
>>> How can I overcome this?
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor@...
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> 
>> 
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
> 

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:8}}

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane