7 Jul 2012 19:55
Re: Rsamtools: Read a bam line by line
On 07/07/2012 10:10 AM, Alex Gutteridge wrote: > Hi, > > I have an application where I have mapped reads to two different > reference sequences and I have these two mappings as bam files. I now > want to see for each read whether it mapped to one, both or neither of > the two references (and where the mapping is to both see if one was > better than the other). All my reading of the samtools docs suggested > that there is no facility for indexing by read name as opposed to > genomic coordinates, so the first way that occured to me to do this was > to sort each bam by read name (which is possible using samtools) and > then step through the two bams in parallel checking the alignment > information for each read. There are ~120M reads, so reading the two > bams into memory all in one go is not practical, but from what I can see > Rsamtools doesn't have functions for reading bams either line by line or > by chunks of lines. All the ScanBamParam arguments seem to refer to see the development version and the yieldSize argument to BamFile. Martin > reading chunks by genomic ranges - this isn't appropriate for my > application as the genomic ranges between the two references are not > necessarily comparable. > > So my question is really two-fold: > > 1) Am I right in my reading of the Rsamtools docs? Are there any lower > level functions for reading bams I am ignoring that could do what I want? > 2) My thought now is to just make sams from the sorted bams and step > through using readLines() and my own sam parsing code. Is there an > alternative strategy I have overlooked? > -- -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 _______________________________________________ Bioconductor mailing list Bioconductor@... https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
RSS Feed