Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Scott Markel <Scott.Markel <at> accelrys.com>
Subject: problems parsing XML results from BLAST+ version of psiblast running in batch mode
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Tuesday 18th June 2013 15:54:24 UTC (over 3 years ago)
Short version -

How do I use Bio::Search::* modules to parse the XML results from the
BLAST+ version of psiblast running in batch mode?  Only one set of
iteration numbers is used, so I can't tell which iteration goes with which
query sequence.

Long version -

I'm running NCBI BLAST+ psiblast (version 2.2.27+) in batch mode with XML
output.  Unlike the BLAST version, which creates a
... tag pair for each query sequence, the BLAST+
version creates a single ... tag pair containing
all iterations for all query sequences.  The iteration numbers run across
the query sequences, i.e., the iteration numbers don't restart for a new
query sequence.

So, how to know which iteration goes with which query sequence?

There are ... and
... tag pairs that could be
used to inspect the iterations, but there are no subroutines in
Bio::Search::Iteration::GenericIteration providing access to these values.

An XML output file fragment showing the tag pairs is pasted below.

Any suggestions on workarounds or a pointer to something obvious that I'm
missing would be greatly appreciated.

Scott

#########################


http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd">

  psiblast
...
  Query_1
  lcl|1 no description
available
  100
  
...
  
  
    
      1
      Query_1
      lcl|1 no description
available
      100
      
...
      
      
...
      
    
    
      2
      Query_2
      lcl|2 no description
available
      100
      
...
      
      
...
      
    
  



Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  [email protected]
Accelrys (Pipeline Pilot R&D)       mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799
5222
USA                                
web:    http://www.accelrys.com

http://www.linkedin.com/in/smarkel
Secretary, Board of Directors:
    International Society for Computational Biology
Chair: ISCB Publications and Communications Committee
Associate Editor: PLOS Computational Biology
Editorial Board: Briefings in Bioinformatics
 
CD: 3ms