Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Adam Witney <awitney <at> sgul.ac.uk>
Subject: Re: Problem Parsing BLAST output to annotate FASTA file
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Wednesday 20th February 2013 15:22:51 UTC (over 3 years ago)
Hi Ann,

On 20/02/2013 05:20, Ann Gregory wrote:
> Hi BioPerl,
> 
> I am having issues with a BioPerl script. I have a blastxml file from a
> blastx blast and the original multifasta file containing the original
> nucleotides sequences.
> 
> I want to take the blast result (ie. the blast description) and annotate
my
> multifasta file.
> 
> I have written 2 while loops that extract the blast descriptions as well
as
> the nucleotide sequence from the multifasta file.
> 
> My problem is that I cannot incorporate one of the while loops into the
> other without loosing the loop property of one of the loops. I would like
> to take the 1st blast description, then the 1st nucleotide sequence, then
> the 2nd blast description, then the 2nd nucleotide sequence and so
> on...just can figure out how to alternate the results.
> 
> See script below:
> 
> 
> use warnings;
> use strict;
> use Bio::SearchIO;
> use Bio::SeqIO;
> 
> 
> my $search_in = Bio::SearchIO->new(-format => 'blastxml', -file =>
> "$ARGV[0]");
> while (my $result = $search_in->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp) {
> my $qd = $hit->description;
> print $qd, "\n";
> }
> }
> }
> 
> my $seqio = Bio::SeqIO->new(-format => 'fasta', -file => "$ARGV[1]");
> while (my $seqobj = $seqio->next_seq) {
> my $nuc = $seqobj->seq();
> print $nuc, "\n";
> }--

I think what you are proposing assumes that the loop over the BLAST
results will come back in the same order as the loop over the Fasta
file, this may be the case, but I'm not sure its something I would rely on.

Anyway, I would loop over the BLAST results, storing the relevant data
to an array or hash and then loop over the fasta file to put the two
together. eg:

my $blast_data;

while ( ... blast data ... ) {
	...
	$blast_data->{$qd} = 
	...
}

while ( my $seqobj = $seqio->next_seq ) {
	my $id = $seqobj->id;
	print $blast_data->{$id}."\n";
}

something along those lines... or have i misunderstood you? if so can
you provide some more details, like what do you want your output to look
like?

HTH

Adam
 
CD: 2ms