Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Fields, Christopher J <cjfields <at> illinois.edu>
Subject: Re: SimpleAlign Query
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Wednesday 19th June 2013 13:15:58 UTC (over 3 years ago)
The alignments from BLAST reports are generally parsed via Bio::SearchIO. 
There is a HOWTO on SearchIO:

http://www.bioperl.org/wiki/HOWTO:SearchIO

Tiling in BioPerl uses these objects, but you can get Bio::SimpleAlign from
the HSPs.

chris

On Jun 18, 2013, at 10:49 AM, "Yin, Jun" <[email protected]>
 wrote:

> Hi, Chris,
> 
> No problem. I can help to look at what is the problem. Yes, usually in
> Bio::SimpleAlign even if the sequence is reverse complementary, the start
> coordinate should be smaller than the end coordinate. But it seems
> Bio::AlignIO::bl2seq is a wrapper for Bio::SearchIO::blast. It is a bit
> different from the other Bio::AlignIO modules, so there may be some
> inconsistency.
> 
> Can you send me the input file for your script? So I can try to figure
out
> where the problem is.
> 
> I would like to recommend you to ask in the bioperl mail list. There will
> be more experts in BioPerl there.
> 
> 
> Cheers,
> -- 
> Jun Yin, Ph.D.
> Postdoc Associate
> 
> James Noonan Group
> Department of Genetics
> Yale University School of Medicine
> Sterling Hall of Medicine, I-120A
> 333 Cedar Street 
> New Haven, CT 06510
> http://www.yale.edu/noonanlab/Jun.html
> 
> 
> 
> 
> On 6/18/13 11:23 AM, "Christopher Field"  wrote:
> 
>> Dear Dr. Yin,
>> 
>> I've got a problem with SimpleAlign.pm, and since you are responsible
for
>> the most recent updates on this module, I hope it's ok to email and ask
>> for your help.
>> To summarise my short program, it uses bl2seq to throw some sanger reads
>> against a genome, and then I'm tiling the reported hits into a visually
>> accessible format.
>> The problem has arisen, that sometimes the alignment is with the reverse
>> complement, but the start and end of both sequences in the Align object
>> are such that the start is always lower than the end. The strand flag
>> also appears uninitialised. Is this a straightforward bug, or a result
of
>> something I'm doing in my code?
>> Any help would be greatly appreciated.
>> 
>> Chris Field
>> Group van Nimwegen
>> Biozentrum der Universit├Ąt Basel
>> 4056-CH Basel
>> 
>> Code:
>> 
>> #!/usr/bin/env perl;
>> 
>> use v5.8.1;
>> use warnings;
>> use strict;
>> 
>> use Bio::Seq;
>> use Bio::SeqIO;
>> use Bio::AlignIO;
>> use Bio::SimpleAlign;
>> use Bio::Tools::Run::StandAloneBlast;
>> 
>> my $usage = "align.pl reference query\n";
>> 
>> open (my $fo,'>',"align.out",) or die $!;
>> 
>> my $rfile = shift or die $usage;
>> my $qfile = shift or die $usage;
>> 
>> my $rstream = Bio::SeqIO->new(-file => $rfile);
>> my $qstream = Bio::SeqIO->new(-file => $qfile);
>> 
>> my $rseq = $rstream->next_seq;
>> 
>> my $align = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastn',
>> -outfile => 'bl2seq.out');
>> 
>> while (my $qseq = $qstream->next_seq){
>>       $align->bl2seq($qseq,$rseq);
>>       my $astream = Bio::AlignIO->new(-file => 'bl2seq.out', -format =>
>> 'bl2seq');
>>       my @seq = ("n") x $qseq->length;
>>       my @match = (" ") x $qseq->length;
>>       my @pos = (" ") x $qseq->length;
>>       while (my $align = $astream->next_aln){
>>               my $qhit = $align->get_seq_by_pos(1);
>>               my $rhit = $align->get_seq_by_pos(2);
>>               if(!grep(/[acgt]/,@seq[$qhit->start-1 .. $qhit->end-1])){
>>                       print $rhit->start,":",$rhit->end,"\n";
>>                       @seq[$qhit->start-1 .. $qhit->end-1] =
>> split("",$rhit->seq);
>>                       @match[$qhit->start-1 .. $qhit->end-1] =
>> split("",$align->match_line);
>>                       @pos[$qhit->start-1 .. $qhit->end-1] =
>> ("<","-","-",split("",$rhit->start),("-") x
>>
($qhit->length-(length($rhit->start)+length($rhit->end)+6)),split("",$rhit
>> ->end),"-","-",">");
>>               }
>>       }
>>       print $fo $qseq->display_id,"\n";
>>       print $fo $qseq->seq,"\n";
>>       print $fo @match,"\n";
>>       print $fo @seq,"\n";
>>       print $fo @pos,"\n\n";
>> }
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> [email protected]
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
 
CD: 3ms