Yin, Jun | 18 Jun 17:49 2013

Re: SimpleAlign Query

Hi, Chris,

No problem. I can help to look at what is the problem. Yes, usually in
Bio::SimpleAlign even if the sequence is reverse complementary, the start
coordinate should be smaller than the end coordinate. But it seems
Bio::AlignIO::bl2seq is a wrapper for Bio::SearchIO::blast. It is a bit
different from the other Bio::AlignIO modules, so there may be some

Can you send me the input file for your script? So I can try to figure out
where the problem is.

I would like to recommend you to ask in the bioperl mail list. There will
be more experts in BioPerl there.

Jun Yin, Ph.D.
Postdoc Associate

James Noonan Group
Department of Genetics
Yale University School of Medicine
Sterling Hall of Medicine, I-120A
333 Cedar Street 
New Haven, CT 06510

On 6/18/13 11:23 AM, "Christopher Field" <chris.field <at> unibas.ch> wrote:

>Dear Dr. Yin,
>I've got a problem with SimpleAlign.pm, and since you are responsible for
>the most recent updates on this module, I hope it's ok to email and ask
>for your help.
>To summarise my short program, it uses bl2seq to throw some sanger reads
>against a genome, and then I'm tiling the reported hits into a visually
>accessible format.
>The problem has arisen, that sometimes the alignment is with the reverse
>complement, but the start and end of both sequences in the Align object
>are such that the start is always lower than the end. The strand flag
>also appears uninitialised. Is this a straightforward bug, or a result of
>something I'm doing in my code?
>Any help would be greatly appreciated.
>Chris Field
>Group van Nimwegen
>Biozentrum der Universit├Ąt Basel
>4056-CH Basel
>#!/usr/bin/env perl;
>use v5.8.1;
>use warnings;
>use strict;
>use Bio::Seq;
>use Bio::SeqIO;
>use Bio::AlignIO;
>use Bio::SimpleAlign;
>use Bio::Tools::Run::StandAloneBlast;
>my $usage = "align.pl reference query\n";
>open (my $fo,'>',"align.out",) or die $!;
>my $rfile = shift or die $usage;
>my $qfile = shift or die $usage;
>my $rstream = Bio::SeqIO->new(-file => $rfile);
>my $qstream = Bio::SeqIO->new(-file => $qfile);
>my $rseq = $rstream->next_seq;
>my $align = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastn',
>-outfile => 'bl2seq.out');
>while (my $qseq = $qstream->next_seq){
>        $align->bl2seq($qseq,$rseq);
>        my $astream = Bio::AlignIO->new(-file => 'bl2seq.out', -format =>
>        my  <at> seq = ("n") x $qseq->length;
>        my  <at> match = (" ") x $qseq->length;
>        my  <at> pos = (" ") x $qseq->length;
>        while (my $align = $astream->next_aln){
>                my $qhit = $align->get_seq_by_pos(1);
>                my $rhit = $align->get_seq_by_pos(2);
>                if(!grep(/[acgt]/, <at> seq[$qhit->start-1 .. $qhit->end-1])){
>                        print $rhit->start,":",$rhit->end,"\n";
>                         <at> seq[$qhit->start-1 .. $qhit->end-1] =
>                         <at> match[$qhit->start-1 .. $qhit->end-1] =
>                         <at> pos[$qhit->start-1 .. $qhit->end-1] =
>("<","-","-",split("",$rhit->start),("-") x
>                }
>        }
>        print $fo $qseq->display_id,"\n";
>        print $fo $qseq->seq,"\n";
>        print $fo  <at> match,"\n";
>        print $fo  <at> seq,"\n";
>        print $fo  <at> pos,"\n\n";