Yin, Jun | 18 Jun 17:49 2013
Picon

Re: SimpleAlign Query

Hi, Chris,

No problem. I can help to look at what is the problem. Yes, usually in
Bio::SimpleAlign even if the sequence is reverse complementary, the start
coordinate should be smaller than the end coordinate. But it seems
Bio::AlignIO::bl2seq is a wrapper for Bio::SearchIO::blast. It is a bit
different from the other Bio::AlignIO modules, so there may be some
inconsistency.

Can you send me the input file for your script? So I can try to figure out
where the problem is.

I would like to recommend you to ask in the bioperl mail list. There will
be more experts in BioPerl there.

Cheers,
-- 
Jun Yin, Ph.D.
Postdoc Associate

James Noonan Group
Department of Genetics
Yale University School of Medicine
Sterling Hall of Medicine, I-120A
333 Cedar Street 
New Haven, CT 06510
http://www.yale.edu/noonanlab/Jun.html

On 6/18/13 11:23 AM, "Christopher Field" <chris.field <at> unibas.ch> wrote:

>Dear Dr. Yin,
>
>I've got a problem with SimpleAlign.pm, and since you are responsible for
>the most recent updates on this module, I hope it's ok to email and ask
>for your help.
>To summarise my short program, it uses bl2seq to throw some sanger reads
>against a genome, and then I'm tiling the reported hits into a visually
>accessible format.
>The problem has arisen, that sometimes the alignment is with the reverse
>complement, but the start and end of both sequences in the Align object
>are such that the start is always lower than the end. The strand flag
>also appears uninitialised. Is this a straightforward bug, or a result of
>something I'm doing in my code?
>Any help would be greatly appreciated.
>
>Chris Field
>Group van Nimwegen
>Biozentrum der Universit├Ąt Basel
>4056-CH Basel
>
>Code:
>
>#!/usr/bin/env perl;
>
>use v5.8.1;
>use warnings;
>use strict;
>
>use Bio::Seq;
>use Bio::SeqIO;
>use Bio::AlignIO;
>use Bio::SimpleAlign;
>use Bio::Tools::Run::StandAloneBlast;
>
>my $usage = "align.pl reference query\n";
>
>open (my $fo,'>',"align.out",) or die $!;
>
>my $rfile = shift or die $usage;
>my $qfile = shift or die $usage;
>
>my $rstream = Bio::SeqIO->new(-file => $rfile);
>my $qstream = Bio::SeqIO->new(-file => $qfile);
>
>my $rseq = $rstream->next_seq;
>
>my $align = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastn',
>-outfile => 'bl2seq.out');
>
>while (my $qseq = $qstream->next_seq){
>        $align->bl2seq($qseq,$rseq);
>        my $astream = Bio::AlignIO->new(-file => 'bl2seq.out', -format =>
>'bl2seq');
>        my  <at> seq = ("n") x $qseq->length;
>        my  <at> match = (" ") x $qseq->length;
>        my  <at> pos = (" ") x $qseq->length;
>        while (my $align = $astream->next_aln){
>                my $qhit = $align->get_seq_by_pos(1);
>                my $rhit = $align->get_seq_by_pos(2);
>                if(!grep(/[acgt]/, <at> seq[$qhit->start-1 .. $qhit->end-1])){
>                        print $rhit->start,":",$rhit->end,"\n";
>                         <at> seq[$qhit->start-1 .. $qhit->end-1] =
>split("",$rhit->seq);
>                         <at> match[$qhit->start-1 .. $qhit->end-1] =
>split("",$align->match_line);
>                         <at> pos[$qhit->start-1 .. $qhit->end-1] =
>("<","-","-",split("",$rhit->start),("-") x
>($qhit->length-(length($rhit->start)+length($rhit->end)+6)),split("",$rhit
>->end),"-","-",">");
>                }
>        }
>        print $fo $qseq->display_id,"\n";
>        print $fo $qseq->seq,"\n";
>        print $fo  <at> match,"\n";
>        print $fo  <at> seq,"\n";
>        print $fo  <at> pos,"\n\n";
>}

Gmane