4 Dec 2012 22:52
Re: Problem with BIO::DB::FASTA and Colon in Fasta Header
Florent Angly <florent.angly <at> gmail.com>
2012-12-04 21:52:41 GMT
2012-12-04 21:52:41 GMT
Hi Jason, See the documentation for seq() at http://search.cpan.org/~cjfields/BioPerl-1.6.901/Bio/DB/Fasta.pm#OBJECT_METHODS <http://search.cpan.org/%7Ecjfields/BioPerl-1.6.901/Bio/DB/Fasta.pm#OBJECT_METHODS>. When you call seq() with a single argument, e.g. $db->seq('C7047455:0-100'), Bio::DB::Fasta interprets it as a compound ID and looks for position 0 to 100 of a sequence called C7047455. This is a feature that has been in Bio::DB::Fasta since the dawn of time. In this form, seq() expects a colon as part of the compound ID, which is problematic because your sequence ID actually contains a colon. I think that when you call $db->seq($id,$start,$end), Bio::DB::Fasta does not attempt to parse your ID. This is why your code works with this form. Note that if you want to get the entirety of a sequence called 'C7047455:0-100', the easiest if your sequence names contain colon is to use $db->get_Seq_by_id('C7047455:0-100') since get_Seq_by_id() does only take a regular ID (not compound). Florent On 05/12/12 06:23, Jason Gallant wrote: > Hello, > > I'm trying to retreive fasta sequences that contain a colon in their > header. However, I cannot get my BioPerl script to do this!! > > It works as expected when the header does not contain the colon, however > doesn't return anything when it does. Weirdly, when I ask it to return the > parsed IDs (see below), it returns the appropriate IDs, which include the > colon! Very confusing, would appreciate any help!! > > Many Thanks, > Jason Gallant > > > use strict; > use Bio::SearchIO; > use Bio::DB::Fasta; > > > my ($file,$id,$start,$end) = > ("secondround_merged_expanded.fasta","C7047455:0-100",1,10); > > > my $db = Bio::DB::Fasta->new($file, -reindex=>1); > my $seq = $db->seq($id,$start,$end); > > print $db->ids; > > print $seq,"\n"; > > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l <at> lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l
RSS Feed