Alexey Morozov | 13 Sep 10:06 2013

GI to taxonomy

Dear colleagues,
I have a bunch of fasta genomes annotated only by their GI numbers, and
need to get a taxonomy for all of them. So I rewrote get_tree method from

 12 open GIS, GI_FILE;
 13 my $tax_db=Bio::DB::Taxonomy->new(source=>'entrez');
 14 my $tree;
 15 while (<GIS>)
 16         {
 17         print "Requesting data for GI $_\n";
 18         my $taxon=$tax_db->get_taxon(-gi=>"$_",-db=>'nucleotide');
 19         if ($ <at> ){die $ <at> ;}#Catch exception and die, just in case
 20                 # or die "Cannot get taxonomy data for GI$_:$!\n";
 21         if ($tree)
 22                 {
 23                 $tree->merge_lineage($taxon);
 24                 }else
 25                 {
 26                 $tree=Bio::Tree::Tree->new(-node=>$taxon);
 27                 }
 28         }

This (line 18) way to invoke Bio::DB::Taxonomy->get_taxon is documented and
I expect to get taxon objects. Yet all I get is

--------------------- WARNING ---------------------
MSG: Must have provided a valid HASHref for -params
Requesting data for GI 15604717

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Can't query website: 400 URL must be absolute
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/
STACK: Bio::DB::Taxonomy::entrez::get_taxon
STACK: ./get_tax:18

Directly using Bio::DB::Taxonomy::entrez doesn't help, GI numbers load OK,
so I have no clue what happened.
Need help.

Alexey Morozov,
LIN SB RAS, bioinformatics group.
Irkutsk, Russia.