Jason Stajich | 1 Oct 03:09 2012
Picon

Re: Genbank query problem

Are they organized in the bioprojects at least?

I've been working on something related with dumping of genomes based on what is in bioprojects part of NCBI.

It isn't documented yet since still in dev, but you can try these three scripts. you need to give it a place to
write with the (-b) option and you'll want to change the query for the 1st script with the -q option.

- fix the query to the taxon you want bioprojects from:
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_eutils_bioproject.pl
- then run this to download the sequences
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_staging.pl
- the run this to get the assemblies that for some reason aren't available at nuclids in the bioprojects file
but can be gleaned from the genbank file -- maybe not needed for your MT genome project anyways.
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_cleanup_missing.pl

Jason
On Sep 29, 2012, at 10:30 AM, Federico Abascal <fedeabascal <at> yahoo.es> wrote:

> Dear colleagues,
> 
> I have a script (mitobank.pl) that is used by some people. It is aimed to retrieve mitochondrial genomes
for a given taxonomic id. The problem arose when, some months ago, the NCBI reorganized the way genomes are
queried and the script no longer worked. I have tried modifying the query string with no success.
> 
> What the script asked for was like:
> 
> 
> my $seq;
> my $gb = new Bio::DB::GenBank;
> my $query = Bio::DB::Query::GenBank->new
> (-query   =>(txid314147[Organism:exp] AND mitochondrial[title] AND genome[ti] NOT plasmid[title]
NOT chromosome NOT chloroplast) OR (txid314147[Organism:exp] AND mitochondrion[title] AND
genome[ti] NOT plasmid[title] NOT chromosome NOT chloroplast),
>  -db      => 'genome');
> 
> It used to return the list of genomes available for that taxonomic id. However, the NCBI now returns a
different kind of results.
> I tried to modify the script and query the "nucleotide" database, but this does not work properly.
> 
> Any one could help me, please?
> 
> Thanks in advance,
> Federico
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l <at> lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
jason.stajich <at> gmail.com
jason <at> bioperl.org

Gmane