Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Jason Stajich <jason.stajich <at> gmail.com>
Subject: Re: Genbank query problem
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Monday 1st October 2012 01:09:30 UTC (over 4 years ago)
Are they organized in the bioprojects at least?

I've been working on something related with dumping of genomes based on
what is in bioprojects part of NCBI.

It isn't documented yet since still in dev, but you can try these three
scripts. you need to give it a place to write with the (-b) option and
you'll want to change the query for the 1st script with the -q option.

- fix the query to the taxon you want bioprojects from:
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_eutils_bioproject.pl
- then run this to download the sequences
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_staging.pl
- the run this to get the assemblies that for some reason aren't available
at nuclids in the bioprojects file but can be gleaned from the genbank file
-- maybe not needed for your MT genome project anyways.
https://github.com/hyphaltip/mobedac-fungi/blob/master/scripts/download_sequences_from_bioproject_cleanup_missing.pl

Jason
On Sep 29, 2012, at 10:30 AM, Federico Abascal 
wrote:

> Dear colleagues,
> 
> I have a script (mitobank.pl) that is used by some people. It is aimed to
retrieve mitochondrial genomes for a given taxonomic id. The problem arose
when, some months ago, the NCBI reorganized the way genomes are queried and
the script no longer worked. I have tried modifying the query string with
no success.
> 
> What the script asked for was like:
> 
> 
> my $seq;
> my $gb = new Bio::DB::GenBank;
> my $query = Bio::DB::Query::GenBank->new
> (-query   =>(txid314147[Organism:exp] AND mitochondrial[title] AND
genome[ti] NOT plasmid[title] NOT chromosome NOT chloroplast) OR
(txid314147[Organism:exp] AND mitochondrion[title] AND genome[ti] NOT
plasmid[title] NOT chromosome NOT chloroplast),
>  -db      => 'genome');
> 
> It used to return the list of genomes available for that taxonomic id.
However, the NCBI now returns a different kind of results.
> I tried to modify the script and query the "nucleotide" database, but
this does not work properly.
> 
> Any one could help me, please?
> 
> Thanks in advance,
> Federico
> 
> _______________________________________________
> Bioperl-l mailing list
> [email protected]
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Jason Stajich
[email protected]
[email protected]
 
CD: 12ms