Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Warren Gallin <wgallin <at> ualberta.ca>
Subject: Protein Records without Sequence
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Wednesday 5th June 2013 18:16:57 UTC (over 3 years ago)
Hi,

I am encountering a problem with a number of protein records.

A HMMer search of the nr database returns a gi number and an associated
sequence.

When I use that gi number to try to retrieve the full GENBANK record,
however, there is no sequence returned with the record.

When I use the NCBI web interface and use that gi number the GENPEPT record
returns with no sequence, but when I select fast format the sequence is
returned.

Examples of gi numbers for which this occurs are:

23099847
21224301
68536697
46580017
77359109

Is this a flaw with the individual GENPEPT records?  In which case should I
report it to NCBI?

Or are these some kind of "special" record that needs different parameters
passed on the utilizes search?

There is a workaround, I guess, where is the sequence comes back empty then
a new retrieval of fasta formatted records can be run and the empty field
in the GENPEPT record repopulated, but this seems inelegant.

All advice and/or commentary appreciated.

Warren Gallin
 
CD: 2ms