[Bioperl-l] Download sequence annotations without sequence ??
sgoegel at gmail.com
Fri Jul 1 18:27:44 EDT 2005
I have scripts and modules set up to, for a given blast report, go through and
download sequences (when not available locally) for certain subjects (hits)
and extract information such as db_xref fields, geneontology annotations,
taxon ID, and features.
The one thing I am not using is the actual DNA or amino acid sequence itself.
For large sequences such as genomic DNA, which can be several megabases in
size or more, it is impractical to download the entire sequence, which I do
My question is, does Bioperl currently have a way to download only the
annotations/features associated with a sequence (in GenBank format, for
example), but not the sequence itself? If NCBI does not currently offer a way
to do that, all that would be necessary to do would be to terminate the
connection with the server when the ORIGIN line is reached.
Of course, that would limit to only one sequence per query, which is perfectly
fine under the circumstances.
For pipelined downloads (the default), the $/ input separator would have to be
modified accordingly. I have done this but I want to make sure it's not
already a standard function of any part of Bioperl. Also, if Bioperl does not
currently do this, is there interest in a patch to add this functionality
(assuming I get around to making one)?
More information about the Bioperl-l