[Bioperl-l] genome position mapping of RefSeq IDs
robert.citek at gmail.com
Wed Feb 25 15:40:00 EST 2009
I have a list of RefSeq IDs for which I can parse out all the
annotation (e.g. exons, SNPs, etc.). For this one project, I need the
same coordinate information relative to the genome rather than the
transcript. Is such mapping information available? Or are pieces
available so that I can string them together?
A simple use case would be for me to query a dataset with a RefSeq and
it will return the genomic coordinates of all introns.
I've looked at the mapping information at
ftp://ftp.ncbi.nih.gov/gene/DATA, which gets me close but seems to be
missing some parts. Or is that what I'm looking for and I just don't
see how the pieces fit?
Thanks in advance for any pointers in the right direction.
On Wed, Oct 22, 2008 at 2:54 PM, Chris Fields <cjfields at illinois.edu> wrote:
> You can 'epost' in increments if you have more IDs, up to 1000-2000 I think.
> Beyond that, you should probably use one of the mapping files located in
> the ftp.ncbi.nih.gov/gene/DATA folder and just use it locally (initially
> index the data with DB_File, search using a tied hash, etc).
More information about the Bioperl-l