[Bioperl-l] to convert cDNA id of nucleotide database to geneacc.id of gene database of ncbi

Cui, Wenwu (NIH/NLM/NCBI) [C] cuiw at ncbi.nlm.nih.gov
Mon Nov 20 10:45:13 EST 2006

This script only works for 1 ACC to 1 Gene ID. 

perl -MLWP::Simple -e '($id =
erm=AK070197[acc]")) =~ s/(\A.*<Id>)|(<\/Id>.*\Z)//gxms; getprint

Wenwu Cui

-----Original Message-----
From: Davis, Sean (NIH/NCI) [E] 
Sent: Monday, November 20, 2006 8:32 AM
To: bioperl-l at lists.open-bio.org
Cc: bikash lohia; bioperl1
Subject: Re: [Bioperl-l] to convert cDNA id of nucleotide database to
geneacc.id of gene database of ncbi

On Monday 20 November 2006 07:45, bikash lohia wrote:
> hello group, I am new to this group and want a help.i have list of
> id of rice (oryza sativa)such as AK070197   , AK105331 etc i have to
> manually search gene database of NCBI for converting this accession
no. of
> cDNA(eg.AK070197) to gene  id of oryza sativa to get os******* gene id
. i
> want to do it through perl programming where the program directly
takes the
> list of id ( such as AK105331,Ak070197) from notepad file and searches
> gene database of ncbi. to give results in accession id starting with
> OS****** .i want only the accession id of corresponding Ak***** id.
> example -  AK070197 of nucleotide databse = Os02g0669100 of gene
> i want to convert all this AK***** ids to OS***** ids through
> in perl/bioperl  as manually not possible for long list. please help.
> have no idea how can the code be. with thanks in advance from Bikash

There is some useful data at:


The README file contains details of each file.  The gene2accession.gz
contains genbank accession numbers and maps them back to Entrez Gene ID.
gene_info.gz file contains the Entrez Gene summary information for all
Genes.  With these two files (loaded into appropriate perl hashes), your
can be complete relatively easily.  

Alternatively, you could use the eUtils modules, which I think are
only via CVS.

Bioperl-l mailing list
Bioperl-l at lists.open-bio.org

More information about the Bioperl-l mailing list