[Bioperl-l] to convert cDNA id of nucleotide database to geneacc.id of gene database of ncbi
Cui, Wenwu (NIH/NLM/NCBI) [C]
cuiw at ncbi.nlm.nih.gov
Mon Nov 20 10:45:13 EST 2006
This script only works for 1 ACC to 1 Gene ID.
perl -MLWP::Simple -e '($id =
erm=AK070197[acc]")) =~ s/(\A.*<Id>)|(<\/Id>.*\Z)//gxms; getprint
From: Davis, Sean (NIH/NCI) [E]
Sent: Monday, November 20, 2006 8:32 AM
To: bioperl-l at lists.open-bio.org
Cc: bikash lohia; bioperl1
Subject: Re: [Bioperl-l] to convert cDNA id of nucleotide database to
geneacc.id of gene database of ncbi
On Monday 20 November 2006 07:45, bikash lohia wrote:
> hello group, I am new to this group and want a help.i have list of
> id of rice (oryza sativa)such as AK070197 , AK105331 etc i have to
> manually search gene database of NCBI for converting this accession
> cDNA(eg.AK070197) to gene id of oryza sativa to get os******* gene id
> want to do it through perl programming where the program directly
> list of id ( such as AK105331,Ak070197) from notepad file and searches
> gene database of ncbi. to give results in accession id starting with
> OS****** .i want only the accession id of corresponding Ak***** id.
> example - AK070197 of nucleotide databse = Os02g0669100 of gene
> i want to convert all this AK***** ids to OS***** ids through
> in perl/bioperl as manually not possible for long list. please help.
> have no idea how can the code be. with thanks in advance from Bikash
There is some useful data at:
The README file contains details of each file. The gene2accession.gz
contains genbank accession numbers and maps them back to Entrez Gene ID.
gene_info.gz file contains the Entrez Gene summary information for all
Genes. With these two files (loaded into appropriate perl hashes), your
can be complete relatively easily.
Alternatively, you could use the eUtils modules, which I think are
only via CVS.
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
More information about the Bioperl-l