[Bioperl-l] Getting chromosome from GenPept?

Barry Moore barry.moore at genetics.utah.edu
Tue Jul 27 17:55:40 EDT 2004

Well, how about reading loc2acc 
(ftp://ftp.ncbi.nih.gov/refseq/LocusLink/loc2acc) into a hash with the 
accession as key and LocusLink ID as value.  Use that to convert from gi 
to LL ID, the having read LL.out (or the LL.out.??.gz for your organism) 
into a hash with LL ID as key and the other fields (or just the 
chromosome number in column 5) as a hash of values you use LL ID to look 
up chromosome.  Better yet, combine the two above files into one has 
with accession as key, and do a one step look up.  Sexy...very sexy - 
got to run take a cold shower.


Leo, Paul (NIH/NHGRI) wrote:

>I have a bunch of proteins denoted by their gi number from a Mass Spect.
>Expt. which I want to organize. I get the protein "details" from
>Bio::DB::GenPept using the GI number (want to find Gene names and other
>properties  etc ... ).
>I also want the chromosome if it is available. I usually get these from
>"primary_tag='source'  tag='chromosome' in the sequence object but this is
>not always present eg
>Gi= 23272966  has no chromosome info  (see
>But if I following the link through the LocusID (CDS primary tag) to
>LocusLink ( http://www.ncbi.nlm.nih.gov/LocusLink/LocRpt.cgi?l=11947
><http://www.ncbi.nlm.nih.gov/LocusLink/LocRpt.cgi?l=11947>  )
>I find that it is chromosome 10....
>What is the "Bio-Perl" way to get this info??? Otherwise I was just going to
>get the $LocusID from the $seq object and get the page 
><http://www.ncbi.nlm.nih.gov/LocusLink/LocRpt.cgi?l=$LocusID>  ... strip the
>HTML and then get the chromosome.
>Is there a more sexy way to do this?
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org

Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT

More information about the Bioperl-l mailing list