[Bioperl-l] How to get from gi/ref/gb to genomic coordinates ?

Jason Stajich jason at bioperl.org
Thu Feb 1 13:36:02 EST 2007

On Feb 1, 2007, at 9:55 AM, Chris Fields wrote:

> On Feb 1, 2007, at 6:54 AM, Rainer Machne wrote:
>> Barry and Jason,
>> thanks for your quick and very helpful replies.
>> I guess we should have done (or repeat) our blast search at
>> http://fungal.genome.duke.edu/
>> to get better mapping from proteins to genomes ?

Well I'm not quite sure of your exact goals.  To find upstream  
regions of known genes, or look at upstream regions of orthologous  

You can first figure out orthologs based on protein similarities,  
then go in an extract upstream regions for the orthologous genes (I  
provide a link to a big all-vs-all FASTA result at the bottom of the  
page if you want those results, as well as some pairiwise orthology  
assignments, although you may want more or less stringent parameters).

All the GFF and AA data is freely available for download on the site  
for each genome we've annotated or for annotation we've re-formatted  
so you can do things locally and/or modify it to your liking.

>> As I retrieved all my proteins via whole genome blasts we should find
>> (most of) them in the genbank files ... a good opportunity for me to
>> learn some Bioperl and the other packages you mentioned in case we  
>> want
>> to do more complex analysis later :-)
>> Thank you very much!
>> Rainer
> If the data is available in GenBank you could run the BLAST  
> searches at NCBI and limit the search with an Entrez query:
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
> Most (all?) genome files are tagged as complete
> I'm not sure but there might be a way of doing this via  
> Bio::Tools::Run::RemoteBlast.  Jason, any ideas?
> chris

Jason Stajich
Miller Research Fellow
University of California, Berkeley
lab: 510.642.8441

More information about the Bioperl-l mailing list