[Bioperl-l] getting proteins matching GO
Davis, Sean (NIH/NHGRI)
sdavis2 at mail.nih.gov
Fri Nov 5 23:37:36 EST 2004
If I'm not mistaken, the GO database contains mappings from GOA
From: Stefan Kirov
To: Pedro Antonio Reche
Sent: 11/5/2004 8:29 PM
Subject: Re: [Bioperl-l] getting proteins matching GO
You may want to check Bio::Ontology and especially Bio::OntologyIO.
These are pretty cool modules, but you will have to install bioperl-live
or wait for bioperl 1.5 (which as I understand should be released soon).
You will have to download the GO DB locally and parse it with
Bio::OntologyIO, I am not sure if somebody is working on remote access
(not familiar if it is possible at the moment). By the way if you are
not familiar with mysql and you are OK with perl, Bio::OntologyIO might
be easiest for you. It will also include anything you are able to get
from GO website. But you will have to keep local database (or flat
file). Hope this helps.
Pedro Antonio Reche wrote:
> Dear Stefan, thanks a lot for your e-mail. Actually, I am interested
> in getting all proteins from all organisms that are tagged with let
> say the go_process cell signaling. I will try the sites that you
> indicate to see if they can do the job. Do you know if Bioperl can
> also do this?
> On Nov 5, 2004, at 12:27 PM, Stefan Kirov wrote:
>> What organism? You can use either EnsMart (for example for human
>> there is a table called hsapiens_gene_ensembl__xref_go__dm) or you
>> can use GeneKeyDB if you install it locally (genereg.ornl.gov/gkdb),
>> there is a table called ll_go, which you can search for the gene
>> identifier(locuslink), associated with a particular GO term and then
>> get the protein accession from another table (something like :
>> "select r.np_accn from ll_go g, ll_refseq_nm r where r.ll_id=g.ll_id
>> and g.go_term=?") and fetch the seq from RefSeq, etc. Both Ensembl
>> and GeneKeyDB are restricted to certain eukaryotes. So it all depends
>> on what kind of organisms you are expected to work with.
>> Pedro Antonio Reche wrote:
>>> I am interested in getting all the protein sequences matching a
>>> specific GO term and I wonder if someone would know how to do this.
>>> Thanks in advance for any help.
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>> Stefan Kirov, Ph.D.
>> University of Tennessee/Oak Ridge National Laboratory
>> 5700 bldg, PO BOX 2008 MS6164
>> Oak Ridge TN 37831-6164
>> tel +865 576 5120
>> fax +865-576-5332
>> e-mail: skirov at utk.edu
>> sao at ornl.gov
>> "And the wars go on with brainwashed pride
>> For the love of God and our human rights
>> And all these things are swept aside"
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
Stefan Kirov, Ph.D.
University of Tennessee/Oak Ridge National Laboratory
5700 bldg, PO BOX 2008 MS6164
Oak Ridge TN 37831-6164
tel +865 576 5120
e-mail: skirov at utk.edu
sao at ornl.gov
"And the wars go on with brainwashed pride
For the love of God and our human rights
And all these things are swept aside"
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
More information about the Bioperl-l