[Bioperl-l] getting proteins matching GO

Pedro Antonio Reche reche at research.dfci.harvard.edu
Mon Nov 8 08:24:16 EST 2004

Dear Nathan, thanks a lot for your help.  As you mention I wish to 
collect all proteins subordinate to a given term. There are several 
terms I am interested in  in retrieving the proteins (all related with 
the immune system) which I have not defined entirely. Therefore, I 
guess that it will be easier if you could send me the file you 
indicated. I have just retrieve the gene_association.goa_uniprot.gz.  
Thanks for the tip.
I am looking forward to hearing from you.

> Hi Pedro
>> Pedro Antonio Reche wrote:
>> Dear Stefan, thanks a lot  for your e-mail. Actually, I am interested
>> in getting all proteins from all organisms that are tagged with let 
>> say
>> the go_process cell signaling...
> The tricky part of working with GO annotations is that they are 
> arranged in
> a hierarchical ontology.  When you talk about wanting proteins that are
> tagged with a particular term, e.g., cell-cell signaling (GO:0007267), 
> you
> probably also want proteins tagged with terms subordinate to the given 
> term.
> There happen to be 93 such terms. I don't know if any of the sites 
> mentioned
> by Stephan have this information at hand, but I have produced a table 
> which
> I'm happy to share.  It has 168,071 rows.  If there are just a few 
> terms
> that you're interested in, like cell-cell signaling, I can do the 
> query for
> you and send you just that part of the table if that would be easier 
> for
> you.
> The next step is to connect proteins to GO terms. I think the file you 
> want
> is gene_association.goa_uniprot.gz at
> ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/.  Perhaps other 
> readers
> can comment on whether there are better sources for the protein-GO
> connections you need. It's a flatfile that's easy to parse.  A good 
> way to
> proceed is to load the data into a relational database and then join 
> with
> the GO defs from the paragraph above.  You can also do the processing 
> in
> Perl.
> Good luck,
> Nat
> ----------------------------------------------------------------------
> Nathan (Nat) Goodman
> Senior Research Scientist
> Institute for Systems Biology
> 1441 North 34th Street
> Seattle, WA 98103-8904
> 206-331-0077
> 206-363-0431 (fax)
> natg at shore.net
> http://home.comcast.net/~natgoodman/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list