[Bioperl-l] Locuslink parser

Law, Annie Annie.Law at nrc-cnrc.gc.ca
Fri Feb 13 11:53:04 EST 2004

Hi Hilmar,

Thanks for your response.  

By what you're saying I think that my existing code would be able To access
the GO identifier.  If I look up the tagname molecular function then I will
get the value to be for example: Molecular function|ATP binding|GO:0005524.
The method that I can think of is to take this value and write some code To
parse the GO identifier out.  Is there a more direct method?

I used the test to test for term annotation or dbxref then if it was dbxref
I was able to get the primary id and the 
Database name. Thanks!  I am learning more about the objects I am using.  Do
you know if there is some doucmentation with Figures showing all of the
relationship of objects with Bio::Seq class eg relationship of Bio::Seq and
Bio::Annotation Collection among others. 

However, I am still unable to get all of the fields for example SUMFUNC( a
brief summary of the function of the products of this locus), ORGANISM, OMIM
etc...  I am not sure how to access these.  
It also seems if I use 
	foreach my $ann (@annotations) {
		if ($ann->isa("Bio::Ontology::TermI")) {
			# this is an ontology term as annotation
		if ($ann->isa("Bio::Annotation::DBLink")) {
			# this is a dbxref annotation
I am filtering out some of the annotation types such as OFFICIAL_GENE_NAME,
CHR, OFFICIAL_SYMBOL, etc.. I only get GO information and DBLINK
If I use the following I will get the maximum number of annotation and
dbxref fields I have been able to extract so far. Is there another category
I am missing.  Better yet how do I find out what are the other missing
categories? Ie. Other than Bio::Ontology::TermI, or Bio::Annotation::DBLink

while (my $seq_obj=$io->next_seq()){
my $anno_collection = $seq_obj->annotation;

foreach my $key ($anno_collection->get_all_annotation_keys){
  my @annotations = $anno_collection->get_Annotations($key);
  foreach my $value (@annotations){
    print "tagname: ", $value->tagname, "\n";		
    # $value is an Bio::Annotation, and has an "as_text" method
    print " annotation value: ", $value->as_text, "\n"; 	


**In the example you provided below I can see that all of the type
Bio::Ontology::TermI annotation types being Grouped and stuck in
@term_annotations but what is the $_-> for ? And why do you need the line
$seq->get_Annotations(); Below it? 
@term_annotations = map { $_->isa("Bio::Ontology::TermI"); } 

Thanks very much,

-----Original Message-----
From: Hilmar Lapp [mailto:hlapp at gnf.org] 
Sent: Thursday, February 12, 2004 2:10 PM
To: Law, Annie
Cc: 'bioperl-l at bioperl.org'
Subject: Re: [Bioperl-l] Locuslink parser

On Thursday, February 12, 2004, at 09:46  AM, Law, Annie wrote:

> I am most intereste in obtaining the fields  locuslink id, GO id,
> accession number, unigene id.

The locuslink ID is the $seq->accession_number. GO should be there as 
term annotations, unigene ID and other accessions should be present as 
dbxref annotations.

You can test for an annotation being a term annotation or a dbxref:

	foreach my $ann (@annotations) {
		if ($ann->isa("Bio::Ontology::TermI")) {
			# this is an ontology term as annotation
		if ($ann->isa("Bio::Annotation::DBLink")) {
			# this is a dbxref annotation

Using the map function you can easily filter for annotation types, for 

	@term_annotations = map { $_->isa("Bio::Ontology::TermI"); } 

BTW if you want to get all annotations from a seq object, you can just 
say $seq->get_Annotations() and omit the key.


Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757

More information about the Bioperl-l mailing list