[Bioperl-l] Parsing entrezgene with Bio::SeqIO

Stefan Kirov skirov at utk.edu
Fri Mar 17 17:32:26 EST 2006


OK, done now. Update to bioperl-live and use optional_id to get the 
text. Let me know how it goes.
Stefan

Liisa Koski wrote:

>Hi,
>I'm using Bio::SeqIO to parse the EntrezGene file Homo_sapiens (from 
>ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ASN_OLD/Mammalia/Homo_sapiens.gz).
>
>I'm using bioperl-1.5.1.
>
>I want to extract the KEGG annotations.
>See code below.
>
>use Bio::SeqIO;
>use Bio::ASN1::EntrezGene;
>
>my $seqio = Bio::SeqIO->new(-format => 'entrezgene',
>                                             -file => 'Homo_sapiens');
>while (my $gene = $seqio->next_seq){
>    print "\n",$gene->id, "\t", $gene->accession_number, "\n";
>    my $ann = $gene->annotation();
>    foreach my $key ( $ann->get_all_annotation_keys() ) {
>        my @values = $ann->get_Annotations($key);
>        foreach my $value ( @values ) {
>            print $key, "\t", "=", "\t", $value->as_text,"\n";
>        }
>    }
>}
>
>Unfortunately the only KEGG annotation I see in the results looks like:
>dblink  =       Direct database link to  in database KEGG 
>(Notice the space between 'to  in')
>
>Anyone have any ideas how to get the KEGG annotation results?
>
>Note: I also tried parsing the file 
>ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/ASN_BINARY/Mammalia/Homo_sapiens.ags.gz
>but I got the below error:
>
>./entrez_gene_seqio.pl Homo_sapiens.ags
>Data Error: none conforming data found on line 1 in Homo_sapiens.ags!
>first 20 (or till end of input) characters including the non-conforming data:
>00
> at /netshare/home/koski/perl_modules/bioperl-live/Bio/SeqIO/entrezgene.pm 
>line 138
>
>
>Thanks,
>Liisa
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>  
>

-



More information about the Bioperl-l mailing list