[Bioperl-l] Parsing entrezgene with Bio::SeqIO
mingyi.liu at gpc-biotech.com
Thu Mar 16 10:59:32 EST 2006
Liisa Koski wrote:
> Unfortunately the only KEGG annotation I see in the results looks like:
> dblink = Direct database link to in database KEGG
> (Notice the space between 'to in')
> Anyone have any ideas how to get the KEGG annotation results?
Stefan's the person maintaining the SeqIO:entrezgene module, so he'd be
able to answer this part of your question.
> Note: I also tried parsing the file
> but I got the below error:
> ./entrez_gene_seqio.pl Homo_sapiens.ags
> Data Error: none conforming data found on line 1 in Homo_sapiens.ags!
> first 20 (or till end of input) characters including the non-conforming data:
> at /netshare/home/koski/perl_modules/bioperl-live/Bio/SeqIO/entrezgene.pm
> line 138
The error was thrown by my Bio::ASN1::EntrezGene module because it
expects a text file, while you fed it with a binary file. To use
gzipped ASN binary file from NCBI, download the NCBI gene2xml
then use this syntax to run my parser on the binary files:
my $parser = Bio::ASN1::EntrezGene->new('file' => "gene2xml -i
Homo_sapiens.ags.gz -c -x -b | "); # Homo_sapiens.ags.gz is the gzipped
binary file directly downloaded from NCBI
Same syntax should be used when you're using SeqIO (thus SeqIO::entrezgene).
More information about the Bioperl-l