[Bioperl-l] parse EMBL Feature Table only

Frank Schwach fs5 at sanger.ac.uk
Mon Dec 14 07:18:17 EST 2009


Maybe I'm really missing something here but I can't find how to parse a
file that is basically just the Feature Table from an EMBL file, looking
like this:

FT                   /colour=7
FT                   /product="RNA-binding protein, putative"
FT   CDS             213199..214812
FT                   /colour=7
FT                   /product="eukaryotic translation initiation factor
FT                   subunit 7, putative"
...[more of the same]

So the file has no header and no actual sequence and it is used simply
to annotate a chromosome in a genome assembly. I've always used GFF for
that purpose but have been given this file now.
BioSeqIO->new(-format=>"EMBL") complains about the missing header and if
I stick in a fake ID line, it warns about the missing sequence and the
fact that the features don't fit on the sequence (of length 0). 
Of course it's not difficult to write my own parser but I'm sure there
must be a BioPerl way of doing that that I have just overlooked. Thanks
for your help.

