[Bioperl-l] Parsing EMBL records

Smithies, Russell Russell.Smithies at agresearch.co.nz
Mon Oct 15 23:51:40 EDT 2007

I've got some output from "newcpgreport" that looks like embl format but
I'm having trouble parsing it.
One of the reasons might be because of this warning but I'm not sure.
	-------------------- WARNING ---------------------
	MSG: Got a sequence with no letters in it cannot guess alphabet

I need to get it into gff or tabbed format and don't want to have to
resort to Perl regexes to do it.
Has anyone got some example code for parsing embl or swiss output I
could borrow?

The "newcpgreport " output looks like this:
ID   BTA28  46084206 BP.
DE   CpG Island report.
CC   Obs/Exp ratio > 0.60.
CC   % C + % G > 50.00.
CC   Length > 200.
FH   Key              Location/Qualifiers
FT   CpG island       181953..182223
FT                    /size=271
FT                    /Sum C+G=162
FT                    /Percent CG=59.78
FT                    /ObsExp=0.75
FT   CpG island       222609..223040
FT                    /size=432
FT                    /Sum C+G=290
FT                    /Percent CG=67.13
FT                    /ObsExp=0.97
FT   CpG island       288537..288741
FT                    /size=205
FT                    /Sum C+G=112
FT                    /Percent CG=54.63
FT                    /ObsExp=0.85 

Any help greatly appreciated  :-)

Russell Smithies

Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.

More information about the Bioperl-l mailing list