[Bioperl-l] Fw: bad entries in interpro again (fwd)

Jared Fox jaredfox at ucla.edu
Mon Dec 6 22:59:41 EST 2004

The problem with Interpro XML is that there are entries like:

 <match id="SSF46785" name=""Winged helix" DNA-binding domain"


<match id="SSF55486" name="Metalloproteases ("zincins"), catalytic domain"

The double quotes are supposed to mark the beginning and end of the name
attribute, but the xml is not valid so it has double quotes inside the
attribute itself. I believe this also happens with other illegal xml 

If Interpro were to start producing valid XML, everything should work 

> ---------- Forwarded message ----------
> Date: Wed, 01 Dec 2004 16:16:46 +0000
> From: Mikko Arvas <Mikko.Arvas at vtt.fi>
> To: bioperl-l at portal.open-bio.org, Hilmar Lapp <hlapp at gmx.net>,
>     Allen Day <allenday at ucla.edu>
> Subject: bad entries in interpro again
> Hi,
> we've been discussing the problems of interpro parsing. I have a friend 
> who
> is going to interpro consortium meeting next week and I could send some
> regards through him. After reading your e-mails, I am (being quite a
> newbie) a little bit confused of what kind of regards would you like to
> send if any?
> Is the &apos the source of the problem? Is it really a problem in BioPerl
> or in expat? Is somebody trying to solve the problem for Bioperl now
> and is there any sensible thing that the interpro team could do to help?
> Cheers,
> mikko
> Mikko Arvas
> VTT Biotechnology
> e-mail:            mikko.arvas at vtt.fi
> tel:                 +358-(0)9-456 5827
> mobile:           +358-(0)44-381 0502
> fax:                +358-(0)9-455 2103
> mail:               Tietotie 2, Espoo
>                       P.O. Box 1500
>                       FIN-02044 VTT, Finland

More information about the Bioperl-l mailing list