heikki at ebi.ac.uk
Wed Jan 28 09:54:56 EST 2004
The more I use XML the more pragmatic I get. When things get tight, I've used
perl directly to modify XML files; and gained 10x speed increase. Validation
in most cases is not practical, so would not worry too much about about it,
BTW, I am now convinced that structure definitions for any database-related
format should be constructed from several definitions; in this case, one for
top level and an other for the entry level.
Thanks for pointing out that new line character in my code.
On Wednesday 28 Jan 2004 11:24, Dave Howorth wrote:
> Heikki Lehvaslaiho wrote:
> > The best way to do this is to ignore the root level of the xml, use perl
> > to parse entries out of it, and pass entry xml only to the parser. This
> > keeps the memory usage down and you can parse as large file as you want.
> Hmmm, it may be pragmatic but I'm sure there are other meanings of
> 'best'. By throwing away the root level you're losing any chance to
> validate the document or use other XML tools and you run the risk of
> making assumptions ...
> > local $/ = "</seqDiff>\n";
> ... There's no reason to expect there will ALWAYS be a newline following
> the tag in a valid XML file (suppose it's created by some XSLT tool that
> cares nothing about readability). You're making an assumption above and
> beyond the specification about how the XML is represented.
> Cheers, Dave
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
More information about the Bioperl-l