[Bioperl-l] xml sequence download from ncbi

Lapointe, David David.Lapointe@umassmed.edu
Thu, 24 Aug 2000 10:22:05 -0400


Great stuff!!  Two things I  had a problem with. First IE5 wanted to
download to viewer.cgi so I wonder if the mime type is not set ( is there an
xml mime type? hmmm?). I downloaded anyway
and the file ( entrez.xml) had some errors.

Here are the first few lines as returned
<--?xml version="1.0"?>
<!DOCTYPE Seq---entry PUBLIC "-//NCBI//NCBI Seqset/EN" "NCBI_Seqset.dtd">

The first line should be
<?xml version="1.0"?>

In the second line there are two many '-' in Seq---entry, which should be
<!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN" "NCBI_Seqset.dtd">

to match the root element

Also I had a problem resolving "NCBI_Seqset.dtd" . Shouldn't there be a
DTD-URL something like

<!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN"  

/../ being some appropriate path.

> -----Original Message-----
> From: Geer, Lewis (NLM) [mailto:lewisg@mail.nih.gov]
> Sent: Thursday, August 24, 2000 9:08 AM
> To: Bioperl
> Subject: [Bioperl-l] xml sequence download from ncbi
> Hi, 
> Sequence download using an xml format derived from our asn.1 
> standard format
> is now available from Entrez.  For an example, try
 where val is the sequence gi number.  Note that this xml output is based
on our asn.1 records which are both complete and complex -- we may end up
making a genbank flatfile-like version, especially since there are small
mismatches between the asn.1 and xml languages that make the xml a bit more
complex than if xml was our native format.

We'd be interested in seeing comments!

Bioperl-l mailing list