[Bioperl-l] Re: Bio::DB::WebSeqDB and Bio::DB::GenBank

Francis Ouellette Francis Ouellette <francis@cmmt.ubc.ca>
Mon, 11 Dec 2000 23:16:48 -0800 (PST)

On Mon, 11 Dec 2000, Hilmar Lapp wrote:

> ASN.1 has been NCBI's format of choice for years, long before they even
> thought about adopting XML. Their ASN.1 schema is probably also much
> stabler than the XML equivalent. If we only had a parser for it. There
> is even a module out on CPAN:
> http://search.cpan.org/doc/GBARR/Convert-ASN1-0.07/lib/Convert/ASN1.pod
> I don't know how useful this could be. Any people on the list with
> experience or feelings in this regard?

It should be noted that ASN.1 is a much richer format than GB/EMBL,
and it can hold many types of anotations not present in the GB/EMBL
format ... for example, things like 1) alignments or 2) quality of
base call (from Ace/phred output).

Here we store all of our data in binary asn.1 in house
(saves space) and can then write out anything to what ever format ...
(typically GB or FASTA, but we can invent our formats as well, like we
are working on for SNPs, who also come into our system in ASN.1)

There is obviously a cost at doing this (you need to work with the
ncbi toolkit is the major one), but you gain from inheriting 12 years
of code developed by pretty good programmers (like using bioperl I
guess :-)

There are converters out there (asn<->xml) and one need not dwelve
into asn.1 world if yu don't want to, but understanding it, and
working with it will give you access to a richer data format and
richer data model ...

my Can$0.02

| B.F. Francis Ouellette                      Tel: (604) 875-3815 | 
| Director, Bioinformatics Core Facility      Fax: (425) 740-6978 | 
| CMMT, UBC, Canada                        http://www.cmmt.ubc.ca | 
| francis@cmmt.ubc.ca                http://www.bioinformatics.ca |