[Bioperl-l] Re: Bio::DB::WebSeqDB and Bio::DB::GenBank

Jason Stajich jason@chg.mc.duke.edu
Tue, 12 Dec 2000 11:50:36 -0500 (EST)

On Mon, 11 Dec 2000, Francis Ouellette wrote:

> It should be noted that ASN.1 is a much richer format than GB/EMBL,
> and it can hold many types of anotations not present in the GB/EMBL
> format ... for example, things like 1) alignments or 2) quality of
> base call (from Ace/phred output).
> Here we store all of our data in binary asn.1 in house
> (saves space) and can then write out anything to what ever format ...
> (typically GB or FASTA, but we can invent our formats as well, like we
> are working on for SNPs, who also come into our system in ASN.1)
> There is obviously a cost at doing this (you need to work with the
> ncbi toolkit is the major one), but you gain from inheriting 12 years
> of code developed by pretty good programmers (like using bioperl I
> guess :-)
> There are converters out there (asn<->xml) and one need not dwelve
> into asn.1 world if yu don't want to, but understanding it, and
> working with it will give you access to a richer data format and
> richer data model ...

Francis - thanks for the insight.  

I think we should try in earnest to add functionality for reading/writing
NCBI XML and/or ASN1.1 in bioperl.  There are some obvious advantages and
we will be able to provide a useful platform for people with ASN1.1
databases as well as cleaner data retrieval from GenBank, etc.  But I
think it will have to be post 0.7 since it represents a fair amount of
work.  I volunteer for investigating feasibilty once we have 0.7 out the

> f. 
> --
> | B.F. Francis Ouellette                      Tel: (604) 875-3815 | 
> | Director, Bioinformatics Core Facility      Fax: (425) 740-6978 | 
> | CMMT, UBC, Canada                        http://www.cmmt.ubc.ca | 
> | francis@cmmt.ubc.ca                http://www.bioinformatics.ca |

Jason Stajich
Center for Human Genetics
Duke University Medical Center