[Bioperl-l] ASN.1 and BioPerl ?

Chris Mungall cjm at fruitfly.org
Fri Feb 11 16:24:32 EST 2005

Hi Pierre

If this is a generic ASN.1 parser then my opinion is that the appropriate
place for this is as an independent module on CPAN. The people can wrap
this with bio-specific parsers in the appropriate Bio::*IO namespace. I
would choose your namespace in such a way that you do not preclude other
ASN parsers - propose this on the perl module-authors mail list first.

Have you see this?

It seems that this is quite similar to yours, although it has some
'excess' functionality

I think returning nested hashes is preferable to objects for the
bare-bones ASN.1 parser. For large files a push/pull architecture would be
preferable; or some kind of indexing scheme. One approach would be to
leverage all the existing CPAN technology for this by providing a
ASN.1->XML mapping - I imagine this can be done generically. Perhaps NCBI
even have such a tool?

I think the codon tables idea is a good one. I don't think there's any
philosophical opposition - a generic format such as ASN.1 is preferable to
a multitude of ad-hocs formats. However, it would be nice if NCBI were to
provide all their datasets as XML, which would make an ASN.1 parser fairly
pointless for those outside the NCBI. Until that time, I for one would
have uses for a generic ASN.1 parser.


On Thu, 10 Feb 2005, Pierre Rioux wrote:

> Hello BioPerl users and coders,
> I haven't seen much (or any) support for parsing ASN.1
> documents in BioPerl; I assume it's either because the
> community doesn't need it much, or there are already some
> Perl ASN.1 libraries available elsewhere (but I've not
> looked very hard, I have to admit). There's some code I
> could contribute about that, but before I do anything
> I'd like to know what other developers think. The
> code I have is basically a pure Perl text parser that
> can read ASN.1 text (in .prt format, like NCBI publishes)
> and builds a data structure (a hierarchy of hash tables)
> to represent it. The object model is quite primitive right
> now, but eventually it could be extended into a nice API.
> My questions are:
>     1) Would this be useful to anyone?
>     2) If so, where (roughly) in the BioPerl code
>        tree would such a thing go?
>     3) Any other comments or recommendations welcome.
> I am using this library in my applications to read the file
> 'gc.prt' as published regularly by NCBI; this file describes
> a set of genetic codes, and NCBI updates it from time to time.
> In BioPerl, I noticed that the same information is hardcoded
> in the class Bio::Tootls::CodonTable, which means that whenever
> NCBI updates the gc.prt file the bioperl class also needs
> to be modified. An alternative I suggest is that if gc.prt
> can be found on the local system (using the NCBI environment
> variable) then Bio::Tootls::CodonTable could read it and keep up
> to date automatically (otherwise, it could keep using the
> hardcoded codon tables it currently has).
> I am sure other parts of BioPerl could probably benefit
> a little from being able to read NCBI's ASN.1 data files,
> unless there's some kind of philosophical opposition to
> the idea?
> I'm willing to do all the work, AND design a proper and
> clean OO interface for the code, provided the answer to
> question #1 above is 'yes'...
> Pierre Rioux
> pierre_rioux {round symbol} yahoo {smallest char} com
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

More information about the Bioperl-l mailing list