[Bioperl-l] ASN1 and BioPerl (and XML too)

Pierre Rioux pierre_rioux at yahoo.com
Thu Feb 17 16:47:44 EST 2005

Hello everyone,

Thanks for all the replies. Here's my plan, based on your
comments. There won't be much arguing happening in this
thread, by the way, because I agree with everything that
was said!

- You've pointed out that the ASN.1 parser I was proposing
  should probably be an independent CPAN module; I agree.
  I am so used to manipulating ASN.1 files containing
  biological data that I'd forgotten it's not a data format
  specific to biology.

- The code I have right now works but it's quite primitive; it
  used to be Perl 4 (FOUR!) code I wrote back in 1996, and
  I spent some time last week upgrading it somewhat, but
  its origins still show. I'll work a little bit more on it
  before I think it's suitable for release. The parser is custom
  and the API too, the ideal solution of course would be to look
  at what exists in the Java world and implement an object model
  with identical method names. But for sure, the first release
  will be nowhere near complete: it will JUST read ASN.1 text and
  provide a way to access the data fields. ASN.1 purists will
  dislike it, I'm sure. :-)

- Eventual integration with BioPerl will probably wait until
  some more design and cleanup work is performed. I am
  willing to do this in my spare time, it's a nice project, but
  I'm not sure how quickly I'll be able to progress.

- Hilmar asked me for the code I have right now, I'm about to
  send it to him personally (with some doc and examples). If anyone
  else is interested, don't hesitate to write to me.

- Some of you mentioned XML. Personally I like XML a whole
  lot better than ASN.1. NCBI *does* provide a way to transform
  any ASN.1 record into XML automatically. Their tool "asntool"
  can be used as a data transformer, and it's quite straightforward.
  For the curious, here's a way a filehandle can be opened to return
  a XML version of an ASN.1 document (here, 'gc.prt'):

    my $fh = new IO::File "asntool -m asn.all -v gc.prt -x stdout |"
        or die "Can't open pipe to asntool: $!\n";

- Some of your replies mentioned a great many number of .asn module
  files that NCBI provides and are often needed to deal with ASN.1
  documents and applications. I'd like to point out that NCBI
  also provides all the module definitions in a single file, it makes
  handling ASN.1 documents much easier. As shown in the code snippet
  above, the file with all the module definitions is called "asn.all".
  I personally never bother figuring out which one I need for
  the -m argument of asntool, I always supply "asn.all" and concentrate
  of the other args.

- Yet another XML comment. Actually a quick plug. I worked for a
  genomics company that has since shut down. There was some XML-handling
  software I wrote while there that I felt would be useful to the
  scientific community, but management would not release it because
  they were afraid of liabilities. I rewrote the whole library (it's
  a Perl module) and it's available on sourceforge. It's not a CPAN
  module because I don't know how to package a CPAN module (yet). If
  you like to design applications which create, load and manipulate
  XML data, have a look: http://pirobject.sourceforge.net
  It's fast and unlike most XML software layers out there the basic
  data model philosophy aims for simplicity and elegance. The XML
  looks good (IMNSHO), unlike NCBI's XML (which is awful).

Thanks everyone. Have a great day!


Do you Yahoo!? 
Yahoo! Mail - You care about security. So do we. 

More information about the Bioperl-l mailing list