torsten.seemann at infotech.monash.edu.au
Tue Sep 19 21:13:43 EDT 2006
I have added example output files for all 4 flavours of Glimmer to
bioperl-live CVS as t/data/Glimmer*, described below:
> The initial checkin comments (circa '03) for Bio::Tools::Glimmer
> describe it as a 'GlimmerM 3.0' parser. The POD says '...a module for
> parsing Glimmer predictions (currently GlimmerM
> 3.0 is all that has been tested)...'. However, the latest version of
> GlimmerM looks to be 2.5.1 (ftp://ftp.tigr.org/pub/software/GlimmerM),
> and there are multiple versions/flavors of Glimmer besides GlimmerM:
> Glimmer 2.X ( bacteria, archaea, and viruses):
A single two part output file. The first part has detailed information
regarding all ORFs, while the second part has the putative genes.
> Glimmer 3.X ( bacteria, archaea, and viruses):
Glimmer3 produces two separate files: XXX.detail and XXX.predict.
The Glimmer3 .detail file is similar to the first part of the Glimmer
2.x first part. The Glimmer3 .predict file conveys the same information
as the second part of a Glimmer2 file, but in a totally different format!
> GlimmerM ( eukaryotes ):
I used GlimmerM 2.5.1. The output matches the original
"t/data/glimmer.out" test file in CVS.
> GlimmerHMM ( eukaryotes ):
This format is nearly identical to GlimmerM, only the first line header
is different. I used version 2.2.0.
> I suspect Bio::Tools::Glimmer only parses GlimmerM, *maybe*
> GlimmerHMM, but not Glimmer 2.X or Glimmer 3.X.
It doesn't currently work with my GlimmerHMM output, as the module
expects a version number, which my output does not have - but I will fix
that in CVS today.
However it won't work with Glimmer 2.x and 3.x. And it probably
shouldn't as the Eukaryotic stuff isn't relevant. New code has to be
written. Most people only want the final gene predictions, which 2.x and
3.x use different formats and files for.
I'm not sure whether to
1. parse them all under the same module, perhaps with a
2. create a single new module Glimmer2 and Glimmer3
3. create two new modules, one for Glimmer2 and one for Glimmer3, given
they are different outputs both in syntax and number of output files
Any advice from Bioperl 'old timers' appreciated ;-)
Dr Torsten Seemann http://www.vicbioinformatics.com
Victorian Bioinformatics Consortium, Monash University, Australia
More information about the Bioperl-l