David.Messina at sbc.su.se
Thu Apr 8 15:39:46 EDT 2010
> if $mol is not in the fixed list of genbank molecule types it should
> be set to the default value of 'DNA', or some other smarter way of
> forcing the molecule type into the fixed vocabulary would be a help.
Sounds good to me. Did you modify your local copy of Bio::SeqIO::genbank and try it out?
I will say, though, that Genbank is a tricky format, both to read and to write. Even if BioPerl would write Genbank records that are fully compliant with the spec, I'm pretty sure they would not be round-trippable*. That is, if you read a Genbank record into BioPerl and then wrote it back out, the output wouldn't exactly match the input.
I think that NCBI is trying to nudge people toward their XML format. I know it won't help this particular situation, but it might be an option to consider for the future.
Speaking of which, what is the current status of the BioPerl Genbank XML parser? Jay, did you ever release that?
* not that they were designed to be: http://www.bioperl.org/wiki/HOWTO:SeqIO#Caveats
More information about the Bioperl-l