[Bioperl-l] GeneStructure interfaces

Alan Robinson (EBI) alan@ebi.ac.uk
Sun, 18 Feb 2001 10:40:57 +0000 (GMT Standard Time)


I was having a look at the GeneStructure modules which look interesting
and I have some quick questions and observations.

Most of my comments relate to possible docmentation inconsistencies,
rather than technical issues with the interfaces (I also appreciate this
is a work in progress and you may just not have got round to these yet!).

1) There are three 'GeneStructure' files- Two implementations
   'Bio::SeqFeature::GeneStructure' and
   'Bio::SeqFeature::Gene::GeneStructure' and an interface

   There are differences between the implementations (e.g. the 'Gene'
   one has no cds() methods) and it appears that neither implements
   the GeneStructureI interface?

   Which is the definitive? Is one of them cruft?   

   Depending on the above, the next 5 comments may be invalid.

2) The 'GeneStructureI' interface has no 'introns()' method.

3) The 'GeneStructureI' interface is documented as returning an array
   of 'ExonI' objects from the 'utrs()' method, but both the
   'GeneStructure' implementations are documented as returning arrays
   of 'SeqFeatureI' objects.

4) For the 'exons()' method; 'Bio::SeqFeature::GeneStructure' is
   documented as returning an array of 'SeqFeatureI'
   objects. 'GeneStructureI' and
   'Bio::SeqFeature::Gene::GeneStructure' are documented as returning
   an array of 'ExonI' objects.

5) The 'utrs()' method of 'Bio::SeqFeature::Gene::GeneStructure' has
   optional arguements that are not documented in the 'GeneStructureI'
   interface for specifying the type of UTR to be returned.

6) The Bio::SeqFeature::GeneStructure' implementation has an optional
   arguement for the 'cds()' method to specify if the returned CDS
   should be corrected for the phase; however the 'cds()' methods of
   other objects (e.g. Transcript, Exon and ExonI) specify that the
   CDS returned must be in phase by the addition of N's at the
7) The 'Transcript' implementation includes as exon types in the sort
   order 'utr5prime', 'utr3prime' and 'ployA'; whilst the 'Exon'
   implementation only includes 'utr' as a valid exon type. Is there a
   particular reason not to have the prime-ness included for the
   'Exon' valid types?

Alan J. Robinson, D.Phil.             Tel:+44-(0)1223 494444
European Bioinformatics Institute     Fax:+44-(0)1223 494468
EMBL Outstation - Hinxton             Email:  alan@ebi.ac.uk
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD, UK                http://industry.ebi.ac.uk/~alan/