[Bioperl-l] Bio::Species, Bio::Taxonomy::Node overhaul

Chris Fields cjfields at uiuc.edu
Sun Aug 6 19:44:14 EDT 2006

Sendu, I feel this needs to be posted to the main list for further  
responses from anyone interested in making a point, one way or  
another.  I'm dropping out of this; you can have the last word.

This is in response to Sendu's proposal to have $species->species  
return the binomial name for that rank, as documented on Bugzilla.   
Any other responses would be appreciated.

(In reply to comment #5)
 > (In reply to comment #4)
 > See also http://en.wikipedia.org/wiki/Species and
 > http://en.wikipedia.org/wiki/Binomial_nomenclature. "The name of  
the species is
 > the whole binomial, not just the second term (which may be called  
 > epithet, for plants, or specific name, for animals)".
 > We can't have a method for the 'specific name' because we have no  
way of always
 > correctly working out what that is. The NCBI taxonomy database  
doesn't tell us,
 > and neither do the various sequence file formats.

Let's say, for instance, that the single definition of 'species,' as  
you have shown, was the only correct definition.  But in your  
response quoting the Wikipedia articles you leave out a plethora of  
other definitions, including one used by taxonomists: the second name  
in a binomial nomenclature, aka the species descriptor or what you  
have as the 'specific epithet'.  This is also explicitly stated in  
the second link you provide, for 'binomial nomenclature':

"As the word "binomial" suggests, the scientific name of a species is  
formed by the combination of two terms: the genus name and the  
species descriptor."

The previous use of species() in Bio::Species fits that definition,  
in that the species() method originally gave only the species  
descriptor (one name), NOT the binomial name, which is given by  
binomial().  Similarly, genus() gave only the genus name.  Why have a  
genus() or binomial() at all if you get the entire name via species()?

So, is there a correct definition of 'species'?  The same wikipedia  
pages you use to bolster your case for using a binomial species name  
actually indicates otherwise:

"Since the advent of the theory of evolution, the conception of  
species has undergone vast changes in biology; however no consensus  
on the definition of the word has yet been reached."

Seems ambiguous to me.  Is there another way?

Our proposal (actually Hilmar's) was to let Bio::Species hold the  
data as parsed in the SeqIO modules as is, but also have the same  
data contained in a Bio::Taxon object for I/O.  Then, slowly  
deprecate Bio::Species in favor of Bio::Taxon.  No confusion as to  
the data returned, no redundant methods, and the change is gradual,  
not sudden.  So, you could get the name ('Homo sapiens') as a  
Bio::Taxon object scientific name:

# returns NCBI TaxID scientific name from Bio::Taxon object

which doesn't carry the ambiguity of what would be returned like

# returns species name from Bio::Species object
$seq->species->species(); # what is it?

Is it a single name?  The binomial?  Both definitions could be  
correct (but only the first one is used).  At least with the first  
version (again proposed by Hilmar), you can state that this  
explicitly returns the scientific name as defined by NCBI (and have  
something from the NCBI server to point to).  No tainting of  
Bio::Taxon with odd useless methods which can be misconstrued five ways.

I'm not going to get drawn into another long-winded argument about  
this.  My point is made.  It's your baby.  I feel that we sometimes  
get too impassioned trying to defend our views when coding is the  
best course of action.  And I feel that not making concise arguments  
can be wasteful and, ultimately, pointless.

It's my firm belief, though, using species() in this way will  
generate more confusion than it's worth.  I'll leave it to you to  
answer the confused emails from bioperl users who don't expect this.


More information about the Bioperl-l mailing list