[Bioperl-l] Species name validation problem

David Waner dwaner at scitegic.com
Mon Mar 27 13:24:12 EST 2006

Yes, I meant to type Bio::Species, not Bio::Seq. Sorry for the

My problem is that I am not calling $species->classification() directly;
I am calling Bio::Species->new(), which in turn calls classification()
which calls validate_species_name(), which then throws an exception on
some species names.  As far as I can see, there is no way to turn off
this (over-aggressive) validation in the Species constructor. 

I guess that instead of this:

	$species = Bio::Species->new(-classification =>

I could do this:

	$species = Bio::Species->new();
	$species->classification(\@classificationArray, 'no
but it would make a nicer interface to have a validation option in the
Species constructor.

- David

-----Original Message-----
From: Hilmar Lapp [mailto:hlapp at gmx.net] 
Sent: Friday, March 24, 2006 9:42 PM
To: David Waner
Cc: Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Species name validation problem

The option would be in Bio::Species, not Bio::Seq. You can circumvent
the name validation by passing an array ref to
$species->classification() and anything that evaluates to true as the
second argument. This is for instance what the genbank parser does
(which doesn't mean that it is always correct); supposedly the swissprot
parser ought to do the same.


On 3/24/06, David Waner <dwaner at scitegic.com> wrote:
> I have found that Bio::Seq->new() throws exceptions on some "species" 
> names containing special characters, or consisting of a single letter,
> e.g:
>         SwissProt: POLN_ONNVG   O'nyong-nyong virus
>         SwissProt: FIBP_ADE1H   Human adenovirus 15/H9
>         SwissProt: POLG_FMDVZ   Foot-and-mouth disease virus (strain
> A22/550 Azerbaijan 65)
>         SwissProt: RIR1_BHV1C   Bovine herpesvirus 1.1
>         SwissProt: SODF_METJ    Methylomonas J
>         GenBank: AJ416726               Stylosanthes aff. calcicola
> It seems that the regex in validate_species_name() is too restrictive,

> but I can't find a way to turn off validation without editing bioperl 
> modules.  There has been some recent discussion of this issue on the 
> mailing list (see below).  Does anyone know if or when a 
> -validate_species option to Bio::Seq->new() will be added? Or should I

> just propose the code change?
> Thanks,
>   David Waner
> > Stefan Kirov skirov at utk.edu
> > Wed Sep 21 08:46:05 EDT 2005
> >
> >
> ----------------------------------------------------------------------
> --
> --------
> >
> > Thanks for the great answer Hilmar!
> > I would prefer to have some kind of a check if the user wishes so. 
> > For
> > example Entrezgene file contains some HTML tags in some entries
> species
> > names which is good to know.
> > I will put an option -validate_species in the constructor to turn 
> > the check on and off. Maybe a species filter can be of some use as 
> > well. though you can just select the correct file from the NCBI 
> > site.... Thanks again! Stefan
> >
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :

Bioperl-l mailing list
Bioperl-l at lists.open-bio.org

Click on the link below to report this email as spam

More information about the Bioperl-l mailing list