[Bioperl-l] Species name validation problem

Hilmar Lapp hlapp at gmx.net
Sat Mar 25 00:42:18 EST 2006


The option would be in Bio::Species, not Bio::Seq. You can circumvent
the name validation by passing an array ref to
$species->classification() and anything that evaluates to true as the
second argument. This is for instance what the genbank parser does
(which doesn't mean that it is always correct); supposedly the
swissprot parser ought to do the same.

   -hilmar

On 3/24/06, David Waner <dwaner at scitegic.com> wrote:
> I have found that Bio::Seq->new() throws exceptions on some "species"
> names containing special characters, or consisting of a single letter,
> e.g:
>
>         SwissProt: POLN_ONNVG   O'nyong-nyong virus
>         SwissProt: FIBP_ADE1H   Human adenovirus 15/H9
>         SwissProt: POLG_FMDVZ   Foot-and-mouth disease virus (strain
> A22/550 Azerbaijan 65)
>         SwissProt: RIR1_BHV1C   Bovine herpesvirus 1.1
>         SwissProt: SODF_METJ    Methylomonas J
>         GenBank: AJ416726               Stylosanthes aff. calcicola
>
> It seems that the regex in validate_species_name() is too restrictive,
> but I can't find a way to turn off validation without editing bioperl
> modules.  There has been some recent discussion of this issue on the
> mailing list (see below).  Does anyone know if or when a
> -validate_species option to Bio::Seq->new() will be added? Or should I
> just propose the code change?
>
> Thanks,
>   David Waner
>
>
> > Stefan Kirov skirov at utk.edu
> > Wed Sep 21 08:46:05 EDT 2005
> >
> >
> ------------------------------------------------------------------------
> --------
> >
> > Thanks for the great answer Hilmar!
> > I would prefer to have some kind of a check if the user wishes so. For
>
> > example Entrezgene file contains some HTML tags in some entries
> species
> > names which is good to know.
> > I will put an option -validate_species in the constructor to turn the
> > check on and off. Maybe a species filter can be of some use as well.
> > though you can just select the correct file from the NCBI site....
> > Thanks again!
> > Stefan
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>


--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------



More information about the Bioperl-l mailing list