[Bioperl-l] Validation of files using BioPerl

Chris Fields cjfields at uiuc.edu
Wed Jun 6 08:37:28 EDT 2007

It has been discussed but never coded.  I believe if it passes  
through the Bio::SeqIO parser it's generally considered validly  
formatted (spacing, balanced quotes), though it doesn't specifically  
check FT keys and qualifiers for invalid ones, look for missing  
annotation, check taxonomy, etc.

As long as the end sequence mark (//) is present for every file, you  
cold try parsing the file into chunks (read with 'local $/ = '//';')  
and tossing the seq chunks as a filehandle (via IO::String) to a  
Bio::SeqIO object wrapped in an eval block (the parser resets $/, so  
it should work).  Follow the eval with a check of $@ for caught  
errors.  It might get tedious for big sequences...


On Jun 6, 2007, at 1:27 AM, Shameer Khadar wrote:

> Dear All,
> How to validate an input file in fasta/PIR/GenPept/PDB format using
> Bioperl ? (This is to avoid unnecessary files to be submitted to  
> servers
> by new users).   Any module available ?
> Many thanks in advance,
> -- 
> Shameer Khadar
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign

More information about the Bioperl-l mailing list