[Bioperl-l] Validate Fasta

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Wed Mar 3 05:52:58 EST 2004

Thanks for youe help, but I am afraid not....

-----Original Message-----
From: john herbert
[mailto:john.herbert at clinical-pharmacology.oxford.ac.uk]
Sent: 03 March 2004 10:45
To: michael.watson at bbsrc.ac.uk; bioperl-l at portal.open-bio.org
Subject: Re: [Bioperl-l] Validate Fasta

Hello Michael.
Im not a BioPerl extra-ordinaire programmer (so anyone correct me if I
am wrong) but I think the -format flag should help here. 


my $in = Bio::SeqIO->new(-file => "rubbish.fasta", -format =>
my $out = Bio::SeqIO->new(-file => ">rubbish2.fasta", -format =>

I am pretty sure if you put this change in your code and run it on your
very nice Perl fasta sequence, it will complain. 

Kind regards,


>>> "michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk> 03/03/2004
10:16:04 >>>

I have searched the archives and only come up with one answer, and it
didn't work - I want to validate a FASTA sequence (DNA).  What I mean is
that if I am given a perfect FASTA sequence, then thats ok, but if there
are ANY whitespace characters, or any other characters that really
shouldn't be there, I want it to throw an error.  The script below was
suggested by Jason in 2002:

use Bio::SeqIO;

my $in = Bio::SeqIO->new(-file => "rubbish.fasta");
my $out = Bio::SeqIO->new(-file => ">rubbish2.fasta");

eval {
	LOOP: while( my $seq = $in->next_seq ) {

if( $@) {
	print "There's an Error!\n";
	goto LOOP;

I actually fired this at one of my scripts, a perl script that clearly
wasn't a fasta sequence - it has #'s, \ts, \ns and all sorts of non DNA
sequence characters.  Here is the result:

/mick/backups";my$date=`date`;my at date=split(/\s+/,$date);my$
date=join("_", at date[0..2],$date[$#date]);print"$date\n";#whi 

This is undoubtedly a wonderfully FASTA formatted perl script, but...

Anyone?  Any ideas?

Thanks in advance for the help!

Bioperl-l mailing list
Bioperl-l at portal.open-bio.org 
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org

More information about the Bioperl-l mailing list