[Bioperl-l] fasta file parser

ste.ghi at libero.it ste.ghi at libero.it
Tue Jul 22 07:28:24 EDT 2008

Dear all,
I'm trying to write a script wich, given a file containing a list of 
IDs, parses a big fasta file returning only sequences NOT listed in the list-

To do so, I first create an array with the IDs to be excluded:


#Load LIST content in an array; avoids duplicates
while (my $line = <LIST>) {

    push(@array1,$line );    

    foreach my $uniq ( @array1 ){

	next if $seen
{ $uniq }++;

	push @unique, $uniq;


then, process the fasta file in 
this way (NOT WORKING).

#Fasta file processing
my $newSeqFileName  = Bio::
SeqIO->new(-file=> ">>INFILE", -format=>'fasta');
while (my $query = 
$SeqFileName->next_seq()) {
       foreach my $elem(@unique){
		chomp $elem;

       	if ($elem eq $query->id) {  

            		print $query->id." matched 
with $elem listed in $ARGV[1]: skipped!\n";
elsif ($elem ne $query->id) {
       			next if $seen2{ $query->id }++;



in this way I get only an exact copy of the input file....where am I wrong?

Thanks a lot for your kind help!

More information about the Bioperl-l mailing list