From BioPerl
Jump to: navigation, search
PDoc Bio::SeqIO
metaCPAN Bio::SeqIO



Bio::SeqIO provides a factory interface for parsing sequence files. The system is designed to be pluggable so that new formats can be added easily. Additional documentation is provided by the SeqIO HOWTO. This module can parser many different pairwise alignment search algorithm results.

Format Bio::SearchIO module comments
FASTA Bio::SeqIO::fasta
FASTQ Bio::SeqIO::fastq
NEXML Bio::SeqIO::nexml
SEQXML Bio::SeqIO::seqxml
BSML Bio::SeqIO::bsml
GenBank Bio::SeqIO::genbank
EMBL Bio::SeqIO::embl
plz add...

Implementing new sequence parsers and writers

A new Bio::SeqIO subclass must:

  1. have an all-lowercase name (there are reasons for this), e.g. simpleseq
  2. be in the Bio::SeqIO package namespace, e.g. Bio::SeqIO::simpleseq
  3. reside in the Bio/SeqIO directory
  4. implement the next_seq method (for reading)
  5. implement the write_seq method (for writing)


The next_seq method should read data using the $self->_readline method as all Bio::SeqIO modules inherit from Bio::Root::IO. This method should return a new Bio::PrimarySeqI object. If the file or stream contains more than one sequence then repeated calls to next_seq should return a new sequence until the end of the stream, when an undefined value should be returned. If the sequence data is rich, meaning it contains features and annotations then Bio::SeqI or Bio::Seq::RichSeqI objects should be returned.


The write_seq method should accept an array of one or many Bio::PrimarySeqI objects and generate sequences in the desired format. The data should be written to the stream using the $self->_print

Example module

 package Bio::SeqIO::simpleseq;
 use strict;
 use Bio::PrimarySeq;
 use base qw(Bio::SeqIO);
 use vars qw($SEP); 
 $SEP = "\t";
 # if this module has its own special initialization options
 sub _initialize {
   my ($self,@args) = @_;
   my ($sep) = $self->_rearrange([qw(SEP)], @args);
   $sep && $self->sep($sep);  
 # method to write a sequence out
 =head2 write_seq
  Title   : write_seq
  Usage   : $stream->write_seq(@seq)
  Function: writes the $seq object into the stream
  Returns : 1 for success and 0 for error
  Args    : array of 1 to n Bio::PrimarySeqI objects
 sub write_seq {
   my ($self,@args) = @_;
   my $sep = $self->sep;
   for my $seq ( @args ) {
     $self->_print(join($sep, $seq->display_id, $seq->seq), "\n");
   return 1;
 # method to read a sequence in
 =head2 next_seq
  Title   : next_seq
  Usage   : my $seq = $stream->next_seq
  Function: reads a $seq object from the stream
  Returns : Bio::PrimarySeqI
  Args    : None
 sub next_seq {
   my ($self) = shift;
   my $line = $self->_readline;
   return undef unless defined $line && $line =~ /\S+/;
   my $sep = $self->sep;
   my ($id, $seq) = split($sep, $line);
   return Bio::PrimarySeq->new(-seq => $seq, -display_id => $id);
 =head2 sep
  Title   : sep
  Usage   : $obj->sep($newval)
  Function: Get/Set the field separator
  Returns : value of separator
  Args    : newvalue (optional)
 sub sep{
   my ($self,$value) = @_;
   if( defined $value) {
      $self->{'_sep'} = $value;
    return $self->{'_sep'} || $SEP;

See also


Personal tools
Main Links