[Bioperl-l] Bio::SeqIO::scf header/comments handling
bosborne11 at verizon.net
Tue Oct 31 21:31:49 EST 2006
It looks like a good place to start would be the get_header() and
_get_header methods in Bio::SeqIO::scf. If you read t/scf.t you can see that
the author, at some point, wanted get_header to return meaningful
information but stepping through the test shows it returning a lot of UNDEF.
Now I don't know if this is due to the method or the source SCF file, but
you might be able to get these methods to work yourself.
But to answer your questions, yes, it certainly sounds reasonable that these
values would be extracted by Bio::SeqIO::scf.
On 10/31/06 3:51 PM, "Nancy Hansen" <nhansen at nhgri.nih.gov> wrote:
> As sequencing centers begin to deposit trace data from "Medical
> Sequencing" projects into the public archives, there is now the need to
> "anonymize" sequence trace files by removing embedded information which
> might be used to identify the individual who was the original source of
> the DNA being sequenced.
> I was hoping I might be able to use Bio::SeqIO to manipulate the
> comments contained in an SCF-formatted trace file, but I'm finding that
> Bio::SeqIO/Bio::Seq::SequenceTrace doesn't seem to store this information.
> Since SCF is a widely-accepted standard for trace files, would it be
> reasonable to include fields like "scf_comments" and "scf_header" in a
> Bio::Seq::SequenceTrace object and have Bio::SeqIO::scf populate them?
> Likewise, it would be great if write_seq could pull these values right
> from a SequenceTrace object rather than requiring them as arguments.
> I'd be happy to help in this effort if necessary.
> Nancy F. Hansen, PhD nhansen at nhgri.nih.gov
> Bioinformatics Group
> NIH Intramural Sequencing Center (NISC)
> 5625 Fishers Lane
> Rockville, MD 20852
> Phone: (301) 435-1560 Fax: (301) 435-6170
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l