[Bioperl-guts-l] [Bug 3061] New: AlignIO hash sequence storage
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Thu Apr 22 07:55:11 EDT 2010
http://bugzilla.open-bio.org/show_bug.cgi?id=3061
Summary: AlignIO hash sequence storage
Product: BioPerl
Version: unspecified
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Core Components
AssignedTo: bioperl-guts-l at bioperl.org
ReportedBy: bernd at bio.vu.nl
Hi
Something I stumble one from time to time: the storage of sequence in AlignIO
is based in SeqIDs. This complicated reading alignments with duplicate IDs,
which actually do occur quite a lot (e.g. CDD of NCBI). Usually I try to
"uniqfy" IDs but this is not straightforward for all alignments formats.
Actually this is were BioPerl is really useful ;-)
I'd propose to store the Sequences in a hash in AlignIO using unique keys,
possibly optionally, to be able to read all sequences in the alignment, even
when they all have the same ID.
This would solve the replacing warnings too.
-------------------- WARNING ---------------------
MSG: Replacing one sequence [10/1-214]
Possibly this can be taken in with the AlignIO refactoring
Regards,
Bernd
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Bioperl-guts-l
mailing list