[Bioperl-l] PDB sequence from ATOM records
dhoworth at mrc-lmb.cam.ac.uk
Tue Jul 6 10:54:38 EDT 2004
It seems to me that many people who parse PDB files write their own
code. This is a shame, because it wastes effort, it makes things more
difficult for beginners, and it leads to differences in results.
This practice stems, I believe, both from the complexity of the PDB data
and from the multitude of use cases. It is well-known that there are
exceptions to almost every rule about the content of PDB files. It is
also clear that sometimes people care about every character in the
coordinates, while other times they care just about the sequence and
sometimes just specific parts of the header, for example.
I think it might be useful to have a session on this subject at BOSC.
We can try to capture different people's requirements. We can list
examples of PDB entries that demonstrate specific problems. We can
consider existing code possibilities. We can drink beer.
Afterwards, perhaps there is more chance of building some software that
will be widely used.
What do you think?
MRC Centre for Protein Engineering
Hills Road, Cambridge, CB2 2QH
More information about the Bioperl-l