[Bioperl-l] A perl regex query
stefan.kirov at bms.com
Tue Sep 18 14:54:49 EDT 2007
Actually, smiles can be tricky too- you can easily generate
non-canonical keys, where InChi is unique (as I understand it at least).
It is promoted by IUAPC:
and can be generated by OpenBabel.
My take is that if you need to map between small molecules InChi might
be the best way..
Chris Fields wrote:
> On Sep 18, 2007, at 8:26 AM, Roy Chaudhuri wrote:
>>> My actual problem is a bit more complicated.
>>> It is not just one string, nut lakhs of them, they are actually
>>> names of
>>> chemical compounds.
>>> THe problem is there are 2 different data sources, I need to match
>>> compond names between them, but the problem is though the compound
>>> be the same in the two, they use different naming formats for them.
>> Unless you can define in simple and precise terms exactly which
>> parts of
>> the string you need then there is no way that you will be able to
>> code a
>> solution in Perl.
>> Maybe you could look for a database that contains the synonyms for
>> molecule? A quick Google finds ChEBI (http://www.ebi.ac.uk/chebi),
>> is available to download as flat files.
>> Dr. Roy Chaudhuri
>> Department of Veterinary Medicine
>> University of Cambridge, U.K.
> D'oh! Roy beat me to it; that's what I was going to suggest. I
> agree; don't trust simple word munging to always get you the correct
> answer in this case, it's just too complicated to try and catch every
> ChEBI is a good choice; Stefan's suggestion of OpenBabel is also a
> good one. I would also try not to reinvent the wheel; there may be
> some modules available via CPAN which do what you need, such as these:
> or this:
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
More information about the Bioperl-l