Chemical Space Navigation in Lead Discovery
Chemical Space Navigation in Lead Discovery
Chemical Space Navigation in Lead Discovery
Figure 1
O N
N
N N O O
(a) (b) (c)
(Topomer-based similarity techniques are available from space mapping effort requires a property-based system,
Tripos Inc., St Louis, Missouri, http://www.tripos.com), or besides an efficient similarity/diversity metric.
by surfing across the ‘scaffold space’ with SORT&gen [39]
(SORT&gen is available from SPECS and BioSPECS, Early attempts to map physical properties described the
Rijswijk, The Netherlands, http://www.specs.net). On the two-dimensional BC(DEF), ‘bulk’ and ‘cohesiveness’
experimental side, one could resort to high-energy gamma- parameters [42], derived from six physical properties
ray radiations (Kessler U, Pilger BD, Zerbe O, Scapozza L, (aqueous solvation energy, partition coefficient, boiling
Folkers G, personal communication) or to high-speed point, molecular refractivity, volume and vaporization
microwave chemistry [40]. enthalpy) for a set of 114 pure liquids. This scheme was
shown to work quite well for a set of 139 diverse structures
Chemical property space navigation [43]. By analogy to the Mercator convention in geography,
Using the analogy of an intercity distances table, in contrast we recently suggested chemography, a combination of
to a geographical map, Martin and Critchlow [41•] pointed rules (not unlike the BC(DEF) dimensions) and objects
out the advantage of having a chemical space map, rather (chemical structures), to provide a consistent, global
than mere distance-based ‘diversity’ in combinatorial chemical space map [44,45]. Chemographic rules included,
library design. Having the right inter-object distances is initially, general properties such as size, lipophilicity, and
clearly not enough, as one is likely to be successful in hydrogen bond capacity, while objects include ‘satellites’,
finding a list of, for example, five cities in Western intentionally placed outside the drug-like space, as well as
Europe that have identical (or close) distances to cities on ‘core’ objects, selected mostly from a list of orally available
the East Coast of the United States. In the absence of a drugs. ChemGPS, the chemical global positioning system,
proper map, a sixth city in Eastern Europe could, in the comprises both the ‘core’ and ‘satellite’ molecules.
wrong context, be placed somewhere in the Atlantic Chemographic map coordinates are extracted, in ChemGPS,
Ocean. In other words, context-sensitive information is by principal component analysis (PCA) [46], from a (fixed)
required while evaluating chemical spaces, even though list of molecular descriptors that evaluate the above-
appropriate measures may have been taken with respect mentioned rules on a single set of molecules. PCA-score
to distance-based (dis)similarity. Thus, an effective chemical prediction is used, then, to project new molecules on the
Chemical space navigation in lead discovery Oprea 387
4. Horrobin DF: Innovation in the pharmaceutical industry. J Royal 23. Kubinyi H: Chance favors the prepared mind — from serendipity
• Soc Med 2000, 93:341-345. •• to rational drug design. J Rec Signal Transduction Res 1999,
This paper articulates some of the issues related to marketing-driven 19:15-39.
preclinical research in the pharmaceutical industry. Fast-paced review of serendipitous drug discoveries, emphasizing structure-
based design aspects. “Screening, especially…HTS, can be considered as
5. Drews J: Drug discovery: a historical perspective. Science 2000, a systematic approach to benefit from mere chance”.
• 287:1960-1964.
A synopsis of the drug-discovery process is provided. 24. Brown F: Chemoinformatics: what is it and how does it impact
drug discovery. Annu Rep Med Chem 1998, 33:375-384.
6. Gaudillière B, Bernardelli P, Berna P: To market, to market — 2000.
• Annu Rep Med Chem 2000, 36:293-318. 25. Hahn MM, Green R: Cheminformatics — a new name for an old
Each year, Annual Reports in Medicinal Chemistry summarizes the NCEs • problem? Curr Opin Chem Biol 1999, 3:379-383.
introduced into the world market for the first time in the previous year. 35 Both software and hardware are discussed from a practitioner’s perspective.
NCEs were introduced into their first markets in 2000. Previous years are as
follows, with the number of NCEs first introduced in the market that year 26. Olsson T, Oprea TI: Cheminformatics: a tool for decision-makers in
given in brackets: 1991 (36), 1992 (36), 1993 (43), 1994 (a record 44 NCEs), drug discovery. Curr Opin Drug Discov Dev 2001, 4:308-313.
1995 (35), 1996 (38), 1997 (39), 1998 (27), and 1999 (35). The average
27. Li J, Murray CW, Waszkowycz B, Young SC: Targeted molecular
for this decade is 37 NCEs.
diversity in drug discovery — integration of structure-based design
7. Kennedy T: Managing the drug discovery/development interface. and combinatorial chemistry. Drug Discov Today 1998, 3:105-112.
Drug Discov Today 1997, 2:436-444.
28. Lewis RA: The design of small- and medium-sized focused
8. Willett P: Similarity and Clustering Techniques in Chemical combinatorial libraries. In Molecular Diversity in Drug Design.
Information Systems. Letchworth: Research Studies Press; 1987. Edited by Dean PM, Lewis RA. Dordrecht: Kluwer Academic
Publishers; 1999:221-248.
9. Johnson MA, Maggiora GM: Concepts and Applications of Molecular
Similarity. New York: Wiley; 1990. 29. Horvath D: High throughput conformational sampling and fuzzy
similarity metrics: a novel approach to similarity searching and
10. Willett P: Chemoinformatics — similarity and diversity in chemical focused combinatorial library design and its role in the drug
•• libraries. Curr Opin Biotechnol 2000, 11:85-88. discovery laboratory. In Combinatorial Library Design and
Good overview of cluster-based, partition-based, dissimilarity-based and Evaluation for Drug Design. Edited by Ghose AK, Viswanadhan VN.
optimization-based selection techniques for combinatorial synthesis planning. New York: Marcel Dekker Inc.; 2001:429-472.
11. Lewis RA, Pickett SD, Clark DE: Computer-aided molecular 30. Patterson DE, Cramer RD, Ferguson AM, Clark RD, Weinberger LE:
•• diversity analysis and combinatorial library design. Rev Comput Neighborhood behavior: a useful concept for validation of
Chem 2000, 16:1-51. ‘molecular diversity’ descriptors. J Med Chem 1996, 39:3049-3059.
This is an exhaustive review related to molecular similarity and diversity.
Methods to probe chemical diversity, including molecular descriptor validation, 31. Pastor M, Cruciani G, McLay I, Pickett S, Clementi S:
are discussed in the context of combinatorial library design. GRid-INdependent Descriptors (GRIND): a novel class of
alignment-independent three-dimensional molecular descriptors.
12. Martin YC: Diverse viewpoints on computational aspects of J Med Chem 2000, 43:3233-3243.
•• molecular diversity. J Comb Chem 2001, 3:231-250.
This review provides first-hand accounts and key references from authors 32. Teague SJ, Davis AM, Leeson PD, Oprea TI: The design of leadlike
involved early on in the field of molecular similarity and diversity. • combinatorial libraries. Angew Chem Int Ed Engl 1999,
38:3743-3748. German version: Angew Chem 1999, 111:3962-3967.
13. Todeschini R, Consonni V: Handbook of Molecular Descriptors. Low-molecular weight and low-hydrophobicity are discussed as key properties
• Weinheim: Wiley-VCH; 2000. when designing combinatorial libraries.
This book covers ca. 3000 molecular descriptors.
33. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ: Experimental and
14. Gund P: Three-dimensional pharmacophoric pattern searching. computational approaches to estimate solubility and permeability
Prog Mol Subcell Biol 1977, 11:117-143. in drug discovery and development settings. Adv Drug Deliv Rev
1997, 23:3-25.
15. Olender R, Rosenfeld R: A fast algorithm for searching for
molecules containing a pharmacophore in very large virtual 34. Hann MM, Leach AR, Harper G: Molecular complexity and its
combinatorial libraries. J Chem Inf Comput Sci 2001, 41:731-738. impact on the probability of finding leads for drug discovery.
J Chem Inf Comput Sci 2001, 41:856-864.
16. Daylight Chemical Information System, Santa Fe, New Mexico.
http://www.daylight.com 35. Oprea TI, Davis AM, Teague SJ, Leeson PD: Is there a difference
between leads and drugs? A historical perspective. J Chem Inf
17. Dixon SL, Merz KM: One-dimensional molecular representations
Comput Sci 2001, 41:1308-1315.
• and similarity calculations: methodology and validation. J Med
Chem 2001, 44:3795-3809. 36. Cramer RD, Poss MA, Hermsmeier MA, Caulfield TJ, Kowala MC,
This paper provides a simplified, one-dimensional projection of molecular Valentine MT: Prospective identification of biologically active
structures, that can be generated from 2D or 3D structures. structures by topomer shape similarity searching. J Med Chem
1999, 42:3919-3933.
18. Leach AR, Bradshaw J, Green DVS, Hann MM: Implementation of a
system for reagent selection and library enumeration, profiling, 37. Cramer RD, Jilek RJ, Andrews KM: Dbtop: topomer similarity
and design. J Chem Inf Comput Sci 1999, 39:1161-1172. searching of conventional structure databases. J Mol Graph Model
2002, in press.
19. Shi S, Peng Z, Kostrowicki J, Paderes G, Kuki A: Efficient
combinatorial filtering for desired molecular properties of reaction 38. Andrews KM, Cramer RD: Toward general methods of targeted
products. J Mol Graph Model 2000, 18:478-496. library design: topomer shape similarity searching with diverse
structures as queries. J Med Chem 2000, 43:1723-1740.
20. Lobanov VS, Agrafiotis DK: Stochastic similarity selections from
large combinatorial libraries. J Chem Inf Comput Sci 2000, 39. De Laet A, Hehenkamp JJJ, Wife RL: Finding drug candidates in
40:460-470. virtual and lost/emerging chemistry. J Heterocyclic Chem 2000,
21. Lipinski CA: Drug-like properties and the causes of poor solubility 37:669-674.
•• and poor permeability. J Pharmacol Toxicol Methods 2000, 40. Kappe CO: High-speed combinatorial synthesis utilizing
44:235-249. microwave irradiation. Curr Opin Chem Biol 2002, 6:this issue.
Analysis of clinical candidates at Merck indicates that the ‘rational design
approach’ leads to poorer permeability, whereas Pfizer’s ‘HTS approach’ 41. Martin EJ, Critchlow RE: Beyond mere diversity: tailoring
appears to result in clinical candidates with poorer solubility. • combinatorial libraries for drug discovery. J Comb Chem 1999,
1:32-45.
22. Agrafiotis DK, Rassokhin DN: Design and prioritization of plates Chemical property binning, via multi-dimensional scaling, could facilitate
• for high-throughput screening. J Chem Inf Comput Sci 2001, chemical diversity void identification.
41:798-805.
The issue of plate selection for HTS is discussed in view of plate-based 42. Cramer RD: BC(DEF) parameters: 1. The intrinsic dimensionality
molecular diversity and similarity to known lead(s). User-defined selection of intermolecular interactions in the liquid state. J Am Chem Soc
criteria (e.g. purity), can also be included in these experimental design schemes. 1980, 102:1837-1849.
Chemical space navigation in lead discovery Oprea 389
43. Cramer RD: BC(DEF) parameters: 2. An empirical structure-based 52. Oprea TI, Zamora I, Svensson P: Quo vadis, scoring functions?
scheme for the prediction of some physical properties. J Am Toward an integrated pharmacokinetic and binding affinity
Chem Soc 1980, 102:1849-1859. prediction framework. In Combinatorial Library Design and
Evaluation for Drug Design. Edited by Ghose AK, Viswanadhan VN.
44. Oprea TI, Gottfries J: Chemography: the art of chemical space New York: Marcel Dekker Inc; 2001:233-266.
navigation. J Comb Chem 2001, 3:157-166.
53. Oprea TI: Virtual screening in lead discovery: a viewpoint.
45. Oprea TI, Gottfries J: ChemGPS: a chemical space navigation tool. Molecules 2002, 7:51-62.
In Rational Approaches to Drug Design. Edited by Höltje HD,
Sippl W. Barcelona: Prous Science Press; 2001:437-446. 54. Cruciani G, Crivori P, Carrupt PA, Testa B: Molecular fields in
quantitative structure-permeation relationships: the VolSurf
46. Jackson JE: A Users Guide to Principal Components. New York: approach. J Mol Struct (Theochem) 2000, 503:17-30.
Wiley; 1991.
55. Anonymous: Waiver of in vivo bioavailability and bioequivalence
47. Clementi S, Cruciani G, Fifi P, Riganelli D, Valigi R, Musumarra G: studies for immediate-release solid oral dosage forms based on
A new set of principal properties for heteroaromatics obtained by a biopharmaceutics classification system. 2000. Available from
GRID. Quant Struct Act Relat 1996, 15:108-120. http://www.fda.gov/cder/OPS/ BCS_guidance.htm.
48. Goodford PJ: Computational procedure for determining 56. Guba W, Cruciani G: Molecular field-derived descriptors for the
energetically favourable binding sites on biologically important multivariate modeling of pharmacokinetic data. In Molecular Modeling
macromolecules. J Med Chem 1985, 28:849-857. and Prediction of Bioactivity. Edited by Gundertofte K, Jørgensen FS.
New York: Kluwer Academic/Plenum Publishers; 2000:89-94.
49. Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S: New
chemical descriptors relevant for the design of biologically active 57. Zamora I, Oprea TI, Ungell AL: Prediction of oral drug permeability.
peptides. A multivariate characterization of 87 amino acids. J Med In Rational Approaches to Drug Design, Edited by Höltje HD,
Chem 1998, 41:2481-2491. Sippl W. Barcelona: Prous Science Press; 2001:271-280.
50. Darvas F, Dorman G: Early integration of ADME/Tox parameters 58. Crivori P, Cruciani G, Carrupt PA, Testa B: Predicting blood-brain
into the design process of combinatorial libraries. Chim Oggi • barrier permeation from three-dimensional molecular structure.
1999, 17:10-13. J Med Chem 2000, 43:2204-2216.
A comprehensive list of drugs that penetrate (or not) the blood-brain-barrier.
51. Pickett SD, McLay IM, Clark DE: Enhancing the hit-to-lead
properties of lead optimization libraries. J Chem Inf Comput Sci 59. Oprea TI, Zamora I, Ungell AL: A pharmacokinetically based mapping
2000, 40:263-272. device for chemical space navigation. J Comb Chem 2002, in press.