Blundell - Structure-Based Drug Design

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

PROGRESS

Structure-based drug design


Tom L. Blundell

The three-dimensional structures of more than 4,000 macromolecules have already been solved, and the number
will continue to increase steadily. Many of these macromolecules are important drug targets and it is now
possible to use the knowledge of their three-dimensional structure as a good basis for drug design.

THE revolution in biology over the past two decades has resulted consequence, the number of protein structures has begun to rise
in radical new opportunities for drug discovery. Most impor- exponentially, like that of sequences over the previous decade.
tantly it has defined major drug targets in the form of molecular Now there are more than 4,000 macromolecular three-dimen-
components of disease processes, which are now being de- sional structures in the Protein Data Bank (refs 2, 5) and many
veloped for use in automated assays. Once lead compounds of these are key drug targets (Figs 2 and 3).
have been identified by screening natural compounds, chemical The accuracy required of a macromolecular structure reflects
databanks or combinatorial libraries, the target macromolecules the use to which it will be put. If the design is predicated on the
can provide a starting point for structure-based approaches assumption that a lead molecule will complement a known bind-
(see ref. 1 for a review). These involve definition of the topogra- ing site precisely, an accurate model will be required at the high-
phies of the complementary surfaces of ligands and their macro- est resolution possible, although designers must remember that
molecular targets. Here I describe recent progress in using such proteins are flexible and can easily accommodate small changes.
knowledge of the three-dimensional structures of receptor or However, if the designer wishes only to know the general avail-
target proteins as a basis for drug design. ability of space, essential hydrogen bonds, key electrostatic
interactions or where to cyclize ligand groups, a rough model
Three-dimensional structures and their accuracy may be adequate.
Although the number of macromolecular three-dimensional Of course, the accuracy of a three-dimensional structure
structures increased linearly for about 30 years2, more powerful depends on the refinement, the resolution and the restraints
synchrotrons, better X-ray detectors, faster computers and introduced in the structure analysis3'6. However, much structure-
graphics and new multi-dimensional nuclear magnetic reso- based design appears to assume that the structure is correct,
nance (NMR) methods have changed the position radically in precise and rigid. Modelling software should perhaps oblige the
the past decade3'4. X-ray and NMR approaches have both taken user to know more about the experimental approach, the statis-
advantage of the expression and purification of stable domains, tical parameters indicating the agreement between model and
substructures and mutants of often complex proteins such as data and the thermal parameters giving clues about disorder,
receptor tyrosyl kinases, oncogenes and repressors (Fig. 1). As a which are available in the original Protein Data Bank files.

Interactive graphics and lead development


Once the three-dimensional structure of a target protein has
dimer
been defined, then computational procedures are required to
suggest ligands that will bind at the active site. This can be
approached either by elaborating a known ligand, preferably
where the proteinligand complex has been defined by X-ray
analysis, or by searching for ligands (database approach) or mol-
ecular fragments (construction approach) that complement the
receptor topography7.
Structure-based design begins with the graphical display of
hydrogen bonds, molecular surfaces8'9 and electrostatic
fields' ". Traditionally, key interactions have been identified
visually from three-dimensional structures of macromolecular
ligand complexes, as illustrated in Fig. 2. New ligand designs are
then explored that optimize a transition-state isostere, modify
groups to improve complementarity and cyclize side groups to
increase the rigidity of the ligand. Other important objectives
include modification of bonds susceptible to hydrolysis, such as
peptide bonds, decrease in size of ligand to assist nasal or oral
absorption and identification of sites for modulation of physi-
calchemical properties to improve bioavailability.

Making the new molecules


Although some attempts have been made at interactive docking
of a putative ligand molecule into a receptor site'2'4, it is more
effective to evaluate the electrostatic, steric or more complex
energy terms during a systematic search of rotational and trans-
lational space for the two molecules15"6. Computational time
FIG. 1 A schematic representation of the nerve growth factor (NGF) can be reduced by precalculating terms for each point on a grid
complex with its receptor tyrosyl kinase, TrkA. Figure devised by Judith using electrostatic terms for probes17 or by using pseudo-ener-
Murray Rust on the basis of the crystal structure of NGF and compara- gies calculated from pairwise distributions of atoms in protein
tive models of the receptor domains. Cl and C2, cystine-rich regions; complexes or crystals of small molecules18'19. Probe molecules
Igl and 1g2, immunoglobulin-like domains; LLR, leucine-rich region. are fitted to these potentials and ranked according to energy.
NATURE VOL 384 SUPP . 7 NOVEMBER 1996 23
PROGRESS

Zn a
Structural Zinc Structural Calcium
N IHuil66 I (lof 3)

u
ILeu164I

JAIa165 I G1y161

Water
1262 H20
P3,
'c

Hts205

N
OH

Catalytic Zinc

1Leu222 I

H1s211 I

::::.
1T'

...t '.?, 4'


FIG. 2 The binding of the hydroxylamine «ÇF
inhibitor U24522 to the stromelysin catalytic -.
domain, a, Schematic of the metal-binding, \ ¡'
hydrogen-bonding interactions and the major
binding pockets; stromelysin catalyst domain
active site, black; U24522 inhibitor, red.
b, The Si' specificity pocket of stromelysin
catalytic domain (blue) compared to that of
another matrix metalloproteinase, fibroblast
collagenase (yellow)55.

For example, the program DOCK21 creates a negative image GROWMOL23 gives multiple highly diverse structures comple-
of the target site and selects and ranks putative ligands on the mentary to active sites, GenStar24 generates chemically reason-
basis of a comparison of internal distances. Up to 100,000 com- able structures from sp3 carbons to fill the binding site, whereas
pounds can be examined in a week. However, difficulties arise in the multiple-copy simultaneous search (MCSS) method25 maps
finding the proper conformation and in discriminating between out the structure by determining energetically favourable posi-
putative interaction modes of similar energy. Procedures for tions and orientations of functional groups on the receptor sur-
matching involving genetic algorithms21 and graph theory22 can face. LUDI26 positions molecules or new substituents into clefts
also be used to generate molecular structures within constraints so that hydrogen bonds are formed and hydrophobic pockets are
of an enzyme active site or a receptor binding site. filled with hydrocarbon groups. A standard library of 1,000 frag-
Alternatively, fragments can be positioned in the binding cleft ments is used to fit the interaction sites and a further library of
of macromolecules and then 'grown' to fill the space available, 1,200 link fragments is used to connect these into a single mol-
exploring the electrostatic, van der Waals or hydrogen-bonding ecule. Such methods depend on the existence of large databases
interactions involved in molecular recognition7. For example, of small molecule structures such as the Cambridge Structure
24 NATURE V0L384 SUPP 7NOVEMBER1996
PROGRESS

FIG. 3 Structure of a, HIV proteinase and b,


human renin complexed with inhibitors
viewed along the active site. Figures
derived from coordinates of a, 4hvp (ref.
52) and b, human renin CP85339 (ref. 56).

Data Base, which contains 100,000 crystal structures27, or the supported by several approximate three-dimensional models40'51
Fine Chemicals Directory where molecular formulae can be and later by X-ray structures48'52'53. This gave important clues
automatically processed into a useful three-dimensional repre- about inhibitors. Structures of several hundred inhibitor com-
sentation by CONCORD28. plexes have been experimentally defined, providing a previously
unparalleled structural database for design (see, for example,
Comparative modelling on a common fold Fig. 3a). These complexes have exploited a range of different
Where there is no three-dimensional structure of the target, a structural features, including 2-fold symmetry in the ligand,
protein with a similar fold can provide the basis for constructing replacement of a bound water molecule, cyclization and replace-
a useful mode12931. For homologous proteins with sequence ment of scissile peptide bonds; see, for example, the cyclic, sym-
identities >30%, the common fold can be recognized by metrical inhibitor of the DupontMerck team54. Several studies
sequence searches. For more distantly related proteins, profiles have involved the use of programs such as DOCK or fragment
or templates are useful in the search for the common fold and searching to identify non-peptidic structures. Useful molecules
alignment of the sequences3238. Once a related fold is identified, are now exploited as cocktails in the treatment of AIDS,. with
this can be used to model the three-dimensional structure. Most encouraging results, although it is evident that mutation in HIV
methods depend on the assembly of rigid fragments391, which allows the virus to escape quickly if challenged with a single
are used in programs such as COMPOSER to define first the antiviral agent.
framework, second, the structurally variable, mainly loop Similar approaches have been used to design inhibitors of a
regions and, third, the side chains29'424. An alternative number of other drug targets. These include antihypertensives
approach, encoded in MODELLER45, seeks to satisfy structural that inhibit human renin (Fig. 3h), anticancer and antiarthritis
restraints derived from homologues and other proteins and inhibitors of matrix metalloproteinases, such as collagenase and
expressed as probability density functions. These modelling pro- stromelysin (Fig. 2), selective immunosuppressants that target
cedures are most successful where the percentage sequence purine nucleotide phosphorylase, agents for treatment of the
identity to the unknown is high (greater than 40%)46. In the common cold that bind the rhinovirus canyon, and antiprolifer-
absence of a common fold, càmbinatorial approaches47, which ative agents that inhibit thymidylate synthase (see ref. 1 for
bring together prediction of secondary and supersecondary review). Most of these have been carried out in pharmaceutical
structures with procedures for docking these together, can be companies where thousands of crystal structure analyses have
used to predict the tertiary structure. been carried out in recent years. New compounds have been
developed that would never have arisen from conventional drug
Successes, contributions and problems discovery techniques.
Structure-based approaches have already played a role in the One major challenge for drug discovery is a consequence of
discovery of several drugs now in clinical use. One of the most the very large surfaces that characterize many of the protein
notable examples has been the development of HIV inhibitors complexes involved in receptor recognition and signal trans-
as AIDS antivirals48. An early report4 that retroviruses code a duction. This is illustrated by the diagram of the nerve growth
proteinase that is related to the aspartic proteinases was remi- factor interaction with its receptor tyrosyl kinase, Trk (Fig. 1).
niscent of our earlier suggestion5° that aspartic proteinases are Not only would it appear difficult to bind a small molecule to
evolved from a symmetrical dimer. We suggested that HIV pro- the large, relatively flat surfaces of many proteins involved in
teinase might be an analogous symmetrical dimer, and this was protein interactions, in contrast to the deep clefts of many
NATURE . V0L384 . SUPP . 7NOVEMBER1996 25
PROGRESS

increasingly used to test


hypotheses about drug-
Proteins Irom Liqands from natural receptor interactions, by
natural Of9aflisii sources or screening
mutating key residues on
the receptor topography. In
PROTEIN
Preparative bichemislry
RECEPTOR
a perfect world a single
assay arid cha,acterizallon
DESIGN BASED cycle should produce an
E rpre..
Ii miiliqraiu.
AND DRUG improved molecule. In prac-
ENGINEERING
quantities
CYCLE
Proteln-ligand DESIGN
CYCLE
tice, designs are developed
complex
iteratively using several cy-
Synthetic
organic
cles, each providing small
Site-directed Crystallization and Solution CD
X-ray analysis fluoresence. 2DNMR chemistry
improvements.
mulageneos
s s o
Structure-based approach-
es will undoubtedly be im-
Cene Relational database Display 3D structure Simulation by EM. portant in the design of
cloning of sequence 3D structure on computer graphics DG. NMA. MD
new proteins, drugs and
vaccines. The structure-based
Knowledge based
approaches have provided
modelling and design many surprises and the sub-
sequent experimental stages
have been a constant re-
Ius- BioOly.ics

bèocorriotflQj
Organic cistz]
minder of the fragility of our
predictive ability. Indeed the
sceptical chemist will rightly
still want to test some of the
structurally less-likely options
and the imaginative chemist
FIG. 4 A multidisciplinary design bi-cycle. CD, Circular dichroism; NMR, nuclear magnetic resonance; EM, will no doubt have his own
energy minimization; DG, distance geometry; NMA, normal mode analysis; MD, molecular dynamics. intuition as well. Further-
more, there is one major
enzymes, but it would also be difficult to disrupt the interaction shortcoming of a rational approach to design. If one company
entirely even if one did. can arrive at a more effective drug through a rational approach,
then so can a competitor. Most drug companies will feel happier
Future of structure-based design with a lead that they have chanced upon randomly in a screen or
Iterative optimization of lead compounds usually involves by combinatorial chemistry as others are less likely to have
multidisciplinary design cycles (Fig. 4) starting from the found the same molecule. However, this may be just the start of
cloning, expression, characterization and definition of the a rational process in which the interactions of the lead molecule
three-dimensional structure of the protein or nucleic acid, with its target receptor are defined as I have described and an
preferably as a complex with a ligand or a pseudo-substrate. improved molecule is designed using a more structure-based
This structure is the basis for suggesting modifications either approach. LI
to the ligand (to be introduced by the chemists) or to the
macromolecule (to be introduced by the genetic engineer). The Tom L. Blundell is in the Department of Biochemistry, University
latter cycle will be of value when the molecule of interest is of Cambridge, Tennis Court Road, Cambridge CB2 IQW, UK, and
itself a protein, for example for 'humanizing' monoclonal the Imperial Cancer Research Fund Unit of Structural Molecular
antibodies, for engineering enzymes and for modifying Biology, Department of Crystallography, Birkbeck College, Malet
polypeptide hormones, growth factors or cytokines. It is Street, London WCIE 7HX, UK.

Whittle, P. J. & Blundell, T. L. A. Rev. Biophys. biomolec. Struct. 23, 349-375 (1.994). Johnson, M. S. et al. Cnt. Rev, bio!. Chem. molec. Bio!, 29, 1-70 (1994).
Bernstein, F. C. et al. J. mo!ec. Bio!. 112, 535-542 (1977). Taylor, W. R. J. molec. Bio!. 188, 233-258 (19861.
Wuthrich, K. Les Cahiers Fondation Louis Jeantet 8, 1-16 (1993). Gribskov, M. etal. Proc. natn, Aced. Sei, U.S.A. 84, 4355-4358 (1987).
Clore, M. & Gronenhorn, A. Prog. Biophys. mo!ec. Bio!. 62, 153-184 (1995). Ponder, J. W. & Richards, F. M. J. moloc, Bio!, 193, 775-791 (1987).
Sussman, J. Prot. Databank Q. News!ett. 76 (April 1996). Sippl, M. J. molec, Bio!. 213, 859-883 (1990).
Blundel!, T. L. & Johnson, M. S. Protein Crystallography (Academic, London, 1976). Jones, D. T., Taylor, W. R. & Thornton, J. M. Nature 358, 86-89 (1992).
cohen, N. C. & Tschinke, N. Progr. Drug Res. 45, 205-235 (1995). Johnson, M. S., Overington, J. P. & Blundell, T. L. J. moloc. Bio!, 231, 735-752
Langridge, R. et al. Science 211, 661-686 (1981). (1993).
Connolly, M. L. Science 221, 709-713 (1983). Bowle, J. U., Luthy, R. & Eisenberg, D. Science 253, 164-170 (1991).
Gilson, M. K., Sharp, K. A. & Honig, B. H. J. comput. Chem. 9, 327-333)1988). Jones, T. H. & Thirup, S. EMBO J. 5, 819-822(1986):
600dsell, D. S., Mian, I. S. & Olson, A. J. J. molec. Graphics 7, 41-47 (1989). Blundell, T. L. etal. Eur. J. Biochem. 172, 513-520 (1988).
Busetta, B., Tickle, I. J. & B!unde!!, T. L. J. app!. Crysta!!ogr. 16, 432-438(1983). Claessens, M. etal. Prot. Engng2, 335-345(1989).
Pattabiraman, N. et al. J. comput. Chem. 6, 432-439 (1985). Sutclitfe, M. J., Hayes, F. R. F. & Blundell, T. L. Prot. Engngl, 385-392(1987).
Tomioka, N., taj, A. & litaka, Y. J. Comput. Aided mo!ec. Design 1, 197-203 Summers, N. L., Carlson, W. D. & Karplus, M. J. molec, Bio!. 196, 175-198(1987).
(1987). Topham, C. et al. J. moloc. Bio!. 229, 194-220 (1993).
Wodak, S.J. & Janin, J. J. mo!ec. Bio!. 1.24, 323-329)1983). 45 Gal!, A. & Blundell, T. L. J. melee. Bio!, 234, 779-81.5 (1993).
Kuntz, T. et al. J. mo!ec. Bio!. 161, 269-278 (1982). Srinivasan, N. & Blundel!, T. L. Pro!. Engng6, 501-612(1993).
Goodford, P. J. J. med. Chem. 28, 849-857 (19851. Presnel!, S. R., Cohen, B. I. & Cohen, F. E. Biochemistry3l, 983-988(1992).
Cruciani, G. & Goodford, P. J. J. mo!ec. Graphics 12, 116-129 (1994). W!odawer, A. & Erickson, J. A. Rev. Biochem. 62, 543-585 (1993).
Pastor, M. & Cruciani, G. J. med. Chem. 38, 4637-4647 (1995). Toh, H., Ono, M., Salgo, K. & Miyata, T. Nature 315,691(1985).
Leach, A. R. & Kuntz, I. D. J. comput. Chem. 13, 730-748(1992). Tang, J., James, J., Jenkins, J. A. & 8!undell, T. L. Nature 271, 618-621 (1978).
Payne,,A. W. R. & Glen, R. C. J. mo!ec. Graphics 11, 76-83(1993). Pearl, L. H. & Taylor, W. R. Nature 329, 351-364)1987).
Lewis, R. A. J. mo!ec. Graphics 10, 31-38 (1993). Mi!!er, M. et al. Science 246, 1149-1152 (1989).
Bohacek, R. S. & McMartin, C. J. Am. ehem. Soc. 116, 5560-5565 (1994). Lapatto, R. et al. Nature 342, 299-302 (1989).
Rotstein, S. H. & Murcko, M. A. J. Comput. Aided melee, Design 7, 23-43(1993). Lam, P. Y. 5. et al. Science 263, 380-384 (1994).
Miranker, A. & Karplus, M. Proteins 11, 29-34(1991). Dhanaraj, V. et al. Structure 4, 375-386 (1996).
Bohm, H. J. J. Comput. Aided mo!ec. Design 6, 61-78; 593-606 (1992). Dhanaraj, V. et al. Nature 357, 466-472 (1992).
Allen, F. H. et al. Acta crystal!ogr. 835, 2331.-2339 (1979).
Rusinko, A. et al. J. ehem. mf. Comput. Sei. 29, 327-333 (1989). ACKNOWLEDGEMENTS. I thank J. Murray Rust, V. Dhanaraj and K. Guruprasad for preparation
B!undell, T. L. et al. Nature 326, 347-352 (1987). of figures and for helpful comments on the manuscript, lam grateful to J. Overington for provid
a!i, A. etal. Trends biochem. Sei. 15, 235-240 (1990). ing an 'industry' view.

26 NATURE VOL 384 . SUPP . 7 NOVEMBER 1996

You might also like