Protein Modelling

Comparative Protein Modelling
Gajigan | Lopez | Palmario | Tan | Sotelo
THE PROBLEM
Can
we predict the 3-dimensional shape of a protein given its amino acid sequence alone? NOOOOOO
Generally,
But
some methods give partial description of 3D structure of proteins
PROTEINS
Amino
acids + peptide bonds = proteins
R groups distinguish different amino acids

Fig. 1 General structure of an amino acid
PROTEINS
DIFFERENT AMINO ACIDS
WHAT DETERMINES PROTEIN FOLDING?
Generally, the aa sequence determines the 3D shape
Exceptions: Protein denaturation Multiple conformations Chaperones
What determines protein fold? Rigidity of backbone Interactions among amino acids Amino acid interaction with water
LEVELS OF PROTEIN STRUCTURE
SECONDARY STRUCTURE
Folding
of the linear sequence of proteins into regular repeating patterns helix sheets Coil or loop
SECONDARY STRUCTURES
Beta pleated sheet conformation
Alpha helix conformation
DETERMINING PROTEIN STRUCTURE

X-ray NMR Prediction
crystallography by computational means

PDB Yearly Growth of Protein Structures
CATH TAXONOMY
Database containing hierarchical domain classifications of protein structures from PDB C Class, C-level
Defined by secondary structure composition Defined by overall shape of domain structure Defined by overall shape and connectivity of domain structures
A Architecture, A-level
T Topology (Fold family), T-level
H Homologous superfamily, H-level
PROTEIN STRUCTURE PREDICTION
Prediction in 1D
Secondary structure Solvent accessibility Transmembrane helices
Prediction in 2D
Inter-residue/strand contacts
Prediction in 3D
Homology modeling Fold recognition Ab initio prediction
1D SECONDARY STRUCTURE
Given
: Amino acid sequence
What
to do? Predict secondary structure conformation of each amino acid , , or c (coil)
SECONDARY STRUCTURE PREDICTION : ANOTHER APPROACH
Make prediction for a given residue by considering a window of n neighboring residues Determine model that performs mapping from window of residues to secondary structure state
Homology Modelling
Protein Structure Prediction
Protein Homology Modelling
based protein structure is more conserved than protein sequence Assumption:
Homologous protein sequences very similar 3D structure
Most accurate when the target and template have similar sequences
template homologous proteins structure was determined using high resolution experimental methods (i.e., X-ray crystallography or NMR)
Steps in Homology Modelling
Selection of Template with known tertiary structure
use sequence alignment search programs (e.g. BLAST) to identify homologous sequence from protein structure databases like PDB Selection of template can be:

Select template with the highest sequence identity Select potentially different template for each similar segment of the protein sequence
Other factors ion selecting template
Resolution of Template structure
Better to use high resolution structures as model template function Ligands environment
Other source of similarity

Aligning protein sequence with templates
Target and template are aligned using

Pair-wise alignment (e.g. Smith-Waterman) Multiple sequence alignment (e.g. CLUSTAL)
accuracy of the alignment --> critical parameter for successful homology modelling
alignment method maximize

maximize sequence similarity (typical) maximize structural similarity (others)
Alignment defines structurally equivalent position
Building of model structures
Build a model using the known structures of homologous template protein Common modelling methods use: by assembly of rigid bodies
(e.g., COMPOSER, SWISS-MODEL)
by segment matching or coordinate reconstruction
(e.g. SEGMOD)
(e.g. MODELLER)
By satisfaction of spatial restraints
Modelling by assembly of rigid bodies
model is assembled from a small number of rigid bodies obtained from the aligned protein structures Proteins can be dissected into

conserved core regions variable loops connect conserved core region Sidechains decorate the backbone

template structures are selected and superposed
framework is obtained by averaging the coordinates of the atoms of structurally conserved regions Loops are generated fit the anchor core regions and have a compatible sequence sidechains are modelled based on their intrinsic conformational preferences
COMPOSER and SWISS-MODEL
Modelling by segment matching or coordinate reconstruction
based the findings that most hexapeptide segment of protein structure can be clustered into only 100 structurally different classes
Segments on the template usually the conserved segment serve as guiding position
Segments of the target protein fit on these guiding position will be identified and assembled
Protein model will be constructed

SEGMOD
Modelling by satisfaction of spatial restraints
starts by generating many constraints or restraints on the structure of target sequence restraints are obtained
assuming that the corresponding distances between aligned residues in the template and the target structures are similar Considering stereochemical restraints on bond lengths, bond angles, dihedral angles, and non-bonded atomatom contacts that
The model is then derived by minimizing the violations of all the restraints which is achieved either by distance geometry or realspace optimization MODELLER-software used
Evaluation constructed model
Validity of the constructed model must be checked Evaluate the stereochemistry and other structural features of the model (e.g., bond lengths, and dihedral angles, side chain rotamers, etc)
Examples of programs PROCHECK and WHATCHECK
Checking of spatial features of the model
hydrophobic core, solvent accessibility, distribution of charged groups, atom-atomdistances, atomic volumes and main-chain hydrogen bonding
a number of online servers are available to evaluate 3D models including PSVS, Eval123D and JCSG.
final model must be consistent with experimental observations,

site-directed mutagenesis cross-linking data ligand binding
Common Errors in Homology Modelling
Inaccurate or incorrect constructed model may arise from
mistakes in alignment of the sequence to the template selecting wrong template errors in modelling side chains error in modelling sequence segments without template
Limitations of Homology Modelling
Large bias to template Cant study conformational changes, Cant find new catalytic or active side Cant explain the activity or lack of activity of the protein
Limitations of Homology Modelling
Protein Threading:
What It Is, When To Do It and How It Is Done
Homology Modeling has its limitations. so Protein Threading makes up for it.
So, when should we do it?
1. We have a sequence of unknown structure. 2. The sequence has no detectable homology to anything of known structure. 3. There are no functional clues as to the structural class of the unknown.
But these situations arent always recognised.
The ideas behind protein threading

There are limited numbers of basic folds found in nature. (1000 to 10, 000) Amino acid preferences for different structural environments provide sufficient information to choose among folds.
In other words, Protein Threading is a knowledge-based technique.
But what exactly is it?
So, protein threading is

Since there are only a limited number of folds in nature, we can find candidate folds, thread a protein through it, and score a proteins fit.
Its complicated and going obsolete, but heres how it works.
First, we have a sequence. MA A G Y AV L S
Second, we run it through candidate folds.
It wont work without math.
Third, we perform complicated math.
Finally, we get the highest score!
Heres our structural model.
How the scoring works
Score Function Measures match of unknown sequence and target sequence. Number of amino acids of type i in the environment m
Unknown Sequence
The score of the number of amino acids in the environment
Target Sequence
Lets see how it works.

Heres a sequence with unknown structure.
But the good thing is, we know the characteristics of the amino acids present.
H bond donor H bond acceptor Glycine Hydrophobic
Lets run the sequence through a library of folded proteins.

Candidate # 1.
S = 20
Candidate # 2.
S=5
Candidate # 3. S = -3
And the winner is CANDIDATE # 1.
Heres a sample scoreboard.

Amino Acid Type
Position on Sequence
The scores do have a basis

So, when do we get a high score?
We get it when the sequence of amino acids in the unknown highly correspond to that of the target sequence.
The factors that account for the correspondence are as follows: - amino acid preferences for solvent accessibility
- amino acid preferences for particular secondary structures

- interactions among spatially neighboring amino acids
Protein threading will be obsolesced without ever really having had a phase of glory. (Torda, 2003)
Less than 30% of the predicted first hits are true remote homologues.
So, lets not waste time on protein threading when
The sequence already has a very high homology with a known structure. The protein has unusual characteristics.
It doesnt have a structure in the presence of a cofactor or a prosthetic group.
The protein is membrane-bound.
The scoring function assumes that water serves as solvent.
But then again, times have changed Tools now exist to get reliable scores.
Applications and Innovations
Application
Usually, homology modeling is applied in the following fields:
1. 2.
Drug design Analysis of protein function

a)
b) c)
Protein interaction Antigenic behavior of proteins Protein stability studies
3.
Alternative path to experimental design
Innovation
MODELLER Open source software Can be used to model proteins and docking Produces outputs which does not include H atoms Flexible
Thank you!

Protein Modelling

Uploaded by

Copyright:

Available Formats

Protein Modelling

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Protein Modelling

Uploaded by

Copyright:

Available Formats

Comparative Protein Modelling

Gajigan | Lopez | Palmario | Tan | Sotelo

some methods give partial description of 3D structure of proteins

acids + peptide bonds = proteins

R groups distinguish different amino acids

DIFFERENT AMINO ACIDS

WHAT DETERMINES PROTEIN FOLDING?

Generally, the aa sequence determines the 3D shape

Exceptions: Protein denaturation Multiple conformations Chaperones

LEVELS OF PROTEIN STRUCTURE

Beta pleated sheet conformation

Alpha helix conformation

DETERMINING PROTEIN STRUCTURE

crystallography by computational means

T Topology (Fold family), T-level

H Homologous superfamily, H-level

PROTEIN STRUCTURE PREDICTION

: Amino acid sequence

to do? Predict secondary structure conformation of each amino acid , , or c (coil)

SECONDARY STRUCTURE PREDICTION : ANOTHER APPROACH

Protein Structure Prediction

Protein Homology Modelling

based protein structure is more conserved than protein sequence Assumption:

Homologous protein sequences very similar 3D structure

Steps in Homology Modelling

Steps in Homology Modelling

Selection of Template with known tertiary structure

Steps in Homology Modelling

Other factors ion selecting template

Resolution of Template structure

Other source of similarity

Steps in Homology Modelling

Aligning protein sequence with templates

Target and template are aligned using

Pair-wise alignment (e.g. Smith-Waterman) Multiple sequence alignment (e.g. CLUSTAL)

Steps in Homology Modelling

alignment method maximize

Steps in Homology Modelling

Alignment defines structurally equivalent position

Steps in Homology Modelling

Building of model structures

(e.g., COMPOSER, SWISS-MODEL)

by segment matching or coordinate reconstruction

By satisfaction of spatial restraints

Steps in Homology Modelling

Modelling by assembly of rigid bodies

Steps in Homology Modelling

COMPOSER and SWISS-MODEL

Steps in Homology Modelling

Modelling by segment matching or coordinate reconstruction

Protein model will be constructed

Steps in Homology Modelling

Modelling by satisfaction of spatial restraints

Steps in Homology Modelling

Steps in Homology Modelling

Evaluation constructed model

Examples of programs PROCHECK and WHATCHECK

Steps in Homology Modelling

Checking of spatial features of the model