Knowledge and improved understanding of the properties of enzyme active sites and their assorted catalytic mechanisms is vital for novel protein design and predicting protein function from structure.
Crystallographic and NMR studies of enzymes have shed light on the relationship between an enzyme’s three-dimensional structure and the chemical reaction it performs.
However, from a structure alone it is a challenging task to extrapolate a catalytic mechanism.
The following properties of catalytic residues are examined: frequency distribution of residue type, function, secondary structure environment, solvent accessibility, flexibility, conservation, hydrogen bonding and quaternary structure.
Definition of catalytic residues:
1. Direct involvement in the catalytic mechanism— e.g. as a nucleophile.
2. Exerting an effect on another residue or water molecule which is directly involved in the catalytic mechanism which aids catalysis (e.g. by electrostatic or acid–base action).
3. Stabilisation of a proposed transition-state intermediate.
4. Exerting an effect on a substrate or cofactor which aids catalysis, e.g. by polarising a bond which is to be broken. Includes steric and electrostatic effects.
Each enzyme has an average of 3.5 catalytic residues.
Frequency distribution and Secondary structure
65% of catalytic residues are provided by the charged group of residues (H, R, K, E, D), while 27% of catalytic residues come are provided by the polar group of residues (Q, T, S, N, C, Y, W), and just 8% are provided by the hydrophobic group of residues.
This is as expected: catalysis involves the movement of protons and electrons and charge stabilisation, which needs electrostatic forces provided by charged and/or polar residues. There is no correlation between percentage abundance in the dataset and contribution to catalysis.
Table: Catalytic residue types and their secondary structure compared with all residues in the dataset
|Catalytic residue type||Secondary structure environment|
|Charged (%)||Polar (%)||Hydrophobic (%)||Alpha helix (%)||Beta sheet (%)||Coil (%)|
Histidine constitutes 18% of all catalytic residues in proteins, although it has a low overall percentage abundance (2.7%). Histidine is particularly suitable for carrying out catalytic reaction steps, as it can be either charged or neutral at physiological pH and can play the role of nucleophile, acid, base or be involved in stabilising the transition state of a reaction.
Aspartate and glutamate residues constitute 15% and 11% of catalytic residues, respectively. Their natural abundance is almost identical (5.7% and 5.9%, respectively). It could be that aspartate residue is slightly favoured over glutamate residue because it has a shorter side-chain by one methylene group, making the side-chain less flexible so it could be held in place, aiding catalysis.
Arginine and lysine constitute 11% and 9% of catalytic residues, respectively. Arginine occurs more frequently in spite of its lower natural abundance in the dataset (4.9% for arginine and 5.8% for lysine). This preference may be due to the three nitrogen groups in the side-chain, all of which can perform electrostatic interactions, compared with just one in the side-chain of lysine.
Additionally, since the side-chain of arginine can make more electrostatic interactions, it can be positioned more accurately to facilitate catalysis. The arginine side-chain also has a good geometry to stabilise a pair of oxygen atoms on a phosphate group, a common biological moiety.
Cysteine(“free” cysteine residues) constitutes 5.6% of catalytic residues, while its natural abundance is only 1.2%.
Histidine and cysteine residues have the highest propensities, these are followed by the rest of the charged residues. Glutamate moves down in the order due to its higher abundance compared with arginine. The charged residues are followed by the polar residues. Tryptophan is the ninth out of 20 residues, an unusually high position.
After the polar residues come the rest of the hydrophobic and aromatic residues, as expected.
Side-chain and main-chain interactions
The side-chain is used by 92% of catalytic residues, while that of main-chain is 8%. Of those using the main-chain, 82% use the N–H group and 18% use the C=O group. Main-chain groups often stabilise transition state intermediates, e.g. Gly30 in phospholipase A2.
Glycine constitutes by far the highest proportion of catalytic residues using the main-chain (44%).
The 89% of catalytic residues have a relative solvent accessibility (%RSA) compared to fully exposed residues of less than 30%.
Approximately 50% of all catalytic residues in the 0–10% bracket, and approximately 25% in the 10–20% bracket. 5% of all catalytic residues have 0% RSA and are totally buried. One might expect to find all catalytic residues fully exposed on the surface of the protein, but the results show that this is not the case. Most catalytic residues have very small exposures to solvent. The major factor could be the need for correct positioning and restriction of the mobility of catalytic residues.
B-factors in the crystal structures were used as a measure of residue flexibility. Catalytic residues tend to have lower B-factors than all residues, suggesting that they have to be more rigidly held in place than the average residue. Catalytic residues become slightly more “fixed” only when the substrate or cofactor is bound.
Catalytic arginine, lysine, aspartate and glutamate residues all have much lower B-factors than on average. Arginine could have one or two nitrogen groups tethered while the others perform the catalytic function. Lysine, which normally has a very flexible side-chain, has to be tethered for catalysis. For glutamate and aspartate residues, one of the oxygen atoms of the carboxylic acid group can be tethered whilst the other performs its catalytic function. The distribution of B-factors for catalytic histidine and cysteine residues is more similar to all histidine and cysteine residues. This could be due to the higher proportion of these residues being catalytic.
Catalytic residues are clearly more conserved than the average residue. The conservation of residues falls steadily as the distance from the catalytic residues increases. This highlights the strong selection pressures on catalytic residues compared with other residues in the vicinity of the active site, which will be important for substrate recognition.
Efficient catalysis depends on exquisite positioning of critical atoms, which can often only be achieved by using specific amino acid residues (e.g. aspartate instead of glutamate). Additionally, residues structurally close to catalytic residues are more conserved than those close by in amino acid residues sequence. One caveat is that enzyme active sites are not necessarily spherical, and the sphere may also pick up some buried core residues which are conserved because they are essential for maintaining the structural integrity of the protein.
Hydrogen bonds to water molecules were excluded from this part of the analysis although these are often critical components of catalysis. Of 598 catalytic residues considered, the majority (93%) enter into at least one hydrogen bond interaction, be it as a donor or acceptor. This shows that catalytic residues have a limited conformational freedom. The 84% of residues make at least one hydrogen bond via either their N–H or CvO group, while 75% of residues make at least one hydrogen bond via a side-chain atom. This suggests that usually the residue conformation is strongly tethered both for the main-chain and the side-chain.
Of the residues making hydrogen bonds via the N–H or C=O groups, almost all (97%) hydrogen bond to another residue in the protein, and a very small proportion (8%) hydrogen bond to a ligand. Most of these hydrogen bonds will probably be necessary to maintain positioning of the catalytic residues.
Of the residues making hydrogen bonds via side-chain atoms, almost all (94%) hydrogen bond with other amino acid residues in the protein. A relatively small proportion form a hydrogen bond with a ligand (19%).
All residues taking part in catalysis via their main-chain groups form hydrogen bonds with the protein (94%) or with the ligand (41%). Again, these residues have tethered conformations. Only 21% form side-chain hydrogen bonds, reflecting in part the high percentage of glycine residues, but also the non-involvement of the side-chain in catalysis.
Quaternary structure/domain usage
Almost all enzymes in our dataset (159) have their active site contained within just one subunit, with only 19 out of 178 enzymes (11%) having catalytic residues in more than one subunit of the enzyme, i.e. the active site is at the interface of two subunits. Of these 19 enzymes, 17 have catalytic residues split between two subunits, while just two have catalytic residues split between three subunits.
Bartlett GJ, Porter CT, Borkakoti N, Thornton JM. Analysis of catalytic residues in enzyme active sites. J Mol Biol. 2002;324(1):105-21
Souce: NovoPro 2019-05-22