Skip to main content


The dipeptide conformations of all twenty amino acid types in the context of biosynthesis

Article metrics


There have been many studies of dipeptide structure at a high level of accuracy using quantum chemical methods. Such calculations are resource-consuming (in terms of memory, CPU and other computational imperatives) which is the reason why most previous studies were restricted to the two simplest amino-acid residue types, glycine and alanine. We improve on this by extending the scope of residue types to include all 20 naturally occurring residue types. Our results reveal differences in secondary structure preferences for the all residue types. There are in most cases very deep energy troughs corresponding either to the polyproline II (collagen) helix and the α-helix or both. The β-strand was not strongly favoured energetically although the extent of this depression in the energy surface is, while not “deeper” (energetically), has a wider extent than the other two types of secondary structure. There is currently great interest in the question of cotranslational folding, the extent to which the nascent polypeptide begins to fold prior to emerging from the ribosome exit tunnel. Accordingly, while most previous quantum studies of dipeptides were carried out in the (simulated) gas or aqueous phase, we wished to consider the first step in polypeptide biosynthesis on the ribosome where neither gas nor aqueous conditions apply. We used a dielectric constant that would be compatible with the water-poor macromolecular (ribosome) environment.


There are many reasons why there has been so much interest in calculating peptide conformations (Gould et al. 1994; Wu et al. 2010; Bellesia et al. 2010; Hovmoller et al. 2002; Bywater and Veryazov 2013; Carrascoza et al. 2014). These include the need to understand the preferred conformations of physiologically active peptides, the way peptides are incorporated into polypeptide and protein structures, and the conformation of de novo peptide formation in the ribosome. Most previous studies (Gould et al. 1994; Wu et al. 2010) were concerned with small peptides per se, in the gas or aqueous phase, while we addressed the latter question, that of peptide biosynthesis.

Throughout this and our previous work, and in keeping with the usage adopted by previous authors (Gould et al. 1994; Bywater and Veryazov 2013; Carrascoza et al. 2014), we study constructs that we refer to as primitive dipeptides with a N-acetyl-(XXX)(2)-N′-methylamine as a generic structure in which XXX represents the defining amino acid residue type for the particular dipeptide. In this context, N-acetyl is employed as a surrogate for the first amino acid residue in the dipeptide. Although not a true amino acid residue as such it is needed, together with the C-terminal amide group, to provide the correct electronic arrangement for a dipeptide and in order to block zwitterion formation. We chose to study all twenty members of the canonical set of amino acid types. A previous publication (Carrascoza et al. 2014) also reported studies of the entire set of amino acids, with somewhat different results, as discussed below.

As referred to in earlier papers (Bywater et al. 2001; Bellesia et al. 2010; Hovmoller et al. 2002; Bywater and Veryazov 2013; Carrascoza et al. 2014), different residue types have different propensities to adopt one or other of the regularly repeating polypeptide structures, α-helix, 310 helix, polyproline II helix (here abbreviated as PP-helix) or β-strand (Bywater et al. 2001; Liljas et al. 2009). These preferences are however not absolute, they can vary according to context: both near neighbours and internal 3D contacts can affect the outcome. In many different areas of protein science it is of interest to know what are the energetic differences between these conformations. In the area we wish to investigate, that of the conformation adopted by newly synthesized peptides on the ribosome, it has previously been proposed (Lim and Spirin 1984, 1986) that the α-helix is the predominant structure. Our earlier results (Bywater and Veryazov 2013) support this prediction, but an alternative, the PP-helix, emerged as an equally likely and in some cases stronger contender. Furthermore, any extended α-helix would be vulnerable to disruption upon the appearance of a proline residue (Bywater et al. 2001). These reflections added to the importance of studying all twenty amino-acid types so as to see how these preferences are distributed throughout the entire set. It is important to note that while the α-helix and the β-strand are, either singly or in combination, by far the most predominant secondary structure types found in globular proteins (membrane proteins are either all-α-helix or all-β-strand), for fibrous proteins the converse is true, these are typically proline- and glycine-rich structures similar to the PP-helix which plays a prominent role. The protein biosynthesis machinery must be able to cater for both classes of protein.


We constructed the starting structures for each of the many thousands of calculations in the same way as before (Bywater and Veryazov 2013) using the Yasara protein modelling program package (Krieger et al. 2002). A complete set of conformers was constructed for each set, whereby the Ci-1–Ni–CAi–Ci angles (the φ angle) were stepped through at intervals of 3° (120 steps) while for each φ rotamer the Ni–CAi–Ci–Ni+1 angle (ψ) was stepped through 120 steps of 3°. This produced a total of 1681 structures for each amino acid type (41 for the special case of proline). In contrast to certain other studies (e.g. Carrascoza et al. 2014) there was no attempt to optimize these input structures. Instead a so-called rigid scan regime was imposed whereby, for each amio acid type, the rotameric state of the side chain was maintained while the φ,ψ angles were changed. This was considered essential in order to be able to make like-for-like comparisons for each amino acid type at these different backbone angles. If the side chain rotameric state for each different backbone geometry were allowed to relax, that would produce an energy minimized structure, but that would be a rather uninteresting object of study because it could not be compared with the thousands of other backbone geometries. Furthermore, the minimum side chain energy state may or may not be relevant at all. For all but the “smallest” side chain types there are multiple rotameric states that are accessible (Ponder and Richards 1987; Pupo and Moreno 2009). It would be impossible to cater for all of them.

For each of these conformers DFT calculations with B3LYP functional and ANO-L-VDZP basis set were performed using Molcas 7.8 (Aquilante et al. 2010). The PCM model was used to simulate solvation effects (Karlström et al. 2003; Pomelli and Tomasi 1997). As before (Bywater and Veryazov 2013) we selected a dielectric constant of 2.5 to reflect the water-poor environment of the peptidyltransferase site and the extremely slow tumbling rate of an object as large as a ribosome.


The results of our calculations for the 20 residue types are presented in the form of Ramachandran-style energy surface plots for each residue type and a table that summarizes the salient features of each of these plots. Some necessary auxiliary information is required as a preliminary, this is provided in the form of the first two figures. Figure 1 is a graphical overview Ramachandran plot showing the φ,ψ positions of the 50 lowest energy conformers for all amino acid types except G and P. The location of the three classical secondary structure types α-helix, 310-helix and PP-helix are shown by colored triangles (see caption to Fig. 1). Figure 2 focuses on the forbidden regions. This is intended to highlight some characteristics of certain residue types (in particular I, V, T and D) and to explain some features that turn up in Figs. 3, 4, 5 and 6. Further details are given in the caption to the figure. The grid and axis markings of Fig. 1 can be used for scaling the 20 plots in Figs. 3, 4, 5 and 6 [the β-strand region (not marked in the figure) covers a very wide range 100° < φ < 180°, 90° < ψ < 180°]. We note however that the large central forbidden region in our plots is almost absent in those of Carrascoza et al. 2014. The full set of results are displayed in Figs. 3, 4, 5 and 6, Ramachandran-style plots showing the φ,ψ distributions separately for each residue type (20 plots) with energy contours shown. There are 1680 data points for each plot (except P) and in order to give a better representation of this data a scaling factor \({\rm{tanh}} \sqrt {(e^2 - e^2_{\rm{max}})/10}\) was applied. For residue type P, only the region −72° < Φ < −60° is shown (40 data points). Because of the cyclic structure of its side chain involving the atoms which form the φ torsion angle (C′-N-CA-C), there are essentially no structures outside that range. As stated above, the key findings from a perusal of these plots is provided in Table 1 which describes the topography of the energy surface in φ,ψ space and provides remarks concerning the secondary structure preferences for each residue type.

Fig. 1

Generic φ,ψ map showing commonly populated areas. This figure is intended to be used as a template for labelling axes and determining values for the dihedral angles in Fig. 4. Blue triangle α-helix, red triangle 310 helix, green triangle PP-helix

Fig. 2

Generic φ,ψ map showing commonly forbidden areas. For this plot, “forbidden areas” is defined as those representing structures in which there is a close contact (“collision”) between atoms. The contact distance was set at 0.93 Å. These forbidden areas are generally less interesting than the “valleys” of the energy surface but they explain why certain residue types behave the way they do (in particular V, I and T). Residue types are shown in lower-case single letter code. This figure also explains the black regions in some of the members (especially V and D) of Figs. 3, 4, 5 and 6—the energy gradients are too steep to be properly rendered by the graphics

Fig. 3

Φ,ψ energy surfaces for dipeptides. Φ,ψ energy surfaces for dipeptides with residue types are shown in this order: A (Ala), C (Cys), D (Asp), E (Glu), F (Phe). Note that only the −72° ≤ Φ ≤ −60° region is relevant for Pro because of its cyclic structure involving the –N–CA–CB–CG–CD– atoms which restricts rotation around the Φ dihedral bond. A diagram showing the chemical structure for Pro is provided in order to illustrate this. Some members of this set of dipeptides appear to show defects (black colour) in certain regions. This has been anticipated and explained above (caption to Fig. 2)

Fig. 4

Φ,ψ energy surfaces for dipeptides. Φ,ψ energy surfaces for dipeptides with residue types are shown in this order: G (Gly), H (His), I (Ile), K (Lys), L (Leu). For details see the caption to Fig. 3

Fig. 5

Φ,ψ energy surfaces for dipeptides. Φ,ψ energy surfaces for dipeptides with residue types are shown in this order: M (Met), N (Asn), P (Pro), Q (Gln), R (Arg). For P (Pro) the cyclic structure of sidechain locks the torsion angle. For details see the caption to Fig. 3

Fig. 6

Φ,ψ energy surfaces for dipeptides. Φ,ψ energy surfaces for dipeptides with residue types are shown in this order: S (Ser), T (Thr), V (Val), W (Trp), Y (Tyr). For details see the caption to Fig. 3

Table 1 Description of the topography of the energy surface in φ, ψ space with remarks concerning the secondary structure preferences for each residue type


Our previous results, for residue types G, A, I and L provided support for established ideas (Lim and Spirin 1984; Lim and Spirin 1986) that the α-helix is a “default” conformation for the de novo generation of polypeptides on the ribosome but also demonstrated a clear alternative or rival. The PP-helix was given comparable, if not in some cases, greater prominence. We see further examples of that here, in the now extended repertoire of residue types. This is important because there is for any species only a single class of ribosome which has to cater for both globular (requiring α-helix and/or β-strand) and fibrous (strongly PP-helix preferring) proteins. Recent DFT studies on a restricted set of GXG model peptides (Ilawe et al. 2005) confirm the prominence of the PP-helix, while also finding a preference for β-strand. The latter is understandable since these authors were focusing on the X = I/V/L and the first two of these residue types are known (and shown here) to be β-strand preferring. All of these “preferential” states (α-helix, β-strand, PP-helix) must be regarded as at least potentially accessible for most amino acid types. I and V do turn up in α-helices, albeit less frequently than in β-strands. Note should be taken of the fact that while α- and PP-helix occupy a relatively small area of φ,ψ space these two structural types are characterised by very deep depressions which renders them enthalpically favored. The β-strand in contrast covers a wide area (alternatively: there is greater tolerance to distortions) although the depression is not as deep. Located between the α-helix, β-strand zones is a region that corresponds to the 2.27 ribbon structure. This was discussed at length in Carrascoza et al. 2014 and indeed, our results do not rule out that some of the amino acid types might dwell in that region. But it is not normally found in proteins and it is an unlikely contender as part of a biosynthesis process. Concerning the apparent propensities for an α-helix geometry, this has to be viewed in the light of the fact that we are considering dipeptides and a true α-helix will not actually form in stretches shorter than 4 residues, in which the first of the hydrogen bonds that stabilize the helix can be established. So this suggests that there is something that intrinsically favours this helix regardless of the assistance provided by hydrogen bonds. The answer almost certainly resides in the need to “remove bumps”, i.e., steric repulsions between the atoms at certain key side chain torsion angles. Similar remarks might be made about the β-strand. There is a very wide range of backbone torsion angles available to this geometry. Also in this case there are no stabilising hydrogen bonds, but in proteins, β-strands are always incorporated into β-sheets, held together by hydrogen bonds. These β-sheets exhibit, as mentioned above, a very large variety of “shapes” and contortions which are allowed because of the very wide range of torsion angles accessible to the constituent β-strands. Lastly, mention should be made of 310 helices. There are clear hints of distinct differences in their prevalence between different amino acid residue types and this can have repercussions for how protein folding takes place. Now that we have energy calculations for the entire set of 20 residue types this makes it easier to survey the whole family and see what patterns of secondary structure preferences might emerge.

The results presented here can be used by protein chemists as a guide to what the most likely secondary structure propensities are for each of the amino acid types. But certain caveats need to be issued. Firstly, the structures studied are not in the strict chemical sense “correct” structures for the dipeptides in gas phase or solution. This is anyway not an endeavor of compelling interest. Here, we have attempted to mimic an environment that the incipient polypeptide chain might encounter in the interstices of the ribosome, or indeed anywhere inside the cell which is known to be very “crowded”, but we can only do that with a very primitive solvation model. We do not know what the neighboring residues in contact with the newly synthesized peptide are and what the precise geometric arrangement is. We only allow the two backbone angles φ and ψ to change, Given the uncertainties about the environment, it does not make sense to allow all other angles to relax and to conduct energy minimizations of these structures. We think that by conducting things in the way we have has at least thrown some light on to the question of how each residue type behaves in comparison with the others, and some information concerning secondary structure propensities is provided. Obtaining structural information about longer peptides is of course also of great interest, but different methodologies are needed for that, molecular dynamics rather than quantum chemical methods, and recent work (Nilsson et al. 2015) reports the results of such cotranslational folding studies. These data do not in any way contradict our results, quite the converse, but the example given was of a small protein with a tendency to form α-helical structure. It would be interesting to see if any attempt is made to detect cotranslational folding of a fibrous protein, in which case the collagen PP helix would come into play.


There has been much interest in determining the structure of dipeptides. Usually these efforts have been restricted to the case of primitive dipeptides where the central residue type is glycine or alanine, and no account was made of the effect of solvent. Gas-phase conditions were assumed. Our previous work extended this coverage of the residue type repertoire to two further cases, that of leucine and its position isomer isoleucine. Simulated solvent conditions corresponding approximately to the water-poor environment and large particle size of a ribosome (or elsewhere in the crowded interstices of the cell) were applied. Already at that stage, major differences were seen between the four residue types, particularly between the two isomers. This encouraged further research into the entire set of 20 standard residue types. We have produced a compendium that protein chemists can use as a guide to the most likely secondary structure propensities for each of the amino acid residue types. Most amino acid residue types can access all three of the major secondary structures α-helix, β-strand, PP-helix but there are individual preferences which were known from experimental and bioinformatics studies. Our plots map out these preferences. In reference to ribosomes we recall that the same ribosomes have to cater for all 20 amino acid types but also enable both globular and fibrous proteins to be formed within and emerge from the peptide synthesis tunnel. We have not considered cotranslational folding as such, but our work should be helpful as a starting point for such studies.


  1. Aquilante F, De Vico L, Ferré N, Ghigo G, Malmqvist PÅ, Neogrády P, Pedersen TB, Pitoňàk M, Reiher M, Roos BO, Serrano-Andrés L, Urban M, Veryazov V, Lindh R (2010) MOLCAS 7: the next generation. J Comput Chem 31:224–247. doi:10.1002/jcc.21318

  2. Bellesia G, Jewett AI, Shea JE (2010) Sequence periodicity and secondary structure propensity in model proteins. Protein Sci 19:141–154. doi:10.1002/pro.288

  3. Bywater RP, Veryazov V (2013) The preferred conformation of dipeptides in the context of biosynthesis. Naturwissenschaften 100:853–859. doi:10.1007/s00114-013-1085-7

  4. Bywater RP, Thomas D, Vriend G (2001) A sequence and structural study of transmembrane helices. J Comput Aided Mol Des 15:533–552. doi:10.1023/A:1011197908960

  5. Carrascoza F, Zaric S, Silaghi-Dumitrescu R (2014) Computational study of protein secondary structure elements: Ramachandran plots revisited. J Mol Graph Model 50:125–133. doi:10.1016/j.jmgm.2014.04.001

  6. Gould R, Cornell WD, Hillier IH (1994) A quantum mechanical investigation of the conformational energetics of the alanine and glycine dipeptides in the gas phase and in aqueous solution. J Am Chem Soc 116:9250–9256. doi:10.1021/ja00099a048

  7. Hollingsworth SA, Karplus PA (2010) A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins. Biomol Concepts 1:271–283. doi:10.1515/BMC.2010.022

  8. Hovmoller S, Zhou TP, Ohlson T (2002) Conformations of amino acids in proteins. Acta Cryst D58:768–776

  9. Ilawe NV, Raeber AE, Schweitzer-Stenner R, Toal SE, Wong BM (2005) Assessing backbone solvation effects in the conformational propensities of amino acid residues in unfolded peptides. PCCP 17(38):24917–24924. doi:10.1039/c5cp03646a

  10. Karlström G, Lindh R, Malmqvist P-Å, Roos BO, Ryde U, Veryazov V, Widmark P-O, Cossi M, Schimmelpfennig B, Neogrady P, Seijo L (2003) MOLCAS: a program package for computational chemistry. Comput Mat Sci 28:222

  11. Krieger E, Koraimann G, Vriend G (2002) Increasing the precision of comparative models with YASARA NOVA—a self-parameterizing force field. Proteins 47:393–402. doi:10.1002/prot.10104

  12. Liljas A, Liljas L, Piskur J, Lindblom G, Nissen P, Kjeldgaard M (2009) Textbook of structural biology. World Scientific, Singapore. ISBN 978-981-277-207-7

  13. Lim VI, Spirin AS (1984) Stereochemistry of the transpeptidation reaction in the ribosome: the ribosome generates the a-helix during synthesis of the polypeptide chain of the protein. Doklady Akad Nauk 280:235–238

  14. Lim VI, Spirin AS (1986) Stereochemical analysis of ribosome conformation of nascent peptide. J Mol Biol 188:565–577

  15. Nilsson OB, Hedman R, Marino J, Wickles S, Bischoff L, Johansson M, Müller-Lucks A, Trovato F, Puglisi JD, O’Brien EP, Beckmann R, Von Heijne G (2015) Cotranslational protein folding inside the ribosome exit tunnel. Cell Reports 12:1533–1540. doi:10.1016/j.celrep.2015.07.065

  16. Pomelli CS, Tomasi J (1997) A new formulation of the PCM solvation method. Theor Chem Accounts 96:39–43. doi:10.1007/s002140050201

  17. Ponder JW, Richards FM (1987) Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 193(4):775–791. doi:10.1016/0022-2836(87)90358-5

  18. Pupo A, Moreno E (2009) Do rotamer libraries reproduce the side-chain conformations of peptidic ligands from the PDB? J Mol Graph Model 27:611–619. doi:10.1016/j.jmgm.2008.10.002

  19. Wu H, Canfield A, Adhikari J, Huo S (2010) Quantum mechanical studies on model alpha-pleated sheets. J Comput Chem 31:1216–1223. doi:10.1002/jcc.21408

Download references

Authors’ contributions

Both authors made distinct but equivalent contributions to this work. RPB conducted bioinformatics work and constructed the many thousands of input structures. VV conducted the quantum mechanical calculations and produced the dihedral angle plots. The manuscript was largely written by RPB but the authors had joint control over its content. Both authors read and approved the final manuscript.


Elmar Krieger and Gert Vriend are thanked for kindly making the Yasara modelling program available under an academic license. The computations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) at LUNARC supercomputer center.

Data availability

All coordinates are available from authors on application.

Competing interests

The authors declare that they have no competing interests.

Author information

Correspondence to Valera Veryazov.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bywater, R.P., Veryazov, V. The dipeptide conformations of all twenty amino acid types in the context of biosynthesis. SpringerPlus 4, 668 (2015) doi:10.1186/s40064-015-1430-8

Download citation


  • Dipeptide
  • Amino Acid Type
  • Residue Type
  • Secondary Structure Type
  • Rotameric State