Molecular Biophysics

A slice through the electric potential of NADH
c

1. WHAT IS BIOPHYSICS?

All processes in nature result from the interplay between mass and energy. Biophysics is a field in biological sciences that studies the relation between the behavior of biological systems and the physical forces acting on them. Molecular biophysics focuses on how these forces make atoms and molecules interact with each other, as well as how they make complex molecules fold into their active form.

d

2. INTER-ATOMIC FORCES

I. Electrostatic forces

The most well-described physical force is probably the electrostatic attraction/repulsion of charged atoms/molecule, due to the electric force acting on them:

Electrostatic interaction between two point charges. The field lines between the charges are shown in red.

The energy of pairwise electrostatic interactions is described by Coulomb’s Law:

(where E is the electrostatic energy in kcal/mol,  i and j are the two interacting charges, q is the charges’ magnitude, r is the distance between them, and epsilon is the dielectric of the medium in which they exist. Note that a positive sign for E represents an unfavorable (repulsive) interaction, whereas a negative sign represents a favorable (attractive) interaction)

Electrostatic interactions also occur between fixed electric dipoles within molecules. A common type of dipole-dipole interactions in biological systems is the “hydrogen bond”. This interaction happens between two electronegative atoms, one of which is bound covalently to hydrogen atom. The hydrogen, which is much less electronegative than its covalently-bound partner, has a partial positive charge, and is therefore attracted to the other electronegative atom. The two electronegative atoms are said to ‘share’ the hydrogen atom, hence the name:

Hydrogen bonds between a hydroxyl group and two amino groups. The partial charges are signified by the δ signs.

The energy of dipole-dipole interactions depends on the magnitude of the dipoles, the 3rd power of the distance between them, and also on the angle between the interacting species.

f

II. Non-polar ‘forces’

Uncharged atoms also interact with each other due to attractive non-polar forces (i.e. the “hydrophobic effect“). This interaction does not result from an actual force between the interacting species, but rather from their tendency to avoid water. Thus, water ‘pushes’ the non-polar atoms towards each other:

Non-polar interaction between hydrophobic entities (red). Water molecules are represented by the blue spheres

The energy of non-polar interactions (Enp, in kcal/mol) has been found empirically to depend on the size of the water-accessible surface area, lost upon the interaction (ΔSA):

f

III. Van-der Waals forces

In fact, any two atoms can be attracted to each other (provided they are close enough) thanks to London (dispersion) forces, a.k.a. van der Waals forces. These result from opposite electronic dipoles, induced within the interacting atoms by each other:

vdW interaction between two neon atoms. To view the animation, click on the figure.

Van-der Waals interactions actually have two component; the attractive interaction between the atoms, which results from the induced dipoles, and a repulsive interaction, which results from overlap of the electron clouds of the two atoms, when they get too close to each other. The total energy of van-der Waals interactions can be approximated by the Lennard-Jones expression:

(where A and B are the experimentally obtained constants of the repulsive and attractive interactions (respectively) and r is the distance between the interacting atoms.)

Notice that, compared to electrostatic interactions, vdW interactions are short-ranged.

ff

3. FORCES ACTING ON PROTEINS

Proteins are complex molecules, built from thousands, sometimes hundreds-of-thousands of atoms. Those atoms are subjected to the same physical forces that work on individual atoms in solution (1, 2). However, because of the large number of atoms in proteins, the acting forces induce a complex behavior in the protein, such as folding to a precise 3D structure (1, 2), or binding specifically to another molecule (1, 2). In other words, the function (behavior) of large molecules depends tightly on their structure. This means that if one could accurately characterize both the structure of a protein and the physical forces acting on each of its atom (as well the atoms of its water-like environment inside/outside the cell), he/she would be able to predict its function. That is what biophysical studies are trying to achieve, directly or indirectly.

*

4. COMPUTATIONAL BIOPHYSICAL METHODS

Biophysical methods can be regarded as either experimental or computational. The first include spectroscopic methods like Nuclear Magnetic Resonance (NMR) or methods that rely on the scattering of subatomic particles hitting the proteins (e.g. Neutron Diffraction), in order to determine its structure and function. Such methods are accurate, but also very expensive and require long periods of time for getting good results.

Computational biophysical methods combine mathematical representations of physical forces with computer algorithms to predict key properties of proteins, which in turn may affect their behavior. For example, the electrostatic potential of a protein (which is important for its function (1, 2), see more below) can be calculated and projected on the surface of the protein. To understand what the potential is, imagine molecules as a collection of (atomic) charges in 3D space, which create an electrostatic field around the entire protein. If an electrically-charged probe is put somewhere inside that field, it will acquire energy as a result. This energy is the electrostatic potential at that point. The image below shows the potential (right) and electrostatic field (left) of short protein segment with helical (spring-like) structure. The blue and red colors represent positive and negative potentials, respectively:

Electrostatic potential and field of alpha helix

Electrostatic properties of alpha helices.

The electrostatic potential is calculated by solving the Poisson equation:

electrostatic potential1

*5. THE CONTINUUM-SOLVENT MODEL

The solution of the Poisson equation relies on a model of the protein and its environment, called “The Continuum-Solvent Model“. In this model, the protein is described explicitly (i.e. all atoms are accounted for), but its water environment is described implicitly, as a uniform body. To account for the water electrostatic properties without having to describe each of its atom, it is described as a high-dielectric body. The dielectric value represents the ability of a medium (water, oil, etc.) to mask electrostatic interactions inside it. This ability results from the polarity of the molecules constituting the medium. Water molecules are polar, and therefor water as a bulk has a high dielectric value (~80). The representation of water as a uniform body of dielectric 80 saves the computational scientist a lot of time and computer resources. One of the popular programs for calculating the electrostatic potential is DelPhi, designed by the Barry Honig Group at Columbia University. Another one is APBS, originally written by Nathan Baker (then part of the J. Andrew McCammon group at UCSD). The picture below shows the continuum-solvent model depiction of a protein partially embedded in membrane. Notice the low dielectric value (ε) assigned to the lipid membrane.

cs-model.jpg

Since biomolecules contain many fully & partially charged groups, their electrostatic properties are important for their function (1, 2). For example, the electrostatic potential may, in some cases, be used to predict the tendency of a protein to bind other molecules or small ions, such as DNA (a negatively charged molecule), calcium ions, etc. This is important, as virtually all the biological functions of proteins involve their binding to other components of the cell (although the binding is not necessarily electrostatic in all cases) The image below shows the potential on the surface of Topoisomerase, a DNA-binding enzyme. Notice the intense blue patch at the center of the protein (left), where the DNA molecule is about to bind (right). As expected, the blue patch represents strong positive potential at this area. The image is taken from Taneja et al. (2006) EMBO Journal 25: 398.

Electrostatic

The potential maps of proteins may give us a qualitative measure of their binding propensity. To get a more quantitative measure, the electrostatic binding free energy (ΔG) between the protein and its ligand (binding partner) must be calculated. The free energy is an important thermodynamic parameters which gives us a measure of the spontaneity of any process; a process is spontaneous only if it lowers the free-energy of the system, and the lower the free-energy gets, the more probable this process is. To get the free energy of protein-ligand binding, the sum of energies of the protein (P) and ligand (L) is to be subtracted from the free-energy of the protein-ligand complex (PL) (see top equation in the figure below). The free-energy in each of these situations is calculated by multiplying the charges with the potential at that point, and integrating over the entire protein-ligand-solvent system (see bottom equation in the figure below).

energy

(Find more on continuum-solvent models here)

*

6. EXPLICIT MODELS: MOLECULAR MECHANICS & DYNAMICS

The calculations above give us the electrostatic binding free energy. However, as mentioned above, binding may also involve other physical forces, such as the hydrophobic effect, van-der Waals interactions, and entropic factors. These are collectively called “nonbonded energy”, since they do not involved the covalent bonds between atoms inside the protein. The latter, called “bonded energy”, must also be taken into account. One way to consider for the chemical and physical forces in the system is to account for all the atoms in it (protein, ligand, and the surrounding water), and then use mathematical expressions to describe the chemical-physical forces acting on them. This approach is called “Molecular Mechanics” (MM), because it implements the principles of classical mechanics on molecular systems. The mathematical expressions used by MM calculations represent each a different physical force acting upon the protein’s atoms. In each MM calculation, all of those expressions are used as one large expression called “Force Field” (FF) (see image below of a typical FF). Two well-known FF are CHARMM and AMBER. In some cases, we also want to simulate the movements of atoms in our system (i.e. dynamics). In such cases, the standard force-field describing the static forces in the system is supplemented with expressions describing how those energies effect atomic movement. This implementation of the MM approach is called “Molecular Dynamics – MD”.

The general form of a force field with its different components

MD animation

A molecular dynamics simulation of spontaneous insertion of the antimicrobial peptide PGLa into a lipid bilayer (source: http://ins.sjtu.edu.cn/people/jakob/)

h

One of the interesting implementations of the MM/MD approach is Molecular Docking, in which the configuration, path, and energy of binding between a protein and ligand is simulated. The method places the ligand in numerous different positions near/inside the protein, in each the binding energy is calculated using the MM force-field. The change of position is itself a function of the calculated energy; when changing the position lowers the energy (i.e. makes the system more stable), it is adopted. In contrast, when the new position increases the calculated energy, it is rejected and a new position is tested (in reality, the adoption/rejection standards are much more complex, and involve statistical wighting). The search is stopped when the energy reaches a minimum which does not change upon positional changes. This way, the path which leads to the most stable protein-ligand configuration is found. It is assumed that this lowest-energy configuration represents the real one, because, as thermodynamics teaches us, the universe has a general tendency to minimize its energy. The docking procedure can be made more realistic by allowing the protein to adapt to the different ligand binding configurations by changing conformation (as happens in real life). This, however, makes the calculations much heavier, resulting in long simulations.

1bxg_phenylpyruvate_docking_PELE

Molecular docking simulation created with the PELE server

One of the most popular uses of molecular docking is virtual screening. That is, the docking of many small molecules on a known protein binding site, to find the one with most favorable binding energy. This procedure is often used in drug discovery projects. In such projects, a molecule is searched, whose binding to the protein changes its activity, usually decreasing it. The activity change may result from competition between the drug molecule and the natural substrate/ligand of the protein on the binding site. Alternatively, the drug molecule may bind to a different site in the protein (allosteric site), where the binding stabilizes a conformation of the protein that has high or low activity. In any case, the drug molecule has to be able to bind the protein binding site with significant affinity, and when the drug acts by competition, its binding affinity has to be significantly higher than that of the natural substrate/ligand. Docking simulations are used to place each of the candidate molecules in the binding site, sample different orientations of the small molecule inside the site, as well as different conformations of molecule, and in each case calculate the binding energy:

Specific docking animation

A virtual screening simulation of 17 different (yet similar) small molecules onto a protein binding site. The site is between two protein subunits (cyan and dark green helices). The protein residues interacting with the small molecules are shown as lines. Hydrogen bonds and pi-interactions that are formed between each molecule and protein residues are shown as black and orange dashed lines, respectively. These interactions contribute to the total binding energy. Created by Elon Yariv.

MM and MD are very popular, and with the growth of computational power they get better results than before. However, they have a serious problem producing reliable free-energy values for bio-molecular processes, such as protein folding and binding. In order to understand why, we should know that the free-energy is composed of an enthalpic component (H, related to the potential energy E), but also of an entropic term (i.e. the number of ways the atoms in the system can recombine) (S):

ΔG = ΔH – TΔS ≈ ΔE – TΔS

(T is temperature)

The force-field-based calculations give us the potential energy. However, in order to get the entropy-related energy, all the possible atomic configurations in the system must be sampled (using MD). Computationally, this task is hard enough when the protein’s atoms are considered. When the solvent (water) atoms are considered as well, the task is extremely hard, since water molecules are numerous. As a result, the force-field MM/MD approach does not seem to be able to provide good solvation (i.e. water-based) free-energies.

In addition, when dynamic processes are simulated by MD, the heavy calculations can cover only sub-microsecond time frames, which are much shorter than the time frames corresponding to biological processes.

*

7. COMBINED MODELS

As mentioned above, the Continuum-Solvent (CS) model provides a relatively good description of the solvent, because it gives the free-energy of the system instead of just the potential energy. This is thanks to the use of the average dielctric parameter incorporated into the Poisson equation. The CS model could, at least in theory, be added to the other mathematical expressions of the force-field in MM/MD calculations. However, whereas the other expressions are simple (e.g. Coulomb law), the Poisson equation is complex and can only be solved numerically. Integrating this equation into the force-field would therefore make things even worse, as the computer would not be able to solve the equation in each of the millions of configurations sampled during an MD simulation (at least not in decent time). A solution to this problem came during the last decade in the form of an approximation; If the protein is assumed to be built from simple shapes instead of complex ones formed by atoms (the Born Model), the Poisson equation can be solved analytically, and can therefore be incorporated into the force-field for describing solvent effects. This approach is still being perfected, but already shows promise.

h

8. QUANTUM-MECHANICAL MODELS

The models described in the above sections are all approximations; a true description of atoms, molecules, and the forces acting on them can only be achieved using quantum mechanical (QM) calculations. This is because interatomic forces result from the distribution of electrons around each atom in the system, and only QM calculations can characterize this distribution. Unfortunately, QM calculations are computationally very heavy and cannot be applied to systems which include many atoms (e.g. a protein, DNA, etc.). One solution that has been adopted by scientists is to perform QM calculations only on a small area of the system which is of special interest (e.g. the active site of an enzyme) and treat the rest of the system with ‘regular’ molecular mechanical calculations (see section 4 above). This approach, called QM/MM (1), is used today to understand small-scale but biologically important process, such as enzymatic catalysis (1) or electron/proton transfer (1). Its importance is also demonstrated by the discovery of quantum phenomena in proteins, such as hydrogen tunneling (1), which can only be investigated using QM tools.

More on biophysics

Next: Computational tools & methods in biophysics

%d bloggers like this: