^{*}

Ben-Naim in three articles dismissed and “answered” the Levinthal’s paradox. He announces there are pitfalls caused by the “misinterpretation” of thermodynamic hypothesis. He claims no existence of Gibbs free energy formula where the variable is a protein’s conformation ** X **. His Gibbs energy functional is

*G*(

*T, P, N, P(*

**)), where the variable is probability distributions**

*R**P*(

**) of the conformations. His “minimum distribution P**

*R*_{eq}” is wrong. By carefully establishing thermodynamic systems, we demonstrate how to apply quantum statistics to derive Gibbs free energy formula

*G*(

**). The formula of the folding force is given.**

*X*In [

But Ben-Naim invents a new “pitfall”: “This misinterpretation (of thermodynamic hypothesis) has inspired many scientists to search for a global minimum in the Gibbs energy as a function of the conformation of the protein, sometimes referred to as the Gibbs energy landscape. Such a minimum in the Gibbs energy is different from the minimum required by the Second Law of Thermodynamics” [

Trying to answer the so called Levinthal’s paradox in [

“The following two statements are true:

a) The native stable structure of the protein must be at a minimum of the GEL (Gibbs Energy Landscape).

b) Upon releasing a constraint within the system, specified by the variables: T, P N, the Gibbs energy of the system will reach a single absolute minimum”.

Ben-Naim’s conclusion is: “From the two true statements a) and b), people have concluded that the stable state of the protein must be in a global minimum in the GEL. Unfortunately, this conclusion is invalid... The reason so many people fell into this pitfall is that in making statements a) and b), we have not specified the variables with respect to which the Gibbs energy has a minimum”.

Here Ben-Naim implies that conformation of a protein should not be the variable of the Gibbs energy. To answer the question of what is the variable in the Gibbs energy Ben-Naim states in [

”.

So Ben-Naim confirms here that the variable of the Gibbs energy is not conformation R. In [

Unfortunately, Ben-Naim’s solution of the single minimum (maximum) distribution at equilibrium is wrong, either for the Gibbs energy functional or for the entropy function in [

But even someone can get a correct minimum distribution for Ben-Naim, Ben-Naim’s shifting from statement (a) to statement (b) is still a misleading, or a real pitfall. Because it shifts the study of protein structure to the study of probability distribution of conformations. The two are different problems and answer to one would not automatically solve the other problem. For example, even knowing what is Ben-Naim’s minimum distribution, we still do not known what is the three-dimensional shape of the native structure.

In this article, why Ben-Naim falls into a pitfall is analyzed. We will also demonstrate how to derive Gibbs free energy formula from quantum statistics to show how to get out of Ben-Naim’s “pitfall”, where we have omitted the environment parameters T and P, since they do not vary in nature protein folding process. Where is a conformation of the protein, equivalent to Ben-Naim’s R, and is the atomic center of the atom, supposing that the molecule has total M atoms. Denying the existence of such (it is equivalent to Ben-Naim’s is one of the reasons that Ben-Naim claims “pitfall”. The negative gradient, is the force that forces the portein to fold. Formulas of are given. The details of the derivation of is given in Section 7.

To analysize Ben-Naim’s “pitfall” and look for the reason why there is a “pitfall”, we should recall what is the thermodynamic principle (Anfinsen called it modestly the thermodynamic hypothesis in [

All previous attempts of deriving the Gibbs free energy formula, including Ben-Naim’s, missed the goal of identifying “the three-dimensional structure of a native protein” that Anfinsen had emphasized in above quotation. By their derivation, the whole system consists of conformations of the same protein molecule, each is only a point in the Euclidean space, supposing that the protein has M atoms. Each micro state of the system, the N points in, is structureless if we consider the three-dimensional conformation. In this kind of treatment, statistical mechanics cannot tell us anything about “the three-dimensional conformation of a native protein”. Once realized this, one should stop using such kind of systems and start to look for systems that can answer the problem of what is the three-dimensional shape of the native structure.

But many just followed the standard setting of statistical mechanics that successfully treated objects such as ideal gas. Instead of telling “the three-dimensional conformation of a native protein”, they shift the problem to that what is the share of the native structure in the probability distribution of conformations. This problem is also interesting and important, but it is a different problem, and as afore mentioned, its resolution tells us nothing about “the three-dimensional conformation of a native protein”. One has to be careful when making inferences between these two different problems. BenNaim’s “pitfall” comes exactly from the misplaced inference, i.e., even knowing what is the correct “minimum distribution” (Ben-Naim’s is wrong) would not help us to know what is “the three-dimensional conformation of a native protein”, not even one iota.

Our understanding of the thermodynamic principle is that under the physiological environment, for each conformation of the peptide chain of the protein molecule there is a Gibbs free energy. The native structure has the minimum value of this Gibbs free energy function. The only uncertainty is that might just correspond a local minimum of, as asserted by Levinthal in [

But, to answer the question of what is “the three dimensional conformation of a native protein”? as Anfinsen emphasized, we have to make the transition of conformations in to conformations in. Based on the three-dimensional geometry of each conformation, a thermodynamic system should be established, in which among other particles, contain exactly only one protein molecule with the conformation. Then one can apply statistical mechanics, classical or quantum, to get the Gibbs free energy of the system, denoted as.

Kinetically, in the physiological environment, an individual protein molecule takes an initial conformation with a Gibbs free energy. With the totality of interatomic interactions of the protein molecule, (we have to add that plus the interaction with its immediate environment), the conformation changes to a series conformations, with Gibbs free energy. At last the conformation changes to the native structure with. The “whole” system is the series of systems in (time) series. Searching the native structure then becomes the mathematical problem of solving the minimization problem

The solution of (1) will not only tell us what is the value (which is not important) but also will tell us what is (which is the most important). This is one way to answer the question that what is “the three dimensional conformation of a native protein”, i.e., making protein structure prediction.

So if we want to resolve the protein folding problem (PFP), for any individual conformation we should create a tailored thermodynamic system and derive from it the Gibbs free energy formula. Given a native protein’s amino acid sequence, searching for global minimum of is truly following the thermodynamic hypothesis as Anfinsen stated it. Unable to derive such should not be labeled as “misinterpretation of the (thermodynamic) hypothesis” [

On the other hand, since 1990’s many techniques for probing individual molecules were developed and experimentally observing and testing single molecule is currently a common practice, see [7,8] for example. Theory anyway should not lagged too far behind experiment in single molecule protein folding study.

The thermodynamic system occupies a region in. Given, how to put it into a space region? And actually, what is? To resolve this we have to use’s three dimensional structure. Assume that each atom has the shape of a ball with van der Wals radius, , the three dimensional structure of is. The is the real space or behavior space while the is only the control space of the protein conformation, [

To establish we need some geometric preparation, although it may sounds too mathematical, it is no surprise at all. In fact, Anfinsen stated as early as in 1973 that “biological function appears to be more a correlate of macromolecular geometry than of chemical detail” [

Although the shape of each atom in is well defined by the theory of atoms in molecules [9,10], what concerning us here is the overall shape of the structure. The cutoff of electron density au [9,10], gives the overall shape of a molecular structure that is just like, a bunch of overlapping balls. Moreover, the boundary of the au cut off is almost the same as the molecular surface (

In mathematics, for any closed surface (compact and connected), there are a bounded domain and a un-bounded domain such that

Let be the diameter of a water molecule and be the molecular surface of with the probe

radius. If is connected, then we can use in (2). If has multiple connected components, , such that is the largest component, i.e., all other components of are contained in. Then denote and. Thus, we always have

Let be the distance from a point to a subset. Define

as our thermodynamic system. While

is the first hydration shell surrounding.

To be simple, we only consider single peptide chain, self-folding globular proteins here. Hence in the system, except, there are only water molecules and electrons. We have and all nuclear centers of water molecules in are contained in. Moreover, since is bounded, it has a finite volume.

The thermodynamic system will be an open system, i.e., electrons and water molecules can enter and leave. Therefore, the numbers and, of water molecules and electronics in are variables. According Anfinsen [

Ben-Naim claims that “in the author’s opinion, the main hindrance to finding a solution to the protein folding problem has been the adherence to the hydrophobic (HOO) dogma, which states that various HOO effects (both solvation and interaction) are the dominant forces in protein folding” and “an exhaustive analysis of all the solvent induced effects on protein folding reveals that the hydrophilic (HOI) effects are much more important than the corresponding HOO effects” [

In [

But no matter the driving force of protein folding is HOO or HOI, a common essence for them is that in a protein there are many different moieties or atom groups with different levels of ability of forming hydrogen bonds (hydrophobic levels). Simply classifying amino acids as hydrophobic or hydrophilic is an over simplification [^{–}, N^{+}, S. If a hydrogen atom is bonded with an atom in, we will put it in.

Let be the subset such that if and only if. Define and as shown in

Let be the volume of, then

Define the hydrophobicity subsurface, , as

Let be the area of a surface, then

In our open thermodynamic system, there will be water molecules in, and electrons in, thus we will denote the variable as a vector

.

After statistical treatment, the mean number of and will be denoted as, ,. Each water molecule in will contact. The chemical potential reflecting the contacting energy will be denoted as. Similarly, the energy for an electron kept in will be the chemical potential. With these preparation, arguing in quantum statistics via the grand canonic ensemble we derive the Gibbs free energy of protein folding as follows (see Section 7 for the detailed derivation, also see [17,18] for further discussions):

Note that in the folding process, each intermediate structure is not in a stationary state, it is rather a system of quasi-equilibrium states of the folding. So that it is not the case that, as in equilibrium state. Rather, the chemical potentials will be constants during the folding, as the environment is kept unchanged.

Formula (10) is not easy to calculate, we can convert it into a geometric form that is not only calculable but also coincident to a mathematically derived formula appeared in [15,19].

Since every water molecule in has contact with the surface and the curvature of is uniformly bounded, is proportional to the area. That is, there are, independent of, such that

Similarly, there will be a, independent of, such that.

By the definition of and, we have roughly. Thus

Substitute (11) and (12) into (10), we get

This Gibbs free energy function really should be written as, where is environment, its parameters including the temperature and pressure which will affect the values of chemical potential and. Since protein folding is in a fixed physiological environment, we can omit in this stage.

It should be emphasized here that since we assumed that the proteins are single peptide chain, self-folding globular proteinsins, the first hydration of contains only water molecules and electrons, no presence of other components at all, this Gibbs free energy function should be only suitable to these proteins. For other kinds of proteins, the presence of other components such as chaperonins must be considered in the thermodynamic system. Then, the geometry of will become more complicated.

Anfisen had shown that the protein folding is a spantenously process [

There are, such that for nuclear centers and in,

We will denote all conformations satisfying (14) as. Then the minimization will become:

or, at least, within, corresponds to a local minimum of.

With the steric conditions we avoided the collapsing problem. But the steric conditions turn the minimization problem (1) into a constrained minimization problem (15). Mathematically the latter is much more difficult to solve. To avoid the constraint in minimization for nonbonding atoms, we can use the van der Waals force to modify the formula as:

(16)

where is the corresponding energy and and the ideal distance between the atoms and. Before using to eliminate the constraint of, we take a more convenient coordinate of the conformation. We require that all bond lengths and angles (denoted as one angle-length pattern) are kept as obtained from a conformation and from calculate the values of all rotatable dihedral angles (including all the main chain, s). In fact, new conformations obtained by changing will keep the same angle-length pattern and all conformations with the same angle-length pattern as are obtained by choose suitable values. The function then can be written as

Let have the dihedral angles, then the constraint in (15) will be relaxed and we will have a minimization problem without any constraint:

Ben-Naim correctly emphasizes that the protein folding is a cause-based process, “One can imagine that at each stage of the folding process, there are strong solventinduced forces exerted on the various groups along the protein. These forces will force the protein to fold along a narrow range of pathways...” [

However, with only a “minimum distribution” Ben-Naim cannot tell what is the garden. With formula (13), it is easy to write down mathematical formula of. For example, in the coordinates, the folding force is

Before giving the formula of, we will point out that if it is calculable, then we can apply the fastest descending method to pursue the minimum value of. That is, starting from a, the immediate next conformation will be chosen such that

where is a suitable step length. When is small, it is guaranteed that. Any (local) minimum would have that.

We will give the analytic formula of here without mathematical proof. It is:

(21)

It should be mentioned here that bond in is rotatable if it is a single bond and if we cut this bond, all nuclear centers in can be divided into two (nonempty) groups, such that we can fix one group and rotate around the bond axis the other group. Let be the outer product in. Let be the bond, then will be the rotation axis and the rotation vector field, i.e., if is a rotated nuclear center; and if is a fixed nuclear center. Furthermore,

and

where and are the outer unit normal and the mean curvature of, and the Hausdorff measures of dimensions 2 and 1. Let be the family of conformations such that and,. Define as, and denote

Finally, if is rotated and is fixed, then

if and are both rotated or both fixed, then we have.

The integration of above formulae on the molecular surface are given in [

The Ben-Naim’s pitfall of “misinterpretation of thermodynamic hypotheses” is dismissed as a Don Quixote’s windmill by demonstrating the existence of Gibbs free energy formulas (10) and (13), pursuing of them were claimed by Ben-Naim as fallen into a pitfall. The formulae themselves need detailed geometric formulation of the thermodynamic system to present them, is a realization of Anfinsen’s insight that “biological function appears to be more a correlate of macromolecular geometry than of chemical detail” [

In Section 7, the quantum statistical derivation of formula (10) is given, the convertion of (10) to (13) is demonstrated in Section 3.2.

Ben-Naim’s minimization at is analyzed and dismissed because it predicts that at equilibrium every possible conformation will have the same probability to be the structure of a native protein. That is, BenNaim claims that for any conformation. In fact, in the contrary, in the physiological environment the native structure is dominate.

The reason of why calculable formulas such as (10) and (13) have not appeared so far is discussed, blindly imitating successful classical examples of applying statistical mechanics and ignoring Anfinsen’s insight are two main reasons.

The force that forces the protein to fold is identified as by general physical law, that Ben-Naim has correctly pointed out. The calculable formula of is given.

For any conformation, let be the nuclear centers of oxygen atoms in water molecules in and be electronic positions of all electrons in. Then the Hamiltonian for the system is:

where is the nuclear mass of atom in, and the masses of water molecule and electron, the Laplacian in corresponding, and V the potential.

Depending on the shape of, for each, , the maximum numbers of water molecules contained in vary. Theoretically we consider all cases, i.e., there are water molecules in,. Let and and, , and denote the nuclear positions of water molecules in. As well, there will be all possible numbers of electrons in. Let denote their nuclear positions. For each fixed and, the Born-Oppenheimer approximation has the Hamiltonian

The eigenfunctions, , comprise an orthonormal basis of. Denote their eigenvalues (energy levels) as, then

.

In the following we will use the natotions and definitions in [21, Chapter 10]. Let be the Bolzmman constant, set. Since the numbers and vary, we should adopt the grand canonic ensemble. Let be the chemical potentials, that is, the Gibbs free energy per water molecule in. Let be electron chemical potential. The grand canonic density operator is ([21, 22])

where the grand partition function is

According to [21, p. 273], under the grand canonic ensemble the entropy of the system is

(27)

Here we denote the mean numbers of water molecules in, , and the mean number of electrons in. The inner energy of the system is denoted as:

.

The term is a state function with variables, and, and is called the grand canonic potential ([21, p. 27]) or the thermodynamic potential ([22, p. 33]). By the general thermodynamic equations [22, pp. 5-6]:

we see that

where is the volume of the thermodynamic system. Thus by (27) we obtain the Gibbs free energy in (10):