Journal of Modern Physics
Vol. 4  No. 10 (2013) , Article ID: 38865 , 7 pages DOI:10.4236/jmp.2013.410167

Entropy Rate of Thermal Diffusion

John Laurence Haller Jr.

Predictive Analytics, CCC Information Services, Chicago, USA


Copyright © 2013 John Laurence Haller Jr. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received July 13, 2013; revised August 15, 2013; accepted September 18, 2013

Keywords: Entropy Rate; Entropy; Information; Diffusion; Temperature


The thermal diffusion of a free particle is a random process and generates entropy at a rate equal to twice the particle’s temperature, (in natural units of information per second). The rate is calculated using a Gaussian process with a variance of which is a combination of quantum and classical diffusion. The solution to the quantum diffusion of a free particle is derived from the equation for kinetic energy and its associated imaginary diffusion constant; a real diffusion constant (representing classical diffusion) is shown to be . We find the entropy of the initial state is one natural unit, which is the same amount of entropy the process generates after the de-coherence time,.

1. Introduction

When a free particle is at a non-zero temperature, it is composed of a spectrum of frequencies that evolve at different rates which causes the probability distribution of where one can find the particle to spread. We will show that the entropy rate, associated with the probability distribution diffusing, is equal to twice the particle’s temperature.


The rate, R, is calculated below using the natural logarithm, and thus the units for the rate are natural units of information per second, when the temperature (T) is expressed in degrees Kelvin, Boltzmann’s constant  is expressed in Joules per Kelvin, and hbar (ħ) is Planck’s constant divided by 2π in Joule-seconds.

This equation tells us the minimum amount of information we need, each second, in order to track a diffusing free particle to the highest precision that nature requires. By quantifying the amount of information needed to follow a free particle for a certain time, and showing it is finite, we are able to guarantee that a computer (or other discrete state-space machine with finite memory) can store a particle’s initial state and trajectory.

What is unique about this result is that there is no dependence on the mass of the particle or any other variable except the temperature.

2. Assumptions

We prove this primary result by making the following three assumptions:

1) The continuous diffusion of a free particle can be modeled as a discrete process with a time step that is much smaller than the de-coherence time ,

where T is the temperature.

2) Knowing the particle’s location at time step n+1 allows one to determine the location of the particle at the previous time step n; i.e., conditional entropy is zero,

where is the random variable that represents where the particle can be found at time step n.

3) At each time step the minimum uncertainty wavepacket is localized around its new location, and thus the conditional entropy of the  step, given all previous steps, is the same as the conditional entropy of the 1st step given its initial state,


These three assumptions taken together are reasonable and give insight into the behavior of the system.

Assumption 1 is aided by the analysis found in [1] which shows the time step of discrete diffusion is

and thus for non-relativistic particles, the assumption holds.

Assumption 2 says that there is no entropy beyond the minimum uncertainty wave-packet after a measurement of the particle’s location was made.

Assumption 3 says that the vacuum localizes the diffusing particle up to the minimum uncertainty wavepacket at each step in the process. Even though an undisrupted particle’s wave packet will spread, at the vacuum level the particle is re-initialized at each step, like in quantum nondemolition measurements [2].

3. Setup

At , a free particle in vacuum is initialized into a minimum uncertainty Gaussian wave-packet with a spatial variance equal to . As time increases so does its variance and thus its entropy.

To calculate the entropy rate of this process, it is helpful to think of time as occurring in discrete units of a small size dt (assumption 1).

We can look at a Venn diagram of this process, figure 1. (or X0 in the figure) is a random variable, drawn from , that describes the location of where the particle can be found at time . (X1) is a random variable, drawn from , that describes the location of where the particle can be found at time .  (X2) is drawn from  and so on up to  which is drawn from where .

As hinted to in the diagram (but explicitly stated here as assumption 2 and assumption 3), we will assume that the conditional entropy of each step is constant;

and , where h is the differential entropy  and where  is the distribution which determines . This essentially means that knowing the location of the particle at any time allows one to calculate where it was in the previous time step and that the minimum uncertainty wave-packet maintains its coherence as its first moment (or average value) diffuses via a process with a variance as given by equation (2).

In Section 5, we show that as time increases, a free particle diffuses such that the variance of where the particle can be found (if localized) is


Thus  (or simply ) is a Gaussian random variable with variance


4. Entropy Rate

We can calculate the entropy rate of this process using the definition of the entropy rate. We will use the entropy rate, R, as calculated by taking the limit as the number of steps goes to infinity of the conditional entropy of the last step given all previous steps divided by the time step [3].


To solve for R, we first notice that since

(assumption 2) we can show by induction that


Due to the symmetric nature of mutual information, we can prove the equation below [2].


Bringing us to the equation for R below


Next, we use assumption 3 to re-write the difference in entropy at time step n and n-1 as equal to the difference in entropy at time step 1 and the initial state.


Since the Xn’s are Gaussian, we can easily calculate the differential entropy of each step using equation (2) and the differential entropy of the Gaussian distribution [2] (see Equation (8) below)


Figure 1. Venn diagram of the conditional entropies of the diffusion process.


Using equations (31) and (32) this is re-written


We are assured by assumption 1 that. Thus, we can Taylor expand the logarithm giving the first term plus the terms that are O(dt) or smaller.


Ignoring the terms of O(dt) or smaller, we get our primary result


The other method to calculate the entropy rate is , which equals the limit as n goes to infinity of the entropy of all the Xn’s divided by n times dt [3]. Since we are looking at the rate of generation of the entropy (not the initial conditions), we subtract the entropy of the initial state h(X0). This also assures that R is in the correct units.


Since (assumption 2), we know that



We can insert zero into the limit




Assumption 3 now lets us rewrite this as


We see that


We can safely conclude that


In this view, the temperature acts as an average energy and generates information (or entropy) at a rate equal to twice the average energy divided by ħ.

5. The Variance of Xn

Given the wave particle duality, which states that a free particle is both a wave and a particle, we see that our free particle undergoes both quantum mechanical diffusion of the wave and classical diffusion of the particle.

Introducing , ,  and  makes this more clear. is a random variable drawn from

the probability distribution associated with the quantum mechanical wave-function, which is the solution to the quantum diffusion equation, equation (33). is a random variable drawn from  and is the solution to real diffusion equation, equation (42).

If  were an observation of where the particle is located, it would be the sum of a sample  drawn from  and the uncorrelated sample , drawn from .


Thus the action of  is to translate the center of the wavefunction, , by a sample of .

As we know from probability theory, the resulting distribution, is equal to the convolution of  and  over the x variable (30) [4].


Since both  and  are Gaussian distributions, it is easy to show that the convolution of the two is again a Gaussian distribution with an expected value being equal to the sum of the two expected values (which in this case is zero) and a variance that is equal to the sum of the variances of the individual distributions.



Shown in Equation (40) the variance of is .


In this equation t is the amount of time that has passed since the particle was initialized in the minimum uncertainty state,  is the standard deviation of the minimum uncertainty state,  is the standard deviation of the minimum uncertainty state in the momentum domain and m is the mass of the particle.

Shown in Equation (48), the variance of is.


Thus we get .


Inserting into the last term the Heisenberg Uncertainty principle (32), , we can group.


To understand the model, it is helpful to look at Equation (26). is the sum of three variances. The first is from the Heisenberg Uncertainty Principle of the initialized state, the second is from the thermal drift of the center of the minimum uncertainty wavepacket moving with a group momentum taken as a sample of the momentum domain, and the third is from the classical diffusion of the center of the wave-function on top of the other two.

It is also possible to derive Equation (27) by assuming no force on the particle, which lets you deduce


Squaring and taking the ensemble average is all you need [5].

6. The Imaginary Diffusion Equation

The Kinetic Energy Hamiltonian characterizes the wave packet of a free particle in one dimension, where H is the Hamiltonian, p is the momentum along the x direction, and m is the mass of the particle [6].


Given that the momentum commutes with the Hamiltonian,

each eigenvalue of the momentum is a constant of motion and thus the variance in momentum space does not grow with time. It is possible to learn the width of the variance of the momentum by looking at the equipartition of energy [7]. Using the equipartition of energy we know to equate the degree of freedom associated with the average Kinetic Energy to one half the temperature times Boltzmann’s constant.


Since we will assume that the average momentum is zero, we can solve for the variance of the momentum.



Also from the Heisenberg Uncertainty Principal, we can solve for the standard deviation of the wave-function in the spatial domain in terms of its width in the momentum space.


With these dependencies stated, we can move onto the imaginary diffusion equation, which takes the original Hamiltonian and rewrites it in terms of operators.  Interpreting the Hamiltonian as the imaginary time derivative operator and the momentum as the negative imaginary spatial derivative operator we can take equation (28) and arrive at the imaginary diffusion equation


Don’t forget that we still have the eigenvalue Equations (34), (35) where H and p are the operators and ω and k are the eigenvalues.



We can calculate the different eigenvalues, and , through equations (28), (34), (35) and as we should expect arrive at the equation for kinetic energy.


To solve equation (33), we will begin in the momentum domain  and take the inverse Fourier Transform to observe how  evolves over time [8]. We use  (the wavenumber divided by ) as the independent variable because we want both  and  to be normalizable to one.


Our assumption that the wave-function of the free particle in the momentum space is a Gaussian wave-packet is quite reasonable given the nice properties of the Gaussian. Similarly, this assumption is already implicit in the equipartition of energy which was used to find the width of the initial wave-packet. Because the equipartition theorem is derived from the perfect gas law (where particles are modeled using the binomial distribution, of which the Gaussian is the limit), the Gaussian is the right distribution to start with.

To properly account for the evolution of  governed by equation (33),  is used as the kernel for the inverse Fourier Transform.


Using equation (36) to substitute in for ω you can solve for equation (38) by completing the squares to get  [8]. is in Gaussian form; to calculate the variance, we need to take the magnitude squared of the wave-function and get the distribution of the particle.






This is, of course, the well know result from quantum mechanics where the variance of the particle is the sum of the initial variance from the Heisenberg Uncertainty Principal and the associated variance of the momentum domain imparting a thermal group velocity  [9].

7. The Real Diffusion Equation

When the diffusion constant of a diffusion process is real and does not vary with position, the resulting diffusion equation is as below [10].


Of course the solution to this real diffusion equation is the Gaussian with variance equal to 2Dt [11].



To find D, we will start with the imaginary diffusion operator and using analytical continuation, perform a Minkowski transformation [6]. The imaginary diffusion operator (33) is


Upon applying the Minkowski transformation, imaginary time is replaced with real time, [12]. Applied on the imaginary diffusion operator, the Minkowski transformation brings out the real diffusion constant we are looking for.


By observation we see that


We can also derive  from kinematic arguments as was shown in [1]. We can calculate the variance of f(x,t).


8. Entropy at

It is important to ask the entropy of the initial state. We find that at  the entropy is 1 natural unit. Since the wavefunction and associated probability distribution are continuous, we calculate the entropy using the equation for differential entropy. One might object that the differential entropy is only accurate up to a scale factor. However I argue (and so did Hirshman [13]) that if you add the differential entropy in the dual domain, the scale factor cancels out because of the scale property of the Fourier Transform and the result is an absolute measure. As before we will use the position and the wavenumber divided by  as the dual domains.


Hirschman [13] showed that this entropy for any wavefunction is  and  when the wavefunction is Gaussian.

While we are working with a Gaussian initial state, the answer appears to be a little more complex than just . We learn from [1] that when solving for the quantum and relativistic length scales of dark particles, particles come in pairs. With only one particle and no reference frame there is no way of knowing the position or the momentum, even if there were universal measuring sticks.

We get around this with two particles and a measuring stick/clock by determining the relative displacement and speed. Thus we need to look at the entropy of the relative difference of position and momentum of the two particles.


as the probability distribution on the location of particle 1 at and similarly for  for particle 2. For the momentum space define

as the probability distribution on the wavenumber divided by  for the first particle and  for the second particle.

The probability distribution on the relative displacement and wavenumber, and are and , respectively, and will be Gaussian assuming both the reference particle and the initial particle have Gaussian wave-functions. Since differential entropy is invariant to the first moment we can assume without loss of generality, the first moment of the reference wave-function is zero. The second moment of  and  will be the sum of the respective second moments of the particle and reference particle if the two are not correlated.

We can go even further and show that the reference particle should have the same second moments as the particle we are measuring if we minimize the entropy. Thus, we arrive at the distributions for both domains for  and



Thus, the total absolute entropy of the initial state, , is




There are 2 things of note relative to the rate, , calculated above. First we see that since the rate, , from above, is the difference between the entropy at two times, the impact of the wider distribution of  vs. is negated. Thus we could have done the analysis above using  and  instead  and and the result would be the same. Second, we see that the entropy of the initial state is equal to the additional entropy generated by the diffusion process during the de-coherence time,

9. Conclusions

We have seen that by making three assumptions about the thermal diffusion of a free particle, we are able to show that entropy is generated at a rate equal to twice the particle’s temperature (when expressed in the correct units).

This result will be applicable to all studies on free particles and other environments that are governed by similar equations. Also a myriad of applications exist in computer modeling, including but not limited to the following: finite difference time domain methods, Block’s equations for nuclear magnetic resonance imaging, and plasma and semiconductor physics.

To check the primary result, one would perform a quantum non-demolition measurement on the quantum state of an ensemble of free particles. The minimum bit rate needed to describe the resulting string of numbers that describe the trajectory would be the entropy rate and should be equal to twice the temperature.

However, even before an experiment can be conducted, this result is useful by suggesting the use of different information theoretical techniques to examine problems with de-coherence and might give a different perspective on the meaning of temperature.

This result is interesting as a stand-alone data point, that the entropy rate is equal to twice the temperature. However, if we could go further and more generally say that temperature is the same as entropy rate, it would change the way we view temperature and entropy.

10. Acknowledgements

JLH thanks Thomas Cover for sharing his passion for the Elements of Information Theory, and McKinsey & Co. for the amazing environment where this article was written.


  1. J. Haller Jr., Journal of Modern Physics, Vol. 4 2013, pp. 85-95.
  2. Yamamoto and Imamoglu, “Mesoscopic Quantum Optics,” John Wiley & Sons, New York, 1999.
  3. Cover and Thomas, “Elements of Information Theory,” John Wiley & Sons, New York, 1991.
  4. Bracewell, “The Fourier Transform and Its Applications,” 2nd Edition, McGraw Hill, New York, 1986.
  5. C. W. Gardiner and P. Zoller, “Quantum Noise,” Springer, Berlin, 2004.
  6. Shankar, “Principles of Quantum Mechanics,” Plenum Press, New York, 1994.
  7. Feynman, “Lectures on Physics,” Addison-Wesley Publishing, Reading, 1965.
  8. Bohm, “Quantum Theory,” Dover Publications, Mineola, 1989.
  9. Shankar, “Principles of Quantum Mechanics,” Plenum Press, New York, 1994.
  10. Bittencourt, “Fundamentals of Plasma Physics,” 2nd Edition, Sao Jose dos Campos, 1995.
  11. Einstein, “Investigation on the Theory of, the Brownian Movement,” Translated by Cowper, Dover, 1956.
  12. Einstein, “The Meaning of Relativity,” 5th Edition, Princeton University Press, Princeton, 1956.
  13. I. I. Hirshman, American Journal of Mathematics, Vol. 79, 1957, p. 152.