Risk-neutral pricing of European call options is investigated from a mathematical point-of-view and is found to be a specious concept
1. Risk-neutral pricing of European call options is an approximation in which all terms of order
The concept of risk-neutral pricing of European call options is investigated from a mathematical approach. It is found that risk-neutral pricing used in the pricing of European call options is a specious concept [
First some notation and background information. Let the probability density function (pdf) for a random variable x be f x ( x ) . The probability that a measurement of the random variable x takes a value between x and x + d x , P { x < x ≤ x + d x } , is given by f x ( x ) d x . The cumulative density function (CDF) F x ( a ) = P { x ≤ a } = ∫ − ∞ a f x ( ξ ) d ξ .
Let S t be the value of an asset at time t. Let the 1-day return be R t = l n ( S t / S t − 1 ) and the n-day return be R t , n = ln ( S t / S t − n ) . Note that
R t , n = ln ( S t / S t − n ) = ln ( S t / S t − 1 × S t − 1 / S t − 2 × ⋯ × S t − n + 1 / S t − n ) = R t + R t − 1 + R t − 2 + ⋯ + R t − n + 1 (1)
and, for simplicity in notation, that R t = R t , 1 .
When returns over non-lapping time periods are independent, the pdf for the n-day return is the n-fold convolution of the pdf of the 1-day return since the n-day return is the sum of n independent 1-day returns. Variances add under convolution, and hence the variance of n-day returns is n times the variance that describes the distribution of the 1-day returns. Since the normal distribution is stable under self-convolution, if the 1-day return distribution is a normal distribution with a mean μ and a variance σ2, then the pdf for an n-day return is a normal distribution with a mean = n μ and with a variance = n σ 2 [
The returns of assets are better described by Student’s t-distributions than by normal distributions. However, Student’s t-distributions are not stable under self-convolution and have fat tails. The fat tails lead to integrals that diverge and these integrals are needed to price options. Both of these characteristics make pricing with Student’s t-distributions difficult [
Following [
The expression for C T follows from the arbitrage theorem [
The Black-Scholes option pricing formula gives prices for European call options and is obtained under the constraints of: 1) no arbitrage; 2) the price of an asset is described by a geometric Brownian motion process with support [ − ∞ , + ∞ ] ; 3) risk-neutral pricing; and, 4) the future price is a martingale. Constraint 1) requires, from the arbitrage theorem [
E { S t } = S 0 e x p ( μ t + σ 2 t / 2 ) (2)
for the time origin chosen such that t = n .
Since by assumption S t is geometric Brownian motion process and since the distribution of R t , n is an n-fold convolution of the 1-day return distribution, then the constraint E { S t } = e x p ( r t ) S 0 specifies the drift rate for S t and the n-fold convolution dictates the shape and the variance of the distribution of R t , n , which then dictates the shape and variance of S t . From Equation (2) and constraints 3) and 4), S t must have a drift rate μ = r − σ 2 / 2 and from the n-fold convolution, R t , n must be normally distributed (by assumption the 1-day return is normally distributed) with a variance of σ 2 n , and σ 2 n = σ 2 t when the time origin is chosen such that t = n .
The value of a stock is expected to drift at the rate of α = r + ( α − r ) where α − r is called the risk premium and r is the risk-free rate.
Assume that S T = S 0 e x p ( ξ ) where ξ = R T , T is the T-day return and that the T-day return is normally distributed with mean α ′ T and variance σ 2 T . Thus E { S T } = e x p ( α T ) S 0 where α = α ′ + σ 2 / 2 .
The price of a European call option with strike price K T is then [
C 0 = S 0 e ( α − r ) T ∫ l n ( K T S 0 ) ∞ e x p ( − ( ξ − ( α + 0.5 σ 2 ) T ) 2 2 σ 2 T ) d ξ σ 2 π T − e − r T K T ∫ l n ( K T S 0 ) ∞ e x p ( − ( ξ − ( α − 0.5 σ 2 ) T ) 2 2 σ 2 T ) d ξ σ 2 π T . (3)
The difference between the Black-Scholes option pricing formula and Equation (3) rests in the α in Equation (3). If one sets α = r in Equation (3), then one obtains the Black-Scholes formula. The Black-Scholes formula is obtained by using the concept of risk-neutral pricing, wherein one essentially sets the risk premium, α − r , equal to zero. Mathematically this approach can not be correct. Setting α − r = 0 violates the no arbitrage condition that C T = E { ( S T − K T ) + } . The distribution for S t is centred about S 0 e x p ( α t ) (i.e., the mean of the pdf for the n-day return is α ′ n = ( α − σ 2 / 2 ) n ). Arbitrarily setting the mean of the distribution will change the value of the expectation E { ( S T − K T ) + } and hence mis-price European call options. In general, for small T, ( α − r ) T ≪ σ T and the error introduced by arbitrarily setting the mean of the distribution will be small. Application of Girsanov’s theorem in risk-neutral pricing is discussed in Sec. 3.
A series expansion in α of C 0 about the risk-neutral value α = r shows that risk-neutral pricing underestimates the price of a call option when α > r . C 0 | α is the cost at time t = 0 of a European call option whereas C 0 | α = r is the cost of a European call option using risk-neutral pricing (i.e., using the same assumptions that yield the Black-Scholes option pricing formula).
C 0 | α = C 0 | α = r + ( S 0 ( f ( d 1 ) + 1 − F ( d 1 ) ) − e − r T K T f ( d 2 ) ) ( α − r ) T + O ( ( α − r ) 2 T 2 ) (4)
with
d 1 = ln ( K T / S 0 ) − ( r + σ 2 / 2 ) T T (5)
d 2 = d 1 + σ 2 T (6)
and f ( a ) and F ( a ) the pdf and CDF for a normal distribution with a mean of zero and a variance of σ2. See Equation (8) for definitions of f ( a ) and F ( a ) .
C 0 | α = r < C 0 | α when α > r since the expansion is only valid for S T > K T , which follows from the definition C T = E { ( S T − K T ) + } . Thus the risk-neutral pricing underestimates the value of the call option and gives, on average, an advantage to the option buyer when α > r . The converse holds. The seller has, on average, an advantage under risk-neutral pricing when α < r . Neither party has, on average, an advantage when the true value of α is used to price an option. The best estimate of a true value is typically the sample mean.
The “success” of risk-neutral pricing owes to the fact that the magnitude of the random fluctuations are typically significantly greater than the magnitude of the risk premium, i.e., σ T ≫ | ( α − r ) T | , and thus the random fluctuations
K T | α = r / 2 | α = r | α = 4 r | α = 8 r |
---|---|---|---|---|
46.00 | 4.089 | 4.108 | 4.225 | 4.382 |
48.00 | 2.369 | 2.385 | 2.483 | 2.616 |
50.00 | 1.102 | 1.113 | 1.178 | 1.268 |
52.00 | 0.393 | 0.398 | 0.430 | 0.475 |
54.00 | 0.105 | 0.107 | 0.118 | 0.134 |
obscure the drift. Presumably fluctuations about the mean value are described well by the shape and scale parameters of the distribution and one should use the best available estimate of the location parameter of the distribution (i.e., the mean drift rate, α) to price an option.
Consider a normal distribution with mean μ and variance σ2. If μ ≪ σ , then to an error of < μ / σ , μ can be ignored. This can be verified from a series expansion of the CDF
F ( a − μ ) ≈ F ( a ) − μ f ( a ) ( 1 + a μ 2 σ 2 + ( a 2 − σ 2 ) μ 2 6 σ 4 ) . (7)
Note that f ( a ) has a factor of σ − 1 , that F ( a ) is the cumulative density function (CDF), and that f ( x ) is the zero mean pdf with variance σ2:
F ( a ) = ∫ − ∞ a f ( x ) d x = 1 σ 2 π ∫ − ∞ a e x p ( − x 2 / ( 2 σ 2 ) ) d x . (8)
The standard deviation σ is a measure of the width of the distribution whereas the mean μ shifts the curve left or right. For a broad curve, small shifts left or right make a small difference.
In an Ito calculus formalism, risk-neutral pricing is explained as
d S t = ( α − σ 2 / 2 ) S t d t + σ S t d W t = ( r − σ 2 / 2 ) S t d t + ( α − r ) S t d t + σ S t d W t = ( r − σ 2 / 2 ) S t d t + σ S t ( d W t + α − r σ d t ) = ( r − σ 2 / 2 ) S t d t + σ S t d W ′ t (9)
where d W t = W ( t + d t ) − W ( t ) is an increment of Brownian motion (or Weiner process W ( t ) ( [
The solution to Equation (9) is not S t = S 0 e x p ( r t ) since E { d W ′ t } ≠ 0 . The solution to Equation (9) is, with α ′ = α − σ 2 / 2 ,
S t = S 0 e x p ( ∫ 0 t α ′ + σ w ( τ ) d τ ) (10)
which follows from the solution, in a Langevin formalism [
d S ( t ) d t = α ′ S ( t ) + σ S ( t ) w ( t ) (11)
where S t = S ( t ) and w ( t ) is a zero mean stochastic process that in a limit is delta function correlated. In the limit, w ( t ) is a white noise and the Wiener process W ( t ) = ∫ 0 t w ( τ ) d τ , d W t = W ( t + d t ) − W ( t ) = w ( t ) d t ( [
d S ( t ) ¯ d t = ( α ′ + σ 2 2 ) S ( t ) ¯ = α S ( t ) ¯ (12)
with solution
S ( t ) ¯ = S 0 e x p ( ∫ 0 t α ′ d τ + σ 2 2 t ) = S 0 e x p ( ∫ 0 t α d τ ) . (13)
In the event that α is a constant, S ( t ) ¯ = S 0 e x p ( α t ) . The mean value of S t (or S ( t ) , S ( t ) = S t ) drifts at the rate α when the time development of S t (or S ( t ) ) is described by Equation (9) in an Ito formulation or equivalently by Equation (11) in a Langevin formulation.
Note that the σ 2 / 2 contribution to the drift arises from averaging over an ensemble of realizations of the stochastic process w ( t ) (c.f. Equations (2), (10), and (13)). Care must be employed in obtaining and in interpreting results within the Ito formalism. One could attempt, in the Langevin picture, to hide the risk premium α − r in w ( t ) , as was attempted in Equation (9) to justify risk-neutral pricing. However, one would experience a similar difficulty. The transformed w ( t ) would not be zero mean and would not be delta function correlated in the limit, and knowledge of α − r would still be required to price the option unless ( α − r ) / σ is negligible.
Langevin equations are first order differential equations with noise driving terms. The Langevin equations should be interpreted as integral equations ( [
Girsanov’s theorem [
A pdf defines a probability measure P. If the pdf for a random variable x is f x ( x ) , then
P x ( A ) = ∫ A f x ( ξ ) d ξ . (14)
If A ( x ) is a small neighbourhood of a specific outcome x, then P x ( A ( x ) ) = d P x ( x ) = f x ( x ) d x . For G ( x ) a function of the random variable x , then the expectation of G ( x ) over x is
E x { G ( x ) } = ∫ G ( ξ ) d P x ( ξ ) = ∫ G ( ξ ) f x ( ξ ) d ξ . (15)
Two probability measures P and Q are said to be equivalent if, for any set A in the probability space, P ( A ) > 0 and Q ( A ) > 0 ( [
Consider a stochastic process x ( t ) (for a stochastic process, x ( t ) is a random variable for each point in time t) that has a drift rate μ and is driven by a Wiener process W ( t ) , such that, e.g., d x ( t ) = μ x ( t ) d t + σ x ( t ) d W ( t ) . Define ( [
Z = e x p ( − ( μ − r ) x ( t ) − 1 2 ( μ − r ) 2 ) . (16)
P ˜ and P are equivalent probability measures and are related by ( [
P ˜ = ∫ Z d P (17)
where P ˜ is the equivalent risk-neutral measure. Z = d P ˜ / d P is called the Radon-Nikodým derivative.
If dP is the unit-normal distribution (i.e., the underlying pdf is normally distributed with zero mean and standard deviation of unity, d P ~ N ( 0,1 ) ) then
P ˜ ( A ) = ∫ A 1 2 π e x p ( − ( μ − r ) x − 1 2 ( μ − r ) 2 ) e x p ( − x 2 / 2 ) d x (18)
and the equivalent risk-neutral measure for this example, for a small neighbourhood A ( x ) near x , is
l i m A ( x ) → d x P ˜ ( A ) = d P ˜ ( x ) = 1 2π e x p ( − ( x − ( μ − r ) ) 2 / 2 ) d x . (19)
Clearly Z is selected to give the desired mean to the equivalent measure. In the example given here, d P ˜ ( x ) is a unit normal distribution with a mean of μ − r . A change of variable starts the transformation of the process d W ( t ) + ( μ − r ) d t / σ to a zero mean process d W ˜ ( t ) .
The transformed process for the example given is d x ˜ ( t ) = r x ˜ ( t ) d t + σ x ˜ ( t ) d W ˜ ( t ) where r ′ = r + σ 2 / 2 could be the risk-free rate and d W ˜ ( t ) is a Gaussian increment at time t of a Brownian motion. In the equivalent measure d P ˜ , x ˜ ( t ) is a geometric Brownian motion that increases on average at the risk-free rate r ′ : E { x ˜ ( t ) } = x ˜ ( 0 ) e x p ( r ′ t ) .
Gardiner ( [
Gardiner ( [
Gardiner ( [
d x ( t ) = a ( t ) d t + b ( t ) d W (t)
d y ( t ) = f ( t ) d t + g ( t ) d W ( t ) (20)
are equivalent if b ( t ) = g ( t ) . This result is Girsanov’s theorem.”
Gardiner ( [
This result is not surprising. The solution to Equation (9) or to Equation (11) in a Langevin approach is given by Equation (10) and can be recast as
S t S 0 e x p ( ∫ 0 t α ′ d τ ) = e x p ( ∫ 0 t σ w ( τ ) d τ ) = S ′ t . (21)
S ′ t is a scaled version of S t and is determined solely by the noise w ( t ) , w ( t ) d t = d W ( t ) . Provided that the drift α ′ is not too large compared to σ, then any S t with the same σ but different drift should look similar. In this case, one is using the higher frequency noise as a fiducial to compare observations of S t for 0 ≤ t ≤ T for different w ( t ) and/or α ′ .
expected values. All simulations used σ = 0.01 .
The simulations for the same noise but different drifts look qualitatively indistinguishable, as pointed out by Gardiner ( [
Gardiner ( [
“Girsanov’s theorem is now the justification for use of the drift rate r instead of μ in the valuation of options using the risk-neutral procedure. The noise term is identical for both cases, and in the case we can say that the two processes can be seen as arising from the choice of a different probability measure to the same set of sample paths. In some sense it can be shown that this is a rigorously justifiable procedure [10.11], although not everyone would accept that. However, the use of change of measure is now an accepted part of the procedure for valuing options and other derivatives when one goes beyond the simple geometric Brownian motion picture.”
Gardiner does not seem to be a true believer, and seems resigned to a deeply engrained status quo.
From [
“The relation (5) has been called by Cox-Ingersoll-Ross (1981) the “Local Expectation Hypothesis”, a terminology which has led to some confusion. Note that the equilibrium process has not been changed, it is the same under both measures P and P ˜ . Girsanov’s Theorem allows us to replace the relation (4) through the equivalent simpler relation (5). In particular, no assumption has been made about the existence of risk-neutral investors. In a real economy neither a “representative” nor a “risk-neutral” investor will exist, since both assumptions would prevent the existence of a (stable) equilibrium. The great advantage of the representation (5) under the (martingale) measure P ˜ is that we do not have to know anything about the individual expectations P and the investors’ attitude towards risk. In summary: x t has not been changed. It is the same equilibrium price process as under P, but in simpler representation under P ˜ . P ˜ is called the “equivalent risk-neutral measure” or the “P-equivalent martingale measure”. ( [
It would appear that x t has been changed. The possible paths are the same, but more weight (probability) has been placed on lower yielding paths (assuming the risk premium is greater than zero). From [
“After the change of measure, we are still considering the same set of stock price paths, but we have shifted the probability on them. If α > r , as it normally is, then the change of measure puts more probability on the paths with lower return so that the overall mean rate of return is reduced from α to r.” ( [
Altering the distribution changes the problem, unless the alteration is undone by an inverse transformation to return to the original frame of reference. Essentially, the risk-neutral approach appears to be to multiply one term on one side of an equation by Z. This is not a valid mathematical approach. Consider x = − α x + a with solution x = a / ( 1 + α ) . Now solve x = − α x + Z a . The solutions for x are not the same unless Z = 1 .
In the development of the risk-neutral measure d W ′ t in Equation (9), it is considered that d W ′ t is a Weiner process (Brownian motion). Strictly speaking, Brownian motion is a zero mean process. d W ′ t is not a zero mean process. One might wish to apply a coordinate transformation such that d W ′ t is a zero mean Brownian motion in the transformed frame of reference. However, one must remember that one is working in a transformed coordinate system, and provide a reverse transformation at the end to obtain the answer in the original frame of reference. This is similar to the problem of relative motion.
Consider an airplane that can cruise at v km/hour with respect to the air mass in which the airplane is embedded (i.e., the local air) and assume that the local air is moving relative to the ground. If one is interested in the location of the plane after a given time, one can solve the problem by using a coordinate system that is embedded in the moving air. Relative to this coordinate system the plane has travelled a distance d = v t in time t. The choice of coordinate system makes the problem look simpler. However, to know the location of the plane relative to a reference coordinate system on the ground such as an airport, one must know the relationship between the reference coordinate system and the moving coordinate system. In a similar manner, to find the value of an asset or an option, one must know the risk premium.
Consider pricing a very long lived option, one so long lived that the mean value α t is > r t + 4 σ t . With 99.99% certainty (normal statistics are assumed in this work), the value of the underlying S t will be > r t . Does it make sense to use risk-neutral pricing and force E { S t } = S 0 e x p ( r t ) ? In this case, the approximation that α t ≪ σ t does not hold, and it appears that risk-neutral pricing would not be a reasonable approach.
Let us examine the justification for risk-neutral pricing, which is presented as Equation (9). As before, define a risk premium, move it to the noise term W t , and absorb the risk premium in W ′ t .
d S t = ( α − σ 2 / 2 ) S t d t + σ S t d W t = ( r − σ 2 / 2 ) S t d t + ( α − r ) S t d t + σ S t d W t = ( r − σ 2 / 2 ) S t d t + σ S t ( d W t + α − r σ d t ) = ( r − σ 2 / 2 ) S t d t + σ S t d W ′ t . (22)
Now W ′ t is a non-zero mean noise term, except in the special case that α − r = 0 . Multiply the noise term by Z / Z , c.f. Equation (16), to obtain
d S t = ( r − σ 2 / 2 ) S t d t + σ Z S t Z d W ′ t = ( r − σ 2 / 2 ) S t d t + σ Z S t d W ″ t . (23)
Z is chosen such that the mean value of Z d W ′ t is zero, i.e., E { Z d W ′ t } = 0 .
One could redefine σ / Z on the right hand side of Equation (23) to obtain
d S t = ( r − σ 2 / 2 ) S t d t + σ G S t d W ″ t , (24)
which is a simple-looking equation but it must be remembered that σ G = σ / Z is no longer a constant. The dependence of σ G on Z − 1 undoes the work to create a zero mean noise term d W ″ t . Alternatively, one could multiply both sides of Equation (23) by Z, follow the rules for transformation of stochastic differential equations and variables ( [
If in Equation (9) or in Equation (22) ( α − r ) / σ can be ignored, then under this approximation risk-neutral pricing would be accurate.
Equation (21) suggests an approach to understand risk-neutral pricing of a European call option. Start with the expression for the value of the call option and manipulate to remove the drift owing to the risk premium in S T :
C 0 = e x p ( − r T ) E { ( S T − K T ) + } = e x p ( − r T ) e x p ( ∫ 0 T ( α − r ) d τ ) e x p ( ∫ 0 T ( α − r ) d τ ) E { ( S T − K T ) + } = e x p ( − r T ) e x p ( ∫ 0 T ( α − r ) d τ ) E { ( S ′ T − K ′ T ) + } (25)
where both S T and K T have been scaled by the same factor to remove drift in S T owing to the risk premium. The expectation in the last line might be anticipated at first look to be risk neutral. However, K T ' is a function of the
risk premium: K ′ T = K T × e x p ( ∫ 0 T − ( α − r ) d τ ) . In addition, the inverse scaling
factor remains in the expression for C 0 as an overall multiplicative factor. The risk premium is thus required to price the option. If the risk premium is negligible, then the scaling factor is approximately unity and risk-neutral pricing of the European call option would be sufficiently accurate. The approach presented in this section is silent on the magnitude of the risk premium that is negligible―one would need to examine the expectation, as was done in Sec. 2.2, to determine the magnitude. The rms fluctuations of the one day returns, σ, is the relevant metric. It is the magnitude of the ratio ( α − r ) / σ that determines what is negligible or not.
Risk-neutral pricing of European call options is mathematically an approximation. Provided that ( α − r ) / σ is small, then the drift rate is obscured over short time intervals by random fluctuations and one is justified in ignoring the risk premium α − r in the drift rate. It would seem honest to state this as the rationale behind risk-neutral pricing, rather than appealing to theorems and pretending that risk-neutral pricing is exact.
Risk-neutral pricing underestimates the price of a European call option when α > r and overestimates the price of a European call option when α < r .
It is interesting to note that risk-neutral pricing of European call options ignores α − r but takes great care to include σ 2 / 2 in the average drift rate. If the risk premium can be ignored, then likely σ 2 / 2 can also be ignored. For the S & P 500, on average σ 2 / 2 ≈ 5 × 10 − 5 whereas the risk premium α − r ≈ 1 × 10 − 4 .
Cassidy, D.T. (2018) Risk-Neutral Pricing of European Call Options: A Specious Concept. Journal of Mathematical Finance, 8, 335-348. https://doi.org/10.4236/jmf.2018.82022