2752075 height=32.4900007247925 />, These parameters control the duration and the frequency of the spikes, that is they control the expected lifetime of the spike periods and the expected lifetime of the periods between spikes. It is easy to see that in the case of a time-homogeneous Markov chain, , the transition probability matrix, is a function only of the difference and that we have :
where and are non negative real parameters. Moreover we assume that are not both zero.
From now on in our model for spiky prices we always assume that the two-state Markov chain, is time-homogeneous and, as a consequence, we assume that its transition probability matrix is given by (4).
The process that describes the asset prices in absence of spikes is a diffusion process defined through the following stochastic differential equation:
with the initial condition:
where denotes the random variable that describes the asset price in absence of spikes at time are constants, is the standard Wiener process, is its stochastic differential and the random variable is a given initial condition. For simplicity we assume that the random variable is concentrated in a point with probability one and, with abuse of notation, we denote this point with. Recall that is the drift coefficient and is the volatility coefficient.
Equation (5) defines the asset price dynamics of the celebrated Black-Scholes model. Note that the solution of (5), (6) is a Markov process known as geometric Brownian motion. Note that several other Markov processes different from the one defined in (5), (6) can be used to model asset price dynamics in absence of spikes. For example we can use one of the so called stochastic volatility models that have been introduced recently in mathematical finance to refine the Black-Scholes asset dynamics model.
Let us define the spike process that models the amplitude of the spikes in the asset prices. Let be a stochastic process made of independent positive random variables with given probability density functions We assume that if the Markov chain is in the regular state then the spike process is equal to one. If the Markov chain transits from the regular state to the spike state at time then the spike process is equal to a value sampled from the random variable during the entire time period beginning with that the Markov chain remains in the spike state. We assume that is in the regular state at time with probability one, so that the spike process starts with with probability one. Let us observe that in the case of spikes with constant amplitude the probability density function is the Dirac delta function with support in that is we have:
Finally we say that the spike process is in the spike state or in the regular state if the Markov chain is in the spike state or in the regular state respectively. In our model for spiky prices we assume that the spikes have constant amplitude that is we assume (7).
Let us define now the process that describes the asset prices with spikes. Let us denote with the random variable that models the price (eventually) with spikes of the asset at time. We assume that the spike process and the process that describes asset prices in absence of spikes, are independent. Following [1,2] we define the process, as the product of the spike process and of the process that is:
Note that the process for spiky asset prices is in the spike state or in the regular state depending from the fact that the spike process is in the spike state or in the regular state respectively, or equivalently, depending from the fact that the Markov chain is in the spike state or in the regular state respectively.
In  it has been shown that the process is able to reproduce spikes in asset prices. This can be seen in Figure 1 where synthetic data sampled from are shown. These synthetic data are those used in the numerical examples presented in Section 4. Note that depending on the values of the parameters defining the model, the trajectories of can exhibit a variety of behaviours. In particular, when is time-homogeneous and the expected times and spent by the process in “spike mode” and in “regular (i.e. inter-spike) mode”, respectively, are given by (see ):
If is small in comparison with the characteristic time of change of the process then the process exhibits spikes. For example, if is the diffusion process defined in (5) then the previous condition can be stated as follows:
We interpret as the expected lifetime of a spike, and as the expected time elapsed between two consecutive spikes. That is the parameters and control the duration and the frequency of the spikes through (9). For example, the relations (9) suggest that in order to model short-lived spikes the parameter must be large, while to model rare spikes the parameter must
Figure 1. The synthetic data.
Using some heuristic rules to recognize spikes on the data time series considered thank to their relation with the spike duration and frequency a rough estimate of the parameters and can be obtained directly from observed spiky prices (market data). For example, the relations (9) suggest that initial estimates and of and respectively, can be obtained as follows:
where and are computed (using some heuristic rules) directly from the data available as the observed average lifetime of the spikes and the observed average time between two consecutive spikes, respectively.
Finally it is easy to see that the asymptotic probabilities of the time-homogeneous Markov chain of being in the spike and in the regular states are respectively:
With the previous choices the model for spiky asset prices introduced in [1,2] depends on five parameters, that is: The parameters and are those appearing in (5) and are relative to the asset price model in absence of spikes, the remaining parameters are relative to the spike process (i.e.) and to the Markov chain that regulates the transitions between spike and regular states (i.e.).
3. The Filtering and the Calibration Problems
The parameters are grouped in the following vector:
where the superscript denotes the transpose operator. The vector is the unknown of the calibration problem that must be determined from the data available. The vectors, describing admissible sets of parameter values, must satisfy some constraints, that is we must have where:
and is the five dimensional real Euclidean space.
The constraints contained in express some elementary properties of the model introduced in Section 2. Note that, when necessary to take care of special circumstances, more constraints can be added to.
The data of the calibration problem are the observation times: and the spiky asset price observed at time where The observed price is a positive real number that is understood as being sampled from the random variable Starting from these data we want to solve the following problems:
Calibration Problem: Find an estimate of the vector.
Filtering Problem (Forecasting Problem): Given the value of the vector estimated forecast the forward asset prices.
Note that the prices are called spot asset prices and that with forward asset prices we mean prices “in the future”, that is, in our setting, we mean the forward prices at time associated to the spot price, These are the prices at time of the asset with delivery time Let us point out that in correspondence of a spot price we can forecast several forward prices associated to different delivery periods. The delivery periods of interest are those actually used in real markets, in fact for these delivery periods observed forward prices are available and a comparison between observed forward prices and forecasted forward prices is possible. In Section 4 in the analysis of real data we consider (some of) the delivery periods used in the UK electricity market in the years 2004-2009 and we make this comparison.
Let us mention that the forecasted forward asset prices associated to the spiky price model introduced in Section 2 do not exhibit spikes while, of course, the corresponding (spot) asset prices exhibit spikes (see [1,2] for more details).
The calibration problem consists in estimating the value of the vector that makes “most likely” the observations. The observations available at time are denoted with:
Note that for simplicity we assume that the transitions from regular state to spike state or viceversa happen at the observation times.
Let be the probability density function of the stochastic process at time conditioned to the observations and let
be the probability density function of the stochastic process conditioned to the observations made up to time, when where for convenience we define.
The probability density functions , are the solutions of the following initial value problems for the Fokker-Planck equation associated to the Black-Scholes model (5):
where and are, respectively, the probabilities defined through the time-homogeneous Markov chain of being in the regular and in the spike state at time (see (3)).
The probability density functions, solutions of the initial value problems for the FokkerPlanck equations (16)-(19), can be written as follows:
where is the fundamental solution of the FokkerPlanck equation (16) associated to the Black-Scholes model (5), that is:
In order to measure the likelihood of the vector we introduce the following (log-)likelihood function:
It is worthwhile to note that definition (22) contains an important simplification. In fact a more correct definition of the (log-)likelihood function should be:
where or depending on whether at time the asset price is in the regular state or in the spike state respectively, However, since when dealing with real financial data the decision about the character of the state (regular or spike state) of the observed prices is dubious, we prefer to adopt the definition (22) for the (log-)likelihood function. In fact the choice made in (22), at the price of introducing some inaccuracy, avoids the necessity of defining a (dubious) criterion to recognize regular and spike states in order to evaluate the (log-)likelihood function. The validity of this choice is supported by the fact that in the numerical experience shown in Section 4 the simplification introduced in (22) using instead of is sufficient to obtain satisfactory results.
The solution of the calibration problem is given by the vector that solves the following optimization problem:
Problem (24) is called maximum likelihood problem. In fact the vector solution of (24) is the vector that makes “most likely” the observations actually made. Problem (24) is an optimization problem with nonlinear objective function and linear inequality constraints.
Note that (22), (24) is one possible way of formulating the calibration problem considered using the maximum likelihood method. Many other formulations of the calibration problem are possible and legitimate. Moreover the formulation of the calibration problem (22), (24) can be easily extended to handle situations where we consider calibration problems associated to data set different from the one considered here, such as, for example, data set containing asset and option prices or only option prices.
The optimization algorithm used to solve the maximum likelihood problem (22), (24) is based on a variable metric steepest ascent method. The variable metric technique is used to handle the constraints.
Let be the absolute value of and be the Euclidean norm of the vector Beginning from an initial guess:
we update at every iteration the current approximation of the solution of the optimization problem (24) with a step in the direction of the gradient with respect to computed in a suitable variable metric of the (log-)likelihood function (22).
In particular let us fix a tolerance value and a maximum number of iterations. Given the optimization procedure can be summarized in the following steps:
1) Set and initialize
2) Evaluate If and
go to item
3) Evaluate the gradient of the (log-)likelihood function in that is:
If go to item
4) Perform the steepest ascent step, evaluating where is a quantity that is used to define the the step made. The quantity can be chosen as a scalar or, more in general, as a matrix of suitable dimensions. The choice of involves the use of the “variable metric”. When is chosen to be a scalar we have a “classical” steepest ascent method;
5) If go to item
6) Set if go to item
7) Set and stop.
The vector obtained in step 7 is the (numerically computed) approximation of the maximizer of the (log-) likelihood function.
Numerical experiments have shown that the (log-) likelihood function defined in (22) is a flat function. That is there are many different vectors in that make likely the data. In the optimization of objective functions with flat regions when local methods, such as the steepest ascent method, are used to solve the corresponding optimization problems (i.e.: (24)), special attention must be paid to the choice of the initial guess of the optimization procedure. That is in actual computations a “good” initial guess is important to find a “good” solution of the optimization problem (24). More specifically in the problem considered here the use of good initial guesses of the volatility and of the drift improves substantially the quality of the estimates of all the parameters contained in the vector. That is in the solution of problem (24) the parameters and are the most “sensitive” parameters of the vector.
For these reasons in order to improve the robustness of the solution of problem (24) that is obtained with the optimization method described above it is useful to introduce some ad hoc steps that lead to a simple reformulation of the (log-)likelihood function (22), of the calibration problem (24) and of the method used to solve problem (24). First of all an ad hoc preliminary simplified optimization problem is solved to produce a high quality initial guess for the steepest ascent method summarized in steps 1) - 7). We proceed as follows. First we estimate two initial values of and let us say and directly from the data available, that is we compute the historical volatility and the historical drift of the data available (see, for example, ). These estimates are used as initial guesses to maximize the objective function (22) as a two variables function of the parameters and with the constraint Note that in this preliminary step we consider as data used to define the (log-)likelihood function (22) only the “most regular portion of the input data”. The most regular portion of the input data is chosen using elementary empirical rules that try to find one or more regular periods in the data considered. In this preliminary optimization problem the remaining components of the vector are initialized as follows and and these initial values are kept constants in the optimization procedure with respect to and. Let be the maximizer found at the end of this preliminary step. In order to keep memory in the maximum likelihood problem (24) of the values and found in the preliminary step we add to the objective function (22) the following penalization term:
where are positive penalization constants. That is we consider the following modified (log-)likelihood function:
The variable metric steepest ascent method 1) - 7) is now applied to solve problem (24) when the (five variables) (log-)likelihood function defined in (28) replaces the (log-)likelihood function defined in (22) starting from the initial guess:
where the initial values and are estimated directly from the input data through (11) and, finally, the initial value is chosen as a rough estimate of the average amplitude of the spikes appearing in the input data. That is problem (24) is solved replacing with starting from an initial guess that uses the results obtained in the preliminary step of the optimization procedure and tries to exploit the data using the “physical” meaning of the model parameters.
These ad hoc steps used to reformulate problem (24), when applied to the analysis of synthetic data, lead to a substantial improvement in the accuracy of the estimates of the model parameters obtained when compared to the estimates obtained solving directly (22), (24) and to a great saving in the computational cost of solving the calibration problem. This reformulation of problem (24) is used to analyze the data considered in Section 4.
As a byproduct of the solution of the calibration problem we obtain a technique to forecast forward asset prices. Let us consider a filtering problem. We assume that the vector solution of the calibration problem associated to the data is known. From the knowledge of at time we can forecast the forward asset prices with delivery period deep in the future and delivery time as follows:
where denotes the expected value of.
In the numerical experiments presented in Section 4 we use the following approximation:
Note that using formula (31) we have implicitly assumed. In fact in the data time series considered in Section 4 the average in time of the (spiky) observations appearing in (31) gives a better approximation of the “spatial” average of the price without spikes than the individual observation made at time of the spiky price. However the average in time of the observations approximates the “spatial” average only if short time periods are used. This is the reason why we limit the mean contained in (31) to the data corresponding to ten consecutive observation times, that is corresponding to a period of ten days when we process daily data as done in Section 4. Note that in (31) an average of spiky prices is used to approximate the expected value of non spiky prices.
4. Numerical Results
The first numerical experiment presented consists in solving the calibration problem discussed in Section 3 using synthetic data. This experiment does the analysis of a time series of daily data of the spiky asset price during a period of two years. The time series studied is made of 730 daily data of the spiky asset prices, that is the prices at time, These data have been obtained computing one trajectory of the stochastic dynamical system used to model spiky asset prices defined in Section 2 looking at the computed trajectory at time, with where we have chosen and. We choose the vector that specifies the model used to generate the data equal to in the first year of data (i.e. when,), and equal to in the second year of data (i.e. when ). The data are generated using as initial value of the second year the last datum of the first year. The synthetic data generated in this way are shown in Figure 1. These data are spiky data and the fact that the first year of data is generated using a different choice of than the choice made in the second year of data can be seen simply looking at Figure 1.
We solve problem (24) with the ad hoc procedure described in Section 3 using the data associated to a time window made of 365 consecutive observation times, that is 365 days (one year), and we move this window across the two years of data discarding the datum corresponding to the first observation time of the window and inserting the datum corresponding to the next observation time after the window. Note that numerical experiments suggest that it is necessary to take a large window of observations to obtain a good estimate the parameters and. The calibration problem is solved for each choice of the data time window applying the procedure described in Section 3, that is it is solved 365 times. The 365 vectors constructed in this way are compared to the two choices of the model parameters used to generate the data. Moreover the time when changes from being to being is reconstructed.
To represent the 365 vectors obtained solving the calibration problem in correspondence to the 365 data time windows considered, we associate to each reconstructed vector a point on a straight line (see Figure 2). Let us explain how this correspondence is established. We first represent the vectors and that generate the data as two points on the straight line mentioned above having a distance proportional to measured in units, where
Figure 2. Reconstruction of the parameter vectors and
. We choose the origin of the straight line to be the mid point of the segment that joins and In Figure 2 the diamond represents the vector and the square represents the vector The unit length is The vectors (points) solution of the 365 calibration problems are represented as (green or red) stars. A point is plotted around when the quantity is smaller than the quantity
, otherwise the point is plotted around
The distance of the point from when is plotted around (or from when is plotted around
) is measured in units (or
measured in units). The point plotted around is plotted to the right or to the left of according to the sign of the second component of (negative second component of is plotted to the left of). Remind that the second component of is the volatility coefficient A similar statement holds for the points plotted around The results obtained in this experiment are shown in Figure 2. In this figure the green stars represent the solutions of the calibration problems associated to the first 183 data time windows, that is the first “half” of data time windows, while the red stars represent those associated to the second “half” of the data time windows (that is the second 182 time windows). Figure 2 shows that the points (vectors) obtained solving the calibration problems are concentrated on two disjoint segments one to the left and one to the right of the origin and that they form two disjoint clusters around and That is, the solution of the 365 calibration problems corresponding to the 365 time windows described previously shows that two sets of parameters seems to generate the data studied. This is really the case. Moreover, as expected, the points of the cluster around are in majority red stars, that is they are in majority the points obtained by the analysis of data time windows containing a majority of observations made in the second year (the second “half” of the data time windows), and a similar statement holds for the points of the cluster around (in majority green stars).
The second numerical experiment is performed using real data. The real data studied are electric power price data taken from the UK electricity market. These data are “spiky” asset prices. The data time series studied is made of the observation times (days), and of the spiky asset price observed at time, where is the daily electric power spot price (GBP/MWh), named Day-Ahead price, Excluding week-end days and holidays this data time series corresponds to more than 5 years of daily observations going from January 5, 2004 to July 10, 2009. Remind that GBP means Great Britain Pound and that MWh means Mega-Watt/hour.
Moreover the data time series studied in correspondence of each spot price contains a series of forward prices associated to it for a variety of delivery periods. These prices include: forward price 1 month deep in the future (Month-Ahead price), forward price 3 months deep in the future (Quarter-Ahead price), forward price 4 months deep in the future (Season-Ahead price), forward price 1 year deep in the future (One Year-Ahead price). These forward prices are observed each day and the forward prices observed at time are associated to the spot price The spot and the forward prices contained in the data time series mentioned above are shown in Figure 3.
The observed electric power prices generate data time series with a complicated structure. The stochastic dynamical system studied in this paper does not pretend to fully describe the properties of the electric power prices. Indeed it is able to model only one property of these prices: the presence of spikes. In addition the electricity prices have many other properties, for example, they are mean reverting and have well defined periodicity, that is they have diurnal, weekly and seasonal cycles. A lot of specific models incorporating (some of) these features are discussed in the literature, see for example .
Let us begin performing the analysis of the data using the model introduced in Section 2. The first question to answer is: the model for spiky prices presented in Section 2 is an adequate model to analyze the time series of the spot prices? We answer to this question analyzing the relationship between the data and the model parameters established through the solution of the calibration problem. Let us begin showing that the relation between the data and the model parameters established through the calibration problem is a stable relationship. We proceed as follows. We have more than 5 years of daily observations. We apply the calibration procedure to the data corresponding to a window of 257 consecutive observation times. Note that 257 is approximately the number of working days contained in a year and remind that we have data only in the working days. We move this window through the data time series discarding the datum corresponding to the first observation time of the window and adding the datum corresponding to the next observation time after the window. In this way we have 1396 − 257 = 1139 data windows and for each one of these data windows we solve the corresponding calibration problem. The calibration problems are solved using the optimization procedure (including the ad hoc steps) described in Section 3. The reconstructions of the parameters obtained
Figure 3. The UK electric power price data.
moving the window along the data are shown in Figures 4 and 5. In Figures 4 and 5 the abscissa represents the data windows used to reconstruct the model parameters numbered in ascending order according to the order in time of the first day of the window. Figures 4 and 5 show that the model parameters, with the exception of are approximately constant functions of the data window. The parameter reconstructed shown in Figure 5 is a piecewise constant function. These findings support the idea that the model and the formulation of the calibration problem presented respectively in Sections 2 and 3 are adequate to interpret the data. In fact they establish a stable relationship between the data and the model parameters as shown in Figures 4 and 5.
In the analysis of the real data time series the second question to answer is: the solution of the calibration problem and the tracking procedure introduced in Section 3 can be used to forecast forward prices? To answer this question we compare the observed and the forecasted forward prices. We apply the calibration procedure to a data window made of the first three years of consecutive observations of the spot price taken from the data time series shown in Figure 3 and we use the solution of the calibration problem found and formulae (30), (31) to
Figure 4. Reconstruction of the parameters of the BlackScholes model: μ, σ.
Figure 5. Reconstruction of the parameters used to model the spikes: a, b, λ.
calculate the forecasted forward prices associated to the spot prices of the data time series not included in the data window mentioned above used in the calibration problem. In Figures 6-8 the forward electric power prices forecasted are shown and compared to the observed forward electric power prices. In Figures 6-8 the abscissa is the day of the spot price associated to the forward prices computed. The abscissa of Figures 6-8 is coherent with the abscissa of Figure 3. Table 1 summarizes quantitatively the results shown in Figures 6-8 and gives the
Figure 6. Month-ahead prices.
Figure 7. Quarter-ahead prices.
average relative error committed using the forecasted forward prices, that is the average relative error committed approximating the observed forward prices with the forecasted forward prices. Table 1 and Figures 6-8 show the high quality of the forecasted forward prices answering the second question posed about the analysis of the data time series in the affirmative.
We can conclude that the data analysis presented shows that the model introduced to describe spiky prices, the formulation of the calibration problem and its numerical
Figure 8. One year-ahead prices.
Table 1. Average relative errors of the forecasted forward electric power prices when compared to the observed forward prices.
solution presented in this paper have the potential of being tools of practical value in the analysis of data time series of spiky prices.
*The research reported in this paper is partially supported by MUR— Ministero Università e Ricerca (Roma, Italy), 40%, 2007, under grant: “The impact of population ageing on financial markets, intermediaries and financial stability”. The support and sponsorship of MUR are gratefully acknowledged.
#The numerical experience reported in this paper has been obtained using the computing grid of ENEA (Roma, Italy). The support and sponsorship of ENEA are gratefully acknowledged.