Considering the overnight effect on the stock market, we construct a daily volatility measure that is formed by a linear combination of the three components, namely overnight volatility, morning realized volatility and afternoon realized volatility, and obtain the optimal solution in theory. An empirical work is performed for studying the daily volatility structure of Shanghai stock index and Shenzhen stock index in China’s stock market by using our daily volatility measure. The empirical results show that, the daily volatility measure considering the impact of overnight variance and time segment performs better than original volatility measure.
The modeling and forecasting of volatility is the basis of financial asset portfolio allocation, capital asset pricing and risk management, and has always been a hot topic in the financial field. With the improvement of high frequency data accessibility and the deepening of the research on high frequency data field, the use of high frequency data to estimate and model volatility has become a new trend in financial research [
With the deepening of research, many scholars have found that the information in the non-trading period of stock markets which is the overnight period has a very important influence on the volatility (Hansen and Lunde [
In general, the existing methods lack not only the analysis of the impact of asset volatility at different periods of the day, but also the research on the correlation and combination of different intraday volatility. As pointed out by Ahoniemi and Lanne [
Let { p * ( t ) , t ≥ 0 } represent the logarithmic pricing process of financial assets, and its generating mechanism can be expressed as a stochastic differential equation:
d p * ( t ) = μ ( t ) d t + σ ( t ) d w ( t ) (1)
where μ(t) denotes the drift rate, σ(t) denotes the volatility, and w(t) denotes the standard Brownian motion.
The true volatility of p* on the t-th day can be defined as I V t ≡ ∫ t − 1 t σ 2 ( t ) d t . As the integral of volatility, I V t is also called the Integrated Volatility of the t-th day, and r t ≡ p * ( t ) − p * ( t − 1 ) is defined as the intra-day return of the t-th day.
Andersen and Bollerslev proposed the definition of Realized Volatility (RV). Let M + 1 denotes the number of price observation values at equal intervals of
the day, that is, p i ( t − 1 ) , p i ( t − 1 + 1 M ) , ⋯ , p i ( t ) . Then, there are a total of M returns in a day, and the intra-day return of the jth observation period is defined as r i , t , j ≡ p i ( t − 1 + j M ) − p i ( t − 1 + j − 1 M ) , j = 1 , 2 , ⋯ , M . The realized volatility is defined as:
R V i , t ≡ ∑ j = 1 M r i , t , j 2 . (2)
It has be proved by Barndorff and Shephard that, without considering the jumps of asset prices, according to the Quadratic Variation theory, when M → ∞ , R V i , t converges to the integrated volatility in probability, that is, lim M → ∞ R V i , t = ∫ t − 1 t σ ( t ) 2 d t . In other words, if the sampling frequency of intra-day returns is high enough, R V i , t can be regarded as the consistent estimator of real volatility.
Transactions in the Chinese stock market are concentrated on the Shanghai Stock Exchange (SSE) and the Shenzhen Stock Exchange (SZSE). The opening hours of each day are from 9:30 to 11:30 and from 13:00 to 15:00, namely, there are 4 hours of trading on each day. However, since the asset prices are changing all the time, using the price changing information observed in only four hours to describe the price changes of the whole day is inaccurate. So it is necessary to consider the price changes at the non-trading hours.
We define the intra-day return as the difference between the logarithm of the daily closing price and the logarithm of the previous daily closing price. The time of a day corresponds to the closing time of the previous day to the closing time of the day. Hence, we can divide the intra-day return into four periods:
Phase I. Overnight period from 15:00 on the previous day to 9:30 on the day. The overnight return of a stock is defined as the difference between the logarithm of the day’s opening price at 9:30 and the logarithm of the previous day’s closing price at 15:00, represented by r 1 t ;
Phase II. Opening hours from 9:30 to 11:30. The morning return of a stock can be obtained in the same way as calculating the intra-day return, represented by r 2 t ;
Phase III. Lunch break from 11:30 to 13:00. The stock’s midday return is defined as the difference between the logarithm of the day’s opening price at 13:00 and the logarithm of the day’s closing price at 11:30, represented by r 3 t ;
Phase IV. Opening hours from 13:00 to 15:00. The afternoon return of a stock can be obtained in the same way as calculating the intra-day return, represented by r 4 t .
The volatilities of overnight and midday returns can be expressed as the square of the returns, that is, r 1 t 2 and r 3 t 2 . The volatilities of morning and afternoon returns are represented by realized volatility, that is, R V 2 t and R V 4 t . The partition results of the four time periods are shown in
To compare the changes of daily volatility at different periods, we use the 1-minite high-frequency data of Shanghai Composite Index and Shenzhen Component Index from January 2, 2014 to November 2, 2015 for comparative analysis, including 429 days of valid data of Shanghai Composite Index and 437 days of valid data of Shenzhen Component Index. Due to the partial missing and abnormal values of 1-minute data of Shanghai Composite Index obtained, in order to ensure that the analysis results are in line with the actual situation, we firstly preprocess the data by interpolating. To eliminate the influence of microstructure noise of high-frequency data, we use high-frequency data at intervals of 5, 10, 15, 20 and 30 minutes respectively for analysis. The results show that the statistical characteristics of the volatilities of the returns at different time intervals display roughly the same variation pattern.
It can be seen from
Statistical Characteristics | Mean | Variance |
---|---|---|
r 1 t 2 | 1.30e−04 | 1.43e−07 |
R V 2 t | 1.85e−04 | 1.02e−07 |
r 3 t 2 | 4.82e−06 | 5.95e−10 |
R V 4 t | 1.46e−04 | 7.21e−08 |
Source: Data from Shanghai Stock Exchange.
Statistical Characteristics | Mean | Variance |
---|---|---|
r 1 t 2 | 1.13e−04 | 2.41e−07 |
R V 2 t | 1.70e−04 | 1.02e−07 |
r 3 t 2 | 1.93e−07 | 3.00e−12 |
R V 4 t | 1.45e−04 | 7.71e−08 |
Source: Data from Shenzhen Stock Exchange.
which is basically consistent with the reality of stock markets. The volatilities of the overnight period are slightly less than that of the afternoon, but are in the same order of magnitude as the averaged volatilities over the trading period. Hence, it can be seen that the overnight effect is obvious and should not be ignored. Because the volatility of midday returns is extremely small compared to that of the other three time periods, in the following analysis we choose overnight volatility, morning realized volatility and afternoon realized volatility as the main components of daily volatility.
The intra-day return of the stock market is defined as the difference between the logarithm of the day’s closing price and the logarithm of the previous day’s closing price. According to the division of the four time periods in
To simplify writing, we define E t ( ⋅ ) = E [ ⋅ | I V t ] to represent conditional expectations. The basic hypothesizes discussed below are:
Assumption I. E t [ I V i t ] = δ i I V t , i = 1 , 2 , 3 , 4 ;
Assumption II. E t [ ω i R V i t − I V i t ] = 0 , i = 2 , 4 ;
Assumption III. E t [ ω i r i t 2 − I V i t ] = 0 , i = 1 , 3 ;
Assumption IV. C o v ( I V i t , I V j t ) = 0 , i ≠ j , i , j = 1 , 2 , 3 , 4 ;
where δ i , ω i are constants, respectively.
Assumption I shows that the expected value of the integrated volatility of each time period is fixed in proportion to integrated volatility of the whole day. Assumption II requires that the conditional deviation rates of the volatilities of the morning and afternoon returns are proportional to the integrated volatilities of the corresponding period. Assumption III requires that overnight and midday volatilities are also proportional to the integrated volatilities of the corresponding period. Assumption IV means that the integrated volatilities at different periods are irrelevant. Assumption IV is supposed to simplify the problem, which is generally not true in practical problems.
To give a more accurate description of the volatility in stock market, based on the model of Hansen and Lunde, we subdivide the opening hours into morning and afternoon and add in volatility of overnight returns. Hence, we define the daily volatility measure of the Chinese stock market as a linear combination of overnight volatility, morning realized volatility and afternoon realized volatility:
R V t ≡ θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t (3)
where θ ≡ ( θ 1 , θ 2 , θ 4 ) ′ is the parameter to be estimated.
To facilitate this discussion, we define unconditional expectations μ ≡ E [ I V t ] , μ 1 ≡ E [ r 1 t 2 ] , μ 2 ≡ E [ R V 2 t ] , μ 4 ≡ E [ R V 4 t ] . Then we have the following conclusions.
Theorem 1. For all θ ≡ ( θ 1 , θ 2 , θ 4 ) ′ that satisfy θ 1 μ 1 + θ 2 μ 2 + θ 4 μ 4 = μ , there is E t [ R V t ] = I V t .
Proof of Theorem 1: see Appendix.
The purpose of the established volatility measure R V t is to better describe integrated volatility I V t , and the difference between them can be measured by the mean square error. So this problem turns into the following optimization problem.
min θ ∈ Θ { E [ R V t − I V t ] 2 } (4)
where Θ is the value interval of parameter θ = ( θ 1 , θ 2 , θ 4 ) ′ .
But in fact, since I V t is the integral of instantaneous volatility, it is an unobservable potential variable. Moreover, the asset price itself is disturbed by the micro-structure noise, so we cannot get a more accurate measurement value of actual I V t , and other methods need to be considered. Then we give the following theorem.
Theorem 2. For θ ∈ Θ , if E t [ R V t − I V t ] = 0 , then min θ ∈ Θ { E [ R V t − I V t ] 2 } is equivalent to min θ ∈ Θ D [ R V t ] , where D [ R V t ] is the variance of R V t .
Proof of Theorem 2: see Appendix.
From Theorem 2, the original problem can be transformed into an optimization problem:
min θ ∈ Θ D [ R V t ] = min θ ∈ Θ D ( θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t ) s . t . θ 1 μ 1 + θ 2 μ 2 + θ 4 μ 4 = μ
Hence, we have the following conclusions.
Theorem 3. The solution of the optimization problem min θ ∈ Θ D ( θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t ) s . t . θ 1 μ 1 + θ 2 μ 2 + θ 4 μ 4 = μ is
θ ^ 1 = α μ μ 1 ; θ ^ 2 = β μ μ 2 ; θ ^ 4 = ( 1 − α − β ) μ μ 4 (5)
where
α = σ 33 μ 4 2 − σ 13 μ 1 μ 4 σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 13 μ 1 μ 4 + [ σ 13 μ 1 μ 4 + σ 23 μ 2 μ 4 − σ 12 μ 1 μ 2 − σ 33 μ 4 2 σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 33 μ 4 2 ] × [ ( σ 23 μ 2 μ 4 − σ 13 μ 1 μ 4 ) ( σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 13 μ 1 μ 4 ) + ( σ 11 μ 1 2 − σ 13 μ 1 μ 4 − σ 12 μ 1 μ 2 + σ 23 μ 2 μ 4 ) ( σ 13 μ 1 μ 4 − σ 33 μ 4 2 ) ( σ 13 μ 1 μ 4 + σ 23 μ 2 μ 4 − σ 12 μ 1 μ 2 − σ 33 μ 4 2 ) ( σ 11 μ 1 2 − σ 13 μ 1 μ 4 − σ 12 μ 1 μ 2 + σ 23 μ 2 μ 4 ) − ( σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 13 μ 1 μ 4 ) ( σ 22 μ 2 2 − σ 12 μ 1 μ 2 − σ 23 μ 2 μ 4 + σ 13 μ 1 μ 4 ) ]
β = ( σ 23 μ 2 μ 4 − σ 13 μ 1 μ 4 ) ( σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 13 μ 1 μ 4 ) + ( σ 11 μ 1 2 − σ 13 μ 1 μ 4 − σ 12 μ 1 μ 2 + σ 23 μ 2 μ 4 ) ( σ 13 μ 1 μ 4 − σ 33 μ 4 2 ) ( σ 13 μ 1 μ 4 + σ 23 μ 2 μ 4 − σ 12 μ 1 μ 2 − σ 33 μ 4 2 ) ( σ 11 μ 1 2 − σ 13 μ 1 μ 4 − σ 12 μ 1 μ 2 + σ 23 μ 2 μ 4 ) − ( σ 11 μ 1 2 + σ 33 μ 4 2 − 2 σ 13 μ 1 μ 4 ) ( σ 22 μ 2 2 − σ 12 μ 1 μ 2 − σ 23 μ 2 μ 4 + σ 13 μ 1 μ 4 )
and σ 11 ≡ D ( r 1 t 2 ) , σ 22 ≡ D ( R V 2 t ) , σ 33 ≡ D ( R V 4 t ) , σ 12 ≡ C o v ( r 1 t 2 , R V 2 t ) , σ 13 ≡ C o v ( r 1 t 2 , R V 4 t ) , σ 23 ≡ C o v ( R V 2 t , R V 4 t ) .
In particular, when Assumption IV is true, that is, σ 12 ≡ C o v ( r 1 t 2 , R V 2 t ) = 0 , σ 13 ≡ C o v ( r 1 t 2 , R V 4 t ) = 0 , σ 23 ≡ C o v ( R V 2 t , R V 4 t ) = 0 , then we have
θ ^ 1 = μ μ 1 σ 22 σ 33 μ 4 2 σ 11 σ 22 + μ 1 2 σ 22 σ 33 + μ 2 2 σ 11 σ 33 θ ^ 2 = μ μ 2 σ 11 σ 33 μ 4 2 σ 11 σ 22 + μ 1 2 σ 22 σ 33 + μ 2 2 σ 11 σ 33 θ ^ 4 = μ − θ ^ 1 μ 1 − θ ^ 2 μ 2 μ 4 (6)
Proof of Theorem 3: see Appendix.
The empirical analysis data used in this paper is still 1-minute closing price data of China’s stock market index from January 2, 2014 to November 2, 2015. Since most studies have confirmed that the 5-minute data can be considered almost impervious to the micro-structure noise of high-frequency data, and we have analyzed the 5, 10, 15, 20, 30 minutes intervals and find that there is roughly the same variation pattern, we mainly analyze Shanghai Composite Index and Shenzhen Component Index with a sampling interval of 5 minutes in the following empirical part. After processing the high-frequency data at a sampling interval of 5 minutes, 50 pieces of high-frequency data can be recorded every trading day. Then we have a total of 21,450 pieces of high-frequency data covering 429 trading days of Shanghai Composite Index, and 21,850 pieces of high-frequency data covering 437 trading days of Shenzhen Component Index.
In Assumption IV, we assume that the volatilities of different time periods are irrelevant. Since this is a very strong assumption, we need to firstly test the correlation between returns and volatilities in practical applications.
We conduct an autocorrelation analysis of 5, 10, 15, 20 and 30 minutes return series of the two stock indexes. The results show that there is no autocorrelation in the series of intraday returns within the above time intervals. Then we conduct correlation tests on the series of the overnight volatility r 1 2 , the morning realized volatility R V 2 and the afternoon realized volatility R V 4 . The results are shown in
As can be seen from the sample correlation coefficients in
Using the result of Theorem 3, we calculate the parameters of 5, 10, 15, 20 and 30 minutes sampling intervals. As the final results obtained from different intervals do not differ much, we chose the 5-minute interval as the representative for the following analysis and explanation. Bring the return series into the formula in Theorem 3, we calculate the parameters as shown in
From the formula R V t = θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t , we can get the daily volatility measure that we construct (expressed as XRV for convenience of differentiation). We take the square of the intra-day returns r 2 as a reflection of the real volatility, and use RV to represent the daily volatility measure which is constructed only by using high-frequency data of opening hours and without taking the overnight effect into account. We calculate the mean and variance of these
Sample Correlation Coefficient | r 1 2 | R V 2 | R V 4 |
---|---|---|---|
r 1 2 | 1.0000 | ||
R V 2 | 0.3675 | 1.0000 | |
R V 4 | 0.2319 | 0.7187 | 1.0000 |
Source: Data from Shanghai Stock Exchange.
Sample Correlation Coefficient | r 1 2 | R V 2 | R V 4 |
---|---|---|---|
r 1 2 | 1.0000 | ||
R V 2 | 0.4034 | 1.0000 | |
R V 4 | 0.2840 | 0.6942 | 1.0000 |
Source: Data from Shenzhen Stock Exchange.
Parameter | Shanghai Composite Index | Shenzhen Component Index |
---|---|---|
μ 1 | 1.13e−04 | 1.30e−04 |
μ 2 | 1.70e−04 | 1.85e−04 |
μ 4 | 1.45e−04 | 1.46e−04 |
σ 11 | 2.41e−07 | 1.43e−07 |
σ 22 | 1.02e−07 | 1.02e−07 |
σ 44 | 7.71e−08 | 7.21e−08 |
σ 12 | 0.3675 | 0.4034 |
σ 14 | 0.2319 | 0.2840 |
σ 24 | 0.7187 | 0.6942 |
θ ^ 1 | 0.4163 | 0.5803 |
θ ^ 2 | 1.1420 | 1.0562 |
θ ^ 4 | 1.2902 | 1.3006 |
three volatility measures of Shanghai Composite Index and Shenzhen Component Index respectively, and show the statistical characteristics in
It can be seen that the daily volatility measure XRV constructed by us is closer to the real volatility value than the realized volatility measure RV which does not consider the overnight effect.
Finally, we use the mean square error (MSE) as the standard to measure the error, which is the most commonly used form of loss function in such judgment. According to the principle of least square method, the smaller the sum of squared residuals, the more consistent the estimated value is with the real value. In practical application, MSE is usually used as a measurement index. The smaller the expected value of the squared difference between the estimated value and the real value, the more accurate the model is. The specific definition is
M S E = 1 n ∑ i = 1 n ( Y i ˜ − Y i ) 2 , and MSEs between the volatility measures and the real volatility are shown in
As we can see from
Based on the above analysis results, we can draw the conclusion that, the daily volatility measure considering the impact of overnight variance and time segment (XRV) is superior to the realized volatility (RV), and can reflect the real situation of volatility more comprehensively.
Stock Index | Statistical characteristics | Mean | Variance |
---|---|---|---|
Shanghai Composite Index | XRV | 4.29e−04 | 5.81e−07 |
RV | 3.15e−04 | 3.07e−07 | |
r2 | 3.90e−04 | 9.17e−07 | |
Shenzhen Component Index | XRV | 4.62e−04 | 5.53e−07 |
RV | 3.36e−04 | 3.09e−07 | |
r2 | 4.57e−04 | 9.32e−06 |
Stock Index | Volatility Measure | MSE |
---|---|---|
Shanghai Composite Index | XRV | 7.99e−07 |
RV | 8.16e−07 | |
Shenzhen Component Index | XRV | 7.14e−07 |
RV | 7.77e−07 |
Due to the use of more intra-day data, the realized volatility measure based on high-frequency return series shows better statistical properties than parametric model in characterizing historical volatility. Since the trading hours of stock markets only account for a small part of a day, which are divided into two periods of morning and afternoon, asset prices change continuously. Hence, the realized volatility composed of the return series of trading time cannot fully characterize the daily volatility. Base on this point, the main work of this paper is to establish the optimized realized volatility statistics through the analyzing and processing of high-frequency trading data of China’s stock market, and compare it with the original measure through mean square error to judge the pros and cons of the new volatility measure. The main empirical results show that in terms of Shanghai Composite Index and Shenzhen Component Index, the daily volatility measure considering the impact of overnight variance and time segment is superior to realized volatility measure without considering them.
This paper proposes a daily volatility measure for Chinese stock market, which considers the impact of overnight variance and time segment. This approach is helpful for us to better understand the volatility structure of Chinese stock market and give a more accurate measure of volatility. Since the high-frequency return series are affected by the microstructure noise, and there are jumps in asset prices when it changes continuously, these factors will make the realized volatility measure have certain deviations. Therefore, the improvement of volatility measure based on microstructure and price jumps is a further work direction in the future.
The authors declare no conflicts of interest regarding the publication of this paper.
Shi, Y. and Li, H.D. (2018) Research on the Daily Volatility Measure Considering the Impact of Overnight Variance and Time Segment in Chinese Stock Market. Journal of Mathematical Finance, 8, 549-561. https://doi.org/10.4236/jmf.2018.83035
1) Proof of Theorem 1:
By hypothesis, there is
E t [ R V t ] = E t [ θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t ] = θ 1 ( 1 ω 1 δ 1 ) I V t + θ 2 ( 1 ω 2 δ 2 ) I V t + θ 4 ( 1 ω 4 δ 4 ) I V t = [ θ 1 ( 1 ω 1 δ 1 ) + θ 2 ( 1 ω 2 δ 2 ) + θ 4 ( 1 ω 4 δ 4 ) ] I V t
and by the law of total expectation, we have
E [ R V t ] = E [ E t ( R V t ) ] = [ θ 1 ( 1 ω 1 δ 1 ) + θ 2 ( 1 ω 2 δ 2 ) + θ 4 ( 1 ω 4 δ 4 ) ] μ
and
E [ R V t ] = E [ θ 1 r 1 t 2 + θ 2 R V 2 t + θ 4 R V 4 t ] = θ 1 μ 1 + θ 2 μ 2 + θ 4 μ 4 = μ
then we have
θ 1 ( 1 ω 1 δ 1 ) + θ 2 ( 1 ω 2 δ 2 ) + θ 4 ( 1 ω 4 δ 4 ) = 1
so E t [ R V t ] = I V t .
2) Proof of Theorem 1:
We define D t ( ⋅ ) = D [ ⋅ | I V t ] , and by conditional expectation formula,
E t [ R V t − I V t ] 2 = D t [ R V t − I V t ] + [ E t ( R V t − I V t ) ] 2 = D t [ R V t ]
Therefore, by the law of total expectation, we can have
E ( R V t − I V t ) 2 = E { E t [ R V t − I V t ] 2 } = E { D t [ R V t ] } = D [ R V t ]
3) Proof of Theorem 3:
Consider the minimum variance of the linear combination α X 1 + β X 2 + ( 1 − α − β ) X 3 of random variables X 1 , X 2 , X 3 , that is, min α , β D [ α X 1 + β X 2 + ( 1 − α − β ) X 3 ] .
Let
F ( α , β ) = D [ α X 1 + β X 2 + ( 1 − α − β ) X 3 ] = α 2 D X 1 + β 2 D X 2 + ( 1 − α − β ) D X 3 + 2 α β C o v ( X 1 , X 2 ) + 2 α ( 1 − α − β ) C o v ( X 1 , X 3 ) + 2 β ( 1 − α − β ) C o v ( X 2 , X 3 )
To simplify the calculation, let c 11 ≡ D X 1 , c 22 ≡ D X 2 , c 33 ≡ D X 3 , c 12 ≡ C o v ( X 1 , X 2 ) , c 13 ≡ C o v ( X 1 , X 3 ) , c 23 ≡ C o v ( X 2 , X 3 ) . Take the partial derivative with respect to a and b respectively, and set the partial derivative to 0, then we have
α = c 33 − c 13 c 11 + c 33 − 2 c 13 + [ c 13 + c 23 − c 12 − c 33 c 11 + c 33 − 2 c 13 ] × [ ( c 23 − c 13 ) ( c 11 + c 33 − 2 c 13 ) + ( c 11 − c 13 − c 12 + c 23 ) ( c 13 − c 33 ) ( c 13 + c 23 − c 12 − c 33 ) ( c 11 − c 13 − c 12 + c 23 ) − ( c 11 + c 33 − 2 c 13 ) ( c 22 − c 12 − c 23 + c 13 ) ]
β = ( c 23 − c 13 ) ( c 11 + c 33 − 2 c 13 ) + ( c 11 − c 13 − c 12 + c 23 ) ( c 13 − c 33 ) ( c 13 + c 23 − c 12 − c 33 ) ( c 11 − c 13 − c 12 + c 23 ) − ( c 11 + c 33 − 2 c 13 ) ( c 22 − c 12 − c 23 + c 13 )
By the condition θ 1 μ 1 + θ 2 μ 2 + θ 4 μ 4 = μ and normalizing it, we have
and we can get α and β.