Journal of Mathematical Finance, 2012, 2, 1330 http://dx.doi.org/10.4236/jmf.2012.21002 Published Online February 2012 (http://www.SciRP.org/journal/jmf) A Comparison of VaR Estimation Procedures for Leptokurtic Equity Index Returns* Malay Bhattacharyya1, Siddarth Madhav R2 1Indian Institute of Management Bangalore, Bangalore, India 2Barclays Capital, New York, USA Email: malayb@iimb.ernet.in, rsmadhav@gmail.com Received July 11, 2011; revised August 1, 2011; accepted August 29, 2011 ABSTRACT The paper presents and tests Dynamic Value at Risk (VaR) estimation procedures for equity index returns. Volatility clustering and leptokurtosis are welldocumented characteristics of such time series. An ARMA (1, 1)GARCH (1, 1) ap proach models the inherent autocorrelation and dynamic volatility. Fattailed behavior is modeled in two ways. In the first approach, the ARMAGARCH process is run assuming alternatively that the standardized residuals are distributed with Pearson Type IV, Johnson SU, Manly’s exponential transformation, normal and tdistributions. In the second ap proach, the ARMAGARCH process is run with the pseudonormal assumption, the parameters calculated with the pseudo maximum likelihood procedure, and the standardized residuals are later alternatively modeled with Mixture of Normal distributions, Extreme Value Theory and other power transformations such as JohnDraper, BickelDoksum, Manly, YeoJohnson and certain combinations of the above. The first approach yields five models, and the second ap proach yields nine. These are tested with six equity index return time series using rolling windows. These models are compared by computing the 99%, 97.5% and 95% VaR violations and contrasting them with the expected number of violations. Keywords: Dynamic VaR; GARCH; EVT; Johnson SU; Pearson Type IV; Mixture of Normal Distributions; Manly; John Draper; YeoJohnson Transformations 1. Introduction VALUE AT RISK (VaR) is a popular measure of risk in a portfolio of assets. It represents a high quantile of loss distribution for a particular horizon, providing a loss thresh old that is exceeded only a small percentage of the time. Traditional methods of calculating VaR include his torical simulation and the analytic variancecovariance approach. However, these models fall short when tested against actual market conditions. The historical simulation approach assumes constant volatility of stocks over an extended period of time. It fails to account for the phe nomenon of volatility clustering, when periods of high and low volatility occur together. This leads to underes timation of VaR during periods of high volatility, and overestimation in times of calm. The analytic variance covariance approach assumes that returns are jointly nor mally distributed. However, the fattailed nonnormal be haviour of returns would mean that this methodology tends to underestimate VaR as well. Fama [1] and Mandelbrot [2] report the failure of the normal distribution to model asset returns, sparking a slew of papers addressing the issue of accurately modeling lep tokurtic time series with volatility clustering. The ap proaches can be roughly divided in two, the first assuming that returns are independent and modeling unconditional distribution of returns. In this approach, numerous distri butions have been proposed, Fama [1] and Mandelbrot [2] use the stable Paretian distribution, Blattberg and Gonedes [3] suggest the use of Student tdistribution. The mixture of normal distributions is used by Ball and Torous [4] and Kon [5] and the logistic distribution, the empirical power distribution and the Student tdistributions have been compared by Gray and French [6]. The Pearson type IV distribution is used by Bhattacharyya, Chaudhary and Yadav [7] for dynamic VaR estimation and by Bhat tacharyya, Misra and Kodase [8] for dynamic MaxVaR estimation. Bhattacharyya and Ritolia [9] use EVT for dynamic VaR estimation. The second approach considers returns to be serially correlated and uses conditional variance models or sto chastic volatility models to model asset returns. Engle [10] and Bollerslev [11] use ARCH and GARCH models to account for volatility clustering. GARCH models have *This work was carried out when Siddarth Madhav R was a graduate student at the Indian Institute of Management Bangalore. C opyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 14 been shown to be more suited to this purpose by various studies such as Poon and Granger [12]. The GARCH (1, 1) model performs well for most stock returns and this paper adopts this approach. The following model has been extensively used to model dynamism in forecasts of returns and volatility of returns. tttt Z (1) where t is the actual return on day t, t is the ex pected return on day t, t is the volatility estimate on day t and t is the standardized residual, having a nor mal distribution with zero mean and unit standard devia tion. ARMA processes are useful for modeling t , the predicted mean of the time series data, and GARCH processes are good models for t , the predicted volatile ity. However, the inherent leptokurtic behaviour of asset returns makes the ARMAGARCH model insufficient for the purpose of calculating VaR. In this paper, ARMA (1, 1) model is used for the cal culation of predicted mean and GARCH (1, 1) model is used for modeling the observed volatility clustering. Models are developed using two approaches. In the first one, consisting of five models, ARMAGARCH model parameters are calculated assuming that standardized residuals alternatively follow Pearson Type IV distribu tion, Johnson U distribution, Manly’s exponential trans formation, normal and Student tdistributions. In the second approach, the ARMAGARCH parameters are calculated using the pseudonormal assumption, i.e., as suming that standardized residuals are normally distrib uted, and they are later modeled using the mixture of normal distributions, Extreme Value Theory, and other power transformations such as JohnDraper, Bickel Doksum, Manly, YeoJohnson and certain combinations of the above. The second approach yields nine models. S While developing and testing VaR models, the authors find it important to develop those that are applicable in real world scenarios. This translates to certain simplicity in execution and fast runtimes for calculations, as time can be a critical issue. At the same time, the importance of creating an accurate measure of risk cannot be under stated, given how the stock market crash of 2008 bank rupted firms and individuals alike, and sent the world spi raling into recession. 2. Leptokurtic Density Functions 2.1. Pearson Type IV Distribution The Pearson family of curves, a generalized family of fre quency curves developed by Karl Pearson, embodies a wide range of commonly observed distributions. The Pear son curves are a solution to the differential equation 2 01 2 d 1 d fx x fx x ccxcx (2) The system of curves which arise from the above dif ferential equation cover a wide spectrum of skewness and kurtosis (Figure 1). The type of distribution obtained postintegration is dictated by the roots of the quadratic equation 2 01 20ccxcx . The Type IV curve is obtained when the roots of the quadratic equation 2 01 20ccxcx are complex, i.e., when 2 . It is suitable for those distributions which have high excess kurtosis and moderate skewness. Financial return data fall in this category. The probability density function (PDF) of the Type IV curve (Heinrich, [13]) is 2 10 4ccc 2 1 1exptan m xx fx kaa (3) where λ, a, ν and m are real parameters (functions of α, ), m > 1/2, 01 2 , and cc c and k is a norma Figure 1. The diagram of the Pearson curve family. It shows the type of curve to be used for each range of skewness and kurtosis. The xaxis is β1 = skewness2, and the yaxis is β2, the traditional kurtosis. Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. Copyright © 2012 SciRes. JMF 15 lizing constant, dependent on λ, a, ν and m. ARCH process, we require a transformation function which can accept arguments that may be positive or negative. Hence we need to use the Johnson U distribution, as the sine hyperbolic inverse function has a domain all over the real line. S The PDF gives rise to a bell shaped curve, where λ is the location parameter, a is the scale parameter, ν and m can be interpreted as the skewness and kurtosis parame ters respectively. The type of Pearson curve to use for a particular situa tion is dictated by the skewness and kurtosis. Table 1 shows the observed skewness and excess kurtosis for the six equity indices. Crossreferencing them with Figure 1, we can see that Pearson Type IV curve is the model to be used. So we have 1 *sinh X Z (7) where and are assumed to be positive. The density function of Johnson U distribution can be easily found in closedform from variable transforma tion: S For a standardized Pearson Type IV curve, i.e., with zero mean and unit standard deviation, we need to add the following constraints. 1 2 ;;;; sinh 1 x fx x (8) 2 22 1 s rr ar (4) sar (5) where R , is the density function of , (0,1)N and > 0 are location and scale parameters respect tively, can be interpreted as a skewness parameter, and > 0 can be interpreted as a kurtosis parameter. The distribution is positively or negatively skewed ac cording to whether is negative or positive. Holding constant and increasing reduces the kurtosis. How ever, and cannot be viewed purely as skewness or kurtosis parameters, respectively. The mean and the variance of Johnson SU distribution are given as: 2.2. Johnson SU Distribution The Johnson family of distributions (Johnson, [14]) con sists of three distributions, which cover all possible av erage, standard deviation, skewness and kurtosis values, excluding the impossible region. These consist of the UB and the lognormal curves. The transformations have the general form S,S .X Zg (6) 12sinh Ω (9) where the transformation parameters ξ is the location, λ is the scale and γ and δ are shape parameters. Z is the re sulting normal distribution. . is one of the following functions: 2 21ccosh2Ω1 2 (10) where 2 exp and . 1 U B ln Lognormal distribution sinh Sdistribution ln1 Sdistribution Normal distribu ton i y gy y y y y 2.3. Extreme Value Theory Extreme value theory provides a framework to formalize the study of behavior in the tails of a distribution. Ac cording to the FisherTippet theorem, there can be three possible extreme value distributions for the standardized variable. Since we are modeling the innovations of the ARMA Table 1. Comparison of moments for each stock index return series. Index Sensex NIFTY DJI FTSE HSI Nikkei Observations 1500 1500 1500 1500 1500 1500 Dates Mar 03  Feb 09 Mar 03  Feb 09Mar 03  Feb 09Mar 03  Feb 09Mar 03  Feb 09 Mar 03  Feb 09 Mean 0.0009 0.0009 0.0001 0.0002 0.0004 0.0001 Std. Deviation 0.0178 0.0181 0.0124 0.0127 0.0169 0.0163 Skewness −0.4276 −0.5130 0.2624 0.1409 0.3876 −0.2730 Kurtosis 7.2358 8.6112 17.3926 14.5248 15.5344 12.8085
M. BHATTACHARYYA ET AL. 16 2.3.1. Gumbel Distrib ut ion As with the normal and gamma distributions, the tail can be unbounded, have finite moments and decay exponent tially. The distribution function is given by: expe for x Gx x (11) 2.3.2. Frechet Distribution The tail can be unbounded, and decay by a power as with the Cauchy and Student tdistribution. The distribution function is given by 0 for 0 exp for 0 x Gx xx (12) Moments exist only up to the integer part of α, higher moments do not exist, as the tails are fat, they are not integrable when weighted by tail probabilities. 2.3.3. Wei b ull Distribution The tails are constantdeclining, and all moments exist. They are thin, and have upper bounds. The distribution function is: exp for 0 1 for 0 xx Gx x (13) Now, since the financial returns data are fattailed and unbounded, we must clearly use the Frechet distribution for modeling extreme value distributions. 2.3.4. Generalized Extreme Value Distribution The Generalized Extreme Value Distribution (GEVD) unifies the above three distributions. Here the tail index (τ) is the inverse of the shape parameter (α). In this equa tion given below, if 0 , it is a Gumbel distribution, if 0 , it is a Frechet distribution else if 0 it is a Weibull distribution. 1/ exp1 for 0 exp for 0 xx x Fx e (14) To build the series of maxima or minima, there are two methods: 2.3.5. Block Maxima This approach consists of splitting the series into equal nonoverlapping blocks. The maximum from each block is extracted and used to model the extreme value distri bution. As volatility clustering is a well observed pheno menon in financial data, very high or very low observa tions tend to occur together. Thus, this technique runs the 2.3.6. Pe ak over Thresh ol d risk of losing extreme observations. of sampling maxima by se 2.4. Mixture of Normal Distributions o model fat The second approach consists lecting those that exceed a chosen threshold. A low thres hold would give rise to a larger number of observations, running the risk of including central observations in the extremes data. The tail index computed has lesser vari ance but is subject to bias. A high threshold has few ob servations, and the tail index is more imprecise, but un biased. The choice of the threshold is thus a tradeoff be tween variance and bias. For the analysis in this paper, we use the Peak over Threshold method. The mixture of normal distributions, used t tailed distributions, assumes that each observation is gen erated from one of N normal distributions. The probabil ity that it is generated from a distribution “i” is i p, with 11 N i ip . The resultant density function 112 2 1 ; N NN ii i pppp x (15) where is a normal distribution with mean i i and standareviation i d d . For the special case of 2N , we have 1 ;1 2 pxp x (16) where 1212 ,, ,,p mixture of N normal is the parameter vector. ur m For a distributions, the first fo oments are: 1 N ii i p (17) 2222 11 NN ii ii ii pp (18) 23 23 3 1 133 N ii i i (19) 422 4 1 3224 136 46 N iiii i i p 4 (20) A mixture of more than two normal distributions may provide a better fit to the series, but Tucker [15] reports that the improvement by increasing the number of nor mal distributions in the mixture from two is not too sig nificant. Estimation of parameters for the mixture of nor mal distribution is problematic. This is because, although we have a well defined distribution function in a closed form, using maximum likelihood techniques for parame ter estimation leads to convergence issues (Hamilton, [16]). Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 17 Using method of moments is another option, but even for the simplest case of 2N, we need five moment equa tions to find the five pters, 1212 ,, ,,p arame , and there may n mith and Makov, [17]). Alternate meth ods have been suggested, such as fractiletofractile comparisons (Hull and White, [18]) and Bayesian updat ing schemes (Zangari, [19]). This paper uses the fractile ot be a solution at all tofractile comparison tech ni 2.5. Power Transformations f the first power trans (Titterington, S que along with a simplifying assumption that one of the means of the mixture of normal distributions is zero. This is a reasonable assumption, in the data set, as most ob servations (about 95%) lie in the zeromean normal dis tribution, and it simplifies calculations considerably. Box and Cox [20] propose one o formations converting a nonnormal distribution into a normal one. In its original form, the transformation func tion is: 1, if 0 log, if 0 y y y (21) However, as it can be seen, the power transformation ca ox 2.5.1. Manly ’s Exp onen ti al Di stri bution ibution given nnot be applied to negative values of y. Since then, many modifications of the original BCox power transformation have been proposed. Manly [21] proposed the exponential distr below. 1, if 0 , if 0 y e y y (22) Negative values of are permitted. This transforma tio sf 2.5.2. BickelDoksum Transformation iginal BoxCox y n is useful for tranorming skewed distributions to normal (Li, [22]). Bickel and Doksum [23] transform the or transformation to sign 1, for 0 yy y (23) where (24) The addition of the sign function makes this transfor m at John and Draper [24] propose the modulus transforma 1, if 0 sign 1, if 0 y yy ation compatible for negative values of y as well. 2.5.3. JohnDr aper M odul u s Transformion tion given below: 11 sign, if 0 signlog1, if 0 y y y yy (25) where 1, if 0 sign 1, if 0 y yy (26) The modulus transformation works best tributions which are approximately symmetric about some ce nsformation Yeo and Johnson [25] propose the following transforma on those dis ntral point (Li, [22]). It reduces the kurtosis of the se ries, while introducing some degree of skewness to a symmetric distribution. 2.5.4. Yeo Johnson Tra tion in 2000: 2 (1)1 , 0,0 log1, 0,0 (1)1, 2,0 2 log1, 2,0 yy yy yyy yy (27) In their original paper, Yeo and Johnson [25] find the value of by minimizing the KullbackLeibler distance between the normal and transformed distributions. In this paper however, we have found by maximizing log likelihoods. This transformation, like Manly, reduces skew ness of the distribution and makes the transformed vari able more symmetric. 3. Dynamic VaR Models s used to calculate dy x returns. ariance This section describes the method namic Value at Risk for equity inde 3.1. Model for Conditional Mean and V To calculate conditional mean t given the time series data until time t − 1, we use an ARMA (1, 1) process. 1111tttt XC X (28) We use the GARCH (1,1) process to m ity of the innovation term. t odel the volatil 222 11 t K1 1 t (29) Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 18 3.2. Models for Innovations In Equation (1), the forecasted mean and variance are ARCH (1, 1) model. As pr calculated by an ARMA (1, 1)G mentioned in the introduction, there are two approaches followed to model innovations. In the first approach, ARMA (1, 1)GARCH (1, 1) model parameters are cal culated assuming that standardized residuals alternatively follow Pearson Type IV distribution, Johnson U S dis tribution, Manly’s exponential transformation, normal and Student tdistributions. In the second apoach, ARMA (1, 1)GARCH (1, 1) parameters are calculated assuming that standardized residuals are normally dis tributed. The extracted standardized residuals are then modeled using the mixture of normal distributions, Ex treme Value Theory, and other power transformations such as JohnDraper, BickelDoksum, Manly, YeoJohnson and certain combinations of the above. Method 1 The first approach consists of five models, whose de sined below. gns are outli Model 1.1 GARCHN Model In Equation (1), t is assumed to be a standard nor mal distribution. Therefore, the innovations term, , has ze t ro mean and the standard deviation of t h. 0,1 0, ttt NNh (30) 2 1 1 e xp 2 2π t tt t t fF h h Therefore, the log likelihood function mized to find the parameters of the m (31) , which is maxi ARMAGARCH odel for the series of length T is given by 2 1 1log 2π 22 T t t tt LLFh h (32) The maximum likelihood estimates for 1)GARCH (1, 1) parameters are found th the ARMA (1, by minimizing e negative of the above function using the fmincon func tion in MATLAB. Model 1.2 GARCHt Model In Equation (1), t is assumed to be a Student tdis tribution with zero mean and unit standard deviation. There fore, the log likeliho function, the logarithm of the den sity function of the innovations term, t , for the series of length T is given by od 1 2 1 Γ log T LLF 1 2log 2 Γπ2 2 1 log1 22 t t t t h h (33) where represents the degrees of freedom in the tdis tribution. The maximum likelihood estimates for the ARMA (1, 1) GARCH (1, 1) parameters are found by minimizing the negative of the above function using the fminco tion in MATLAB. Model 1.3 GARCHPIV Model In Equation (1), n func t is assumed to be a Pearson Type IV distribution. Thandardized innovations series has ia e, e st unit varnce, but not necessarily a zero mean. This was justified by Newey and Steigerwald [26], who proved that an additional location parameter is needed to satisfy the identification condition for the consistency of parameter estimates when conditional innovation distribution in the GARCH model is asymmetric. Hence Equation (4) holds, but Equation (5) does not. Therefor 1 s t tt t a EX Fhr (34) Hence, for modeling innovations, we need to change the location and scale parameters to t h and t ah respectively. The normalizing parameter is inversely pro portional to the scale parameter, so it changes to t kh. ,,, , ts ZPIVkma ,,, , tt stt PIV kh mahh (35) The distribution function of the innovation series is given by 2 1tt h k fF 1tt ts t ha h 1 exp tan m tt st h ah (36) The log likelihood function to be maximized is given by 1 222 2 22 1 2 12 log log 22 log11 tan 1 T t t tt t tt t r LLFkh hr rr h hr rr v v v h (37) We use Equation (4) and the relation 21rm to write a s ma and in terms of . The log lic tion ixim(by minimng –LLF n m ized r izi kelihood fun ) using the fmi Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 19 con function in MATLAB. The maximum likelihood estimates from the GARCHN model and the Pearson Type IV parameters calculated from the first four mo ments of the resulting standardized innovations series (under the pseudonormal assumption) are used as initial estimates for the optimization function. Th constant k is computed by the technique use [13]. Model 1.4 GARCHJSU Model In this model, the standardized innovations in Equa tion (1), e normalizing d by Heinrich t with is assumed to be a Johnson distribu tion. As the GARCHPIV model, thandardized inecessarily z U S e st novations have unit variance, but not nero mean. Therefore, from Equation (10), the scale parameter λ is constrained. 2 1coshcosh2Ω1 s (38) where 2 exp and Ω . Note that Equation (9) does not hold, and the parame ter ξ has to be estimated during optimization. The pre dicted future value of the time series is given by 12sinh Ω (39) 1t tt t EX Fh Now, for modeling the innovations series t , the loca tion and scale parameters must be changed to t h and t h . ,,, ts t ZJSU ,,, t st SUh h (40) 12 1 sinh st tt h h h 1 st h tt tt st fF h wh e (41) ere The log likelihood function to be maximizd is given by 0, 1N . 1 2 (42) 21 1 loglog 2π 2 11 log1sinh 22 T st t tt LLF logh The maximum likelihood estimates are calculated by minimizing the negative of the above function using the fmincon function in MATLAB. Model 1.5 GARCHManly Model In this model, the standardized innovations in Equation (1), it is assumed that when t is put through Manly’s ex ponential transformation (Equation (22)), it becomes nor mally distributed. Assuming that the transformed normal function has zero mean and unit standard deviation, t c has the following closed form probability distribution fun tion 2 1 , 2π exp 2 1 12 t tt t MZ fZF Z erf (43) where ,t Z is the exponentially transformed (Equa tion (22)) value of t and erf is the error function. Therefore the standardizenovations ( have following distribution d int the 1 expexp 2 t t h 11 1e rf2 tt t h 2 2 π fF exp 1 x h The log likelihood function to be maximized is given by (44) 1 2 2 (45) 2 , 1 log2π 22 t t Mh 2 log 1 1 T t t t t LLF erf hh The maximum likelihood estimates are calculated minimizing the negative of the above function using fmincon function in MATLAB. The above equation derived in detail in the Appendix. Method 2 The second approach consists of nine models, whose de signs are outlined below. Model 2.1 GARCHEVT Model In this model, the ARMA (1, 1)GARCH(1, 1) m at the standardized innovations in Equation (1) by the para t s are eters are found under the pseudonormal assumption, i.e., th ade is a standard normal function. Now, the assumption m is that the values of t considered for calculation VaR, i.e., the 99th, 97. and 95th percentiles are pa an extreme value distribution. This assumption is th retically justified, as the ARMAGARCH process gets rid of rt of eo 5th Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. Copyright © 2012 SciRes. JMF 20 of the serial correlation between terms, and the Fisher Tippet theorem is applicable. We use the Peak over Threshold (POT) method to ob serve the number of values which exceed a high thresh old. The distribution of conditional excess losses over a certain high threshold follows a Generalized P tribution (GPD). is the number atbove the threshol. Therefo, the tail estimator becomes of observions ad u re 1 11, for u Nxu xx N u (49) areto Dis 11 q u N VaR uq N (50) The Value at Risk is now calculated by the formula q (51) Choosing the threshold to be used in the calculations is a subjective process. In this paper, we calculate the mean excess returns for various values of thresholds and plot them. For a GPD, the mean excess return is given by: 1 , 11, 0 1exp, 0 where x Gx x (46) is the shape parameter (positive in our specific case, as this yields a heavy tailed q ttt VaR VaR GPD) and is the sceter. he negative of the return series, th is positive, and mean ex aling param The formula for conditional excess losses above the threshold u (We consider t 1 u eu (52) The threshold is calculated by observing the graphs and identifying the point from which the conditional ex cess return increases linearly with the threshold values. It is possible to consider any larger value as a threshold as well, but this way, the maximum number of data points gets accommodated in the extreme value distribution, thus reducing the variance of the obtained parameters. In Figures 2(a) and (b), we observe that the thresh old ereby ensuring that the threshold cess return is positive) is given by  u yPXyuXu (4) 7 1 u yF yuFuFu (48) Since u y is a GPD with positive , we need to backcalculate yu. u is given by u NN, where N is the total number of observations and u N Figure 2. The optimal threshold is calculated by plotting the mean excess function of the six time series. The point is chosen at thwhere w seen, the DJI graph is an anomaly, where no such clear point is pres e point the graph begins to slope upards. As can be ent.
M. BHATTACHARYYA ET AL. 21 value for Sensex returns is at 1.4, and for Nifty, it is at 1.5. Note that in the graphs, we consider the negative of the return series, which is why the threshold values are positive. For certain time series, the graph obtained useful for finding the threshold. Consider the mean ex cess return for DJI in Figure 2 for instance. we consider an appropriately high value fo such as the 95th percentile of negative returns. sumption to calculate the ARMA (1 parameters. The standardized innovations are assumed to ha ese standardized innovations. The mean of one of the two normal distributions in the mixture is assumed to be zero. This assumption is rea sonable, as results show that the probability that the stan dardized residuals lie in this normal distribution is very high. A small percentage lies in the other distribution, with the nonzero mean and higher variance, these yield the very high and very low values observed in the data. Thus, the parameter vector is of size four: is not very In such cases, r the threshold, Model 2.2 GARCHMixNorm Model This model also makes use of the pseudonormal as , 1)GARCH (1, 1) ve a mixture of two normal distributions. We calculate the mean, standard deviation, skewness and kurtosis of th 112 ,,,p . point lies in the first (no p is the probability that the data nzero mean) distribution, 1 is distribution, the mean of the first 1 and 2 are butio the the first anond ns respectively. The mean of the second distribution is as sumed to be zero. The parameter vector components must satisfy the four moment constraints. standard deviations ofd secdistri 1ep (53) 22 22 12 1 11 ee pp p (54) 22 2 1 32 31 e eee ep (55) 1 2 4 1 eee e is possible th if the first five m di vided into seven sets; less than 0.5 standard deviations, 0.5 – 1, 1 – 1.5, 1.5 – 2, 2 – 2.5, 3 standard deviations. The actual number of residuals in ea 44 22 1 12 4 34 1331 6 1 eee e pp p (56) 3 p An obtained solution is feasible if it satisfies the con straints 22 12 0, 0 and 01p. To calculate the parameters through the method of moments, we need five moment equations. It at there may not be a solution even oments were calculated. So we employ a fractileto fractile comparison test in addition to using certain mo ment equations. We employ a modified version of the technique used by Perez [27]. The data (standardized residuals) is 2.5 – 3, and greater than ch category k is compared with the predicted number of residuals for the solution each obtained from the moment equations k . The solution considered is the one obtained by maximizing the log likelihood function 7 1 ,log kk k L (57) and satisfying the constraint Equations (49), (50), (52) and (53). As it turns out in most cases, there is no solu tion which satisfies all of them, in such cases, constraint Equation (52) is dropped. The minimization is carried out using the fmincon function in MATLAB. It turns out that the optimum values of the parameter are dependent on the initial values considered, so the parameters obtained for the previous data point are used as initial values in the optimization for the next one. The Value at Risk is now calculated by the formula in Equation (48), where is calculated from inserting the calculated param mixture of normals probability density function given by Equation (16) and cumulating it by numerical methods. Model 2.3 GARCHBickelDoksum Model We calculate the ARMA (1, 1)GARCH (1, 1) parame ters under the pseudonormal assumption. The standard ized residuals obtained q t VaR eters in the t Bickel and (23) an rame are put through the trans formation suggested by Doksum [23] to nor malize them (Equationsd (24)). If we assume that for some value of the pater , the transformed ob servations ,i T ributed with mean are normally dist and statandard deviion . The parameter is esti mated by maximizing the log likelihood function 2 2 2 1 log2π, 22 t ti n lT 1i 1 1log t i i (58) where ,, . The maximum likelihood estimate for the mean and variance is given by 1 1 ˆ, t i i T t (59) (60) The estimate for 2 21 ˆˆ , t i T 1 i t can, therefore, be obtained by simply maximizing the likelihood function Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 22 2 ˆ log2π1log 2 ti n l (61) As shown in Table 2(a), the BickelDoksum transfor mation does not handle skewed distributions well, as it only reduces kurtosis. Hence, this model must be modi fied to fix this drawback. The Value at Risk is now calc Equation (48), whereis calculated from the in verse BickelDoksum ulated by the formula in q t VaR formula 1 12 ˆˆ 11,, q t VaRN q (62) where 12 ˆˆ ,,Nq for probability is the inverse tion normal func 1q, mean ˆ and variance 2 ˆ Mo . del 2.4 GARCHJohnDrape We calculate the ARMA (1, 1)GARCH (1, 1) p ters under the pseudonormal assumption. The standard ized residuals obtainedare transformed with the mo dulus transformation ped by John and Draper [24] (E ious model, meter r Model arame t ropos th quations (25) & (26)). By using similar arguments as the preve para is estimated by maximizing the log likelihood function 2 1 ˆ log 2π1log 1 2 t ti i n l (63) where 2 ˆ is given by Equation (56) with ,i T re presenting the modulus transformation. ere As with the BickelDoksum transformation, Table 2(a) shows that the modulus transformation is not a skew corrector, it reduces kurtosis. Hence, this model must be modified to correct this. The Value at Risk is now calculated by the formula in Equation (48), wh q t VaR is calculated from the in verse JohnDraper formula 1 12 ˆˆ 111 ,, q t VaRN q (64) where 12 ˆˆ ,,Nq r probability 1q, m is the inverse normal func tion foean ˆ and variance 2 ˆ . Model 2.5 GARCHYeoJohnson Model We calculate the ARMA (1, 1)GARCH (1, 1) para meters under the pseudonormal assumption. The stan dardized residuals obtained t are transformed with the YeoJohnson [25] transformation (Equations (27)). By using similar arguments as the previous models, the pa rameter is estimated by maximizing the log likeli hood function 1 ˆ log 2π1signlog1 2 ti i i l (65) 2 re YeoJohnson trans fo The model m 2 nt where ˆ presenting the YeoJohnson transformation. Tables 2(a) and (b) show that the rmation is a skewcorrecting transformation. ust be modified to enable kurtosishandling as well. The Value at Risk is now calculated by the formula in Equation (48), where V is calculated from the in verse YeoJohnson formula q t aR 12 12 ˆˆ 1121,, q t VaRN q (66) ere wh 12 ˆˆ ,,Nq s the inverse normal func tion for probability i 1q , mean ˆ and variance 2 ˆ Mo . del 2.6 GARCHManlyJohnodel We calculate the ARMA (1, 1)GARCH (1, 1) und pseudonormal assumption. The innovations are initially transformed through the Manly exponential transforma rid it of skewn oublytransformed data obtained is now roughly normally distributed (Ta bles 2(a) and (b)). To obtain the parameter for the Manly transformation, the following loglikelihood function is maximized. Draper M er the tion toess. The symmetric data is now transformed with the JohnDraper modulus transforma tion, which reduces kurtosis. The d 2 1 2 ti i The parameter for the JohnDraper tran ˆ log 2πλ t n l (67) sformation is obtained by maximizing the loglikelihood funct Equation (60). ion in The inverse Manly transformation is given by 12 1ˆˆ log 11,, q t VaRNq (68) The Value at Risk is calculated in two steps. First, the low quantile value is ected to the inverse John Draper transformation in Equat subj ion (61) and this value is backtransformed with the inverse Manly transformation in Equation (65). 2.7 GARCHManlyBi MA ( ption. vatio tran sforma tio he inverse trans formations in Equations (59) and (65) carried out serially in that order. ModelckelDoksum Model We calculate the AR1, 1)GARCH (1, 1) under the pseudonormal assum The innons are initially msfored through the Manly exponential tran n remove skewness, and then with the BickelDoksum transformation, which reduces kurtosis. The skewness and kurtosis of the doublytransformed insample data is given in Tables 2(a) and (b). The parameters for the Manly and BickelDoksum trans formations are calculated by maximizing loglikelihoods in Equations (64) and (58). After the two parameters are obtained, the VaR is calculated from t is given by Equation (56) with ,i T Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. Copyright © 2012 SciRes. JMF 23 Table 2. (a) Skewness comparison of std. residualser transformation; (b) Kurtosis comparison of std. residuals after power transformation. NIFTY after pow (a) SensexDJI FTSE HSI Nikkei Initial skewness −0.4764 −0.5311 −0.1024 −0.3660 −0.1836 −0.3209 Manly 0.0134 0.0615 0.0035 0.0162 0.0126 0.0072 −0.0801 −0.2763 −0.0815 −0.2299 JohnDraper −0.3694 −0.2842 YeoJohnson 0.0037 0.0442 BickelDoksum 0.0094 0.0055 −0.0175 −0.0306 −0.0858 −0.3045 −0.1069 −0.2638−0.4379 −0.3769 ManlyYeoJohnson 0.0052 00.0093 0.0068 −0.0221 ManlyJohnDraper −0.0026 0.0052 0.0226 0.0150 m 126 −0.0087 0130 0.0041 −0.0003 0.0026 −0.0193 .0356 −0.0287 0.0070 −0.0026 ManlyBickelDoksu0.00.0281 −0.0015 0.0068 0.0208 0.0123 0.0051 −0.0022 −0.0151 −0.0284 JohnDraperYeoJohnson −0.0063 YeoJohnsonJohnDraper −0.0002 0.0028 YeoJohnsonBickelDoksum 0.0029 0.0233 0.0028 −0.0002 0.0088 −0. The standardized residuals for the insample data are transformed with various check their normalizing effect. For doubletransformations, the data is first tra ond transformation. (b power transformations. The skewness of each transformed output is co nsformed with the transformation mentioned first, and then subjected to the sec ) Sensex NIFTY mpared to DJI FTSE HSI Nikkei Initial kurtosis 3.7840 5.0195 3.3459 3.8574 3.9326 3.5752 Manly 3.3038 4.93 3.17 4.7952 72 3.2771 3.8375 JohnDraperYeoJohnson 2.9229 3.0349 YeoJohnsonJohnDraper 3.1005 3.18 JohnsonBickelDoksum 3.1907 3.0305 3.0275 18 3.3380 3.4958 3.8097 3.1979 JohnDraper 3.1862 18 YeoJohnson 3.3475 2.8817 3.1754 2.8691 3.0347 3.3385 3.5389 3.8324 3.2611 2.9518 3.3488 3.0716 3.2118 3.3388 3.4952 3.8182 3.2126 BickelDoksum 3.5532 3.8147 ManlyYeoJohnson 3.3032 4.89 ManlyJohnDraper 3.0932 3.2107 ManlyBickelDoksum 2.8838 3.0976 2.8498 2.9224 2.9502 3.1660 3.0251 2.9903 2.8775 3.0082 2.8458 2.8674 2.8840 3.1073 2.8531 2.9431 2.9565 69 Yeo3.2998 3.7688 The standardized residuals fsample datansformed with variousor the in are tra power transformations. The kurtosis of each transformed output is compared to check their normalizing effect. For doubletransformations, the data is first transformed with the transformation mentioned first, and then subjected to the sec ond transformation. Model 2.8 GARCHYeoJohnsonJohnDraper Model We calculate the ARMA (1, 1)H (1, 1 the pseuption. Tvations tially traugh the Yon tra tion to rid it ness. The symmata is no formed whnDraper mtransfo, which reduces kursis. The doubed d tained is normally died (Tab and (b)). A the para for the two transform obtfrom Eq (62) an the VaR p the iransfor in Equ1) and (63). 2.9 GAoJohnelDoksum Model We che (1, 1)(1, 1) unde psormal an. Thtions ally GARC) under donormal assum nsformed thro he inno eoJohns are ini nsforma of skewetric dw trans ith the Jo to odulus lytransform rmation ata ob now roughlystributles 2(a) fter metersations are ained uted from uations nverse t d (60), mations is com ations (6 Model alculate t RCHYe ARMA sonBick GARCH er th eudonssumptioe innovare initia
M. BHATTACHARYYA ET AL. 24 transformhe YeoJohsform removeen transfoith the Bicl Doksu which reexcess . The skosis of tblytran data o Tables (b). ParamJohnsonickelD tran ted byizing l hood (58). The VaR is cad frnverseationstions (59) and (63) carri seriall order. 4. Testing Tseries are of length 15se are diinto thple serth 1000) and tofsamies (l00). Fdata pthe outple rewe estiodel pars usinre Table 3. (a) 99% VaR violations comparisons for model 1 series; (b) 97.5% VaR violations comparisons for model 1 series; (c) (a) 99% VaR Model 1.1 Normal 2 T Model 1. n Type IV Model 1.4 on SU 5 Manly Expected Violations ed through tnson tranation to skewness, and th m transformation, rmed w moves ke kurtosis ewness and kurthe dousformed btained are given in2(a) and eters for the Yeo sformations are calcula and B maxim oksum oglikeli ds in Equations (62) anlculate om the i transform in Equa ed outy in that he data 00; thevided e insam ength 5 ies (leng or each ou oint in ple ser ofsam gion, mate mrameteg the p 95% VaR violations comparisons for model 1 series. Model 1.3 Pearso Johns Model 1. Sensex 16 7 7 8 5 16 Nifty 16 8 8 13 5 DJI 22 20 9 9 11 5 FTSE 19 21 13 13 17 5 H S I 15 13 6 6 10 5 Nikkei 11 13 7 7 9 5 14 This tabletion comparisons f Model 1 see expectedf violations is gn in the last c 99% VaR is eted to be violatnt outofsamet. As can odels 1.3re the bestones. (b) 9mal 1.2 T del 1.3 Pearson Type IV odel 1.4 Johnson SU del 1.5 MExpected Violations shows the VaR violaor theries. Th number oiveolumn, xpec ed 5 times for a 500 poiple data sbe seen, M and 1.4 a performing 7.5% VaR Model 1.1 Nor ModelMoM Mo anly Sensex 28 27 16 15 21 12.5 Nifty 24 24 16 15 21 12. 29 30 25 25 24 12.5 5 DJI 34 27 22 22 23 12.5 FTSE H S I 24 23 20 19 22 12.5 14 18 12.5 Nikkei 29 29 14 This table shows the VaR violation comparisons for the Model 1 series. The e to be violated 12.5 times for a 500 point outofsample data set. As can be seen, (c) 95% VaR Model 1.1 Normal Model 1.2 T Model Pearson T xpe Mode 1.3 ype IV Model 1.4 Johnson SU Model 1.5 Manly Expected Violations cted number of violations is given in the last column, 97.5% VaR is expected ls 1.3 and 1.4 are the best performing ones. Sensex 38 38 29 28 33 25 Nifty 40 38 29 DJI 55 49 42 32 36 25 41 42 25 36 41 25 33 33 25 33 38 25 FTSE 38 39 36 H S I 37 34 33 Nikkei 46 43 33 This table shows the VaR violation comparisons for the Model 1 series. The expe be violated 25 times for a 500 point outofsample data set. As can be seen, M cted number of violations is given in the last column, 95% VaR is expected to odel 1.3 is the best performing one. Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 25 vious 1000 data points, i.e., for finding VaR on day t, we consider data points from day t1000 to day 1t . re run in MATLAB version 7.2 on a Win ating system with 1.6 GHz processing speed. hile running the program to calculate VaR for a single Table 4. (a) 99% VaR violations comparisons for model 2 ser 99% VaR violations comparisons for model 2 series; (c) 95% VaR violations comparisons for model 2 series. 99% VaR odel 2.1 EVT Model 2.2 Mixtur Nors Model 2.3 BickelDoksum 4 Draper Model 2.5 YeoJohnson Model 2.6 ManlyJohn Draper Model 2.7 ManlyBickel Doksum Model 2.8 Yeo ohnson JohnDraper Model 2.9 YeoJohnson BickelDoksu Expected Violations The tests a dows XP oper W day, the results are generated well within 30 seconds for The models are tested on six equity indices, Sensex, Nifty, most cases. 5. Results 5.1. Data and Model Parameters ies; (b) (a) Me of mal John Model 2.J m Sensex 7 15 13 13 7 7 7 5 7 7 N9 13 12 9 10 9 10 5 D11 15 10 11 0 11 5 FT14 16 14 15 15 5 8 5 ifty 9 13 JI 15 17 14 1 SE 16 18 18 13 H S I 6 13 9 8 10 6 6 6 6 5 Nikkei 8 9 10 12 10 8 8 8 This table shows the VaR violation comparisons for the Model 2 series. The eted number of violations is given in the last column, 99% VaR is expected to be violated 5 times for a 500 point outofsample data set. As can be seehe bene. 97.5% odel 2.1 EVT Model 2.2 Mixtf Normals Model 2.3 BickelDoksu odel 2.4 hnDraper Model 2.5 YeoJohnson Model 2.6 ManlyJohn Draper Model 2.7 anly Bickel Doksum Model 2.8 Yhnson JohnDraper Model 2. YeoJohns BickelDoksumiolations xpec n, Model 2.6 is t (b) st performing o VaR Mure om M Jo MeoJo 9 on Expected V Sensex 22 20 24 23 22 19 19 19 12.5 19 Nifty18 22 21 21 18 18 18 12.5 DJI 25 28 28 25 22 22 22 12.5 FTSE23 24 24 24 23 23 23 12.5 19 12.5 Nikkei 24 19 23 23 19 17 17 18 18 12.5 22 18 33 22 25 23 H S I 23 19 21 21 23 19 19 19 This table shows the VaR violation comparisons for the Model 2 serie numb is given in the last column, 97.5% VaR is expected to be5 timof As 2.6 aes 95% V aR odel 2.1 EVT Mod2 Mixture of Nor Model 2.3 BickelDoksum odel 2.4 hnDraper Model 2.5 YeoJohnson Model 2.6 ManlyJohn Draper Model 2.7 Manly ckelDoksum 8 YeoJohnson Japer Model 2. YeoJohnson BickelDoks Expected Violations s. The expected can be seen, Models er of violations nd 2.7 are the b violated 12.es for a 500 point outsample data set.t performing ones. (c) Mel 2. mals Jo M Bi Model 2. ohnDr 9 um Sensex 34 38 36 36 32 32 25 34 34 32 Ni34 43 36 36 33 32 25 D51 57 52 52 45 43 25 8 fty 34 34 33 JI 49 47 45 FTSE 35 38 37 37 34 34 34 33 32 25 H S I 31 37 35 37 34 Nikkei 37 44 43 3 34 33 34 33 25 35 34 34 34 25 Thclations is given in the last column, 95% VaR is expected to del 2.9 is the best performing one. is table shows the VaR violation comparisons for the Model 2 series. The expeted number of vio be violated 25 times for a 500 point outofsample data set. As can be seen, Mo Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 26 DJI, FTSE, HSI and Nikkei. The data used is the closing d March 2003 to ed from www.fi the series, and the first four moments are given in Table 1. 5.2. VaRne We test thecti ther othlculated VaR has bte is given by N (69) We measure VaR for each outofsample data point, therefore, N = 500. We calculate 95%, 97.5% and 99% V che,ted viola tT 3(a)pa 1 ser, compariVaR vations fohe six eity Table 5. (a) LR Test fo VaR violations for model 1 series; (b) LR Test for 97 VaR vioions for mel 1 series; (c) LRt for VaR violations for model 1 series. (a) VaR Mode Normal 2 TMode Pearson T IV del 1.4 Johnson SU odel 1.5 Mly value of these indices from the perio ebruary 2009. The data was obtainF nance.yahoo.com, and the time period includes the stock market crash of 2008. The details regarding the returns of where N is the total number of VaR measurements. Violations and Compariso f each model of Mod by calcu ls lating effeveness o e numbf times e caeen violad. The expected number of violations for a qpercentile VaR Expected % VaR violations1%qq aR for ea ions for eac ables data point h would be 2 (c) com . Therefor 5, 12.5 and re the five the expec 5 respective models of th ly. e Model iesng iolr tqu r 99%.5%latod Tes95% 99% l 1.1 Model 1. l 1.3ype MoM an Sensex 47 0.72 0.72 1.54 15.47 15. Nifty 15.47 10.99 1.54 1.54 8.97 DJI 31.78 25.91 2.61 2.61 5.42 FTSE 23.13 28.80 8.97 17.90 H S .9 Nikkei 2 8.97 I 13.16 8 8.97 7 0.19 0. 0.7 19 0.72 3. 2. 91 61 5.42 Thow LR teststic for theel 1, 99% violation observations. bers in indicate sitns where tll hypot. the observed violations is equal to the predicted one, is rejected. (b) VaRModel 1.1 Normal Mode l 1.2Model 1.3 Pearsope IV el 1.4 JohnU Model 1.5 M his table ss the stati Mod VaRThe num bolduatiohe nuhesis, i.e 97.5% T n TyModson Sanly Sensex .66 13.02 0.92 0.48 4.94 14 Nifty 59 8.59 0.92 0.48 4.94 8.59 H S I 8.59 7.28 92 3.00 6.06 16 8. DJI 26.01 13.02 6.06 6.06 7.28 FTSE 16.38 18.16 9.98 9.98 3. Nikkei 16.38 .38 0.18 0.18 2.19 This table showR tetiom the observed violations is equal to the predicted one, is rejected. (c) Model 1.1 Normal Model 1.2 Model 1.3 Pearson Tpe IV 4 Johnson SU 5 Ma s the Lst statistic for the Model 1, 97.5% VaR violation observans. The nubers in bold indicate situations where the null hypothesis, i.e. 95% VaR T yModel 1.Model 1.nly Sensex 18 6.18 0.64 0.37 2.46 6. Nifty 08 6.18 0.64 1.90 4.51 I .67 19.18 10.19 9.11 10.19 2.46 2.46 8. DJ 28 FTSE 6.18 7.10 4.51 4.51 9.11 H S I 5.32 3.08 2.46 Nikkei 15.04 11.33 2.46 2.46 6.18 This table shows the LR test statistic for the Model 1, 95% VaR violation obsetions. The numbers in bold indicate situations where the null hypothesis, i.e. the observed violations is equal to the predicted one, is rejected. rva Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 27 indices. Tables 4(a)(c) compare the same for the nine computed, and the best model for each percentile VaR is found. It can be seen that Models 1.3 and 1.4 are best per (where ARMA (1, 1)GARCH (1, 1) pa rameters are calculated with the pseudonormal a wthe Table 6. (a) LR Test for 99% VaR violations for model 2 series; LR Test for 95% VaR violations for model 2 series. (a) Normals raper YeoJohnson Manly JohnDraper ManlyBickel Doksum YeoJohnson JohnDraper YeoJohnson BickelDoksum models of the Model 2 series. The expected violations for 99%, 97.5% and 95% VaR are given in the last column of each table. The mean violation for each model is forming models across all indices. Amongst those of the Model 2 series ssump tion) hoever, Models 2.6, 2.7, 2.8 and 2.9 perform best. This is expected from the skewnesskurtosis Table 2, where the most normalized transformations are shown to be ManlyJohnDraper, ManlyBickelDoksum, YeoJohn (b) LR Test for 97.5% VaR violations for model 2 series; (c) 99% V aR Model 2.1 EVT Model 2.2 Mixture of Model 2.3 BickelDoksum Model 2.4 JohnD Model 2.5 Model 2.6 Model 2.7 Model 2.8 Model 2.9 Sensex 0.72 0.72 13.16 8.97 8.97 0.72 0.72 0.72 0.72 Nifty 2.61 2.61 8.97 7.11 8.97 2.61 3.91 2.61 3.91 DJI42 90 10.99 13.3.91 5.42 3.91 FTSE 10.99 15.20.46 20.46 15.10.99 13.16 8.97 H S I0.19 2.61 1.54 3.0.19 0.19 0.19 Nikkei 1.54 3.91 7.11 3.54 1.54 1.54 5.13.16 17.16 5.42 47 47 13.16 8.97 91 0.19 2.61 91 1.1.54 This tas the LR test sta for the Model 2, VaR violation obser The numbers in bold indications where the null hysis, i.e. the obiolations is eque predicted one,cted. aR Model 2. 1 EVT Mixture of Normals Model 2.3 BickelDoksum Model 2.4 JohnDraper Model 2.5 Ye nson Model 2.6 Manly JohnDraper Model 2.7 Manly BickelDoksum Model 2.8 YeoJohnson JohnDraper Model 2.9 YeoJohnson BickelDoksum ble showtistic 99%vations.te situapothe served val to th is reje (b) Model 2.2 99% VoJoh Se92 828 00 nsex 3.6.06 .59 7.6.06 3.00 3.00 3.3.00 Nif2.19 6.06 4.94 4.94 2.19 2.19 2.19 19 DJI 9.98 214.66 14.66 9.98 6.06 6.06 6.06 06 FTSE 7.28 8.59 8.59 8.59 7.28 7.28 7.28 28 H S3.00 4.94 4.94 7.2 3.00 3.00 3.00 00 Nikkei 3.00 7.28 7.28 3.00 1.50 1.50 2.19 19 ty 6.06 2. 3.95 6. 9.98 7. I 7.28 8 3. 8.59 2. This tas the LR test sr the Model 2,VaR violation obser The numbers in bold indtuations where the nuthesis, i.e. (c) 99% VaR Model 2.1 Model 2.2 Mixture of Model 2.3 el Model 2.4 Dra Model 2.5 Model 2.6 Manly er Model 2.7 ManlyBickel Model 2.8 YeoJohnson aper Model 2.9 YeoJohnson ble showtatistic fo 97.5% vations.icate sill hypo the observed violations is equal to the predicted one, is rejected. EVT Normals BickDoksum JohnperYeoJohnson JohnDrap Doksum JohnDr BickelDoksum Sensex 3.08 64.51 4.51 3.08 3.08 1.90 1.90 .18 1.90 Nif3.08 14.51 4.51 3.08 3.08 2.46 2.46 DJI 22.17 323.7323.73 19.16.37 13.75 13.75 FTS3.77 65.32 5.32 3.08 3.08 3.08 2.46 H S1.41 53.77 5.32 3.08 3.08 2.46 3.08 Nik 5.32 111.3311.33 6.18 3.77 3.08 3.08 ty 1.33 1.90 2.16 18 11.33 E .18 1.90 I .32 2.46 kei 2.52 3.08 This table shows the LR test statistic for the Model 2, 95% VaR violation observations. The numbers in bold indicate situations where the null hypothesis, i.e. e observed violations is equal to the predicted one, is rejected. th Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 28 sonJohnDraper and YeoJohnsonBickelDoksum. Model 2.1 performs well too, especially for higher VaR estima tion. In order to test the observed VaR numbers, we use Kupiec’s test to determine if the observed VaR violations are significantly different from their expected values. The test is based on the fact that the number of violations N in a sample of size T is binomially distributed as ~,Tp. Thus, the probability of N excesses oc NB curring over a T day period is given by 1TN N pp where p is the probability of exceeding VaR on a giv en day. Under the null hypothesis that NT p , we calculate the Likelih (LR) test statood Ratioistic N N NT 1 ln 11 N p (70) Thstatistics foVaR v observtions areen inles 5 for tdel 1 , and in s 6) forodel s. Thes in bo thohereobserveR viols are signintlyrent fr expected es. measurement of Value at Risk. We use an ARMA (1, 1) process to mexpd a GARCH o calculate parters th pormssump while Models 2.x calculate theith thal assumThe fwing coionsadm the rs. dels 1.3 and 1.4Pd GARSU) r and away the performmong almod r consistency can be seen across in peres. ng odelsh use tseudonl as dardized innovations, the first one makes the distribu tion symmetric, while the second one reduces the kurtosi percentile VaR es utatly, M 2.x sere slighaster thsand 1t the dince of a few sec onds does notndatethem imore accurate GARCHPIV and GARCHJSU model RE NC [1 ame Behavior of Stock Prices,” Journal of Bu siness, Vol. 47, No. 1, 1965, pp. 244280. [2] B. B. Mandelbrot, “The Variation of Certain Speculative 2 ~ 2ln 2 T NT TN p e test giv r the (a)(c) iolation he Mo a series Tab Table (a)(c the M2 seriee valu ld are fica se w diffe the om d Va on ation 6. Conclusions this work, we build different models for accurateIn odel co ces m nditional ectation, an ional variance cesses w (1, 1) pros to ame del condit for the above pro . Models 1 ithout .x e seudonal ation, m w nclus e pseudonor can be m m e fro ption. esult ollo Mo (GARCHIV anCHJ are fa els. Thei best ers al the dices and VaR centil Amothe m whiche porma sumption, Models 1.6, 1.7, 1.8 and 1.9 perform the best. These use two transformations to normalize the stan s. 2 A Model.1 (GRCHEVT timates. ) performs well for high Compionalodelries atly f an Model 1.3 ma .4, bu using ffere n the place of the s. FEREES ] E. Fa, “Th Prices,” Journal of Business, Vol. 36, No. 4, 1963, pp. 394 419. doi:10.1086/294632 [3] R. Blattberg and N. Gonedes, “A Comparison of Stable and Student Distributions as Statistical Models of Stock Prices,” Journal of Business, Vol. 47, 1974, pp. 244280. doi:10.1086/295634 [4] C. A. Ball and W. N. Torous, “A Simplified Jump Process for Common Stock Returns,” Journal of Financial and Qu antitative Analysis, Vol. 18, No. 1, 1983, pp. 5365. doi:10.2307/2330804 [5] S. J. Kon, “Models of Stock Returns: A Comparison,” Jour nal of Finance, Vol. 39, No. 1, 1984, pp. 147165. /23doi:10.2307 27673 d D. W. Frenc odels fo [6] J. anhf al Mr St of Bss Finance and Accountol. 17, N990, pp. 59. doi:1 1/j.14685 990.tb011 B. Gray Distribution , “Empirical C ock Index Re omparisons o turns,” Journal usine 4514 ing, V 957.1 o. 3, 1 97.x0.111 [7] M. Bhattacharyya, A. Chaudharyav ndi tional VaR Estimation Using Pearson Type IV Distribu tion,” Journal of Operational Reseaol. 191, No. 1, 20086397. doi:10.1016/j.ejor.2007.07.021 and G. Yad, “Co European rch, V , pp. 38 [8] M. Bhattacharyya, N. Misra and B Kodase, “Max VaR for NonNormal and Heteroskedeturns,” titaastic RQuan tive Finance, Vol. 9, No. 8, 2009, pp. 925935. doi:10.1080/14697680802595684 [9] M. Bhattacharyya and G. Ritolia, “Conditional VaR using EVT—Towards a Planned Margin Scheme,” International Finlysi No.p. 30.fa.2 Review of 82395. ancial Ana 1016/j.ir s, Vol. 17, 006.08.004 2, 2008, p doi:1 [10] a  ticth Estimates of the Variance of United Kingdom inflat” Econometrica, Vol. 5. 4, 1982, 10 i:10.230 2773 R. F. Engle, “Autoregressive Condition l Heteroscedas ity wi ion, 07. 0, Nopp. 987 do7/191 1] T. lev, “alized Agressive Cional Heteroskedasticit urnal of Econometrics, V, No. 3, 1986, pp. 307oi:10.101 44076(8631 [1 BollersGenerutoreondit y,” Jo 327. ol. 31 )9006d 6/030 2] S. Poon and C. Granger, “Forecasting Volatilitinan ciaets,” Journal of Econ Literaturl. 41, No. 2, 2003, pp. 478539. [1 y in F l Markomic e, Vo doi:10.1257/002205103765762743 [13] J. Heinrich, “A Guide to the Pearson Type IV Distribu tion,” 2004. http://wwwcdf.f publications/cdf6820_pearson4.p on,s of by Methods of Translation,” Bioika, Vol. 36. 12, nal.gov/ “System df. N. L. Johns[14] Frequency Curves Generated metr , No 1949, pp. 14917:10.1093et/36.12.6. doi /biom149 [15] A.ucker, “examinatf Finite afinite Variance Distribdaily Stock Returns,” Journal of Businomstics, Vol. 10, No. 1, L. TA Reion ond In utions As Mo ess & Econ els of D ic Stati 19. 7381. /1392, ppdoi:10.2307 91806 6] J. D. Hamilton, “asiBayepproach imat ing Parameters for Mixtures of Journal of Busind Econotatistics, , No. 1, 1991, pp. 2739. doi:10.2307/1391937 [1 A Qusian A Normal Distributions,” to Est ess anmic SVol. 9 Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 29 [17] D. M. Titterington, A. F. M Smitha and U. E. Makov, “Statistical Analysis of Finite Mixture Distributions,” John Variables Are Not Normally Distributed,” Wiley & Sons, Chichester, 1992. [18] J. Hull and A. White, “Value at Risk When Daily Changes in Market Jour nal of Derivatives, Vol. 5, No. 3, 1998, pp. 919. doi:10.3905/jod.1998.4 07998 [19] P. Zangari, “An Improved Methodology for Measuring VaR,” Risk Metrics Monitor, Reuters/JP Morgan, 1996. [20] G. E. P. Box an D. R. Cox, “An Analysis of Tra ” Journal of the Royal StatisticaSociety, Vol. 26, No. 2, 1964, pp. 211252. [21] B. F. J. Manly, “Exponential Data Transformations,” The Statistician, Vol. 25, No. 1, 1976, pp. 374 d nsfor mations, l 2. doi:10.2307/2988129 [22] P. Li, “Box Cox Transformations: An Overview,” Uni versity of Connecticut, Storrs, 2005. [23] P. J. Bickel and K. A. Doksum, “An Ana Association, Vol. 76, 1981, pp. 296311. lysis of Trans formations Revisited,” Journal of the American Statistical doi:10.2307/2287831 [24] J. A. John and N. R. Draper, “An Alternative Family of Transformations,” Applied Statistics, Vol. 29, No. 2, 1980, pp. 190197. doi:10.2307/2986305 [25] I.K. Yeo and R. Johnson, “A New Family of Power Trans formations to Improve Normality or Symmetry,” Biome trika, Vol. 87, No. 4, 2000, pp. 954959. doi:10.1093/biomet/87.4.954 [26] W. K. Newey and D. G. Steigerwald, “Asymptotic Bias for QuasiMaximumLikelihood Estimators in Condi tional Heteroskedasticity Models,” Econometrica, Vol. 65, No. 3, 1997, pp. 587599. doi:10.2307/2171754 [27] P. G. Perez, “Capturing Fat Tail Risk in Exchange Rate l of Risk, 008, pp. 73100. Returns Using SUCurves: A Comparison with Normal Mix ture and Skewed Student Distributions,” Journa Vol. 10, No. 2, 20072 Copyright © 2012 SciRes. JMF
M. BHATTACHARYYA ET AL. 30 Appendix In the model 1.5 used, the returns series t r is modeled as follows tttt rX (71) We assume that t is a distribution such that when transformed through Manly’s exponential transformation (Equation (22)) it becomes normal. ,~ZTXZN, (72) 2 11 1/ 2 2exp d am z am (73 π P ) The lower limit is given b y 1 since TX. As varies from to , varies from 1 to . In other words, it is impossible for to take on a value less than 1 . 1 PzaPxTa (74) This arises since the Manly’s transformation is oneto one and monotonically increasing. We name 1 bTa and proceed 2 () () 11 exp d 2 2π Tb T m Px bm (75) We name , and follows. mTn ddmTnn 2 11 exp d 2 2π bTn PxbT nn (76) We need to add a normalizing constant to the equ ation, such that k 1Px . 2 1 exp d 2 2π Tn k PxT nn (77) Substituting 2 Tn w , dd 2 Tn wn , and changing limits from to ,1, 2 . 2 1/ 2 exp d π k Pxw w (78) 2 1/ 2 1/ exp d π Pxw w 2 2 0 2 expd π kww k (79) 2 1 1 k 2 erf (80) The innovations are related to the standardized inno vations by ttt h . We also assume that the trans formed standardized residuals have zero mean and unit standard deviation. Therefore 1 2 2 π 1 1erf2 exp 1 1 expexp 2 tt fF x (81) Since () tt PaPah , t 1 2 2 π 1 1erf 2 exp 1 1 expexp 2 tt t t t fF h xh h (82) The log likelihood function to be minimized, is there fore 2 1 2 21 loglog 2π 2 1 12 , 2 T t t t t t t LLF h erf Mh h (83) Copyright © 2012 SciRes. JMF
