A Comparison of VaR Estimation Procedures for Leptokurtic Equity Index Returns

doi:10.4236/jmf.2012.21002

Paper Menu >>

Journal Menu >>

Journal of Mathematical Finance, 2012, 2, 13-30

http://dx.doi.org/10.4236/jmf.2012.21002 Published Online February 2012 (http://www.SciRP.org/journal/jmf)

A Comparison of VaR Estimation Procedures for

Leptokurtic Equity Index Returns*

Malay Bhattacharyya1, Siddarth Madhav R2

1Indian Institute of Management Bangalore, Bangalore, India

2Barclays Capital, New York, USA

Email: malayb@iimb.ernet.in, rsmadhav@gmail.com

Received July 11, 2011; revised August 1, 2011; accepted August 29, 2011

ABSTRACT

The paper presents and tests Dynamic Value at Risk (VaR) estimation procedures for equity index returns. Volatility

clustering and leptokurtosis are well-documented characteristics of such time series. An ARMA (1, 1)-GARCH (1, 1) ap-

proach models the inherent autocorrelation and dynamic volatility. Fat-tailed behavior is modeled in two ways. In the

first approach, the ARMA-GARCH process is run assuming alternatively that the standardized residuals are distributed

with Pearson Type IV, Johnson SU, Manly’s exponential transformation, normal and t-distributions. In the second ap-

proach, the ARMA-GARCH process is run with the pseudo-normal assumption, the parameters calculated with the

pseudo maximum likelihood procedure, and the standardized residuals are later alternatively modeled with Mixture of

Normal distributions, Extreme Value Theory and other power transformations such as John-Draper, Bickel-Doksum,

Manly, Yeo-Johnson and certain combinations of the above. The first approach yields five models, and the second ap-

proach yields nine. These are tested with six equity index return time series using rolling windows. These models are

compared by computing the 99%, 97.5% and 95% VaR violations and contrasting them with the expected number of

violations.

Keywords: Dynamic VaR; GARCH; EVT; Johnson SU; Pearson Type IV; Mixture of Normal Distributions; Manly;

John Draper; Yeo-Johnson Transformations

1. Introduction

VALUE AT RISK (VaR) is a popular measure of risk in

a portfolio of assets. It represents a high quantile of loss

distribution for a particular horizon, providing a loss thresh-

old that is exceeded only a small percentage of the time.

Traditional methods of calculating VaR include his-

torical simulation and the analytic variance-covariance

approach. However, these models fall short when tested

against actual market conditions. The historical simulation

approach assumes constant volatility of stocks over an

extended period of time. It fails to account for the phe-

nomenon of volatility clustering, when periods of high

and low volatility occur together. This leads to underes-

timation of VaR during periods of high volatility, and

overestimation in times of calm. The analytic variance-

covariance approach assumes that returns are jointly nor-

mally distributed. However, the fat-tailed non-normal be-

haviour of returns would mean that this methodology tends

to underestimate VaR as well.

Fama [1] and Mandelbrot [2] report the failure of the

normal distribution to model asset returns, sparking a slew

of papers addressing the issue of accurately modeling lep-

tokurtic time series with volatility clustering. The ap-

proaches can be roughly divided in two, the first assuming

that returns are independent and modeling unconditional

distribution of returns. In this approach, numerous distri-

butions have been proposed, Fama [1] and Mandelbrot [2]

use the stable Paretian distribution, Blattberg and Gonedes

[3] suggest the use of Student t-distribution. The mixture

of normal distributions is used by Ball and Torous [4]

and Kon [5] and the logistic distribution, the empirical

power distribution and the Student t-distributions have

been compared by Gray and French [6]. The Pearson type

IV distribution is used by Bhattacharyya, Chaudhary and

Yadav [7] for dynamic VaR estimation and by Bhat-

tacharyya, Misra and Kodase [8] for dynamic MaxVaR

estimation. Bhattacharyya and Ritolia [9] use EVT for

dynamic VaR estimation.

The second approach considers returns to be serially

correlated and uses conditional variance models or sto-

chastic volatility models to model asset returns. Engle

[10] and Bollerslev [11] use ARCH and GARCH models

to account for volatility clustering. GARCH models have

*This work was carried out when Siddarth Madhav R was a graduate

student at the Indian Institute of Management Bangalore.

M. BHATTACHARYYA ET AL.

been shown to be more suited to this purpose by various

studies such as Poon and Granger [12]. The GARCH (1,

1) model performs well for most stock returns and this

paper adopts this approach.

The following model has been extensively used to

model dynamism in forecasts of returns and volatility of

returns.

tttt





 (1)

where t

is the actual return on day t, t



is the ex-

pected return on day t, t



is the volatility estimate on

day t and t

is the standardized residual, having a nor-

mal distribution with zero mean and unit standard devia-

tion.

ARMA processes are useful for modeling t



, the

predicted mean of the time series data, and GARCH

processes are good models for t



, the predicted volatile-

ity. However, the inherent leptokurtic behaviour of asset

returns makes the ARMA-GARCH model insufficient

for the purpose of calculating VaR.

In this paper, ARMA (1, 1) model is used for the cal-

culation of predicted mean and GARCH (1, 1) model is

used for modeling the observed volatility clustering.

Models are developed using two approaches. In the first

one, consisting of five models, ARMA-GARCH model

parameters are calculated assuming that standardized

residuals alternatively follow Pearson Type IV distribu-

tion, Johnson U distribution, Manly’s exponential trans-

formation, normal and Student t-distributions. In the

second approach, the ARMA-GARCH parameters are

calculated using the pseudo-normal assumption, i.e., as-

suming that standardized residuals are normally distrib-

uted, and they are later modeled using the mixture of

normal distributions, Extreme Value Theory, and other

power transformations such as John-Draper, Bickel-

Doksum, Manly, Yeo-Johnson and certain combinations

of the above. The second approach yields nine models.

While developing and testing VaR models, the authors

find it important to develop those that are applicable in

real world scenarios. This translates to certain simplicity

in execution and fast run-times for calculations, as time

can be a critical issue. At the same time, the importance

of creating an accurate measure of risk cannot be under-

stated, given how the stock market crash of 2008 bank-

rupted firms and individuals alike, and sent the world spi-

raling into recession.

2. Leptokurtic Density Functions

2.1. Pearson Type IV Distribution

The Pearson family of curves, a generalized family of fre-

quency curves developed by Karl Pearson, embodies a

wide range of commonly observed distributions. The Pear-

son curves are a solution to the differential equation







01 2

fx x

fx x ccxcx





 (2)

The system of curves which arise from the above dif-

ferential equation cover a wide spectrum of skewness

and kurtosis (Figure 1). The type of distribution obtained

post-integration is dictated by the roots of the quadratic

equation 2

01 20ccxcx



.

The Type IV curve is obtained when the roots of the

quadratic equation 2

01 20ccxcx



 are complex, i.e.,

when 2

. It is suitable for those distributions

which have high excess kurtosis and moderate skewness.

Financial return data fall in this category. The probability

density function (PDF) of the Type IV curve (Heinrich,

[13]) is

4ccc



1exptan

fx kaa













 

 



 



 











(3)

where λ, a, ν and m are real parameters (functions of α,

), m > 1/2,

01 2

, and cc c



  and k is a norma-

Figure 1. The diagram of the Pearson curve family. It shows

the type of curve to be used for each range of skewness and

kurtosis. The x-axis is β1 = skewness2, and the y-axis is β2,

the traditional kurtosis.

M. BHATTACHARYYA ET AL.



lizing constant, dependent on λ, a, ν and m. ARCH process, we require a transformation function which

can accept arguments that may be positive or negative.

Hence we need to use the Johnson U distribution, as the

sine hyperbolic inverse function has a domain all over the

real line.

The PDF gives rise to a bell shaped curve, where λ is

the location parameter, a is the scale parameter, ν and m

can be interpreted as the skewness and kurtosis parame-

ters respectively.

The type of Pearson curve to use for a particular situa-

tion is dictated by the skewness and kurtosis. Table 1

shows the observed skewness and excess kurtosis for the

six equity indices. Cross-referencing them with Figure 1,

we can see that Pearson Type IV curve is the model to be

used.

So we have

*sinh X



 





 







(7)

where



and



are assumed to be positive.

The density function of Johnson U distribution can

be easily found in closed-form from variable transforma-

tion:

For a standardized Pearson Type IV curve, i.e., with

zero mean and unit standard deviation, we need to add the

following constraints.



;;;; sinh



































(8)







 (4)

sar





 (5)

where





is the density function of , (0,1)N



and



> 0 are location and scale parameters respect-

tively,



can be interpreted as a skewness parameter,

and



> 0 can be interpreted as a kurtosis parameter.

The distribution is positively or negatively skewed ac-

cording to whether



is negative or positive. Holding



constant and increasing



reduces the kurtosis. How-

ever,



and



cannot be viewed purely as skewness or

kurtosis parameters, respectively. The mean and the

variance of Johnson SU distribution are given as:

2.2. Johnson SU Distribution

The Johnson family of distributions (Johnson, [14]) con-

sists of three distributions, which cover all possible av-

erage, standard deviation, skewness and kurtosis values,

excluding the impossible region. These consist of the

and the lognormal curves. The transformations

have the general form

S,S



 





 







(6)

12sinh Ω



 (9)

where the transformation parameters ξ is the location, λ is

the scale and γ and δ are shape parameters. Z is the re-

sulting normal distribution.





is one of the following

functions:



21ccosh2Ω1









 (10)

where





exp





 and





 .



ln Lognormal distribution

sinh Sdistribution

ln1 Sdistribution

Normal distribu

ton













2.3. Extreme Value Theory

Extreme value theory provides a framework to formalize

the study of behavior in the tails of a distribution. Ac-

cording to the Fisher-Tippet theorem, there can be three

possible extreme value distributions for the standardized

variable.

Since we are modeling the innovations of the ARMA-

Table 1. Comparison of moments for each stock index return series.

Index Sensex NIFTY DJI FTSE HSI Nikkei

Observations 1500 1500 1500 1500 1500 1500

Dates Mar 03 - Feb 09 Mar 03 - Feb 09Mar 03 - Feb 09Mar 03 - Feb 09Mar 03 - Feb 09 Mar 03 - Feb 09

Mean 0.0009 0.0009 0.0001 0.0002 0.0004 0.0001

Std. Deviation 0.0178 0.0181 0.0124 0.0127 0.0169 0.0163

Skewness −0.4276 −0.5130 0.2624 0.1409 0.3876 −0.2730

Kurtosis 7.2358 8.6112 17.3926 14.5248 15.5344 12.8085

M. BHATTACHARYYA ET AL.

2.3.1. Gumbel Distrib ut ion

As with the normal and gamma distributions, the tail can

be unbounded, have finite moments and decay exponent-

tially. The distribution function is given by:



expe for

Gx x







(11)

2.3.2. Frechet Distribution

The tail can be unbounded, and decay by a power as with

the Cauchy and Student t-distribution. The distribution

function is given by



0 for 0

exp for 0

Gx xx

















(12)

Moments exist only up to the integer part of α, higher

moments do not exist, as the tails are fat, they are not

integrable when weighted by tail probabilities.

2.3.3. Wei b ull Distribution

The tails are constant-declining, and all moments exist.

They are thin, and have upper bounds. The distribution

function is:

 



exp for 0

1 for 0



 









(13)

Now, since the financial returns data are fat-tailed and

unbounded, we must clearly use the Frechet distribution

for modeling extreme value distributions.

2.3.4. Generalized Extreme Value Distribution

The Generalized Extreme Value Distribution (GEVD)

unifies the above three distributions. Here the tail index

(τ) is the inverse of the shape parameter (α). In this equa-

tion given below, if 0



, it is a Gumbel distribution, if



, it is a Frechet distribution else if 0



 it is a

Weibull distribution.

 



exp1 for 0

exp for 0









 









(14)

To build the series of maxima or minima, there are two

methods:

2.3.5. Block Maxima

This approach consists of splitting the series into equal

non-overlapping blocks. The maximum from each block

is extracted and used to model the extreme value distri-

bution. As volatility clustering is a well observed pheno-

menon in financial data, very high or very low observa-

tions tend to occur together. Thus, this technique runs the

2.3.6. Pe ak over Thresh ol d

risk of losing extreme observations.

of sampling maxima by se-

2.4. Mixture of Normal Distributions

o model fat-

The second approach consists

lecting those that exceed a chosen threshold. A low thres-

hold would give rise to a larger number of observations,

running the risk of including central observations in the

extremes data. The tail index computed has lesser vari-

ance but is subject to bias. A high threshold has few ob-

servations, and the tail index is more imprecise, but un-

biased. The choice of the threshold is thus a trade-off be-

tween variance and bias. For the analysis in this paper,

we use the Peak over Threshold method.

The mixture of normal distributions, used t

tailed distributions, assumes that each observation is gen-

erated from one of N normal distributions. The probabil-

ity that it is generated from a distribution “i” is i

p, with



.

The resultant density function



112 2

;

NN ii

pppp x

 





 (15)

where is a normal distribution with mean i





and

standareviation i

d d



. For the special case of 2N



we have









pxp x

 





(16)

where





1212

,, ,,p









mixture of N normal

is the parameter vector.

For a distributions, the first fo

oments are:







 (17)

2222

ii ii















(18)



23 23

133

ii i



 





 



















 (19)



422

3224

136

iiii i











 





















(20)

A mixture of more than two normal distributions may

provide a better fit to the series, but Tucker [15] reports

that the improvement by increasing the number of nor-

mal distributions in the mixture from two is not too sig-

nificant. Estimation of parameters for the mixture of nor-

mal distribution is problematic. This is because, although

we have a well defined distribution function in a closed

form, using maximum likelihood techniques for parame-

ter estimation leads to convergence issues (Hamilton, [16]).

M. BHATTACHARYYA ET AL. 17

Using method of moments is another option, but even for

the simplest case of 2N, we need five moment equa-

tions to find the five pters,

1212

,, ,,p

arame





, and there may n

mith and Makov, [17]). Alternate meth-

ods have been suggested, such as fractile-to-fractile

comparisons (Hull and White, [18]) and Bayesian updat-

ing schemes (Zangari, [19]).

This paper uses the fractile

ot be a solution at all

-to-fractile comparison tech-

2.5. Power Transformations

f the first power trans-

(Titterington, S

que along with a simplifying assumption that one of the

means of the mixture of normal distributions is zero. This

is a reasonable assumption, in the data set, as most ob-

servations (about 95%) lie in the zero-mean normal dis-

tribution, and it simplifies calculations considerably.

Box and Cox [20] propose one o

formations converting a non-normal distribution into a

normal one. In its original form, the transformation func-

tion is:



1, if 0

log, if 0

















(21)

However, as it can be seen, the power transformation

2.5.1. Manly ’s Exp onen ti al Di stri bution

ibution given

nnot be applied to negative values of y. Since then,

many modifications of the original B-Cox power

transformation have been proposed.

Manly [21] proposed the exponential distr

below.



1, if 0

, if 0

















(22)

Negative values of are permitted. This transforma-

tio sf

2.5.2. Bickel-Doksum Transformation

iginal Box-Cox

n is useful for tranorming skewed distributions to

normal (Li, [22]).

Bickel and Doksum [23] transform the or

transformation to

 

sign 1, for 0











(23)

where

(24)

The addition of the sign function makes this transfor-

John and Draper [24] propose the modulus transforma-



1, if 0

sign 1, if 0









ation compatible for negative values of y as well.

2.5.3. John-Dr aper M odul u s Transformion

tion given below:

 



sign, if 0

signlog1, if 0









 







(25)

where



1, if 0

sign 1, if 0













 (26)

The modulus transformation works best

tributions which are approximately symmetric about some

nsformation

Yeo and Johnson [25] propose the following transforma-

on those dis-

ntral point (Li, [22]). It reduces the kurtosis of the se-

ries, while introducing some degree of skewness to a

symmetric distribution.

2.5.4. Yeo- Johnson Tra

tion in 2000:

 



(1)1

, 0,0

log1, 0,0

(1)1, 2,0

log1, 2,0

yyy





 

























(27)

In their original paper, Yeo and Johnson [25] find the

value of



by minimizing the Kullback-Leibler distance

between the normal and transformed distributions. In this

paper however, we have found



by maximizing log-

likelihoods. This transformation, like Manly, reduces skew-

ness of the distribution and makes the transformed vari-

able more symmetric.

3. Dynamic VaR Models

s used to calculate dy-

x returns.

ariance

This section describes the method

namic Value at Risk for equity inde

3.1. Model for Conditional Mean and V

To calculate conditional mean t



given the time series

data until time t − 1, we use an ARMA (1, 1) process.

1111tttt

XC X



 





 (28)

We use the GARCH (1,1) process to m

ity of the innovation term.

odel the volatil-

222

K1 1



 

 



(29)

M. BHATTACHARYYA ET AL.

3.2. Models for Innovations

In Equation (1), the forecasted mean and variance are

ARCH (1, 1) model. As

calculated by an ARMA (1, 1)-G

mentioned in the introduction, there are two approaches

followed to model innovations. In the first approach,

ARMA (1, 1)-GARCH (1, 1) model parameters are cal-

culated assuming that standardized residuals alternatively

follow Pearson Type IV distribution, Johnson U

S dis-

tribution, Manly’s exponential transformation, normal

and Student t-distributions. In the second apoach,

ARMA (1, 1)-GARCH (1, 1) parameters are calculated

assuming that standardized residuals are normally dis-

tributed. The extracted standardized residuals are then

modeled using the mixture of normal distributions, Ex-

treme Value Theory, and other power transformations such

as John-Draper, Bickel-Doksum, Manly, Yeo-Johnson and

certain combinations of the above.

Method 1

The first approach consists of five models, whose de-

sined below. gns are outli

Model 1.1 GARCH-N Model

In Equation (1), t

is assumed to be a standard nor-

mal distribution. Therefore, the innovations term, , has



ro mean and the standard deviation of t



0,1 0,

ttt

NNh



(30)



1



2π

fF h









Therefore, the log likelihood function

mized to find the parameters of the



(31)

, which is maxi-

ARMA-GARCH

odel for the series of length T is given by



1log 2π

LLFh h





 









(32)

The maximum likelihood estimates for

1)-GARCH (1, 1) parameters are found

the ARMA (1,

by minimizing

e negative of the above function using the fmincon func-

tion in MATLAB.

Model 1.2 GARCH-t Model

In Equation (1), t

is assumed to be a Student t-dis-

tribution with zero mean and unit standard deviation. There-

fore, the log likeliho function, the logarithm of the den-

sity function of the innovations term, t



, for the series of

length T is given by



log

LLF

















2log

Γπ2

log1

























































(33)

where represents the degrees of freedom in the t-dis-

tribution.

The maximum likelihood estimates for the ARMA (1,

1) GARCH (1, 1) parameters are found by minimizing

the negative of the above function using the fminco

tion in MATLAB.

Model 1.3 GARCH-PIV Model

In Equation (1),

n func-

is assumed to be a Pearson Type

IV distribution. Thandardized innovations series has

e st

unit varnce, but not necessarily a zero mean. This was

justified by Newey and Steigerwald [26], who proved that

an additional location parameter is needed to satisfy the

identification condition for the consistency of parameter

estimates when conditional innovation distribution in the

GARCH model is asymmetric. Hence Equation (4) holds,

but Equation (5) does not. Therefor



tt t

EX Fhr









 





(34)

Hence, for modeling innovations, we need to change

the location and scale parameters to t



and

respectively. The normalizing parameter is inversely pro-

portional to the scale parameter, so it changes to t

kh.







,,, ,

ZPIVkma





,,, ,

stt

PIV kh mahh







(35)

The distribution function of the innovation series is

given by



1tt















1tt









exp tan



































(36)



The log likelihood function to be maximized is given



222

log log

log11

tan

LLFkh

rr h































































(37)

We use Equation (4) and the relation





21rm

write

s ma

and in terms of . The log lic-

tion ixim(by minimng –LLF n-

ized

izi

kelihood fun

) using the fmi

M. BHATTACHARYYA ET AL. 19

con function in MATLAB. The maximum likelihood

estimates from the GARCH-N model and the Pearson

Type IV parameters calculated from the first four mo-

ments of the resulting standardized innovations series

(under the pseudo-normal assumption) are used as initial

estimates for the optimization function. Th

constant k is computed by the technique use

[13].

Model 1.4 GARCH-JSU Model

In this model, the standardized innovations in Equa-

tion (1),

e normalizing

d by Heinrich

with

is assumed to be a Johnson distribu-

tion. As the GARCH-PIV model, thandardized

inecessarily z

e st

novations have unit variance, but not nero

mean. Therefore, from Equation (10), the scale parameter

λ is constrained.

 

1coshcosh2Ω1





(38)

where



exp





 and Ω







Note that Equation (9) does not hold, and the parame-

ter ξ has to be estimated during optimization. The pre-

dicted future value of the time series is given by







12sinh Ω



(39)

tt t

EX Fh



 

Now, for modeling the innovations series t



, the loca-

tion and scale parameters must be changed to t



and





,,,

ZJSU

,,,

SUh h

 











(40)



sinh





























 





















(41)

ere

The log likelihood function to be maximizd is given



0, 1N



.



2 (42)

loglog 2π

log1sinh

LLF logh

















 





The maximum likelihood estimates are calculated by

minimizing the negative of the above function using the

fmincon function in MATLAB.

Model 1.5 GARCH-Manly Model

In this model, the standardized innovations in Equation

(1), it is assumed that when t

is put through Manly’s ex-

ponential transformation (Equation (22)), it becomes nor-

mally distributed. Assuming that the transformed normal

function has zero mean and unit standard deviation, t

c- has the following closed form probability distribution fun

tion



2π

exp 2

tt t

fZF Z

erf





















(43)

where







is the exponentially transformed (Equa-

tion (22)) value of t

and erf is the error function.

Therefore the standardizenovations ( have

following distribution

d int



the



expexp

rf2

exp 1





























































The log likelihood function to be maximized is given

(44)













(45)

log2π











log 1

LLF

erf

































The maximum likelihood estimates are calculated

minimizing the negative of the above function using

fmincon function in MATLAB. The above equation

derived in detail in the Appendix.

Method 2

The second approach consists of nine models, whose de-

signs are outlined below.

Model 2.1 GARCH-EVT Model

In this model, the ARMA (1, 1)-GARCH(1, 1)

at the standardized innovations in Equation (1)

the

para-

s are

eters are found under the pseudo-normal assumption,

i.e., th

ade is a standard normal function. Now, the assumption m

is that the values of t

considered for calculation

VaR, i.e., the 99th, 97. and 95th percentiles are pa

an extreme value distribution. This assumption is th

retically justified, as the ARMA-GARCH process gets rid

rt of

eo-

5th

M. BHATTACHARYYA ET AL.

of the serial correlation between terms, and the Fisher-

Tippet theorem is applicable.

We use the Peak over Threshold (POT) method to ob-

serve the number of values which exceed a high thresh-

old. The distribution of conditional excess losses over a

certain high threshold follows a Generalized P

tribution (GPD).

is the number atbove the threshol.

Therefo, the tail estimator becomes

of observions ad u



11, for

Nxu









u



 



 (49)

areto Dis-



VaR uq

































(50)

The Value at Risk is now calculated by the formula

(51)

Choosing the threshold to be used in the calculations is

a subjective process. In this paper, we calculate the mean

excess returns for various values of thresholds and plot

them. For a GPD, the mean excess return is given by:



11, 0

1exp, 0





 







where













 







 (46)



is the shape parameter (positive in our specific

case, as this yields a heavy tailed

ttt

VaR VaR





GPD) and



is the

sceter.

he negative of the return series,

th is positive, and mean

aling param

The formula for conditional excess losses above the

threshold u (We consider t









 (52)

The threshold is calculated by observing the graphs

and identifying the point from which the conditional ex-

cess return increases linearly with the threshold values. It

is possible to consider any larger value as a threshold as

well, but this way, the maximum number of data points

gets accommodated in the extreme value distribution,

thus reducing the variance of the obtained parameters. In

Figures 2(a) and (b), we observe that the thresh old

ereby ensuring that the threshold

cess return is positive) is given by

 

yPXyuXu (4)

 



yF yuFuFu (48)

Since



y is a GPD with positive



, we need to

back-calculate



yu.





u is given by u

NN,

where N is the total number of observations and u

Figure 2. The optimal threshold is calculated by plotting the mean excess function of the six time series. The point is chosen at

thwhere w seen, the DJI graph is an anomaly, where no such clear point is

pres

e point the graph begins to slope upards. As can be

ent.

M. BHATTACHARYYA ET AL. 21

value for Sensex returns is at 1.4, and for Nifty, it is at

1.5. Note that in the graphs, we consider the negative of

the return series, which is why the threshold values are

positive.

For certain time series, the graph obtained

useful for finding the threshold. Consider the mean ex-

cess return for DJI in Figure 2 for instance.

we consider an appropriately high value fo

such as the 95th percentile of negative returns.

sumption to calculate the ARMA (1

parameters. The standardized innovations are assumed to

ese standardized innovations.

The mean of one of the two normal distributions in the

mixture is assumed to be zero. This assumption is rea-

sonable, as results show that the probability that the stan-

dardized residuals lie in this normal distribution is very

high. A small percentage lies in the other distribution, with

the non-zero mean and higher variance, these yield the

very high and very low values observed in the data.

Thus, the parameter vector is of size four:

is not very

In such cases,

r the threshold,

Model 2.2 GARCH-MixNorm Model

This model also makes use of the pseudo-normal as-

, 1)-GARCH (1, 1)

ve a mixture of two normal distributions. We calculate

the mean, standard deviation, skewness and kurtosis of



112

,,,p





.

point lies in the first (no

p is the probability that the data

n-zero mean) distribution, 1



distribution, the mean of the first 1



and 2



are

butio

the

the first anond ns

respectively. The mean of the second distribution is as-

sumed to be zero.

The parameter vector components must satisfy the four

moment constraints.

standard deviations ofd secdistri

1ep



 (53)



22 22

pp p

 



 





(54)



22 2

eee



















(55)









4 1

eee e



 







is possible

th if the first five

di-

vided into seven sets; less than 0.5 standard deviations,

0.5 – 1, 1 – 1.5, 1.5 – 2, 2 – 2.5,

3 standard deviations. The actual number of residuals in

1331 6

eee

pp p



 



















(56)





An obtained solution is feasible if it satisfies the con-

straints 22

0, 0



 and 01p.

To calculate the parameters through the method of

moments, we need five moment equations. It

at there may not be a solution even

oments were calculated. So we employ a fractile-to-

fractile comparison test in addition to using certain mo-

ment equations.

We employ a modified version of the technique used

by Perez [27]. The data (standardized residuals) is

2.5 – 3, and greater than

ch category







is compared with the predicted

number of residuals for the solution each obtained from

the moment equations







. The solution considered is

the one obtained by maximizing the log likelihood function



,log









(57)

and satisfying the constraint Equations (49), (50), (52)

and (53). As it turns out in most cases, there is no solu-

tion which satisfies all of them, in such cases, constraint

Equation (52) is dropped. The minimization is carried out

using the fmincon function in MATLAB. It turns out that

the optimum values of the parameter are dependent on

the initial values considered, so the parameters obtained

for the previous data point are used as initial values in the

optimization for the next one.

The Value at Risk is now calculated by the formula in

Equation (48), where is calculated from inserting

the calculated param mixture of normals

probability density function given by Equation (16) and

cumulating it by numerical methods.

Model 2.3 GARCH-Bickel-Doksum Model

We calculate the ARMA (1, 1)-GARCH (1, 1) parame-

ters under the pseudo-normal assumption. The standard-

ized residuals obtained

VaR

eters in the







Bickel and

(23) an

rame

are put through the trans-

formation suggested by Doksum [23] to nor-

malize them (Equationsd (24)). If we assume that

for some value of the pater



, the transformed ob-

servations







ributed with mean are normally dist



and statandard deviion



. The parameter is esti-

mated by maximizing the log likelihood function







|log2π,

1log







 





 (58)











where









. The maximum likelihood estimate

for the mean and variance is given by



ˆ,

 







(59)



(60)

The estimate for



ˆˆ

 







t



can, therefore, be obtained by

simply maximizing the likelihood function

M. BHATTACHARYYA ET AL.



|log2π1log



 



(61)

As shown in Table 2(a), the Bickel-Doksum transfor-

mation does not handle skewed distributions well, as it

only reduces kurtosis. Hence, this model must be modi-

fied to fix this drawback.

The Value at Risk is now calc

Equation (48), whereis calculated from the in-

verse Bickel-Doksum

ulated by the formula in

VaR

formula







ˆˆ

11,,

VaRN q



 



  





(62)

where



ˆˆ

,,Nq



 





for probability



is the inverse

tion

normal func-



1q, mean







and variance







Mo .

del 2.4 GARCH-John-Drape

We calculate the ARMA (1, 1)-GARCH (1, 1) p

ters under the pseudo-normal assumption. The standard-

ized residuals obtainedare transformed with the mo-

dulus transformation ped by John and Draper [24]

ious model, meter

r Model

arame-





ropos

quations (25) & (26)). By using similar arguments as

the preve para



is estimated by

maximizing the log likelihood function



log 2π1log 1





 





(63)

where 2



is given by Equation (56) with







re-

presenting the modulus transformation.

ere

As with the Bickel-Doksum transformation, Table 2(a)

shows that the modulus transformation is not a skew-

corrector, it reduces kurtosis. Hence, this model must be

modified to correct this.

The Value at Risk is now calculated by the formula in

Equation (48), wh q

VaR is calculated from the in-

verse John-Draper formula

 





ˆˆ

111 ,,

VaRN q







 



(64)

where

 

ˆˆ

,,Nq



 





r probability



1q, m

is the inverse normal func-

tion foean









and variance







Model 2.5 GARCH-Yeo-Johnson Model

We calculate the ARMA (1, 1)-GARCH (1, 1) para-

meters under the pseudo-normal assumption. The stan-

dardized residuals obtained







are transformed with

the Yeo-Johnson [25] transformation (Equations (27)). By

using similar arguments as the previous models, the pa-

rameter



is estimated by maximizing the log likeli-

hood function



log 2π1signlog1





 





(65)

2 re-

Yeo-Johnson trans-

fo The model

where ˆ



presenting the Yeo-Johnson transformation.

Tables 2(a) and (b) show that the

rmation is a skew-correcting transformation.

ust be modified to enable kurtosis-handling as well.

The Value at Risk is now calculated by the formula in

Equation (48), where V is calculated from the in-

verse Yeo-Johnson formula









ˆˆ

1121,,

VaRN q









 



(66)

ere wh









ˆˆ

,,Nq



 









s the inverse normal func-

tion for probability







, mean







and variance









Mo .

del 2.6 GARCH-Manly-Johnodel

We calculate the ARMA (1, 1)-GARCH (1, 1) und

pseudo-normal assumption. The innovations are initially

transformed through the Manly exponential transforma-

rid it of skewn

oubly-transformed

data obtained is now roughly normally distributed (Ta-

bles 2(a) and (b)).

To obtain the parameter for the Manly transformation,

the following log-likelihood function is maximized.

-Draper M

er the

tion toess. The symmetric data is now

transformed with the John-Draper modulus transforma-

tion, which reduces kurtosis. The d



i

The parameter for the John-Draper tran

log 2πλ



 



(67)

sformation is

obtained by maximizing the log-likelihood funct

Equation (60).

ion in

The inverse Manly transformation is given by

 

1ˆˆ

log 11,,

VaRNq























 (68)

The Value at Risk is calculated in two steps. First, the

low quantile value is ected to the inverse John-

Draper transformation in Equat

subj

ion (61) and this value is

back-transformed with the inverse Manly transformation

in Equation (65).

2.7 GARCH-Manly-Bi

MA (

ption. vatio

tran sforma-

tio

he inverse trans-

formations in Equations (59) and (65) carried out serially

in that order.

Modelckel-Doksum Model

We calculate the AR1, 1)-GARCH (1, 1) under the

pseudo-normal assum The innons are initially

msfored through the Manly exponential tran

n remove skewness, and then with the Bickel-Doksum

transformation, which reduces kurtosis. The skewness and

kurtosis of the doubly-transformed insample data is given

in Tables 2(a) and (b).

The parameters for the Manly and Bickel-Doksum trans-

formations are calculated by maximizing log-likelihoods

in Equations (64) and (58). After the two parameters are

obtained, the VaR is calculated from t

is given by Equation (56) with







M. BHATTACHARYYA ET AL.

Table 2. (a) Skewness comparison of std. residualser transformation; (b) Kurtosis comparison of std. residuals

after power transformation.

NIFTY

after pow

(a)

SensexDJI FTSE HSI Nikkei

Initial skewness −0.4764 −0.5311 −0.1024 −0.3660 −0.1836 −0.3209

Manly 0.0134 0.0615 0.0035 0.0162 0.0126 0.0072

−0.0801 −0.2763 −0.0815 −0.2299 John-Draper −0.3694 −0.2842

Yeo-Johnson 0.0037 0.0442

Bickel-Doksum

0.0094 0.0055 −0.0175 −0.0306

−0.0858 −0.3045 −0.1069 −0.2638−0.4379 −0.3769

Manly-Yeo-Johnson 0.0052 00.0093 0.0068 −0.0221

Manly-John-Draper −0.0026 0.0052 0.0226 0.0150

m 126

−0.0087

0130

0.0041 −0.0003 0.0026 −0.0193

.0356 −0.0287

0.0070 −0.0026

Manly-Bickel-Doksu0.00.0281 −0.0015 0.0068 0.0208 0.0123

0.0051 −0.0022 −0.0151 −0.0284 John-Draper-Yeo-Johnson −0.0063

Yeo-Johnson-John-Draper −0.0002 0.0028

Yeo-Johnson-Bickel-Doksum 0.0029 0.0233

0.0028 −0.0002 0.0088 −0.

The standardized residuals for the in-sample data are transformed with various

check their normalizing effect. For double-transformations, the data is first tra

ond transformation.

power transformations. The skewness of each transformed output is co

nsformed with the transformation mentioned first, and then subjected to the sec-

)

Sensex NIFTY

mpared to

DJI FTSE HSI Nikkei

Initial kurtosis 3.7840 5.0195 3.3459 3.8574 3.9326 3.5752

Manly 3.3038 4.93

3.17

4.7952

3.2771 3.8375

John-Draper-Yeo-Johnson 2.9229 3.0349

Yeo-Johnson-John-Draper 3.1005 3.18

Johnson-Bickel-Doksum 3.1907 3.0305 3.0275

18 3.3380 3.4958 3.8097 3.1979

John-Draper 3.1862 18

Yeo-Johnson 3.3475

2.8817 3.1754 2.8691 3.0347

3.3385 3.5389 3.8324 3.2611

2.9518 3.3488 3.0716 3.2118

3.3388 3.4952 3.8182 3.2126

Bickel-Doksum 3.5532 3.8147

Manly-Yeo-Johnson 3.3032 4.89

Manly-John-Draper 3.0932 3.2107

Manly-Bickel-Doksum

2.8838 3.0976 2.8498 2.9224

2.9502 3.1660 3.0251 2.9903

2.8775 3.0082 2.8458 2.8674

2.8840 3.1073 2.8531 2.9431

2.9565

Yeo-3.2998 3.7688

The standardized residuals f-sample datansformed with variousor the in are tra power transformations. The kurtosis of each transformed output is compared to

check their normalizing effect. For double-transformations, the data is first transformed with the transformation mentioned first, and then subjected to the sec-

ond transformation.

Model 2.8 GARCH-Yeo-Johnson-John-Draper Model

We calculate the ARMA (1, 1)H (1, 1

the pseuption. Tvations-

tially traugh the Yon tra-

tion to rid it ness. The symmata is no-

formed whn-Draper mtransfo,

which reduces kursis. The doubed d-

tained is normally died (Tab

and (b)).

A the para for the two transform

obtfrom Eq (62) an the VaR-

p the iransfor in Equ1)

and (63). 2.9 GAo-Johnel-Doksum Model

We che (1, 1)(1, 1) unde

psormal an. Thtions ally

-GARC) under

do-normal assum

nsformed thro

he inno

eo-Johns

are ini

nsforma

of skewetric dw trans

ith the Jo

odulus

ly-transform

rmation

ata ob

now roughlystributles 2(a)

fter metersations are

ained

uted from

uations

nverse t

d (60),

mations

is com

ations (6

Model

alculate t

RCH-Ye

ARMA

son-Bick

-GARCH er th

eudo-nssumptioe innovare initia

M. BHATTACHARYYA ET AL.

transformhe Yeo-Johsform

removeen transfoith the Bicl-

Doksu which reexcess .

The skosis of tbly-tran

data o Tables (b).

Param-Johnsonickel-D

tran ted byizing l-

hood (58). The VaR is cad

frnverseationstions (59) and (63)

carri seriall order.

4. Testing

Tseries are of length 15se are diinto

thple serth 1000) and t-of-samies

(l00). Fdata pthe outple

rewe estiodel pars usinre-

Table 3. (a) 99% VaR violations comparisons for model 1 series; (b) 97.5% VaR violations comparisons for model 1 series; (c)

(a)

99% VaR Model 1.1 Normal 2 T Model 1.

n Type IV

Model 1.4

on SU 5 Manly Expected Violations

ed through tnson tranation to

skewness, and th

m transformation,

rmed w

moves

kurtosis

ewness and kurthe dousformed

btained are given in2(a) and

eters for the Yeo

sformations are calcula

and B

maxim

oksum

og-likeli

ds in Equations (62) anlculate

om the i transform in Equa

ed outy in that

he data 00; thevided

e in-sam

ength 5

ies (leng

or each

oint in

ple ser

-of-sam

gion, mate mrameteg the p

95% VaR violations comparisons for model 1 series.

Model 1.3

Pearso Johns Model 1.

Sensex 16 7 7 8 5 16

Nifty 16 8 8 13 5

DJI 22 20 9 9 11 5

FTSE 19 21 13 13 17 5

H S I 15 13 6 6 10 5

Nikkei 11 13 7 7 9 5

This tabletion comparisons f Model 1 see expectedf violations is gn in the last c 99% VaR is eted to

be violatnt out-of-samet. As can odels 1.3re the bestones.

(b)

9mal 1.2 T del 1.3

Pearson Type IV

odel 1.4

Johnson SU del 1.5 MExpected Violations

shows the VaR violaor theries. Th number oiveolumn, xpec

ed 5 times for a 500 poiple data sbe seen, M and 1.4 a performing

7.5% VaR Model 1.1 Nor ModelMoM Mo anly

Sensex 28 27 16 15 21 12.5

Nifty 24 24 16 15 21 12.

29 30 25 25 24 12.5

DJI 34 27 22 22 23 12.5

FTSE

H S I 24 23 20 19 22 12.5

14 18 12.5 Nikkei 29 29 14

This table shows the VaR violation comparisons for the Model 1 series. The e

to be violated 12.5 times for a 500 point out-of-sample data set. As can be seen,

(c)

95% VaR Model 1.1 Normal Model 1.2 T Model

Pearson T

xpe

Mode

1.3

ype IV

Model 1.4

Johnson SU Model 1.5 Manly Expected Violations

cted number of violations is given in the last column, 97.5% VaR is expected

ls 1.3 and 1.4 are the best performing ones.

Sensex 38 38 29 28 33 25

Nifty 40 38 29

DJI 55 49 42

32 36 25

41 42 25

36 41 25

33 33 25

33 38 25

FTSE 38 39 36

H S I 37 34 33

Nikkei 46 43 33

This table shows the VaR violation comparisons for the Model 1 series. The expe

be violated 25 times for a 500 point out-of-sample data set. As can be seen, M

cted number of violations is given in the last column, 95% VaR is expected to

odel 1.3 is the best performing one.

M. BHATTACHARYYA ET AL. 25

vious 1000 data points, i.e., for finding VaR on day t, we

consider data points from day t1000 to day 1t



re run in MATLAB version 7.2 on a Win-

ating system with 1.6 GHz processing speed.

hile running the program to calculate VaR for a single

Table 4. (a) 99% VaR violations comparisons for model 2 ser 99% VaR violations comparisons for model 2 series; (c)

95% VaR violations comparisons for model 2 series.

99% VaR odel 2.1

EVT

Model 2.2

Mixtur

Nors

Model 2.3

Bickel-Doksum

Draper

Model 2.5

Yeo-Johnson

Model 2.6

Manly-John-

Draper

Model 2.7

Manly-Bickel-

Doksum

Model 2.8

Yeo- ohnson-

John-Draper

Model 2.9

Yeo-Johnson-

Bickel-Doksu

Expected

Violations

The tests a

dows XP oper

day, the results are generated well within 30 seconds for The models are tested on six equity indices, Sensex, Nifty,

most cases.

5. Results

5.1. Data and Model Parameters

ies; (b)

(a)

Me of

mal John-

Model 2.J

Sensex 7 15 13 13 7 7 7 5 7 7

N9 13 12 9 10 9 10 5

D11 15 10 11 0 11 5

FT14 16 14 15 15 5

8 5

ifty 9 13

JI 15 17 14 1

SE 16 18 18 13

H S I 6 13 9 8 10 6 6 6 6 5

Nikkei 8 9 10 12 10 8 8 8

This table shows the VaR violation comparisons for the Model 2 series. The eted number of violations is given in the last column, 99% VaR is expected to

be violated 5 times for a 500 point out-of-sample data set. As can be seehe bene.

97.5% odel 2.1

EVT

Model 2.2

Mixtf

Normals

Model 2.3

Bickel-Doksu

odel 2.4

hn-Draper

Model 2.5

Yeo-Johnson

Model 2.6

Manly-John-

Draper

Model 2.7

anly- Bickel-

Doksum

Model 2.8

Yhnson-

John-Draper

Model 2.

Yeo-Johns

Bickel-Doksumiolations

xpec

n, Model 2.6 is t

(b)

st performing o

VaR Mure om

Jo Meo-Jo

on- Expected

Sensex 22 20 24 23 22 19 19 19 12.5 19

Nifty18 22 21 21 18 18 18 12.5

DJI 25 28 28 25 22 22 22 12.5

FTSE23 24 24 24 23 23 23 12.5

19 12.5

Nikkei 24 19 23 23 19 17 17 18 18 12.5

22 18

33 22

25 23

H S I 23 19 21 21 23 19 19 19

This table shows the VaR violation comparisons for the Model 2 serie numb is given in the last column, 97.5% VaR is expected

to be5 tim-of- As 2.6 aes

95% V

aR odel 2.1

EVT

Mod2

Mixture of

Nor

Model 2.3

Bickel-Doksum

odel 2.4

hn-Draper

Model 2.5

Yeo-Johnson

Model 2.6

Manly-John-

Draper

Model 2.7

Manly-

ckel-Doksum

Yeo-Johnson-

Japer

Model 2.

Yeo-Johnson-

Bickel-Doks

Expected

Violations

s. The expected

can be seen, Models

er of violations

nd 2.7 are the b violated 12.es for a 500 point outsample data set.t performing ones.

(c)

Mel 2.

mals Jo

Model 2.

ohn-Dr

Sensex 34 38 36 36 32 32 25 34 34 32

Ni34 43 36 36 33 32 25

D51 57 52 52 45 43 25

fty 34 34 33

JI 49 47 45

FTSE 35 38 37 37 34 34 34 33 32 25

H S I 31 37 35 37 34

Nikkei 37 44 43 3

34 33 34 33 25

35 34 34 34 25

Thclations is given in the last column, 95% VaR is expected to

del 2.9 is the best performing one.

is table shows the VaR violation comparisons for the Model 2 series. The expeted number of vio

be violated 25 times for a 500 point out-of-sample data set. As can be seen, Mo

M. BHATTACHARYYA ET AL.

DJI, FTSE, HSI and Nikkei. The data used is the closing

d March 2003 to

ed from www.fi-

the series, and the first four moments are given in Table 1.

5.2. VaRne

We test thecti

ther othlculated VaR has bte

is given by

N (69)

We measure VaR for each out-of-sample data point,

therefore, N = 500. We calculate 95%, 97.5% and 99%

V che,ted viola-

tT 3(a)pa

1 ser, compariVaR vations fohe six eity

Table 5. (a) LR Test fo VaR violations for model 1 series; (b) LR Test for 97 VaR vioions for mel 1 series; (c)

LRt for VaR violations for model 1 series.

(a)

VaR Mode Normal 2 TMode Pearson T IV del 1.4 Johnson SU odel 1.5 Mly

value of these indices from the perio

ebruary 2009. The data was obtainF

nance.yahoo.com, and the time period includes the stock

market crash of 2008. The details regarding the returns of where N is the total number of VaR measurements.

Violations and Compariso

f each model

of Mod

by calcu

lating effeveness o

e numbf times e caeen violad.

The expected number of violations for a q-percentile VaR



Expected % VaR violations1%qq

aR for ea

ions for eac

ables

data point

h would be 2

-(c) com

. Therefor

5, 12.5 and

re the five

the expec

5 respective

models of th

ly.

e Model

iesng iolr tqu

r 99%.5%latod

Tes95%

99% l 1.1 Model 1. l 1.3ype MoM an

Sensex 47 0.72 0.72 1.54

15.47 15.

Nifty 15.47 10.99 1.54 1.54

8.97

DJI 31.78 25.91 2.61 2.61 5.42

FTSE 23.13 28.80 8.97 17.90

H S .9

Nikkei 2

8.97

I 13.16 8

8.97

7 0.19 0.

0.7

0.72

61 5.42

Thow LR teststic for theel 1, 99% violation observations. bers in indicate sitns where tll hypot.

the observed violations is equal to the predicted one, is rejected.

(b)

VaRModel 1.1 Normal Mode l 1.2Model 1.3 Pearsope IV el 1.4 JohnU Model 1.5 M

his table ss the stati Mod VaRThe num bolduatiohe nuhesis, i.e

97.5% T n TyModson Sanly

Sensex .66 13.02 0.92 0.48 4.94

Nifty 59 8.59 0.92 0.48 4.94

8.59

H S I 8.59 7.28 92 3.00

6.06

DJI 26.01 13.02 6.06 6.06 7.28

FTSE 16.38 18.16 9.98 9.98

Nikkei 16.38 .38 0.18 0.18 2.19

This table showR tetiom

the observed violations is equal to the predicted one, is rejected.

(c)

Model 1.1 Normal Model 1.2 Model 1.3 Pearson Tpe IV 4 Johnson SU 5 Ma

s the Lst statistic for the Model 1, 97.5% VaR violation observans. The nubers in bold indicate situations where the null hypothesis, i.e.

95% VaR T yModel 1.Model 1.nly

Sensex 18 6.18 0.64 0.37 2.46

Nifty 08 6.18 0.64 1.90

4.51

I .67 19.18 10.19 9.11 10.19

2.46 2.46

DJ 28

FTSE 6.18 7.10 4.51 4.51 9.11

H S I 5.32 3.08 2.46

Nikkei 15.04 11.33 2.46 2.46

6.18

This table shows the LR test statistic for the Model 1, 95% VaR violation obsetions. The numbers in bold indicate situations where the null hypothesis, i.e.

the observed violations is equal to the predicted one, is rejected. rva

M. BHATTACHARYYA ET AL. 27

indices. Tables 4(a)-(c) compare the same for the nine

computed, and the best model for each percentile VaR is

found.

It can be seen that Models 1.3 and 1.4 are best per-

(where ARMA (1, 1)-GARCH (1, 1) pa-

rameters are calculated with the pseudo-normal a

wthe

Table 6. (a) LR Test for 99% VaR violations for model 2 series;

LR Test for 95% VaR violations for model 2 series.

(a)

Normals raper Yeo-Johnson Manly-

John-Draper

Manly-Bickel-

Doksum

Yeo-Johnson-

John-Draper

Yeo-Johnson-

Bickel-Doksum

models of the Model 2 series. The expected violations for

99%, 97.5% and 95% VaR are given in the last column

of each table. The mean violation for each model is

forming models across all indices. Amongst those of the

Model 2 series

ssump-

tion) hoever, Models 2.6, 2.7, 2.8 and 2.9 perform

best. This is expected from the skewness-kurtosis Table

2, where the most normalized transformations are shown to

be Manly-John-Draper, Manly-Bickel-Doksum, Yeo-John-

(b) LR Test for 97.5% VaR violations for model 2 series; (c)

99% V

aR Model 2.1

EVT

Model 2.2

Mixture of Model 2.3

Bickel-Doksum

Model 2.4

John-D

Model 2.5 Model 2.6 Model 2.7 Model 2.8 Model 2.9

Sensex 0.72 0.72 13.16 8.97 8.97 0.72 0.72 0.72 0.72

Nifty 2.61 2.61 8.97 7.11 8.97 2.61 3.91 2.61 3.91

DJI42 90 10.99 13.3.91 5.42 3.91

FTSE 10.99 15.20.46 20.46 15.10.99 13.16 8.97

H S I0.19 2.61 1.54 3.0.19 0.19 0.19

Nikkei 1.54 3.91 7.11 3.54 1.54 1.54

5.13.16 17.16 5.42

47 47 13.16

8.97 91 0.19

2.61 91 1.1.54

This tas the LR test sta for the Model 2, VaR violation obser The numbers in bold indications where the null hysis, i.e.

the obiolations is eque predicted one,cted.

aR Model 2. 1

EVT Mixture of

Normals

Model 2.3

Bickel-Doksum

Model 2.4

John-Draper

Model 2.5

Ye nson

Model 2.6

Manly-

John-Draper

Model 2.7

Manly-

Bickel-Doksum

Model 2.8

Yeo-Johnson-

John-Draper

Model 2.9

Yeo-Johnson-

Bickel-Doksum

ble showtistic 99%vations.te situapothe

served val to th is reje

(b)

Model 2.2

99% Vo-Joh

Se92 828 00 nsex 3.6.06 .59 7.6.06

3.00 3.00 3.3.00

Nif2.19 6.06 4.94 4.94 2.19 2.19 2.19 19

DJI 9.98 214.66 14.66 9.98 6.06 6.06 6.06 06

FTSE 7.28 8.59 8.59 8.59 7.28 7.28 7.28 28

H S3.00 4.94 4.94

7.2 3.00 3.00 3.00 00

Nikkei 3.00 7.28 7.28

3.00 1.50 1.50 2.19 19

ty 6.06 2.

3.95 6.

9.98 7.

I 7.28 8 3.

8.59 2.

This tas the LR test sr the Model 2,VaR violation obser The numbers in bold indtuations where the nuthesis, i.e.

(c)

99% VaR Model 2.1 Model 2.2

Mixture of Model 2.3

el-

Model 2.4

-Dra

Model 2.5 Model 2.6

Manly-

Model 2.7

Manly-Bickel-

Model 2.8

Yeo-Johnson-

aper

Model 2.9

Yeo-Johnson-

ble showtatistic fo 97.5% vations.icate sill hypo

the observed violations is equal to the predicted one, is rejected.

EVT Normals BickDoksum JohnperYeo-Johnson John-Drap Doksum John-Dr Bickel-Doksum

Sensex 3.08 64.51 4.51

3.08 3.08 1.90 1.90

.18 1.90

Nif3.08 14.51 4.51

3.08 3.08 2.46 2.46

DJI 22.17 323.7323.73 19.16.37 13.75 13.75

FTS3.77 65.32 5.32

3.08 3.08 3.08 2.46

H S1.41 53.77 5.32 3.08 3.08 2.46 3.08

Nik 5.32 111.3311.33 6.18 3.77 3.08 3.08

ty 1.33 1.90

2.16 18 11.33

E .18 1.90

I .32 2.46

kei 2.52 3.08

This table shows the LR test statistic for the Model 2, 95% VaR violation observations. The numbers in bold indicate situations where the null hypothesis, i.e.

e observed violations is equal to the predicted one, is rejected. th

M. BHATTACHARYYA ET AL.

son-John-Draper and Yeo-Johnson-Bickel-Doksum. Model

2.1 performs well too, especially for higher VaR estima-

tion.

In order to test the observed VaR numbers, we use

Kupiec’s test to determine if the observed VaR violations

are significantly different from their expected values. The

test is based on the fact that the number of violations N

in a sample of size T is binomially distributed as



~,Tp. Thus, the probability of N excesses oc- NB

curring over a T day period is given by



1TN



where p is the probability of exceeding VaR on a giv-

en day. Under the null hypothesis that NT p



, we

calculate the Likelih (LR) test statood Ratioistic



N N



 

ln 11









(70)

Thstatistics foVaR v observtions

areen inles 5 for tdel 1 , and

in s 6) forodel s. Thes in

bo thohereobserveR viols are

signintlyrent fr expected es.

measurement of Value at Risk. We use an ARMA (1, 1)

process to mexpd a GARCH

calculate parters th

pormssump while Models 2.x calculate

theith thal assumThe fwing

coionsadm the rs.

 dels 1.3 and 1.4-Pd GARSU)

r and away the performmong almod-

r consistency can be seen across in

peres.

ng odelsh use tseudo-nl as-

dardized innovations, the first one makes the distribu-

tion symmetric, while the second one reduces the

kurtosi



percentile VaR es

utatly, M 2.x sere slighaster

thsand 1t the dince of a few sec-

onds does notndatethem imore

accurate GARCH-PIV and GARCH-JSU model

RE NC

[1 ame Behavior of Stock Prices,” Journal of Bu-

siness, Vol. 47, No. 1, 1965, pp. 244-280.

[2] B. B. Mandelbrot, “The Variation of Certain Speculative





2ln

2





e test

giv

r the

(a)-(c) iolation

he Mo

series Tab

Table (a)-(c the M2 seriee valu

ld are

fica

se w

diffe

the

d Va

ation

6. Conclusions

this work, we build different models for accurateIn

odel co

ces m

nditional ectation, an

ional variance

cesses w

(1, 1) pros to

ame

del condit

for the above pro

. Models 1

ithout

seudo-nal ation,

m w

nclus

e pseudo-nor

can be m

e fro

ption.

esult

ollo

Mo (GARCHIV anCH-J

are fa

els. Thei

best ers al the

dices and

VaR centil

Amothe m whiche porma

sumption, Models 1.6, 1.7, 1.8 and 1.9 perform the best.

These use two transformations to normalize the stan-

2 A

Model.1 (GRCH-EVT

timates.

) performs well for high

Compionalodelries atly f

an Model 1.3

.4, bu

using

ffere

n the place of the

FEREES

] E. Fa, “Th

Prices,” Journal of Business, Vol. 36, No. 4, 1963, pp. 394-

419. doi:10.1086/294632

[3] R. Blattberg and N. Gonedes, “A Comparison of Stable

and Student Distributions as Statistical Models of Stock

Prices,” Journal of Business, Vol. 47, 1974, pp. 244-280.

doi:10.1086/295634

[4] C. A. Ball and W. N. Torous, “A Simplified Jump Process

for Common Stock Returns,” Journal of Financial and Qu-

antitative Analysis, Vol. 18, No. 1, 1983, pp. 53-65.

doi:10.2307/2330804

[5] S. J. Kon, “Models of Stock Returns: A Comparison,” Jour-

nal of Finance, Vol. 39, No. 1, 1984, pp. 147-165.

/23doi:10.2307 27673

d D. W. Frenc

odels fo

[6] J. anhf

al Mr St

of Bss Finance and Accountol. 17, N990,

pp. 59. doi:1 1/j.1468-5 990.tb011

B. Gray

Distribution

, “Empirical C

ock Index Re

omparisons o

turns,” Journal

usine

451-4

ing, V

957.1

o. 3, 1

97.x0.111

[7] M. Bhattacharyya, A. Chaudharyav ndi-

tional VaR Estimation Using Pearson Type IV Distribu-

tion,” Journal of Operational Reseaol.

191, No. 1, 20086-397.

doi:10.1016/j.ejor.2007.07.021

and G. Yad, “Co

European rch, V

, pp. 38

[8] M. Bhattacharyya, N. Misra and B Kodase, “Max VaR

for Non-Normal and Heteroskedeturns,” tita-astic RQuan

tive Finance, Vol. 9, No. 8, 2009, pp. 925-935.

doi:10.1080/14697680802595684

[9] M. Bhattacharyya and G. Ritolia, “Conditional VaR using

EVT—Towards a Planned Margin Scheme,” International

Finlysi No.p.

30.fa.2

Review of

82-395.

ancial Ana

1016/j.ir

s, Vol. 17,

006.08.004

2, 2008, p

doi:1

[10] a -

ticth Estimates of the Variance of United Kingdom

inflat” Econometrica, Vol. 5. 4, 1982,

10 i:10.230 2773

R. F. Engle, “Autoregressive Condition l Heteroscedas

ity wi

ion,

07.

0, Nopp. 987-

do7/191

1] T. lev, “alized Agressive Cional

Heteroskedasticit urnal of Econometrics, V, No.

3, 1986, pp. 307-oi:10.101 4-4076(863-1

[1 BollersGenerutoreondit

y,” Jo

327.

ol. 31

)9006d 6/030

2] S. Poon and C. Granger, “Forecasting Volatilitinan-

ciaets,” Journal of Econ Literaturl. 41,

No. 2, 2003, pp. 478-539.

[1 y in F

l Markomic e, Vo

doi:10.1257/002205103765762743

[13] J. Heinrich, “A Guide to the Pearson Type IV Distribu-

tion,” 2004.

http://www-cdf.f publications/cdf6820_pearson4.p

on,s of

by Methods of Translation,” Bioika, Vol. 36. 1-2,

nal.gov/

“System

df.

N. L. Johns[14] Frequency Curves Generated

metr , No

1949, pp. 149-17:10.1093et/36.1-2.6. doi /biom149

[15] A.ucker, “examinatf Finite afinite

Variance Distribdaily Stock Returns,”

Journal of Businomstics, Vol. 10, No. 1,

L. TA Reion ond In

utions As Mo

ess & Econ

els of D

ic Stati

19. 73-81. /1392, ppdoi:10.2307 91806

6] J. D. Hamilton, “asi-Bayepproach imat-

ing Parameters for Mixtures of

Journal of Busind Econotatistics, , No.

1, 1991, pp. 27-39. doi:10.2307/1391937

[1 A Qusian A

Normal Distributions,”

to Est

ess anmic SVol. 9

M. BHATTACHARYYA ET AL. 29

[17] D. M. Titterington, A. F. M Smitha and U. E. Makov,

“Statistical Analysis of Finite Mixture Distributions,” John

Variables Are Not Normally Distributed,”

Wiley & Sons, Chichester, 1992.

[18] J. Hull and A. White, “Value at Risk When Daily Changes

in Market Jour-

nal of Derivatives, Vol. 5, No. 3, 1998, pp. 9-19.

doi:10.3905/jod.1998.4 07998

[19] P. Zangari, “An Improved Methodology for Measuring

VaR,” Risk Metrics Monitor, Reuters/JP Morgan, 1996.

[20] G. E. P. Box an D. R. Cox, “An Analysis of Tra

” Journal of the Royal StatisticaSociety, Vol. 26,

No. 2, 1964, pp. 211-252.

[21] B. F. J. Manly, “Exponential Data Transformations,” The

Statistician, Vol. 25, No. 1, 1976, pp. 37-4

d nsfor-

mations, l

doi:10.2307/2988129

[22] P. Li, “Box Cox Transformations: An Overview,” Uni-

versity of Connecticut, Storrs, 2005.

[23] P. J. Bickel and K. A. Doksum, “An Ana

Association, Vol. 76, 1981, pp. 296-311.

lysis of Trans-

formations Revisited,” Journal of the American Statistical

doi:10.2307/2287831

[24] J. A. John and N. R. Draper, “An Alternative Family of

Transformations,” Applied Statistics, Vol. 29, No. 2, 1980,

pp. 190-197. doi:10.2307/2986305

[25] I.-K. Yeo and R. Johnson, “A New Family of Power Trans-

formations to Improve Normality or Symmetry,” Biome-

trika, Vol. 87, No. 4, 2000, pp. 954-959.

doi:10.1093/biomet/87.4.954

[26] W. K. Newey and D. G. Steigerwald, “Asymptotic Bias

for Quasi-Maximum-Likelihood Estimators in Condi-

tional Heteroskedasticity Models,” Econometrica, Vol. 65,

No. 3, 1997, pp. 587-599. doi:10.2307/2171754

[27] P. G. Perez, “Capturing Fat Tail Risk in Exchange Rate

l of Risk,

008, pp. 73-100.

Returns Using SU-Curves: A Comparison with Normal Mix-

ture and Skewed Student Distributions,” Journa

Vol. 10, No. 2, 2007-2

M. BHATTACHARYYA ET AL.

Appendix

In the model 1.5 used, the returns series t

r is modeled

as follows

tttt





 (71)

We assume that t

is a distribution such that when

transformed through Manly’s exponential transformation

(Equation (22)) it becomes normal.

 

,~ZTXZN,





 (72)



1/ 2

2exp d

z am











  (73

















)

The lower limit is given b

y 1



 since





TX.

varies from  to



,

varies from



 to  . In other words, it is impossible for

to take on a value less than 1



.

 



PzaPxTa



  (74)

This arises since the Manly’s transformation is one-to-

one and monotonically increasing. We name





bTa





and proceed



()

exp d

2π

Px bm































 (75)

We name , and follows.



mTn



ddmTnn





  

exp d

2π

bTn

PxbT nn





























(76)

We need to add a normalizing constant to the equ-

ation, such that



1Px .

  

exp d

2π

PxT nn

















 











(77)

Substituting









,







, and

changing limits from to



,1,























exp d

Pxw w















 







(78)



exp d

Pxw w





















 



expd

kww









k



(79)

erf















(80)





The innovations are related to the standardized inno-

vations by ttt







. We also assume that the trans-

formed standardized residuals have zero mean and unit

standard deviation. Therefore



 

1erf2

exp 1

expexp





























(81)

Since





()

PaPah



 

,



 

1erf 2

exp 1

expexp







































(82)

The log likelihood function to be minimized, is there-

fore

loglog 2π

LLF h

erf









 



























 









(83)