J. Biomedical Science and Engineering, 2009, 2, 294-303
doi: 10.4236/jbise.2009.25044 Published Online September 2009 (http://www.SciRP.org/journal/jbise/
JBiSE
).
Published Online September 2009 in SciRes. http://www.scirp.org/journal/jbise
Sleep spindles detection from human sleep EEG
signals using autoregressive (AR) model: a surrogate
data approach
Venkatakrishnan Perumalsamy1, Sangeetha Sankaranarayanan2, Sukanesh Rajamony3
1Department of Information Technology, Thiagarajar College of Engineering, Madurai, India; 2Department of Electrical and Elec-
tronics Engineering, Sethu Instituite of Technology, Madurai, India; 3Department of Electronics and Communication Engineering,
Thiagarajar College of Engineering, Madurai, India.
Email: pvkit@tce.edu; viggee_tce2000@yahoo.co.in; rshece@tce.edu
Received 5 May 2009; revised 19 July 2009; accepted 29 July 2009.
ABSTRACT
A new algorithm for the detection of sleep
spindles from human sleep EEG with surrogate
data approach is presented. Surrogate data ap-
proach is the state of the art technique for
nonlinear spectral analysis. In this paper, by
developing autoregressive (AR) models on
short segment of the EEG is described as a
superposition of harmonic oscillating with
damping and frequency in time. Sleep spindle
events are detected, whenever the damping of
one or more frequencies falls below a prede-
fined threshold. Based on a surrogate data, a
method was proposed to test the hypothesis
that the original data were generated by a linear
Gaussian process. This method was tested on
human sleep EEG signal. The algorithm work
well for the detection of sleep spindles and in
addition the analysis reveals the alpha and beta
band activities in EEG. The rigorous statistical
framework proves essential in establishing
these results.
Keywords: AR Model; LPC; Sleep Spindles; Sur-
rogate Data
1. INTRODUCTION
Oscillatory signal activities are ubiquitous in the bio-
medical signals [1]. Multielectrode recordings provide
the opportunity to study signal oscillations from a net-
work perspective. To assess signal interactions in the
frequency domain, one often applies methods, such as
ordinary coherence and Granger causality spectra [2]
that are formulated within the frame work of linear sto-
chastic process. Electroencephalogram (EEG) is one of
the most important electrophysiological techniques used
in human clinical and basic sleep research. In 1979 Bar-
low proposed linear modeling system which has a long-
lasting history in EEG analysis [3]. The models are
mainly considered as a mathematical description of the
signal and less as a biophysical model of the underlying
neuronal mechanisms.
In 1985, Frannaszczua et al. [4] proposed a model to
interpret linear models as damped harmonic oscillators
generating EEG activity based on the equivalence be-
tween stochastically driven harmonic oscillators and
autoregressive (AR) models. There is a unique transfor-
mation between the AR coefficients and the frequencies
and damping coefficients of the corresponding oscilla-
tors. In particular at times when the EEG is dominated
by a certain rhythmic activity e.g. in the case of sleep
spindles or alpha activity, on might expect, that this ac-
tivity will be rejected by a pole with a corresponding
frequency and low damping. This idea was the staring
point of our analysis [5].
The sleep EEG is always not stationary. However, we
demonstrated that the effects of non stationary become
relevant only with scales longer than 1s [6]. Therefore,
short segments with duration of around 1s are suffi-
ciently described by linear models. The non stationary in
longer time scales might be rejected by the variation of
the AR-coefficients and thus by the corresponding fre-
quencies and damping coefficients. Based on the above
considerations we propose an easy way to define oscil-
latory events. They are detected, whenever the damping
of one of the poles of a 1s AR model is below a prede-
fined threshold.
The method of surrogate data is a tool to test whether
data were generated by some class of model. In 1992 the
method of surrogate data proposed by Theiler et al. [7] is a
general procedure to test whether data are consistent with
some class of models. In order to test the hypothesis that
the data are consistent with being generated by a linear
system, the Fourier Transform (FT) algorithm is applied.
Based on a example and the theory of linear stochastic
systems we will show that this algorithm produce correct
V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303 295
SciRes Copyright © 2009 JBiSE
distribution of time series and therefore might not gener-
ally yield the correct distribution of the test statistics. The
surrogate data method differs from a simple Monte Carlo
implementation of a hypothesis test in that it tests not
against a single model, but a class of models, i.e. linear
systems driven by Gaussian noise. The idea is to select
single model from the class on the basis of the measured
data x, and then do a Monte Carlo hypothesis test for the
selected model and the original data.
The statistical properties of a time series generated by
a linear process are specified by the autocovariance
function (ACF) or equivalently by its Fourier Transform,
the power spectrum X (ω). The purpose of this study is
described as even without a rigorous mathematical
foundation it is possible to detect sleep spindles as well
as alpha and beta activities from human sleep EEG. The
theoretical framework of AR model and the procedure to
generate the surrogate data are given in Section 2. The
method is then tested on one simulation data set in Sec-
tion 3. Section 4 describes sleep spindles detection using
AR model with surrogate data approach. Results are
discussed and summarized in Section 5.
2. METHODS
In this section, we start with AR model of order p and
then proceed to outline the procedure for generating
surrogate data.
2.1. AR Model
The Parametric description of the EEG signal by means
of the AR model makes possible estimation of the trans-
fer function of the system in the straight forward way.
From the transfer function it is easy to find the differen-
tial equation describing the investigated process. Our
detection algorithm is based on modeling 1s segments of
the EEG time series using autoregressive (AR) models
of order p. From the AR (p)-model

p
jnjnj xa
0 (1)
where
j
a - Coefficients of the model (=1)
0
a
n
x - The value of the sampled signal at the moment n
n
- Zero mean uncorrelated white noise process.
Applying the Z-transform to Eq.1 we obtain:
)()()( zEzXzA
(2)
where
p
j
j
jzazA 0
)( (3)
X(z)-the Z-transform of the signal x
E(z)-the Z-transform of the noise.
If the system is stable, there exists A-1 (z) and we get
)()()( 1zEzAzX
(4)
In the z domain this filter is expressed by the Eq.4 where
A-1 (z) is the transfer function. Denoting it by H(z) and
writing if explicitly we obtain:
  p
j
j
jzaZAzH 0
1/1)()( (5)
Multiplying numerator and denominator p
z
we get
p
j
jp
j
pzazzH 0
/)( (6)
Factorizing the denominator gives the formula
 p
jj
pzzzzH 1)(/)( (7)
where
ki
kj erz
Using the above formula to estimate the frequencies
)2/(
kk
f and damping coefficients
(
kk rln
1

Denotes the sampling interval).
We assume that there are only single poles of H (Z)
which can be written in the form
 p
jjj zzzczH 1)/()( (8)
For the single pole coefficients can be found ac-
cording to the formula:
j
c
z
zHzz
cj
zzj j
)()(
lim
(9)
By means of the inverse transform –z-1 from the Eq.8
the impulse response of the system: h (n) can be found.
Since from the properties of the z-1 transform we know
that we obtain:
).exp())/(
1
ij zlnnzzzz 
 p
jjj zlnnczHznh 1
1).exp())(()( (10)
If the sampling interval t was chosen according to the
Nyquist theorem we can express the impulse response as
continuous function and write it in the form:
p
jjj zln
t
t
cth1).exp()(
p
jjj tac
1)exp( (11)
where
t = t*n t
zln
aj
j
Laplace transform of the Eq.11 which corresponds to the
transfer function H(s) of the continuous system is given
by the formula:
p
j
j
jas
csH 1
1
)( (12)
The above expression can be obtained directly from Eq.8
296 V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303
SciRes Copyright © 2009 JBiSE
by means of the integral transform z. Eq.12 can be written
as a ratio of two polynomials of the order p-1 and p
o
p
p
o
p
p
cscsc
bsbsb
sH 

1
1
1
1
)(
(13)
The polynomial coefficients can be readily calculated
from and . This form of the transfer function was
found also by Freeman 1975. It leads directly to the dif-
ferential equations describing the system. The transfer
function is the ratio of the Laplace transform of input y(t)
and output x(t) functions:
j
cj
a
))((
))((
)(
)(
)( tyL
txL
sY
sX
sH  (14)
Since variable s corresponds to the operator d
d, from
Eq.13 and 14 we get:
)()()( 0
1
1
1txctx
dt
d
ctx
dt
d
cp
p
p
)()()( 0
1
1
1
1
1
tybty
dt
d
bty
dt
d
p
p

(15)
In this way we have obtained the differential equation
describing the system which is free of the arbitrary pa-
rameters. Its order is determined by the characteristic of
the signal and may be found from the criteria based on
the principle of the maximum of entropy.
2.2. Surrogate Data Method
Generally, a surrogate data testing method involves three
ingredients: 1) a null hypothesis; 2) a method to generate
surrogate data; and 3) testing statistics for significance
evaluation. The null hypothesis in the present study is
that investigated data from a linear Gaussian process. If
the null hypothesis is rejected, then we conclude that the
data are either non-Gaussian or come from nonlinear
process. The surrogate data are generated in such a way
that it is Gaussian distributed but has the same second
order spectral properties (in the bivariate, auto spectra)
as the original data. The testing statistics the amplitude
of the peridogram we will give the steps for generation
the surrogate data and the theoretical rationale behind
the steps [8].
Consider two zero mean stationary random processes
x(t) and y(t). These processes may or may not be linear
Gaussian processes. Their one-sided auto spectra Sxx and
Syy can be estimated [9].
]|),([|
1
lim2)( 2
TfXE
T
fS kTxx 
]|),([|
1
lim2)( 2
TfYE
T
fS kTyy
(16)
where T is the duration of the data, the expectation is
taken over multiple realizations and Xk and Yk are the
Fourier Transform of x(t) and y(t). Now we consider
how to generate two zero-mean linear Gaussian proc-
esses )(tx
and )(ty
which have the same second order
statistical properties as x(t) and y (t). Let k
X
and k
Y
expressed in real and imaginary parts, ,
R
XI
X
, R
Y
and I
Y
, these Fourier Transforms are
)(.)()( fXjfXfX IRk
)(.)()( fYjfYfYIRk
(17)
Since and
)(
'tx )(ty
are normally distributed with zero
mean values, R
X
, I
X
, and are also nor-
mally distributed with zero means [9]. The covariance
matrix () of
R
Y
R
X
I
Y
, I
X
, and should be se-
lected such that the auto-spectra and cross spectra of
R
YI
Y
)(tx
and )t(y
are the same as those of the original data,
i.e, xxxx SS
and yyyySS
In practice, Sxx and Syy estimated from the data. The
question is how to draw Gaussian variables for R
X
,
I
X
, R
Y
and I
Y
such that and
xxxx SS
yy
S
yy
S
.
It can be shown that the relation between R
X
, I
X
,
R
Y
and I
Y
and xx
S
and are as follows [10].
yy
S
0][][

IRIR YYEXXE
xxIIRR S
T
YYEXXE

 4
][][
yyIIRR S
T
YYEYYE

 4
][][
(18)
According to these relations the covariance matrix ()
for the real and imaginary parts of and
k
Xk
Y
for
each frequency is


xx
xx
xIR
I
R
S
S
T
XX
X
X
E0
0
4


yy
yy
yIR
I
R
S
S
T
YY
Y
Y
E0
0
4
(19)
Choosing xxxx SS
and and
yy
S
yy
S
R
X
, I
X
,
R
Y
and I
Y
are then generated by sampling from a
Gaussian distribution with zero mean and covariance ma-
trix. Once R
X
, I
X
, and are generated for
each frequency, the surrogate data and
R
Y
x
I
Y
)(t
)(ty
are the
inverse Fourier Transform of )(.)( fXjfXIR
)f(
k
X
and )(.)()( fXjfXfXIRk
V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303 297
SciRes Copyright © 2009
3.1. Linear Bivariate AR Model Driven by
Gaussian White Noise
For easy implementation, we summarize the earlier
analysis as follows.
Step 1) Estimate and for original data ac-
cording to (16)
xx
Syy
SThe model is written as
)()5(6.0)(
)()2(5.0)1(*8.0)(
ttxty
ttxtxtx


(21)
Step 2) Calculate Covariance matrix according to (19)
Step 3) Draw values (realizations) of R, I
XX
, R
Y
and I from the Gaussian processes with zero mean
and covariance matrix () for each frequency. This can
be done for example Matlab function based on the zig-
gurat method [11].
Ywhere )(t
and )(t
are uncorrelated Gaussian white
noise with zero means and unit variances. The data set
consists of M=50 realizations where each realization is
of length K=512. For sampling frequency of 128Hz,
each realization has the duration of 4s. To perform the
test P =1500 surrogate data sets were generated follow-
ing the procedure in Section 2.
Step 4) Take the inverse Fourier Transform of k
X
and k to obtain the surrogate data andY)(tx )(ty
. To
ensure surrogate data are real valued, the negative fre-
quency parts of k and k
Y are taken as the complex
conjugate of the positive frequency parts.
X The one sided power spectra xx and yy for both
original (solid curve) and one set of surrogate data (dot-
ted curve) are shown in Figure 1. It is seen that the sur-
rogate data’s spectra nearly match well with that of the
original data. This is an expected result. The maximum
amplitude for x(t) original power spectra is 30.1891
[Figure 1(a)] and that for surrogate data x(t) power
spectra is 32.6881. Similarly the maximum amplitude
for y(t) original power spectra is 28.5673 [Figure 1(b)]
and that for surrogate data y(t) power spectra is 29.8921.
Notice that the power spectra amplitude is higher be-
tween 0 to 30 Hz than other regions. Namely, the esti-
mation variances are proportional to the power spectral
amplitudes as those frequencies. Considering that the
surrogate data contour plots are computed based on a
randomly selected data set among P=1500 available,
these maximum value comparisons suggests the known
fact that there are no nonlinear or non-Gaussian compo-
nents in the original data. The histogram and Gaussian
fit of input data x(t) and y(t) and surrogate data x(t) and
y(t) are generated by ziggurat algorithm shown in Figure
2. The PDFs are obtained from these parameters
S S
Step 5) Repeat Steps 3) and 4) to generate multiple
realizations of the surrogate data.
R and I, could be drawn from Gaussian distribu-
tion with zero mean and covariance matrix x. This sim-
plified method is similar to the method proposed by
Timmer [12]. The probability density function for the
extreme value distribution (type I) with location pa-
rameter µ and scale parameter is σ [13].
XY
))exp(exp(
1
 xx
y (20)
The exact PDF is determined once µ and σ are known.
From this distribution, one can determine the threshold
for any desired significance level (i.e. -p value).
3. SIMULATION EXAMPLE
We have performed simulation studies to test the effec-
tiveness of the method proposed before. Matlab “Signal
processing and Spectral analysis toolbox” is used in our
analysis.
Figure 1. One sided power spectra of a) x(t) and x(t) b) y(t) and y(t).The solid curve indicates the result from the original data and
the dotted curve indicates the result from one of the 1500 surrogate data sets.
JBiSE
298 V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303
SciRes Copyright © 2009 JBiSE
according to Eq.20 and are plotted along with the histo-
gram in blue color in Figure 3. Notice that the PDF here
are multiplied by the total number of surrogate data sets
(1500) to match the histogram. From the PDFs, the
threshold for a significance level of p < 0.005. By com-
paring the original data’s maximum power spectra val-
ues with the thresholds, it can be seen from the null hy-
pothesis that the original data coming from a linear
Gaussian processes cannot be rejected, a theoretically
expected result.
(a)
(b)
Figure 2. Histogram and Gaussian fit of the a) original and b) surrogate data.
010 2030405060 7080 90 100
0
5
10
15
20
25
PDF
Gaussian Fit
020 4060 80100120140
0
5
10
15
20
25
30
35
PDF
Gaussian Fit
Figure 3. PDF of original power spectra and the surrogate data power spectra with Gaussian Fit.
V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303 299
SciRes Copyright © 2009 JBiSE
Figure 4. µ and σ versus the number of surrogate data sets.
We have also performed a study to examine whether
fitting an extreme value distribution to the empirical
histogram is a viable approach. The fitted model pa-
rameters µ and σ versus the number of surrogate data
sets used are plotted in Figure 4. It can be seen after
around 500 surrogate data sets, the estimation becomes
linear.
4. SLEEP SPINDLES DETECTION
A seven minutes recording of 9 channels of EEG (C3,
A2, O1, O2, C4, P3, P4, F3, and F4) was used in our
test (16,207 trials) shown in Figure 5. Precise numbers
of sleep stages and standard characteristics derived
form hypnograms are in Table 1. Number of sleep
stages (17s records) are in the first column; data in the
second column are the average values over 5 subjects
(C3A2, C4P4, F3C3, P3O1, P4O2) related to total sleep
time, the beginning of sleep is set as the first appear-
ance of the sleep Stage 2. Sleep efficiencies is the ratio
of time spent in Stages 1-4 or REM sleep to the whole
sleep time, where also movement time and some
awakenings during the night are included; in sleep
medicine this parameter discriminates some sleep dis-
orders. Surrogate data were generated following the
procedure in Section 2.
The EEG derivation C3A2 and C4P4 was analyzed
shown in Figure 6. Oscillatory events are detected, if the
damping coefficients rk falls above a pre defined thresh-
old and hence rk, exceeds the corresponding threshold. In
practice, we use two thresholds, a lower one, ra, to detect
candidate events scanning the EEG with non-overlap-
ping 1-s segments. When ra is crossed we go back to the
previous segment and use a smaller step size of 1/16s
(overlapping 1-s segments). If rk exceeds a second
threshold rb > ra the beginning of an oscillatory events is
detected.
Figure 5. Human sleep EEG signals (9 channels C3, A2, O1, O2, C4, P3, P4, F3, and F4).
300 V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303
SciRes Copyright © 2009 JBiSE
Table 1. Number of sleep stages (first column) and sleep efficiency and average percentage of sleep stages during the
sleep (mean ± standard deviation).
Total 16,207 Sleep Efficiency[%]: 92.6 ± 5.3
Waking 761 % Waking 8.0 ± 3.2
Stage 1 809 % Stage 1 6.9 ± 1.4
Stage 2 8024 % Stage 2 46 ± 7.3
Stage 3 1536 % Stage 3 9.3 ± 4.2
Stage 4 1753 % Stage 4 10.9 ± 2.6
REM Sleep 3217 % REM Sleep 17.9 ± 3.6
Movement time 107 % Movement time 0.7 ± 0.6
(a) (b)
Figure 6. Sleep spindles detection for a) C3A2 and b) C4P4 channel (arrow mark representation).
The oscillatory events are terminated by the time rk is
lower than rb for the last time before it falls below ra.
The frequency and time at the position of the maximal
value rmax are considered as the frequency and occur-
rence of the event respectively. Spindles or Sigma waves
are a poor indicator for sleep onset. Some subjects do
not have discernable sigma waves. These spindles are
more prominent in Stage 2 than Stage 1. In the present
analysis the order of the AR model was set to p=8 and
the threshold \value set to ra=0.75 and rb=0.85 parameter
were chosen in such a way, that clearly visible sleep
spindles were reliable detected by the algorithm. Sleep
stages were visually scored according to standard criteria
in Figure 7.
In Figure 7 shows the detected events from 2 re-
cordings. The distributions show modes in four stages:
in the delta (±1:0-2 Hz), in the fast delta (±2: 2-4 Hz), in
the alpha (®: 8-12 Hz) and sigma (sleep spindles:
11.5-16 Hz) bands. These modes are also evident in the
distribution of the events in the different sleep stages
(Figure 7). Alpha waves predominate in waking, spin-
dles in Stage 2 while the occurrence of delta waves de-
creases from Stage 2 to Stage 4 (deepening of sleep).
Delta waves are the prevalent events in REM sleep and
Stage 1. Note that they also occur in Stage 2 with almost
the same incidence. Events in the alpha frequency range
correspond to continuous alpha activity during waking
and to small amplitude, sometimes spindle-like activity
in NREM sleep. However, in particular during NREM
sleep Stages 3 and 4, slow waves are often not detected
as oscillatory events because they yield a relaxatory pole
(frequency zero) in the AR-model. Table 2 provides de-
tection of different frequency bands and their amplitude
calculated from the Fast Fourier Transform.
Table 2. Different frequency band detection and the corresponding true value determined from one sided Power spectra.
S.No Wave Bandwidth (Hz) Time range (sec) Amplitude (µV)
1 Delta 0, 1-4 0-5 10-30
2 Theta 4-8 5-9 30-70
3 Alpha 8-12 9-13 > 70
4 Beta > 12 13 – 16 < 100
5 Sleep Spindles (sigma) 11.5 – 16 16-20 < 50
V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303 301
SciRes Copyright © 2009 JBiSE
Figure 7. Sleep stages from Stage 1-Stage 4.
The two most important characteristics of EEG ele-
ments are frequency and amplitude. Frequency is in-
verse value of duration of EEG segment. The frequency
range is divided in to four bands: beta (12Hz and
higher), alpha (8-12Hz), theta (4-8Hz) and delta (0,
1-4Hz). Amplitude of EEG is taken as the peak to peak
value. After the magnitude of amplitude EEG signal is
divided into low-voltage, middle and high- voltage
EEG. For delta, theta and alpha bands these values are
following: 10-30 µV for low voltage, 30-70 µV for
middle voltage and above 70 µV for high voltage EEG.
For beta band, the values are lower: below 10 µV low
voltage, 10-25 µV middle voltage, and above 25 µV
high voltage EEG.
Beta activity is typical during wakefulness. Alpha oc-
curs in relaxed state with eyes closed. Theta waves are
dominant in normal wake state in children; in adult peo-
ple theta activity appears only in small amount espe-
cially in drowsiness. Delta waves are present in deep
sleep; in healthy people do not exist in wake state. Sleep
spindles are rhythmic activity in 12-14Hz frequency
range and amplitude below 50 µV. The power spectra
for the original (solid line) and the surrogate (dotted line)
data are shown in Figure 8. Three peaks around 1-14Hz
can be seen, indicating the presence of synchronized
activities at these frequencies. According to the tradi-
tional classification first peak belongs to the delta band
(1-4Hz) and the second peak belongs to the alpha band
(8-12Hz) and third peak belongs to the beta band
(>12Hz). Past research has examined whether such os-
cillatory neural activities self-couple and couple each
other in a nonlinear fashion [17]. We study this issue
with the new surrogate test method.
Figure 8. One sided power spectra estimated recording from
P3O1 electrode on 1s period. The solid line is for the original
data and dotted line is for surrogate data.
302 V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303
SciRes Copyright © 2009 JBiSE
Table 3 provides comparison between E. Olbrich et al.
and our study. However, a few spindles are not detected
by the algorithm with the chosen parameters, i.e. the
maximum of rk remains smaller than the threshold rb.
Therefore, in our method proposed is using a surrogate
data approach needs to be further optimized for more
sensitive spindle detection.
5
. DISCUSSION AND SUMMARY
In this study sleep spindles were detected in 1s over-
lapped segments, wide 0.5-16Hz band containing delta
and theta was used for better eye movement separation.
Alpha activity was excluded as it would increase auto
covariance. With consensus scoring we would expect the
agreement to improve [5]. The cost benefit of this labo-
rious task would have been low. There was no manual
artifact handling and it is possible that some misclassifi-
cations are the results of the artifacts undetected by the
corresponding threshold rk and were generated surrogate
data with ziggurat method. In this algorithm there are
two different thresholds but as it can be seen from Table
3, the amplitude criterion was most important (i.e.
mostly amplitude in REM sleep is less than 50 µV and
NREM sleep is greater than 100 µV). Schwilden et al.
[14] reported that 90% of the EEG did not show any
nonlinearity. They suggested that only under some
pathological conditions, such as epilepsy, the brain sig-
nals manifest nonlinear effects. Other studies [15,16],
have shown that non linear characteristics exist between
theta and gamma EEG signals during short term memory
processing. To settle these debates, one needs carefully
constructed statistical tests. The method proposed here
represents our effort in this direction. We will contrast
our method with other often applied techniques in this
area.
5.1. Comparison with other Surrogate Data
Method
Multiple ways are currently in use to generate surrogate
data sets to test the significance of the sleep spindle de-
tection. In one method, the Fourier Transform (FT) is
estimated for each segment and the phase of the Fourier
Transform is randomized without changing the Fourier
amplitude. The result is then inverse Fourier Trans-
formed to generate a surrogate time series [14,17]. While
the phase randomization destroyed nonlinearity and
leads to linear Gaussian process [18], the Fourier ampli-
tudes are random variables for a stationary process as
well. By keeping them constant, one loses an important
degree of freedom [12]. In contrast, our method, in
which both amplitudes and phases of the Fourier Trans-
form are randomly generated, produces surrogate data
sets that are explicitly Gaussian, linear and share the
same second order statistics with the original data. An-
other method, called amplitude adjusted Fourier Trans-
form (AAFT), has been used in recent studies [19].
AAFT was designed to test the null hypothesis that the
observed time series is a monotonic nonlinear transfor-
mation of a linear Gaussian process. This method has the
same problems as the previous phase randomization
method. In addition, the generated surrogate data sets
usually do not have the same power spectrum as the
original data, leading to false rejections when the dis-
criminating statistics are sensitive to second-order statis-
tical properties [20].
5.2. Final Remarks
We make several additional remarks regarding our
method. First, generating the null hypothesis distribution
for the test statistic is a very time-consuming process.
The application of the extreme value theory mitigates
this problem. Our examples show that the probability
distribution function based on the extreme value theory
fits the empirical distribution well. In agrees with that
from the empirical histogram. Secondly, when the origi-
nal data with length L need to be zero padded to length
N (N > L), the surrogate data we generated have length
N. To make sure that the surrogate data still have the
same power spectra with the original data, the surrogate
data need to be normalized by L, instead of the real data
length N. Third, the periodogram method is used to es-
timate the power spectra. Consequently, all of the limita-
tions associated with the periodogram method were in-
herent in our method, including poor frequency resolu-
tion for short data, fourth when we generated surrogate
data set using ziggurat method does not generate random
numbers effectively in tail regions because Matlab does
not execute greater than 500 function calls [11]. Finally,
T
able 3. Comparison between AR model and AR model with surrogate data approach.
AR-Model (E. Olbrich et al. 2003) This study
S. No. Name of the
Electrode Bandwidth (Hz) Amplitude (µV) Bandwidth (Hz) Amplitude (µV)
1 C3A2 9-15 32 10-16 27
2 C4P4 11-15 25-42 12-14 22-46
3 F3C3
Relaxatory pole in the AR model
(zero frequency) --------- 11-16 118
4 P3O1
Relaxatory pole in the AR model
(zero frequency) ----------- 10.5-15 36
5 P4O2 10-16 <15 12-15 <10
V. Perumalsamy et al. / J. Biomedical Science and Engineering 2 (2009) 294-303 303
SciRes Copyright © 2009 JBiSE
the discrimination power of our statistical test is unclear.
This can be determined empirically by repeating the test
many times on different realization for the data [20]. We
will consider this issue as part of our future research.
In summary, detection of sleep spindles using the sur-
rogate method proposed in this paper is shown to give
accurate results when applied to test the significance of
power spectral amplitudes. It is based on solid statistical
principles and overcomes some weaknesses in previous
methods for the same purpose. It is expected to become
a useful addition to the repertoire of nonlinear analysis
methods for neuroscience and other biomedical signal
processing applications.
6. ACKNOWLEDGEMENTS
The authors wish to express their kind thanks to Ms. Kristina Sus-
makova for providing EEG sleep data and invaluable help and discus-
sions.
REFERENCES
[1] Buzsaki, G., (2006) Rhythms of the brain, Oxford, U. K:
Oxford University Press.
[2] Chen, Y. H., Bressler, S. L., Knuth, K. H., Truccolo, W.
A., and Ding, M. Z., (2006) Stochastic modeling of
neurobiological time series: Power, coherence, granger
causality, and separation of evoked response from ongo-
ing activity, Chaos, 16.
[3] Barlow, J. S., (1979) Computerized clinical electroe-
ncephalography in perspective, IEEE Transactions Bio-
med. Eng., 26 (7), 377–391..
[4] Franaszczuk, P. J. and Blinowska, K. J., (1985) Linear
model of brain electric activity EEG as a superposition of
damped oscillatory models, Biol. Cybern., 53, 19–25.
[5] Olbrich, E. and Achermann, P., (2003) Oscillatory events
in the human sleep EEG detection and properties, El-
sevier Science March., 1–6.
[6] Olbrich, E., Achermann, P., and Meier, P. F., (2002)
Dynamics of human sleep EEG, Neuro computing CNS.
[7] Theiler, J., Eubank, S., Longtin, A., Galdrikian, B., and
Farmer, J. D., (1992) Testing for linearity in time series:
The method of surrogate data, Phys. D., 58, 77–94,.
[8] Wang, X., Chen, Y. H., and Ding, M. Z., (2007) Testing
for statistical significance in Bispectra: A surrogate data
approach and application to neuroscience, IEEE Transac-
tions on Biomedical Engineering, 54(11), 1974–1982.
[9] Wu, W. B., (2005) Fourier transform of stationary proc-
esses, Proc. Amer. Math. Soc., 133, 285–293.
[10] Bendat, J. and Persol, A., (1986) Random data analysis
and measurement procedures, 2nd ed. New York: Wiley.
[11] Marsagli, G. and Tsang, W. W., (2000) The ziggurat
method for generating random variables, J. Stat. Softw.,
5, 1–6.
[12] Timmer J., “On Surrogate data testing for linearity based
on the periodogram”, eprints arxiv: comp-gas/9509003,
1995.
[13] Evans, M., Hastings, N., and Peacock, B., (1993) Statis-
tical distributions, 2nd ed, New York: Wiley.
[14] Schwilden, H. and Jeleazcov, C., (2002) Does the EEG
during isoflurane/alfentanil anesthesia differ from linear
random data, J. Clin. Monit, Comput., 17, 449–457.
[15] Schack, B., Vath, N., Petsche, H., Geissler, H. G., and
Moller E., (2002) Phase coupling of theta-gamma EEG
rhythms during short term memory processing, Int. J.
Psychophys., 44, 143–163,.
[16] Shils, J. L., Litt, M., Skolnick, B. E., and Stecker, M. M.,
(1996) Bispectral analysis of visual interactions in hu-
mans, Electroencephalography Clinical Neurophys., l98,
113–125.
[17] Schanze, T. and Eckhorn, R., (1997) Phase correlation
among rhythms present at different frequencies: Spectral
methods, application to microelectrode recording from
visual cortex and functional implications, Int. J. Psycho-
phyus., 26, 171–189.
[18] Venema, V., Bachenr, S., Rust, H. W., and Simmer, C.,
Statistical characteristics of surrogate data based on
geophysical measurements, Non linear Processes Geo-
phys., 13, 449–466.
[19] Chon, K. H., Raghavan, R., Chen, Y. M., Raghavan, D.
J., and Yip, K. P, (2005) Interactions of TGF-dependent
and myogeneic oscillations in tubular pressure, Amer. J.
Physiol., 288, F298–FF307.
[20] Schreiber, T. and Schmitz, A., (1997) Discrimination
power of measures for nonlinearity in a time series, Phys,
Rev. E, 55, 5443–5447.