Segmentation of MS lesions using entropy-based EM algorithm and Markov random fields

doi:10.4236/jbise.2011.48071

Paper Menu >>

Journal Menu >>

J. Biomedical Science and Engineering, 2011, 4, 552-561

doi:10.4236/jbise.2011.48071 Published Online August 2011 (http://www.SciRP.org/journal/jbise/

JBiSE

Published Online August 2011 in SciRes. http://www.scirp.org/journal/JBiSE

Segmentation of MS lesions using entropy-based EM

algorithm and Markov random fields

Ahmad Bijar1, Mahdi Mohamad Khanloo1, Antonio Peñalver Benavent2, Rasoul Khayati1

1Department of Biomedical Engineering, Shahed University, Tehran, Iran;

2Department de Estadística, Matemáticas e Informática, Universidad Miguel Hernández, Elche, Spain.

Email: ahmad_bijar@live.com; mahdi.khanlo@gmail.com; a.penalver@umh.es; khayati@shahed.ac.ir

Received 25 June 2011; revised 28 July 2011; accepted 8 August 2011.

ABSTRACT

This paper presents an approach for fully automatic

segmentation of MS lesions in fluid attenuated inver-

sion recovery (FLAIR) Magnetic Resonance (MR)

images. The proposed method estimates a gaussian

mixture model with three kernels as cerebrospinal

fluid (CSF), normal tissue and Multiple Sclerosis le-

sions. To estimate this model, an automatic Entropy

based EM a lgo r it h m is u s ed to f ind th e b est es t ima ted

Model. Then, Markov random field (MRF) model

and EM algorithm are utilized to obtain and upgrade

the class conditional pro bability density function and

the apriori probability of each class. After estimation

of Model parameters and apriori probability, brain

tissues are classified using bayesian classification. To

evaluate the result of the proposed method, similarity

criteria of different slices related to 20 MS patients

are calculated and compared with other methods

which include manual segmentation. Also, volume of

segmented lesions are computed and compared with

gold standard using correlation coefficient. The pro-

posed method has better performance in comparison

with previous works which are reported here.

Keywords: Gaussian Mixture Model; EM; Entropy;

Markov Random Field; Multiple Sclerosis

1. INTRODUCTION

Magnetic Resonance Imaging (MRI) permits the non-

invasive detailed visualization of internal anatomical

structures. The segmentation of MR images into ana-

tomical tissues, fluids, and structures is an area of inter-

est in MRI, and is increasingly being used to assess the

progression of the disease and to evaluate the effect of

drug therapy, supplementing traditional neurological

disability scales such as the extended disability status

scale (EDSS) [1]. Multiple Sclerosis (MS) is a chronic,

inflammatory, demyelinating disease of the Central

Nervous System (CNS) that primarily affects the white

matter (WM) of the central nervous system. MS lesions

exhibit hypersignals in T2 weighted and hyposignals in

T1 weighted, with respect to normal white matter inten-

sities. Previous research has shown that the FLAIR

(Fluid Attenuated Inversion Recovery) sequence con-

tains the most distinctive lesion-healthy tissue differen-

tiation for segmentation of white matter lesions. The

radiological criteria for MS include the number of lesion

on the MRI, their locations and their sizes, and these

quantitative information is also crucial for studying the

progression of MS lesions and the effect of drug treat-

ments. To make a quantitative analysis of the brain scans

of the patient, human experts are required to identify the

multiple sclerosis lesions that are present in those scans

as manual segmentation which is extremely time con-

suming. So, automatic and semi-automatic methods for

segmentation of MS lesions are recommended and have

been investigated extensively. There are many proposed

approaches, automatic and semi-automatic, for segmen-

tation of brain into different tissues, including MS le-

sions. These approaches include a variety of methods

such as statistical, fuzzy, neural networks, and fuzzy

neural networks. In this paper, we develop a fully auto-

matic method for segmentation of MS lesions that re-

quires no training, atlas, or thresholding steps. Our

method can be divided into two main sections: at first,

intensities of brain pixels are modeled using a gaussian

mixture model which consists of three kernels as CSF,

normal tissue and MS lesions classes. This step starts

with only one kernel and uses an entropy based EM al-

gorithm to estimate three kernels as three mentioned

classes in an automatic manner which does not need

initial values for parameter estimation. After estimation

of the model, we would be able to classify brain pixels

by knowing apriori probabilities of the classes. The next

step is obtaining these apriori probabilities with no

training and atlas. So, a MRF model and EM algorithm

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561 553

are applied to update and attain apriori probabilities and

means and variances of each class.

2. METHODS

2.1. Patients and MR Imaging

The proposed method is evaluated on a dataset which is

obtained and used in [2]. This dataset contains 16 female

and 4 male with average age of 29 ± 8 years old, this

dataset is selected according to the revised Mc Donald

criteria 2005 [3]. All images were acquired according to

full field MRI criteria of MS [3] in T2-weighted (T2-w),

T1-weighted (T1-w), Gadolinium enhanced T1-weighted,

and FLAIR in axial, sagittal and coronal surfaces. We

selected the FLAIR images, especially axial ones, with

lesions in deep, priventricular, subcortical, and cortical

white matters (supratentorial lesions). More lesion load

and higher accuracy of FLAIR in revealing of these MS

lesions were the reason for this selection [4]. Each image

volume (patient data) consisted of averagely 40 slices

with a 256 × 256 scan matrix. The pixel size was 1mm2,

and the slice thickness was 3mm without any gap.

2.2. Manual Segmentation of MS Lesions and

Brain Segmentation

The segmentation of MS lesions was performed manu-

ally by neurologist and radiologist in Flair images with

visual inspection of corresponding T1-w and T2-w im-

ages. These manually segmented images were used as

Gold standard [5] to evaluate the performance of pro-

posed method. To evaluate the proposed method, differ-

ent types of images which have different lesion volumes

were applied. Also, the brain segmentation was per-

formed using a fully automatic object-oriented approach

[6]. This method was based on the regional-spatial char-

acteristics of brain in MR images. At first, original im-

age is converted to a binary image. Then, morphological

opening on the binary image is performed and tiny re-

gions are eliminated. Three rectangular masks showing

the cerebral regions are produced and the regions in the

binary image which have overlap with these rectangles

are preserved and, the rest are eliminated. Final mask is

generated by dilation of selected regions and filling tiny

holes. Finally, an image, which includes only cerebral

tissues, is obtained by applying the resulted mask on the

original image.

3. PROBLEM FORMULATION

3.1. Gaussian Mixture Model

A finite mixture model is the weighted sum of M

> 1 components in for



1



xπxx

ppm





(1)

where corresponds to the

weight of each component which satisfies

 

π0,11,2,,

mm

1mπ1



.

For the Gaussian mixtures model, each component den-

sity





xpm is a Gaussian probability density given by







2πdet

exp x





 (2)





 







where T denotes the transpose operation, m



is the

mean vector and m



is the covariance matrix which is

assumed positive definite. Here we encapsulate these

parameters into a parameter vector, writing the parame-

ters of each component asmmm

to get









,,





π,π,,π, ,.







Eq.1 can be rewritten as







xπx







 (3)



If we knew the component from which x came, then it

would be simple to determine the parameters



. Simi-

larly, if we knew the parameters , we could determine

the component that would be most likely to have pro-

duced x. The difficulty is that we know neither.



3.2. Bayesian Classification

Bayesian Classification is a probabilistic technique of

pattern recognition and is based on the principle of

Bayes decision theory [7], given in Eq.4 below









px P

Px px





 (4)

where, x is a given feature vector,



denotes a class,

or state of nature,







is the prior probability of

class







px is prior probability of the feature vec-

tor x,







is aposteriori probability, which a fea-

ture vector should be classified as belonging to class









is the conditional probability that a feature

vector occurs in a given class



. For the approach here,

the feature x shall consist of one component, intensity of

brain pixels. The quantity is known as the evi-

dence, and serves only as a scale factor, such that the

quantity in Eq.4 is indeed a true probability, with values

between zero and one. So, the maximum a posteriori

(MAP) estimate of Eq.4 is used as below













Pxpx P





 (5)

According to Bayesian theory [8], the feature vector x

is classified to



of which the aposteriori probability

given x is the largest between the classes.









max

iii

px P











 (6)

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561

554

Bayes decision rule is optimal in the sense of minimi-

zation of the probability of error. It is quite obvious that

such an Ideal Bayesian solution can be used only if dis-

tributions





, and the apriori probabilitles







are known. In the context of classification of brain tissue,

the probability models are not known, and therefore,

must be approximated. The performance of the Bayesian

classifier is directly related to how well these distribu-

tions can be modeled.

3.3. EM Algorithm

The Expectation-Maximization (EM) algorithm [9] is a

general-purpose iterative algorithm for maximum likeli-

hood (ML) estimation.The EM algorithm consists of two

steps, E-step (expectation) and M-step (maximization)

which are alternately used until the





log n

ipx







converges to a local optimum. Let



,1,,

xi n

be a set of n realizations of a random d-dimensional

vector χ with a pdf . Thus the parameterized pdf

can be written as a combination of pdfs of the M com-

ponents m characterizing the finite

mixture model



,M



1,wm







1π

px px









 (7)

The likelihood can be expressed as





Lpx px









 

 (8)

E-step: the data χ are assumed to be incomplete and

the complete data set is determined by es-

timating the set of variables











,, M

ZZ Z, where

each

is an N-dimensional vector .

The log likelihood of the complete data Y is





zz



loglog πx

mm m

pYz p



 

 

 (9)

where

 





πx

tit

iit

mMtit

zPm







 (10)

is the posterior probability and is the parameter

estimate obtained after t iterations.



M-step: in this step, the parameters are deter-

mined according to the estimate of the variables . For

Gaussian mixture models this corresponds to reestimat-

ing the , the

1t



πt

1t



 and 1t





for each m according

to:

πN





i

(11)

mNi









 (12)



111

tit t

mi mi m













 (13)

4. ENTROPY

In the information theory, information entropy is a

measurement of the system uncertain degree. The term

of entropy in the theory of information was first intro-

duced by Shannon, in 1948, in [10].

4.1. Shannon Entropy

Let





1,,

xx be a discrete random variable with

values in S and probability mass function





px, the

information (or uncertainty) of each possible event is





log



xp x (14)

The mean of





X is introduced as Shannon en-

tropy and is denoted by























log

XEI Xpxpx





 (15)





X varies from zero to



log S zero meaning that

there is no uncertainty, whereas



log S is reached

when all elements of X have equal probabilities, in this

case, the uncertainty is at its maximum [11].

4.2. Rényi Entropy

Rényi further expanded the concept of Sannon entropy,

and defined the





0, 1qq q order Rényi [12] en-

tropy of probability density function as



 

1log, 1

qxs

HXpx q

q





 (16)

Rényi entropy is a non-increasing function of q and It

can be verified, by applying the 1’Hôpital rule that

Shannons entropy is the Rényi entropy of order 1.

5. ENTROPY ESTIMATION

In this paper, Rényi entropy [12] is selected as entropy

measure and estimation of this type of entropy which

will be express, is based on Leonenko and et al. [13].

Let n



 be a random vector with probability

measure μ having the density p. The proposed method

by Leonenko estimates q

from a sample of N inde-

pendent and identically distributed (i.i.d.) random vari-

ables 12 N based on the nearest neigh-

bor distances in the sample. This work is resumption of

the method which estimates

,,XX,X N 2

and proposed by Ko-

zachenko and Leonenko [14].

Leonenko’s method estimates q, Eq.17, for I1q



through the computation of conditional moments of

nearest-neighbor distances.

 







qq xs

IIppxpx q



1







 (17)

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561 555

5.1. Estimators

Suppose that 12 , are i.i.d. with a

probability measure μ having a density p with respect to

the Lebesgue measure. Let

,,, 2

XXX N







denote the Euclid-

ean distance between two points ,

y of . For a

given sample



X and a given i

in the sample,

from the distances

1N



,, 1,,,

Xj N











the order statistics 1, 12, 11, 1

,ji







are formed, so that is the kth nearest-neighbor

distance from i









to some other

in the sample,

.jiq

in Eq.18 is estimated for , by 1q



,, ,,

ˆq

Nkq Nik



ζ (18)

where



,,, 1

Nikk mkN

NCV





ζ





π21

Vm

( is the volume of the unit ball in )



0, 1





 



11q

Ckkq 





So, q

in Eq.17 is estimated as below





,, ,,

log 1

NKq Nkq

Iq (19)

5.2. Maximum Entropy

Suppose that we have a group of variables which have

obtained from different types of distributions and have

equal variance. According to the second Gibbs theorem

[15], among all of them, Gaussian variables have maxi-

mum entropy. Therefor, by using this theorem, we

would be able to understand that if the modeling of the

variable with a gaussian kernel is acceptable or not. This

procedure is done through the comparing of the theo-

retical maximum entropy and the real one of the under-

lying data. Let X be a d dimensional variable, the theo-

retical maximum entropy is computed as below

 

max

1log 2π

HX 









(20)

where, is covariance of observations. There would

be no difference between the theoretical maximum en-

tropy and the real one. If the data were truly Gaussian.

We have used this property to determine the worst esti-

mated kernel in the mixture.



6. SELECTION AND SPLIT OF THE

WORST ESTIMATED KERNEL

For every kernel in the mixture, the theoretical maxi-

mum entropy and the real one of the underlying data is

computed and evaluated in the normalized weighted

form [16] to find the worst estimated kernel. The worst

estimated kernel would be selected through

 



max real

max

arg maxπ

kH k

kHk





















(21)

where, k



indicates the worst estimated kernel,





real

k is the real entropy of the data under the kth

kernel, and





max

k is the maximum entropy of the

kth kernel. Suppose that a special kernel has been se-

lected to split, during the split step, overall dispersion

and two new covariance matrices have to be constant

and positive define, respectively. If the kth component

had to split into the k1th and k2th components, the first

two moments, the mean vectors and the covariance ma-

trices should satisfy the following split equations which

are proposed by Richardson and Green [17]

112 2

πππ

ππ π

kkk

kkkkkk

 













11112 222

ππ

kk kk

kkkkk kkk















(22)

Solving the split equations is an ill-posed problem

because the number of the equations is less than the

number of the unknowns. To solve this ill-posed prob-

lem, we have used the method which is proposed by

Dellaportas and Papageorgiou in [18]. They have used

the spectral decomposition of the current covariance

matrix, and the original problem is replaced by estimat-

ing the new eigenvalues and eigenvectors of the new

covariance matrices without the previously cited con-

straint. Let k

the spectral decomposition

of the covariance matrix

kkk

 



, with , the eigenvector

matrix,





diag ,d

kkk







a diagonal matrix con-

taining the eigenvalues of k

 with increasing order,

and d the problem dimension. Let also D be a d × d rota-

tion matrix with orthonormal unit vectors as columns. D

is built by generating its lower triangular matrix inde-

pendently from





12dd different uniform U(0,1)

densities. The proposed split operation is given by









322

ππ

π1π

Vπ

diag diagdiag

diagdiagdiag 1

kkk

iii

kk kk

iii

kk kk

uuu



 













 





A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561

556



(23)

where



is a d × 1 vector of ones, and



12 222

,,,,

uuu uu, and are



333 3

,,,

uuu u

2d + 1 random variables needed to build priors, means,

and eigenvalues for the new component in the mixture.

They are calculated as







 

233

2, 21,2

1,1 1,0,1

uud

uUu duU













(24)

with , and and denoting

beta and uniform distributions, respectively. The expres-

sions for 1

2, ,j



.,.U



.,.





and 2



establish that the new means are

obtained by moving across the axes formed by the ei-

genvectors with sizes 11



and 22



. We must

keep positive but let vary in [–1, 1] the interval

because the moves in both directions across the large

eigenvector are achieved via expressions for 1



and



, but the moves across the shorter axis require both

positive and negative .

7. ESTIMATION OF PROBABILITY

DENSITY FUNCTION

This section describes the proposed method for fitting

Gaussian mixture models in depth. The algorithm starts

with only one kernel, whose initial parameters are given

by the whole samples. As is shown in [19], if we denote

12, a set of samples of a pdf, the ML

hypothesis for the mean and covariance matrix are

,, d

yy y















(25)

Furthermore, the initial value for the first prior 1 is

1. EM algorithm computes a fitted model which consists

only one kernel by these initial values. Whereas the

model has only one kernel, this kernel should be split

and replace by two other ones. This work is done by

Eq.23. The new model with two kernels has to fit on

samples. EM algorithm with initial values which are

obtained by previous split, is applied for this reason.

Now, the fitted model has two kernels as two classes and

it is less than our interested number of kernels which is

three, csf, normal tissue and MS lesions. So, the worst

fitted kernel among these two kernels has to be selected

and split. As mentioned before, the worst fitted kernel

has big difference between the theoretical maximum

entropy and the real one against other kernels. So, we

could select the worst kernel by means of maximum

entropy and real one using Eq.21.

The worst fitted kernel is replaced by two new kernels

through split (Eq.23) and the other kernel is remained

with no variation. To find the final fitted model which

has three kernels, another EM algorithm is applied, the

initial parameters for EM algorithm are obtained by split

(split of the worst fitted kernel) and from previous EM

algorithm (for remained kernel). Figure 1 shows the

Entropy-based EM algorithm to fit the best fitted mix-

ture to barin samples.

8. MARKOV RANDOM FIELDS

An advantage of MRF models [20] is the use of

neighborhood information to improve the apriori prob-

abilities







[21]. The intuition behind the MRF

model is that most pixels belong to the same class as

their neighbors, and it is a powerful tool to describe the

class assigning or labeling dependence between adjacent

pixels.

Suppose a digital image bases on a

N lattice, so

image describes as









,1 ,1SsxyxM yN (26)

Fit a mixture with three kernels to brain samples



Brain Image

Fit a kernel to all brain samples through EM

algorithm.

Split the obtained kernel into two new kernels

and use the obtained result as initial values to

fit a new model with two kernels to all

samples through the EM algorithm.

Find the worst estimated kernel in the mixture

which has two kernels by comparing the

maximum entropy and real entropy, and split

it into two new kernels and use its result as

initial values for the next level , the other

kernel has no variation .

Fit a model w ith three kernels to samples

through the EM algorithm, initial values are

prepared from previous step.

Figure 1. The Entropy-based EM algorithm to find the best

fitted model to barin samples.

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561 557

Assuming that the unobserved random field







is a Markov random field with the probability density

function of







of a segmented image depending

on its finite neighboring region then



,Nxy













,,,,,

ii ii

pxy xyxyxy

pxyNxy











(27)

where is a set of all labeled neighbors. So, the

segmentation problem defined as a pixel classification

problem using MAP can be estimated using a Gibbs dis-

tribution [22], which is easier to estimate. The Gibbs

distribution can be expressed as



,Nxy







,,exp

Uxy

pxyNxy z









(28)

where,



,exp Uxy





 is a normalization con

stant which guarantees that is always

smaller or equal to one, and T is a constant which stands

for the temperature constant which normally supposes to

be one. And the energy function





,pxy











ii ii

xy Cxy

Uxy Vxy







 (29)

where, cliques are subsets of or



,Cxy

 



,Nxy





itself and







Vxy





is an arbitrary func-

tion of





. The equivalence between MRF and

Gibbs distribution is expressed by Hammersley-Clifford

theorem, which states that







is a MRF with

neighborhoods if and only if



,Nxy







,Cxy

is a

Gibbs distribution field with the cliques in-

duced by the neighborhood





y,Nx



. This theorem pro-

vides an easy way to construct the MRF in an explicit

manner, i.e., one can explicitly estimate the conditional

probability distribution of MRF by choosing specific

kinds of cliques and an appropriate energy

function that is specific for the practical

problem. In our model we used the simple equation for

the



,Cxy



,yUx







energy function proposed by Therrien [23],

and utilized by Nett et al. [24]. This equation is a linear

combination of products of elements in the cliques:



 













, ,1,1,

,1 ,1

Uxyxyxy xy

xy xy

 

 





(30)

9. COMPUTATION OF THE APRIORI

PROBABILITY BY MRF MODEL

At first, the brain image is classified through the Bayes-

ian classification method by using the model which is

obtained as the result of Entropy Based EM algorithm.

The gray level values for each tissue class resulted from

previous step are extracted. Then, the class conditional

probability density function of each tissue class (i.e.,





ˆk





and





ˆk





) is estimated through the EM algo-

rithm. Also, apriori probabilities









for tissue

classes are estimated by MRF model stated in Eqs.28-30.

The parameters of MRF model, i.e.,



, 1



and 2



have been experimentally set to 0.1, 0.01, and 0.01 for

the best result. Next, the new value for termination tol-

erance is calculated and evaluated. The new value of

termination tolerance is calculated from

 









 







ˆˆ

max ,

ˆˆ

max max,

ˆˆ

max ,

abs PP

tol abs

abs



 

 



















(31)

in which





ˆk







ˆk





and



ˆk





are the row

vectors of the apriori probabilities, mean, and variance

of the tissue classes in kth iteration, respectively. This

algorithm which computs the apriori probabilities by

MRF model is shown in Figure 2.

Computation of the apriori probabilities by MRF model and

update means and variances of each class

The obtained model

from Entropy based

Em algorithm

Initialization of:

maximum iteration,

and tolerance

Bayesian classification

Extraction of the

gray level values

for each tissue class

Estimation of apriori

probabili ties by

MRF model

Estimation of class conditional

probabili ty density function of

each class through

the EM algorithm

Calculation and evaluation of the new value

for termination tolerance

Post processing

Segmented barin

Figure 2. Computaion of apriori probabilities and update of

means and variances.

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561

558

10. POSTPROCESSING

In manual segmentation, the possible lesions with sizes

as small as one or two pixels are not usually considered

as MS lesions by experts. However, as the gray level of

neighboring pixels in an image usually are highly corre-

lated, in our approach, the one-pixel object is considered

as a possible noise if its gray level is higher than the

average gray levels of its neighbor pixels (in 3 × 3

neighborhood) plus a margin of 25 (which is selected by

trial). This object is recognized as noise and removed in

the next step. Regarding the two pixel objects, in con-

trast with the manual procedure, these objects may be

identified as seed points of MS lesions as recommended

by the neurologist.

JBiSE

11. SIMULATION RESULTS

The results of proposed method, for two slices which

contain different lesion loads are shown in Figure 3 and

Figure 4. Segmented brain of typical original FLAIR

images are shown in Figure 3(a) and Figure 4(a). Mul-

tiple sclerosis pixels and the resulting segmented slices

are shown in Figure 3(b) and Figure 4(b). Figure 3(c)

and Figure 4(c) show the histogram of brain pixels and

estimated distribution through the proposed method.

Also, three estimated kernels which constitute gaussian

mixture model are shown in Figure 3(d) and Figure

4(d).

12. EVALUATION

To evaluate the proposed method, SI [25], OF and EF

[26] criteria are considered and computed over all 20

patients.

2TP

SI 2TPFPFN



 (32)

Figure 3. Result of applying the proposed algorithm to the image of a patient: (a) brain image; (b) result of

proposed method; (c) histogram of brain, overlaid.

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561 559

Figure 4. Result of applying the proposed algorithm to the image of a patient: (a) brain image; (b) result of proposed method; (c)

histogram of brain, overlaid.

OF TP FN

 (33)

EF TP FN

 (34)

where, TP stands for true positive voxels, FP for false

positive voxels, and FN for false negative voxels. SI and

OF for a good segmentation should be close to 1 and EF

should be close to 0. As mentioned before, patients are

categorized into 3 groups. The proposed method has

been applied to all patients and SI, OF and EF criteria

are computed and mean values of each group are illus-

trated in Table 1. It is noticeable that SI, OF and EF are

improved with an increase in lesion load. Also, the

volumetric comparison of lesions between the proposed

method and gold standard using correlation coefficient

(CC) for each patient group is shown in Table 1. By

attention to correlation coefficient values, accuracy of

the proposed method is increased for patients with large

lesion load.

13. CONCLUSIONS

In this paper, we have described and validated a fully

automatic method for classification of brain tissues in

MR FLAIR images of MS patients. A gaussian mixture

model which has three kernels as brain tissues is ob-

tained through entropy based EM algorithm. This task

starts with only one kernel and finally finds three kernels

with no initially supplied training samples. Then, apriori

probabilities of classes are estimated using MRF. At last,

brain tissues are classified through Bayesian Classifica-

tion. This framework has been validated on a data set of

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561

560

Table 1. Similarity criteria and volumetric comparison of lesions for each patient group.

Correlation analysis Similarity criteria

Correlation CoefficientM ± SD (CC)

MGS ± SDGS (CC)

EF OF SI

N Patient category

0.93 1.87 ± 1.11

1.89 ± 1.13

0.27230.7277 0.7262 7 Small lesion load

0.95 11.58 ± 3.08

11.59 ± 3.10

0.22010.7395 0.7531 10 Moderate lesion load

0.98 22.23 ± 3.52

22.21 ± 3.54

0.15410.7857 0.8101 3 Large lesion load

N: number of patients in each group, M: mean, SD: standard deviation

Table 2. Similarity index (SI) values for the proposed method

and the other methods.

Method Similarity index (SI)

Johnston et al. [3] 0.65

Boudraa et al. [13] 0.62

Leemput et al. [14] 0.51

Zijdenbos et al. [31] 0.68

Proposed Method 0.75

MR FLAIR images of 20 MS patients via similarity cri-

teria (i.e., SI, OF, and EF). There are other segmentation

methods which have been evaluated via similar ways

(i.e., SI), and have used manual segmentation for

evaluation of them, such as Johnston et al. [27], Boudraa

et al. [28], Leemput et al. [29], and Zijdenbos et al. [30].

Comparison of proposed method with these methods is

shown in Table 2. Johnston et al. [31] presented a semi-

automatic segmentation approach based on stochastic

relaxation method (SRM) and iterated conditional mode.

Boudraa et al. [28] performed an automatic detection of

MS lesions by using fuzzy C-means (FCM) algorithm.

Leemput et al. [29] proposed the use of an intensity-

based and stochastic model-aided segmentation ap-

proach for fully automatic segmentation of MS lesions.

Zijdenbos et al. [30] developed an automatic pipeline

based on a supervised artificial neural network (ANN)

classifier. As it is seen in Table 2, the proposed method

has improved the results reported in [26-29]. Also, in

[31], it is mentioned that a value for SI more than 0.7

represents a very good segmentation in this field. Future

work will include improving accuracy and correlation

value.

REFERENCES

[1] Kurtzke, J.F. (1983) Rating neurologic impairment in

multiple sclerosis: An expanded disability status scale

(EDSS). Neurology, 33, 1444-1452.

[2] Rasoul, K., Mansur, V., Farzad, T., Massood Nabavi., S.

(2008) Fully automatic segmentation of multiple sclero-

sis lesions in brain MR FLAIR images using adaptive

mixtures method and Markov random field model. Com-

puters in Biology and Medicine, 38, 379-390.

doi:10.1016/j.compbiomed.2007.12.005

[3] Polman, C.H., Reingold, S.C., Edan, G., Fillippi, M.,

Hartung, HP. and Kappos, L. (2005) Diagnostic criteria

for MS 2005 revisions to the MC Donald criteria. Annals

of Neurology, 58, 840-846. doi:10.1002/ana.20703

[4] Edelman, R.R., Hesselink, J.R. and Zlatkin, M.B. (2006)

Clinical magnetic resonance imaging. 3rd Edition, Saun-

ders, Philadelphia, 1571-1615.

[5] Anbeek, P., Vincken, K.L., van Osch, M.J.P., Bisschops,

R.H.C. and Van der Grond, J. (2004) Probabilistic seg-

mentation of white matter lesions in MR imaging,

NeuroImage, 21, 1037-1044.

doi:10.1016/j.neuroimage.2003.10.012

[6] Khayati, R. (2006) Quantification of multiple sclerosis

lesions based on fractal analysis. PhD Thesis, Technical

Report No. 1: Fully automatic object oriented brain seg-

mentation in MRI. Amirkabir University of Technology,

Tehran.

[7] Duda, R.O., Hart, P.E. and Stork, D.G. (2001) Pattern

classification. 2nd Edition, Wiley, New York.

[8] Bernardo, J.M. and Smith, A.F.M. (1994) Bayesian the-

ory. Wiley, New York. doi:10.1002/9780470316870

[9] Dempster, A.P., Laird, N.M. and Rubin, D.B. (1997)

Maximun likelihood from incomplete data via the EM

algorithm. Journal of the Royal Statistical Soceity B, 39,

1-38. doi:10.1.1.133.4884

[10] Shannon, C. (1948) A mathematical theory of communi-

cation. Bell System Technical Journal, 27, 379-423.

[11] Cover, T.M. and Thomas, J.A. (1991) Elements of infor-

mation theory. Wiley & Sons, New York.

doi:10.1002/0471200611

[12] Renyi, A. (1961) On measures of entropy and informa-

tion. University California Press, Berkeley, 547-561.

[13] Leonenko, N., Pronzato, L. and Savani, V. (2008) A class

of Rényi information estimators for multidimensional

densities. Annals of Statistics, 36, 2153-2182.

doi:10.1214/07-AOS539

[14] Kozachenko, L. and Leonenko, N. (1987) On statistical

estimation of entropy of a random vector. Problems In-

formation Transmission, 23, 95101.

[15] Jones, M. and Sibson, R. (1987) What is projection pur-

suit. Journal of the Royal Statistical Society: Series A,

150, 136. http://www.jstor.org/stable/2981662

[16] Benavent, A.P. Ruiz, F.E. and Sáez, J.M. (2009) Learning

Gaussian mixture models with entropy-based criteria.

IEEE Transactions on Neural Networks, 20, 1756-1771.

doi:10.1109/TNN.2009.2030190

[17] Richardson, S. and Green, P. (1991) On Bayesian analy-

sis of mixtures with an unknown number of components

(with discussion). Journal of the Royal Statistical Society:

A. Bijar et al. / J. Biomedical Science and Engineering 4 (2011) 552-561 561

Series B, 59, 731-792. doi:10.1111/1467-9868.00095

[18] Dellaportas, P. and Papageorgiou, I. (2006) Multivariate

mixtures of normals with unknown number of compo-

nents. Statistics and Computing, 16, 57-68.

doi:10.1007/s11222-006-5338-6

[19] Mitchell, T.M. (1997) Machine Learning. McGraw-Hill,

Boston. doi:10.1036/0070428077

[20] Li, S.Z. (2001) Markov random field modeling in image

analysis. Springer, Tokyo.

[21] Held, K., Kops, E.R., Krause, B.J., Wells, W.M., Kikinis,

R. and Mller-Grtner, H.W. (1997) Markov random field

segmentation of brain MR images. IEEE Transactions on

Medical Imaging, 16, 878-886. doi:10.1109/42.650883

[22] Geman, D., Geman, S., Graffigne, C. and Dong, P. (1990)

Boundary detection by constrained optimization, IEEE

Transactions on Pattern Analysis and Machine Intelli-

gence, PAMI-12, 609-628. doi:10.1109/34.56204

[23] Therrien, C.W. (1989) Decision, estimation, and classifi-

cation. Wiley, New York.

[24] Nett, J.M. (2001) The study of MS using MRI, image

processing, and visualization. Master’s Thesis, Univer-

sity of Louisville, Louisville.

[25] Zijdenbos, A.P., Dawant, B.M., Margolin, R.A. and Palmer,

A.C. (1994) Morphometric analysis of white matter lesions

in MR images: Method and validation. IEEE Transactions

on Medical Imaging, 13, 716-724.

doi:10.1109/42.363096

[26] Stokking, R., Vincken, K.L. and Viergever, M.A. (2000)

Automatic morphologybased brain segmentation

(MBRASE) from MRI-T1 data. Neu roImage, 12, 726-738.

doi:10.1006/nimg.2000.0661

[27] Johnston, B., Atkins, M.S., Mackiewich, B. and Ander-

son, M. (1996) Segmentation of multiple sclerosis lesions

in intensity corrected multispectral MRI. IEEE Transac-

tions on Medical Imaging, 15, 154-169.

doi:10.1109/42.491417

[28] Boudraa, A.O., Dehakb, S.M.R., Zhu, Y.M., Pachai, C.,

Bao, Y.G. and Grimaud, J. (2000) Automated segmenta-

tion of multiple sclerosis lesions in multispectral MR

imaging using fuzzy clustering. Computers in Biology

and Medicine, 30, 23-40.

doi:10.1016/S0010-4825(99)00019-0

[29] Leemput, K.V., Maes, F., Vandermeulen, D., Colchester,

A. and Suetens, P. (2001) Automated segmentation of

multiple sclerosis lesions by model outlier detection.

IEEE Transactions on Medical Imaging, 20, 677-688.

doi:10.1109/42.938237

[30] Zijdenbos, A.P., Forghani, R. and Evans. A.C. (2002)

Automatic pipeline analysis of 3-D MRI data for clinical

trials: Application to multiple sclerosis. IEEE Transac-

tions on Medical Imaging, 21, 1280-1291.

doi:10.1109/TMI.2002.806283

[31] Bartko, J.J. (1991) Measurement and reliability: statisti-

cal thinking considerations. Schizophrenia Bulletin, 17,

483-489. doi:10.1093/schbul/17.3.483