Adaptive Motion Segmentation for Changing Background

doi:10.4236/jsea.2009.22014

Paper Menu >>

Journal Menu >>

J. Software Engineering & Applications, 2009, 2: 96-102

doi:10.4236/jsea.2009.22014 Published Online July 2009 (www.SciRP.org/journal/jsea)

Adaptive Motion Segmentation for Changing

Background

Yepeng Guan1,2

1School of Communication and Information Engineering, Shanghai University, Shanghai, China; 2Key Laboratory of Advanced Dis-

plays and System Application, Ministry of Education, 149 Yanchang Rd., Shanghai, China.

Email: ypguan@shu.edu.cn

Received December 25th, 2008; revised April 12th, 2009; accepted May 4th, 2009.

ABSTRACT

Segmentation of moving objects efficiently from video sequence is very important for many applications. Background

subtraction is a common method typically used to segment moving objects in image sequences taken from a statistic

camera. Some existing algorithms cannot adapt to changing circumstances and require manual calibration in terms of

specification of parameters or some hypotheses for changing background. An adaptive motion segmentation method is

developed according to motion variation and chromatic characteristics, which prevents undesired corruption of the

background model and does not consid er the adaptation coefficient. RGB co lor space is selected in stead of introdu cing

complex color models to segment moving objects and suppress shadows. A color ratio for 4-connected neighbors of a

pixel and multi-scale wavelet transformation are combined to suppress shadows. The mentioned approach is

scene-independent and high correct segmentation. It ha s been shown that th e approach is robust and efficient to detect

moving objects by experiments.

Keywords: Motion Segmentat ion, Background Update, Background Subtracti on, Motion Variation, Shadow Suppression

1. Introduction

Moving objects segmentation is an important topic in co-

mputer vision applications, including video conferences,

vehicle tracking, and three-dimensional object identifica-

tion, and has been actively investigated in recent years

[1]. The most widely adopted approach for moving ob-

ject segmentation with a fixed camera is based on back-

ground subtraction. A background (called as background

model also) is computed and evolved frame by frame.

A reliable background model has to account for back-

ground at each time instant. Mistake in labeling fore-

ground and background points could cause wrong update

of the background model. A particularly critical situation

occurs whenever moving objects stop for a long time and

become a part of the background. When these objects

start again, a ghost is detected in the area where they

stopped. This will persist for all the following frames,

preventing the area to be updated in the background for-

ever.

In addition, moving object segmentation is easily af-

fected by shadow problem. Researchers try to select an

optimal color space for shadow eliminating among a set

of color spaces, such as HSV, YCrCb, XY Z , L*a*b*,

L*u*v*,C1C2C3, l1l2l3, normalized rgb and so on, how-

ever, it remains open-ended how important is the appro-

priate color space, and which color space is the most

effective [2]. Many approaches in literature have been

developed so far. Some existing methods require manual

calibration in terms of the specification of parameters

which are related to the environment and the lighting

conditions or make some hypotheses.

An approach to adaptive background updating and

shadow suppressing is developed. RGB color space is

selected instead of introducing complex color models to

segment moving objects. Motion evaluation is intro-

duced to prevent giving erroneous segmentation in those

corresponding with an un-updated background model. A

color ratio and multi-scale wavelet transformation are

combined to suppress shadows. The main contribution of

the proposal is that the developed approach is scene-

independent and automatic background updating ac-

cording to motion variations caused by moving objects.

The second contribution is that when segmenting motion

objects it does not require any complex supervised train-

ing or manual calibration in terms of the specification of

parameters or makes any hypotheses. Experimental re-

sults from indoor and outdoor environments have shown

*Tel: +86 21 56331967; Fax: +86 21 56336908.

Adaptive Motion Segmentation for Changing Background 97

the developed approach is efficient and flexible during

segmenting moving objects and suppressing shadows in

applications.

The remainder paper is organized as follows. Section 2

briefly reviews some related previous works. In the next

section, background model update would be discussed.

Shadow suppression is described in Section 4. Experi-

mental results from indoor and outdoor environments are

given in Section 5 and followed by conclusions in Sec-

tion 6.

2. Related Works

Many works have been put forward in literature for mo-

ving objects detection. Background subtraction based

moving objects in the scene are detected by the differ-

ence between the current frame and the background

model. When the deviation is greater than some critical

value, the pixel is considered as foreground (moving

object) [3]. The most simple background model is the

previous frame. The difference between the observed

frame and the previous frame is thresholded to determine

which pixel is background and which pixel is the fore-

ground. Another way to model the background is to take

the mean, median, or minimum and maximum values of

the previous N pixel [4]. It needs to keep track of the

pixel value history. To avoid this problem a first order

recursive filter is used to update the background model

[5]. It easily causes ‘tailing’ or ‘ghosting’. A particularly

critical situation occurs whenever the moving object

stopped for a long time and became a part of the back-

ground. When these objects start again, a ghost is de-

tected in the area where they stopped [6]. This will per-

sist for all the following frames, preventing the area to be

updated in the background image forever [7]. One

method is by modeling each pixel as unimodal Gaussian

distribution [8]. It fails to model background pixels that

are subject to repetitive motions which have multiple

background colors. To overcome these difficulties, a

parametric background modeling is done by modeling

each background pixel value as a mixture Gaussian dis-

tribution [9,10]. The parametric background model still

lacks flexibility when dealing with non-static back-

grounds, a highly flexible non-parametric technique is

proposed for estimating background probabilities from

recent samples over time using Kernel density estimation

[11]. False detections due to fluctuating backgrounds are

still not covered in the algorithm until now. Codebook

technique is a different approach proposed for the back-

ground subtraction in [12]. One of drawbacks is that the

algorithm cannot adapt to changing circumstance when

the environment was not present in the training phase.

Moving objects that stop moving and should be adopted

into the background will get difficulties in the algorithm.

Shadows cause serious problems while segmenting

moving objects, due to the misclassification of shadow-

points as a foreground. Many works have been devel-

oped to suppress shadow [1,2,6,13,14,15,16,17,19,21,22].

By shadow suppression, the major problem is how to

distinguish moving cast shadows from moving object

points [15]. Cucchiara et al. [6] defined a shadow mask

for each point resulting from motion segmentation.

However, it often makes additional assumptions such as

small changes in hue and saturation, necessary a prior

knowledge of the bands of the changes in the value

channel. Moreover choice of some parameters is less

straightforward and for now is done empirically. Tatter-

sall et al. [16] proposed adaptive shadow identification

through automatic parameter estimation based on the

above method. The single variable parameter is only

used. However, much additional assumptions must be

made also in [16]. Salvador et al. [17] proposed invariant

color features to detect cast shadows through using

chrominance color components. It is found that several

assumptions are needed regarding the reflecting surfaces

and the lightings. In outdoor scene, shadows will have a

blue color cast due to the sky, while the lighting regions

have a yellow cast (sunlight), hence the chrominance

color values corresponding to the same surface point

may be significantly different in shadow and sunlit re-

gions [18]. Cavallaro et al. [19] proposed the normalized

rgb space to detect shadows. It is known that the practi-

cal application of normalized rgb suffers from a problem

inherent to the noise at low intensities which would re-

sult in unstable chromatic components [20]. Texture

analysis can be potentially effective in solving the prob-

lem. Heikkila et al. [21] proposed a texture-based me-

thod for modeling the background and detecting moving

objects from a video sequence. Each pixel is modeled as

a group of adaptive local binary pattern histograms that

are calculated over a circular region around the pixel.

Because of the huge amount of different combinations, it

must be done more or less empirically to find a good set

of parameter values. Spagnolo et al. [22] proposed a ra-

tio-based algorithm to detect shadows with an empiri-

cally assigned ratio threshold. This ratio-based algorithm

considering the ratio between only two adjacent pixels

considerably shortens the computation time, but it easily

misclassifies shadows as objects, because ratio magni-

tude of shadows may have a similar magnitude value.

3. Background Model Update

The main idea of the proposed approach is to update pix-

els according to motion variations caused by moving

objects, which has been proven to be more reliable and

less sensitive to noise.

Assuming the background model Bt+1(x, y) at t+1 time,

extract possible moving regions P based on background

subtraction (seen from Figure 1(c)). According to the

fact that the variation of sensible motion target in the

scene can be found out from the sequence, the difference

98 Adaptive Motion Segmentation for Changing Background

region P1 may be incorrect (seen from Figure 2(e)). In

order to overcome the problem, chromatic information is

used to further extract moving region P2 from P1 accord-

ing to moving objects with the same chromatic in P1 as in

R (seen from Figure 2(f)). According to the mentioned

above, the background model is updated as follows,

between the two adjacent frames is used to extract mov-

ing regions R (seen from Figure 1(d)). Extract moving

object regions P1 from the regions P according to mov-

ing objects with the same connective characteristic as

regions R (seen from Figure 1(e)).

When the object moves slowly in situ, the extracted

(, )(, ),arg( ,)

(, )

(, ),

xyB xyifXORPPT

Bxy Bxy otherwise















(1)

where

(, ),( , )PconnectR PPchrom PR

(2)

In (1), It+1 is a current frame at t+1 time, “→” repre-

sents updating, Connect (R, P) represents selecting a

region with the same connective characteristic as R from

P, chrom (P1, R) represents selecting a region with the

same chromatic information in R as in P1, XOR is a logi-

cal exclusive-or operator, T is a threshold value.

The proposed background update approach reveals

some advantages. Firstly it smoothes and reduces the

effects of noise in the image since sudden variations of a

single pixel are not included in the background model.

Secondly it does not consider the adaptation coefficient

or the learning rate used in the existing literatures. Fi-

nally it does not depend on static or moving objects in

the image. In order to demonstrate the last point, some

results of two sequences where lena walks, and she bend-

s in two scenes from http://www.tele.ucl.ac.be/~gaitanis/

results/Human_Action_Video_Database/2Feet/ are given

in Figures 1 and 2, respectively.

Figure 1. Lena walks in the scene and the extracted results. (a) First frame in the sequence. (b) The 25th frame. (c) Segmenta-

tion based on subtraction of (a) and (b). (d) Segmentation based on difference between the 24th and 25th frames. (e) Ex-

tracted object region by connectivity. (f) Extracted object region by chromatic consistency. (g) Detected varied background

region. (h) Extracted final foreground re gion

Figure 2. Lena bends in situ and the ex tracted results. (a) First frame in the sequence. (b) The 25th frame. (c) Segmentation

based on subtraction of (a) and (b). (d) Segmentation based on difference between the 24th and 25th frames. (e) Extracted

object region by connectivity. (f) Extr acte d objec t region by chromatic c onsistenc y. (g) Detec ted varie d backgr ound re gion. (h)

Extracted final foreground region

Adaptive Motion Segmentation for Changing Background 99

In the second example, some existing algorithms in

the literature are difficult to correctly extract moving

objects. In Figures 1 and 2, the ghosts are removed since

their connective pixels do not contain any frame-differe-

nce pixels. From Figures 1(g) and 2(g), one can see that

some varied background pixels are detected in the region

no matter how lena walks or bends in situ. The results

obtained with the approach show correct foreground

segmentation and effective background updating from

Figures 1 (h) and 2(h). It highlights that the aperture

problem resulting from the adjacent frame-difference is

overcome by combining region connectivity and the

chromatic consistency.

4. Shadow Suppression

Since RGB color camera system is one of the most pop-

ular color spaces, and all colors are seen as a variable

combination of the three primaries in the RGB color

space, RGB color space is selected to eliminate shadows

in the paper. Before segmenting image, start by applying

a smoothing operator both to the background image B(x,

y) and the current frame F(x, y).

Calculate a color ratio difference between B(x, y) and

F(x, y) for 4-connected neighbors of a pixel and for all

three color components as:

(, )(, )

(,, ,)(1,1,,1,2,3,4)

(,) (,)

kRGB FB

Ixy Ixy

pxyij ijijn

Ixiyj Ixiyj

 

  (3)

where (, )

xy is the intensity of F(x, y) for the kth

color component at location (x, y), (,

xiy j

the intensity of its neighbor color components; )

is the intensity of B(x, y) for the kth color component at

location (x, y), and (,

xiy j is the intensity of its

neighbor color components.

Since moving shadow makes the region covered by

itself darker than the background and it has similar

chromaticity, a suitable threshold Th is applied to the

color ratio difference (3) to obtain a candidate fore-

ground region as:

(, )

SPx y

(, )0

nif pTh

SPx yotherwise







 (4)

If the distortion distribution of is assumed to be a

Gaussian one, we can threshold the distortion by n



where n



is a standard deviation of . However, it

is found from experiments that the distribution of is

not a Gaussian one, but is has a very sharp peak at zero.

The standard deviation of is used to select the thre-

shold Th, and (4) can be rewritten as:

(, )0

if p

SPx yotherwise









 (5)

The value of color ratio mentioned above may not

uniquely highlight the property of a particular material,

and there may be instable for particular values. To obtain

a robust segmentation result, the results from the color

ratio and multi-scale wavelet transformation method

proposed in [23] are combined.

5. Experimental Results

To confirm the effectiveness of the above proposed

method, we have conducted experiments with different

indoor and outdoor video sequences.

Figure 3. Detected foregrounds for indoor circumstances. The first c olumn is the first frame of the sequence s, the second col-

umn is the current frames (67#, and 300#, from top to bottom, respectively), and the third column is the detected foreground

masks with light grey pixels superimposed on the original image

100 Adaptive Motion Segmentation for Changing Background

Figure 4. Detected foregrounds in outdoor circumstances. The first column is the first frame of the sequences, the second

column is the current frames (69#, and 84#, from top to bottom, respectively), and the third column is the detected fore-

ground masks with light grey pixels supe r imposed on the original image

In the experiments, the threshold T used in (1) is cho-

sen as 100, which represents the minimum number of

expected background updating in consecutive frames.

The threshold T can be chosen lower in which more

background pixels can be updated. The selection of

lower T means more time to process background updat-

ing and some noises may drift into updating. The back-

ground reference image used is the first frame of se-

quence no matter objects enter the field of view before

captured or not.

The sequences with different indoor conditions in-

cluding the MPEG-4 test sequence Hall Monitor, aton

project test sequence intelligent room from http://cvrr.

ucsd.edu/aton/shadow have been used to test the devel-

oped algorithm. Some results are given in Figure 3.

In Figure 3, the results obtained by the developed ap-

proach are quite good, considering that the contrast be-

tween the object and background is very low.

In order to test the proposed approach further in out-

door conditions, the sequences with different outdoor

conditions including campus and highway from the

ATON project are used. The choice of such different

scenes is made to emphasize the reliability and robust-

ness of the proposed approach in outdoor circumstances.

Some results are shown in Figure 4.

In Figure 4, the results show the robustness of the

proposed algorithm to cope with different outdoor cir-

cumstances.

The above results show qualitative information about

the effectiveness of the developed approach. It is neces-

sary to quantitatively evaluate the performance of the

method with a ground-truth.

The main goal of the proposed method is not accurate

detection or discrimination of shadow pixels, but the

improvement of moving object detection because accu-

rate object detection is crucial for further applications.

The performance of moving object detection is measured

in term of correct segmentation rate (csr) and false seg-

mentation rate (fsr) as:

TPGTFP FN

csr fsr

TP GTGT







(6)

where TP (true positive) is the number of correctly de-

tected foreground pixels, GT (ground-truth) is the

ground-truth for the foreground, FP (false positive) is the

number of background falsely marked as foreground

pixels, FN (false negative) is the number of foreground

pixels falsely classified as background ones.

Using the csr and fsr measurements, the total correct

segmentation will have 100% by csr while the total

agreement with the ground-truth will have 0% by fsr.

The available sequence intelligent room and its ground-

truth data are available from http://cvrr.ucsd.edu/aton/

shadow. In our work, two recent proposed methods in-

cluding the invariant color features (ICF) proposed in [17]

and the ratio map (RM) developed in [22] are quantita-

tively evaluated together, and the results are shown in

Figure 5 for a comparison.

The symbols in the legend of Figure 5 refer to object

extraction results: ICF [17], RM [22], and the proposed

method. The mean values of csr corresponding to the

plots of Figure 5 are the following: RM 0.84, ICF 0.86,

proposed 0.98. The mean values of fsr are the following:

RM 3.42, ICF 4.11, proposed 1.61.The proposed method

outperforms the investigated methods with the best ob-

ject detection over time.

6. Conclusions

This paper has presented an approach to adaptive motion

segmentation and shadow suppression. The ghosts are

detected and removed by the developed background up-

date function, which prevents undesired corruption of the

background model and does not consider the adaptation

coefficient or the learning rate used in the literature. By

Adaptive Motion Segmentation for Changing Background 101

Figure 5. Comparison of video object segmentation rate for the test sequence intelligent room

comparison, it has been shown that the proposed method

outperforms the investigated methods and is robust and

efficient to detect moving objects during coping with

different indoor or outdoor circumstances.

7. Acknowledgements

This work in part is supported by the National Natural

Science Foundation of China (Grant No. 60872117).

REFERENCES

[1] W. Zhang, X. Z. Fang, X. K. Yang, and Q. M. J. Wu,

“Moving cast shadows detection using ratio edge,” IEEE

Transactions on Multimedia, Vol. 9, No. 6, pp. 1202-

1214, September 2007.

[2] C. Benedek and T. Sziranyi, “Study on color space selec-

tion for detecting cast shadows in video surveillance,”

International Journal of Imaging Systems and Technol-

ogy, Vol. 17, No. 3, pp. 190-201, October 2007.

[3] T. Horprasert, D. Harwood, and L. S. Davis, “A statistical

approach for real-time robust background subtraction and

shadow detection,” Proceedings of IEEE International

Conference on Computer Vision Frame-Rate Workshop,

pp. 1-19, September 1999.

[4] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: real

time surveillance of people and their activities,” IEEE

Transactions on Pattern Analysis and Machine Intelli-

gence, Vol. 22, No. 8, pp. 809-830, August 2000.

[5] J. Heikkila and O. Silven, “A real-time system for moni-

toring of cyclists and pedestrians,” Proceedings of 2nd

IEEE Workshop on Visual Surveillance, pp. 74-81, July

1999.

[6] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and S.

Sirotti, “Improving shadow suppression in moving object

detection with HSV color information,” Proceedings of

IEEE International Conference on Intelligent Transporta-

tion Systems, pp. 334-339, August 2001.

[7] P. Spagnolo, T. D'Orazio, M. Leo, and A. Distante,

“Moving object segmentation by background subtraction

and temporal analysis,” Image and Vision Computing,

Vol. 24, No. 5, pp. 411-423, May 2006.

[8] C. Wren, A. Azabayejani, T. Darrell, and A. Pentland,

“Pfinder: Real-time tracking of the human body,” IEEE

Transactions on Pattern Analysis and Machine Intelli-

gence, Vol. 19, No. 7, pp. 780-785, July 1997.

[9] C. Stauffer and W. E. L. Grimson, “Adaptive background

mixture models for real-time tracking,” Proceedings of

Computer Vision and Pattern Recognition, Vol. 2, pp.

246-252, June 1999.

[10] Z. Zivkovic and F. van der Heijden, “Efficient adaptive

density estimation per image pixel for the task of back-

ground subtraction,” Pattern Recognition Letters, Vol. 27,

No. 7, pp. 773-780, May 2006.

[11] A. Elgammal, D. Harwood, and L. S. Davis, “Nonpara-

metric model for background subtraction,” Proceedings

of 6th European Conference on Computer Vision, Vol. 2,

pp. 751-767, June 2000.

[12] K. Kim, T. H. Chalidabhongse, D. Harwood, and L.

Davis, “Real-time foreground-background segmentation

using codebook model,” Real-Time Imaging, Vol. 11, No. 3,

pp. 172-185, June 2005.

[13] J. Stander, R. Mech, and J. Ostermann, “Detection of

moving cast shadows for object segmentation,” IEEE

Transactions on Multimedia, Vol. 1, No. 1, pp. 65-76,

March 1999.

[14] I. Mikic, P. C. Cosman, G. T. Kogut, and M. M. Trivedi,

“Moving shadow and object detection in traffic scenes,”

Proceedings of International Conference on Pattern Rec-

ognition, Vol. 1, pp. 321-324, September 2000.

[15] A. Prati, I. Mikic, M. M. Trivedi, and R. Cucchiara, “De-

tecting moving shadows: Formulation, algorithms and

evaluation,” IEEE Transactions on Pattern Analysis and

Machine Intelligence, Vol. 25, No. 7, pp. 918-924, July

2003.

[16] S. Tattersall and K. Dawson-Howe, “Adaptive shadow

identification through automatic parameter estimation in

102 Adaptive Motion Segmentation for Changing Background

video sequences,” Proceedings of Irish Machine Vision

and Image Processing, pp. 57-64, September 2003.

[17] E. Salvador, A. Cavallaro, and T. Ebrahimi, “Cast shad-

ow segmentation using invariant color features,” Com-

puter Vision and Image Understanding, Vol. 95, No. 2,

pp. 238-259, August 2004.

[18] E. A. Khan and E. Reinhard, “Evaluation of color spaces

for edge classification in outdoor scenes,” Proceedings of

International Conference on Image Processing, Vol. 3, pp.

952-955, September 2005.

[19] A. Cavallaro, E. Salvador, and T. Ebrahimi, “Detecting

shadows in image sequences,” Proceedings of IEEE

Conference on Visual Media Production, pp. 165-174,

March 2004.

[20] M. Kampel, H. Wildenauer, P. Blauensteiner, and A.

Hanbury, “Improved motion segmentation based on sha-

dow detection,” Electronic Letters on Computer Vision

and Image Analysis, Vol. 6, No. 3, pp. 1-12, December

2007.

[21] M. Heikkila and M. Pietikainen, “A texture-based method

for modeling the background and detecting moving ob-

jects,” IEEE Transactions on Pattern Analysis and Ma-

chine Intelligence, Vol. 28, No. 4, pp. 657-662, April

2006.

[22] P. Spagnolo, T. D. Orazio, M. Leo, and A. Distante, “Ad-

vances in shadow removing for motion detection algo-

rithms,” Proceedings of 2nd International Conference on

Vision Video Graphics, pp. 69-75, July 2005.

[23] Y. P. Guan, “Wavelet multi-scale transform based fore-

ground segmentation and shadow elimination,” The Open

Signal Processing Journal, Vol. 1, No. 6, pp. 1-6, Nove-

mber 2008.