Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Arbitrary cylinder color model for the codebook based background subtraction

Open Access Open Access

Abstract

The codebook background subtraction approach is widely used in computer vision applications. One of its distinguished features is the cylinder color model used to cope with illumination changes. The performances of this approach depends strongly on the color model. However, we have found this color model is valid only if the spectrum components of the light source change in the same proportion. In fact, this is not true in many practical cases. In these cases, the performances of the approach would be degraded significantly. To tackle this problem, we propose an arbitrary cylinder color model with a highly efficient updating strategy. This model uses cylinders whose axes need not going through the origin, so that the cylinder color model is extended to much more general cases. Experimental results show that, with no loss of real-time performance, the proposed model reduces the wrong classification rate of the cylinder color model by more than fifty percent.

© 2014 Optical Society of America

1. Introduction

Background subtraction is a fundamental task in most of the computer vision applications, such as video surveillance, activity recognition, human motion analysis and object tracking. The purpose of a background subtraction algorithm is to distinguish moving objects from the scene. Its basic principle can be formulated as building a model of a background and comparing this model with the current frame in order to identify non-stationary or new objects.

Nature scenes put many challenges on background modeling, including light changes, waving trees, shadows, rippling water, snow, rain, and others. A large number of research papers have been published in the previous decade dealing with these challenges (e.g., [121]). Among these, the codebook model [1] and its enhancements (e.g., [2, 1421]) perform better than many state of the art methods in terms of both segmentation accuracy and the processing speed.

The codebook is a compact and compressed model capable of background modeling for long periods. One of its distinguished features is the cylindrical color (CY) model used to cope with illumination changes. This model was developed based on the observation that under lightening variation, pixel values almost totally distributed in elongated shape along the axis going toward the original point in RGB color space. The CY model, or its variants (e.g., [18, 19, 21]), has been widely used.

However, we have found that the CY model mentioned above is valid only if the spectrum components of the light source change in the same proportion. In fact, it is not true in many practical cases, where the variations of each spectrum component of the light source would not be in proportion. In these cases, the CY model is inaccurate and much less efficient.

To tackle the problem mentioned above we propose an arbitrary cylinder based color (AC) model with a highly efficient updating strategy. This model uses cylinders whose axes needs not going through the origin, so that the CY model is extended to more general cases. With no loss of real-time performance, the AC model reduces the wrong classification rate of CY model by more than 50%. Furthermore, it should be pointed out that, the CY model is only a special case of the AC model.

Another minor contribution of this paper recites in a modified forgetting and strengthening strategy. The original codebook adopts a conservative update policy which does not allow the creation of new codewords when out of the learning stage. To deal with new backgrounds, many researchers introduced the concept of foreground model (e.g., [2, 18, 20]). In their works, the foreground model and the background model are treated separately. Since the background and the foreground models are similar in representations, we merge the two models together and remove some redundant operations to accelerate calculations.

The original codebook background subtraction algorithm is given in Section 2. The proposed AC model is introduced in Section 3. Its construction details are given in Sections 4. The AC model based codebook background subtraction is summarized in Section 5. In Section 6, experiments are made to compare the performance of the proposed AC model with that of the CY model. Comparisons with two other typical generalizations of the CY model [18, 21] are also given. Finally, the paper is concluded in Section 7.

2. Original codebook

Original codebook background subtraction is a pixel-based approach. Each pixel is modelled by a codebook containing one or more codewords. Each codeword is represented by a RGB vector vi={R¯i,G¯i,B¯i} and a 6-tuple auxi={Ii,Ii,fi,λi,pi,qi}, where I and I are the minimum and the maximum brightness, respectively. f is the frequency with which the codeword has occurs. λ is the maximum negative run-length, meaning the longest time interval during which this codeword is not accessed. p and q are the first and the last access times of the codeword respectively. A codebook for each pixel can be constructed as follows:

  • 1) Find a codeword from the codebook matching to the current pixel value based on two conditions a) and b).
    • a) The Euclidian distance of the pixel value x={R,G,B} from the line represented by the vector vi={R¯i,G¯i,B¯i} is less than a color threshold TC.
    • b) The intensity of the pixel value I locates in the range [IiTi,Ii+Ti], where Ti is an intensity threshold.
  • 2) If no match could be found, label the pixel as foreground and create a new codeword by setting
    • vnew={R,G,B}.
    • auxi={I,I,1,t1,t,t}, where t is the current frame number.
  • 3) Otherwise, label the pixel as background and update the corresponding codeword as follows.
    • vi={(fiR¯i+R)/(fi+1),(fiG¯i+G)/(fi+1),(fiB¯i+B)/(fi+1)}.
    • auxi={min{Ii,I},max{Ii,I},fi+1,max{λi,tqi},pi,t}.

For each codeword in the codebook, if max{λi,tqi+pi1}t/2, delete the codeword from the codebook.

3. Arbitrary cylinder color model

It was firstly reported in [1] that, under lighting variation, the pixel values are mostly distributed in elongated shape along the axis going toward the origin in RGB color space. In their experiment, pixels were sampled from an image sequence captured from a color-chart. Illumination was changed by decreasing or increasing the light strength to make the pixel values darker or brighter. Based on this observation, they developed the CY model. This color model, or its variants (e.g., [18, 19, 21]), has been widely used.

However, this model implies that the spectrum components of the light source change in the same proportion. In fact, it is not true in many practical cases. For example, there are three light sources, the pure red light source, the pure green light source, and the pure blue light source. If we only decrease or increase the brightness of the red light, the pixel values can never move along an axis going toward the origin.

Such complex illumination conditions exist in many environments as illustrated in Fig. 1. The test sequence in the first row was drawn from an IP camera at a junction of three roads. The test sequence in the second row was captured by a camera near a harbor. And the test sequence in the third row was downloaded from the PETS2007 website. Pictures in the first column are screenshots. Pixels for observation are marked by small crosses centered in red boxes.

 figure: Fig. 1

Fig. 1 Pixel value distribution under complex illumination changes. (The figure in the second row, first column is from Innovative Security Designs (ISD), printed with permission)

Download Full Size | PDF

In these scenarios, both direct light (e.g., from the sun or light-bulbs) and diffuse light contribute to the background illumination. But the variations of each light source would not be in proportion. The sampled pixel value distributions are shown in the second column of Fig. 1. It can be seen that the distributions are not along axes through the origin.

If we use the CY model to represent the background pixel value distribution, as illustrated in the third column of Fig. 1, many small cylinders must be used. This would make the computation much slower. In addition, since those cylinders cover many unwanted spaces, inaccuracy would occur. If we use cylinders whose axes do not pass though the origin, as shown in the last column of Fig. 1, only one cylinder would be enough to model the background, and the result of background subtraction should be much more accurate.

Based on the discussion above, to extend the CY model to more general cases, we developed the AC color model. As shown in Fig. 2, this model is represented by its two controlling point A={Ar,Ag,Ab} and B={Br,Bg,Bb}.

 figure: Fig. 2

Fig. 2 Arbitrary cylinder color model.

Download Full Size | PDF

For an observed pixel value P={Pr,Pg,Pb}, the color distortion is described by two parameters {x,y}. Parameter x is the distance from P to A alongAB¯, which can be calculated by

x=BA,PABA
in which .,. is the dot product and . is the norm operator. Parameter y is the perpendicular distance from P to AB¯, which can be calculated by

y=(PA)x(BA)BA

The incoming pixel P is said to be within the model AC (i.e. it is classified as background), if it meets the following conditions

{yTCTIxBA+TI
where TC (the color threshold) and TI (the intensity threshold) are positive numbers.

4. AC model construction

In this section, we deal with how the proposed AC model is initialized, updated, and forgotten (or strengthened).

4.1. AC model initialization

Suppose P is the observed pixel value. If a new AC model has to be established, it is initialized as follows.

  • 1) A=P(PTI)/P.
  • 2) B=P(P+TI)/P.

4.2. Updating the AC model over time

Suppose background pixel values lie in the neighborhood of the line segment AB¯. We can obtain the equation of AB¯ by solving the least square estimation from the sampled pixel values. However, this approach requires the storage of a large amount of pixel values and the calculation is time costing. In order to simplify calculations, we made an approximation that the sampled pixel values are evenly distributed along the axis of the AC model.

Let the new pixel value be P. Then, The updated line segment A*B*¯must lie in the plane defined by AB¯ and P. Hence, the updating problem can be reduced from 3D space to 2D space. The 2D space coordinate system is defined as follows and shown in Fig. 3.

 figure: Fig. 3

Fig. 3 AC model updating scheme.

Download Full Size | PDF

  • 1) Let the point A be the origin.
  • 2) Let the direction of the x-axis along the vector BA.
  • 3) Let the direction of the y-axis along the vector (PA)x(BA)/BA, where x is the distance from P to A along AB¯.

The points A, B and P, respectively, can be represented in the 2D system as.

a=[00]T
b=[BA0]T
p=[xy]T
where y is the distance from P to AB¯. In the 2D coordinate system the new line segment a*b*¯ can be estimated from the original m points and the new point p together.

Let the new line equation be

y=kx+t
The parameters{k,t}, can be obtained by solving the least square solution of Eq. (8)
[0bm111bm11(m1)bm11x1][kt]=[000y]
The solution is
[kt]=[m(2m1)b26(m1)+x2mb2+xmb2+xm+1]1[xyy]
In order to determine the endpoints of the new line segment, we calculate the projections of points a, b, and p onto the new estimated line, and choose the most distant two points among the projections a, b, and p as the endpoints a* and b*. The formula used to calculate the projection of a point {x,y} onto a liney=kx+t is given in Eq. (10)
[xy]=[k11k]1[tx+ky]
where {x,y} is the projection.

Finally, we must transform a* and b* from the 2D coordinate system to the 3D coordinate system. The unit vector in 3D space representing the direction of the x-axis in 2D space is

nx=BABA
The unit vector in 3D space representing the direction of the y-axis in 2D space is
ny=(PA)xnx(PA)xnx
Hence, endpoints of the new line segment A*B*¯ in the 3D coordinate system can be calculated by

A*=[nxny]a*
B*=[nxny]b*

This updating process can be accelerated by adopting a look-up table method. That is, for each {x,y,BA}, we pre-compute a* and b*. Then, we store them in a table. Checking up the table can substantially reduce the time for calculating a* and b*. The ranges of {x,y,BA}are set as follows:

{2TIBA200TIxlmax+TI0xTC

It should be noted that if only one light source exists, the axis of the AC model points to the origin. This is the same as the CY model. Therefore, the CY model could be considered as a special case of the AC model.

4.3. Forgetting and strengthening process

If a codeword was rarely used, it should be discarded (or forgotten). On the contrary, if a codeword was used constantly, it should be strengthened.

The original codebook adopts a conservative update policy which does not allow the creation of new codewords when out of the learning stage. In order to deal with new backgrounds, researchers introduced the concept of foreground model (e.g [2, 18, 20].). In their works, the foreground model and the background model are treated separately. However, the representations of both models are identical. So, it is reasonable to merge the two models together and remove some redundant operations in order to accelerate calculations.

In the proposed algorithm, we use only two parameters {oc,no} to control the forgetting and strengthening processes. The parameter oc represents the number of times a model been updated since its establishment, and no represents the number of times a model has not been updated since its latest update. The corresponding forgetting and strengthening criteria is given below.

  • 1) If a codeword is added to the codebook for the first time, let oc=1 and no=0.
  • 2) If a codeword is matched by an observed pixel value, let oc=oc+1 and no=0.
    • ● If ocToc, consider this pixel as background.
    • ● If oc<Toc, consider this pixel as foreground.
  • 3) For each codeword, let no=no+1.
    • ● If noocTno, delete this codeword.

5. AC model based codebook

A summarization of the proposed AC model based codebook background subtraction algorithm is given below. The codeword used in this paper has the form of {A,B,oc,no}, where A={Ar,Ag,Ab} and B={Br,Bg,Bb}.

  • 1) For an observed pixel value, find a matching codeword from the codebook meeting the following conditions.
    • TIxBA+TI.
    • yTC.
  • 2) If no matching codeword is found, label the pixel as foreground and create a new codeword by setting.
    • A=P(PTI)/P.
    • B=P(P+TI)/P.
    • oc=1.
    • no=0.
  • 3) Otherwise, update the corresponding codeword as follows.
    • A=[nxny]a*.
    • B=[nxny]b*.
    • oc=oc+1.
    • no=0.
    • ● If ocToc, consider this pixel as background.
    • ● If oc<Toc, consider this pixel as foreground.
  • 4) For each codeword, let no=no+1.
    • ● If noocTno, delete this codeword from the codebook.

6. Experiments

In this section, we compare the performances between the proposed AC model, the CY model and two other generalizations of the CY model [18, 21]. The two other generalizations are the cone (CO) color model and the box (BO) color model. All color models are implemented in the same codebook framework. In addition, the thresholds used for color, intensity, and recurrence in all color models are identical.

All models are tested on a variety of challenging indoor and outdoor environments, including subway hall, bridge, harbor, road junction, and city center. The first sequence (subway hall) was extracted from the PETS2007 public data set (data set 8, camera 2). The second and third sequences (bridge and harbor), were captured by HD cameras. The fourth sequence was grabbed with an IP camera. The last sequence (city center) was extracted from the PETS2009 public data set (city center, view 2). All test sequences are resampled to the resolution of 500x300 pixels.

The difficulties for foreground detection in these scenes are caused by dynamic background, local and global illumination changes, training without clear background, camouflage foreground objects, and noise due to video recording.

Many metrics are used to assess their performances. These metrics consists of the true positive number (TP), the false positive number (FP), the true negative number (TN), the false negative number (FN), and the time costs (TC) in millisecond. All ground truth segmentations were produced manually.

6.1. Parameter settings

The corresponding matching and updating schemes for the AC and the CY models can be found in Sections 2 and Subsection 4.2 respectively. The matching and updating schemes for the CO and the BO models can be found in references [18, 21]. All color models adopt the same forgetting and strengthening strategy given in Subsection 4.3. Parameters used in the test are given in Table 1 and Table 2.

Tables Icon

Table 1. Parameter settings for the AC and the CY models

Tables Icon

Table 2. Parameter settings for the CO and the BO models

The parameter TC is the color threshold, TI is the intensity threshold, Toc is the occurrence threshold, Tno is the non-occurrence threshold, and m(orf) is the updating rate.

The parameterα is the learning rate, θcolor is the threshold for color direction error, β and γ together determine the upper and lower bounds of intensity variations, and ε is the detection threshold.

6.2. Qualitative comparisons

Figure 4 shows examples of background subtraction for one typical frame of each sequence. Pictures in the first column are screenshots. Pictures in the second columns are ground truths. The segmentation results using the CY model and the AC model are shown in the third and fourth columns respectively. The segmentation results using the CO model and the BO model are shown in the fifth and sixth columns respectively. It can be seen from the results that AC model performs much better than the CY model and the other models do, especially in outdoor environments with complex illumination conditions.

 figure: Fig. 4

Fig. 4 Comparative background segmentation for typical frames taken from five sequences. (The figures in the second and third row, first column is from Innovative Security Designs (ISD), printed with permission)

Download Full Size | PDF

6.3. Quantitative evaluations

Calculations are performed by a 3.2GHz i5 processor. Algorithms are realized in LabVIEW environment. We selected 5 frames from each sequence, leading to 25 frames for evaluation.

Table 3 records the TP, the FN, the FP, the TN, and the TC for each of the 25 frames. The first column indicates the frame number of each chosen frame. The scores in columns 2-6 are for the AC model and the scores in columns 7-11 are for the CY model. The last row of this table shows the average value for each metric.

Tables Icon

Table 3. Performance metrics for the AC and the CY color models

Table 4 records the TP, the FN, the FP, the TN, and the TC for the CO and the BO color models. The scores in columns 2-6 are for the CO model and the scores in columns 7-11 are for the BO model. The last row of this table shows the average value for each metric.

Tables Icon

Table 4. Performance metrics for the CO and the BO color models

The sum of the FN and the FP indicates the number of wrong classifications. It is obviously that the AC model reduces the wrong classification rate of the CY model by more than 50%. Meanwhile, the time costs of both models are nearly identical.

The detection rate (indicated by TN) of the CO color model (or the BO model) is only a slightly higher than that of the AC model. However, the wrong classification rate of the CO model is three times more than that of the AC model. And the wrong classification rate of the BO model is five times more than that of the AC model.

7. Conclusions

We have found that, the CY model used in codebook background subtraction is inaccurate in many practical cases. Actually, the CY model is only valid if the spectrum components of the light source change in the same proportion. Both theoretical and experimental results proved this point. To extend the CY model to more general cases, we proposed an AC model.

First, we represent the AC model by two endpoints of its axis. Then, a highly efficient updating scheme is developed by considering a linear regression problem. To accelerate the updating process, a look-up table method is proposed. We also simplify the forgetting and strengthening process to reduce time costs.

With no loss of real-time performance, the proposed model reduces the wrong classification rate of the CY model by more than 50%. It should be pointed out that the CY model is only a special case of the AC model. Hence, we expect that the proposed AC model could be a better substitute for the CY model in all codebook background subtraction applications.

Acknowledgments

We are grateful to the reviewers for their constructive suggestions. We are also very grateful for helpful discussions with P. P. Shi (Xidian University) on the AC model updating strategy used in this paper.

References and links

1. K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis, “Background modeling and subtraction by codebook construction,” in Proceedings of IEEE Conference on Image Processing (IEEE, 2004), pp. 3061–3064.

2. K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis, “Real-time foreground-background segmentation using codebook model,” Real-time Imaging 11(3), 172–185 (2005). [CrossRef]  

3. A. Elgammal, R. Duraiswami, D. Harwood, and L. S. Davis, “Background and foreground modeling using nonparametric kernel density estimation for visual surveillance,” Proc. IEEE 90(7), 1151–1163 (2002). [CrossRef]  

4. L. Y. Li, W. M. Huang, I. Y. H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” IEEE Trans. Image Process. 13(11), 1459–1472 (2004). [CrossRef]   [PubMed]  

5. D. S. Lee, “Effective Gaussian mixture learning for video background subtraction,” IEEE Trans. Pattern Anal. Mach. Intell. 27(5), 827–832 (2005). [CrossRef]   [PubMed]  

6. Z. Zivkovic and F. Heijden, “Efficient adaptive density estimation per image pixel for the task of background subtraction,” Pattern Recognit. Lett. 27(7), 773–780 (2006). [CrossRef]  

7. M. Heikklä and M. Pietikäinen, “A texture-based method for modeling the background and detecting moving objects,” IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 657–662 (2006). [CrossRef]   [PubMed]  

8. L. Maddalena and A. Petrosino, “A self-organizing approach to background subtraction for visual surveillance applications,” IEEE Trans. Image Process. 17(7), 1168–1177 (2008). [CrossRef]   [PubMed]  

9. D. M. Tsai and S. C. Lai, “Independent component analysis-based background subtraction for indoor surveillance,” IEEE Trans. Image Process. 18(1), 158–167 (2009). [CrossRef]   [PubMed]  

10. O. Barnich and M. Van Droogenbroeck, “ViBe: a universal background subtraction algorithm for video sequences,” IEEE Trans. Image Process. 20(6), 1709–1724 (2011). [CrossRef]   [PubMed]  

11. S. Kwak, G. Bae, and H. Byun, “Moving-object segmentation using a foreground history map,” J. Opt. Soc. Am. A 27(2), 180–187 (2010). [CrossRef]   [PubMed]  

12. C. Cuevas, R. Mohedano, and N. García, “Adaptable Bayesian classifier for spatiotemporal nonparametric moving object detection strategies,” Opt. Lett. 37(15), 3159–3161 (2012). [CrossRef]   [PubMed]  

13. A. Elkabetz and Y. Yitzhaky, “Background modeling for moving object detection in long-distance imaging through turbulent medium,” Appl. Opt. 53(6), 1132–1141 (2014). [CrossRef]   [PubMed]  

14. Y. B. Li, F. Chen, W. L. Xu, and Y. T. Du, “Gaussian-based codebook model for video background subtraction,” Adv. Nat. Comput. 2, 762–765 (2006). [CrossRef]  

15. M. H. Sigari and M. Fathy, “Real-time background modeling/subtraction using two-layer codebook model,” in Proceedings of the International Multi-Conference of Engineers and Computer Scientists (IAENG, 2008), pp. 717–720.

16. A. Ilyas, M. Scuturici, and S. Miguet, “Real time foreground-background segmentation using a modified codebook model,” in Proceedings of IEEE Conference on Advanced Video and Signal Based Surveillance (IEEE, 2009), pp. 454–459. [CrossRef]  

17. M. J. Wu and X. R. Peng, “Spatio-temporal context for codebook-based dynamic background subtraction,” AEU, Int. J. Electron. Commun. 64(8), 739–747 (2010). [CrossRef]  

18. J. M. Guo, Y. F. Liu, C. H. Hsia, M. H. Shih, and C. S. Hsu, “Hierarchical method for foreground detection using codebook model,” IEEE Trans. Circ. Syst. Video Tech. 21(6), 804–815 (2011). [CrossRef]  

19. I. T. Sun, S. C. Hsu, and C. L. Huang, “A hybrid codebook background model for background subtraction,” in Workshop of IEEE Conference on Signal Processing Systems (IEEE, 2011), pp. 96–101. [CrossRef]  

20. M. Shah, J. Deng, and B. Woodford, “Enhanced codebook model for real-time background subtraction,” Neural Information Processing 3, 449–458 (2011).

21. Q. Tu, Y. Xu, and M. Zhou, “Box-based codebook model for real-time objects detection,” in Proceedings of IEEE Conference on Intelligent Control and Automation (IEEE, 2008), pp. 7621–7625.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (4)

Fig. 1
Fig. 1 Pixel value distribution under complex illumination changes. (The figure in the second row, first column is from Innovative Security Designs (ISD), printed with permission)
Fig. 2
Fig. 2 Arbitrary cylinder color model.
Fig. 3
Fig. 3 AC model updating scheme.
Fig. 4
Fig. 4 Comparative background segmentation for typical frames taken from five sequences. (The figures in the second and third row, first column is from Innovative Security Designs (ISD), printed with permission)

Tables (4)

Tables Icon

Table 1 Parameter settings for the AC and the CY models

Tables Icon

Table 2 Parameter settings for the CO and the BO models

Tables Icon

Table 3 Performance metrics for the AC and the CY color models

Tables Icon

Table 4 Performance metrics for the CO and the BO color models

Equations (15)

Equations on this page are rendered with MathJax. Learn more.

x= BA,PA BA
y= ( PA ) x( BA ) BA
{ y T C T I x BA + T I
a= [ 0 0 ] T
b= [ BA 0 ] T
p= [ x y ] T
y=kx+t
[ 0 b m1 1 1 b m1 1 ( m1 ) b m1 1 x 1 ][ k t ]=[ 0 0 0 y ]
[ k t ]= [ m( 2m1 ) b 2 6( m1 ) + x 2 m b 2 +x m b 2 +x m+1 ] 1 [ xy y ]
[ x y ]= [ k 1 1 k ] 1 [ t x+ky ]
n x = BA BA
n y = ( PA )x n x ( PA )x n x
A * =[ n x n y ] a *
B * =[ n x n y ] b *
{ 2 T I BA 200 T I x l max + T I 0x T C
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.