Wireless Sensor Network, 2010, 2, 328336 doi:10.4236/wsn.2010.24044 Published Online April 2010 (http://www.SciRP.org/journal/wsn) Copyright © 2010 SciRes. WSN Very Low BitRate Video Coding by Combining H.264/AVC Standard and 2D Discrete Wavelet Transform Ali Aghagolzadeh1,2, Saeed Meshgini1, Mehdi Nooshyar1, Mehdi Aghagolzadeh1 1Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran 2Iranian Telecommunication Research Center (ITRC), Tehran, Iran Email: aghagol@tabrizu.ac.ir, saeed_meshgini@tabrizu.ac.ir, nooshyar@tabrizu.ac.ir Received October 25, 2009; revised November 11, 2009; accepted February 16, 2010 Abstract In this paper, we propose a new method for very low bitrate video coding that combines H.264/AVC stan dard and twodimensional discrete wavelet transform. In this method, first a two dimensional wavelet trans form is applied on each video frame independently to extract the low frequency components for each frame and then the low frequency parts of all frames are coded using H.264/AVC codec. On the other hand, the high frequency parts of the video frames are coded by Run Length Coding algorithm, after applying a threshold to neglect the low value coefficients. Experiments show that our proposed method can achieve bet ter ratedistortion performance at very low bitrate applications below 16 kbits/s compared to applying H.264/AVC standard directly to all frames. Applications of our proposed video coding technique include video telephony, videoconferencing, transmitting or receiving video over halfrate traffic channels of GSM networks. Keywords: Video Coding, H.264/AVC Standard, Run Length Coding, TwoDimensional Wavelet Transform 1. Introduction The demands for video transmission and delivery over both high and low bandwidth channels have been accel erated. The high bandwidth applications include digital video by satellite (DVS) and highdefinition television (HDTV). The low bandwidth applications are dominated by transmission over the Internet, where the majority of modems work at speeds below 56 kbits/s [1]. On the other hand, representing video material in a digital form requires a long number of bits. The volume of data generated by digitising a video signal is too large for the most transmission systems. This means that com pression is essential for the most digital video applica tions. An efficient and welldesigned video compression system gives very significant performance advantages for visual communication at both low and high transmis sion bandwidths. At low bandwidths, compression en ables applications that would not otherwise be possible, such as basicquality video telephony over a standard telephone connection. At high bandwidths, compression can support a much higher visual quality. Video com pression and video codecs will therefore remain a vital part of the emerging multimedia applications for the foreseeable future, allowing designers to make the most efficient use of the available transmission capacity. The development of video coding technology since 1980 has been bounded up with a series of international standards for video compression. Each of these standards supports a particular application of video coding (or a set of ap plications), such as videoconferencing and digital televi sion [2]. H.264/AVC is the newest video coding standard of the ITUT Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The goals of this stan dardization efforts were enhanced compression effi ciency, networkfriendly video representation for both interactive (video telephony) and noninteractive (broad cast, streaming, storage and video on demand) applica tions [3]. H.264/AVC has achieved a significant im provement in ratedistortion efficiency relative to the previous standards [4]. However, H.264/AVC standard, like the previous video coding standards, results in a number of unacceptable artifacts such as blockiness when operated at very low bit rates. Hence, there is a need for new techniques to improve the coding effi ciency and produce acceptable quality of video at very low bitrate applications. In this paper, a new video compression method for very low bitrate coding is proposed. The main goal of
A. AGHAGOLZADEH ET AL.329 this paper is enhancing the compression efficiency (ratedistortion performance) at very low bitrate appli cations (such as videoconferencing and video teleph ony). This has been achieved by combining H.264/AVC standard and twodimensional discrete wavelet trans form. Experiments show that H.264/AVC standard, like the other video coding standards, has a good capability in coding of the low frequency components (the general structure) in contents of video frames, but it has difficul ties in encoding the details of objects in video streams, like boundaries and edges. Since the techniques em ployed in this standard use only the statistical dependen cies in the video signal at a block level and do not con sider the semantic content of the video, at very low bit rates (high quantization factors) artifacts are introduced at the block boundaries. Usually these block boundaries do not correspond to physical boundaries of the moving objects and hence, visually annoying artifacts are intro duced [5]. This problem is emphasized when the objects in video frame are dislocated rapidly; i.e. when a fast motion in a video stream occurs. Depending on the num ber of quantization levels used in the coding procedure, some details of an object are eliminated. The more the number of quantization levels is decreased, the more the details are vanished. High and suddenly motions in a video stream can also lead into loss of some important information through a limited capacity channel. The sup porting idea of this paper is to combat these problems by extracting the details from a video sequence and then coding them by another scheme instead of H.264/AVC standard. This paper is organized as follows. At first in Section 2, we give some analytic discussion about wavelet trans form. The architecture of the proposed video coding sys tem is then presented in Section 3. In Section 4, com parisons are given between the experimental results ob tained by the proposed method and the original H.264 codec. The possible advantages of our proposed method in different applications are discussed in this section. Conclusions are given in Section 5. 2. Wavelet Transform Although the Fourier transform has been the mainstay of transformbased image and video processing since the late 1950s, a more recent transformation, called the wavelet transform, is now making it even easier to compress, transmit, and analyze many images and videos. Unlike the Fourier transform, whose basis functions are sinusoids, wavelet transforms are based on small waves, called wavelets, of varying frequency and limited duration. The goal of the modern wavelet research is to create a set of basis functions (or general expansion functions) and transforms that will give an informative, efficient, and useful description of a function or signal. Another central idea is that of multiresolution analysis where the decomposition of a signal is done in terms of the differ ent resolutions of details. Both the mathematics and the practical interpretations of the wavelet transform seem to be best served by using the concept of resolution to define the effects of changing scales. To do this, we will start with a scaling function rather than directly with the wavelet . After the scaling function is defined from the concept of resolu tion, the wavelet functions will be derived from it. Good reviews of the wavelet transform are given in [6] and [7]. In following, a short review and mathematical interpreta tions of the wavelet transform are given [6] and [7]. We define a set of scaling functions in terms of integer translates of the basic scaling function by 2 . kxxkk L (1) The subspace of 2 L spanned by these functions is defined as 0k k V Spanx (2) for all integers k, k . This means that 0 for any . kk k xax fxV (3) One can generally increase the size of the subspace spanned by changing the spatial scale of the scaling func tions. A twodimensional family of functions is gener ated from the basic scaling function by scaling and trans lation by 2 ,22 jj jk xk (4) whose span over is k , 2j jk jk kk V SpanxSpanx (5) for all integers k . This means that if j xV , then it can be expressed as 2 j k k . xaxk (6) For , the span can be larger since 0j ,jk 0 be comes narrower and is translated into smaller steps. It, therefore, can represent finer details. For, j ,jk is wider and is translated into larger steps. So these wider scaling functions can represent only coarse in formation, and the size of the space they span is smaller. In order to follow our intuitive ideas of scale or resolu tion, we formulate the basic requirements of multireso lution analysis (MRA) by requiring nested spanned spaces as Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL. 330 L 2 21012 VVVVV L (7) Haar or 1 for all jj VV j (8) with 2 , .VV (9) The space that contains high resolution signals also contains those of lower resolution. Because of the definition of V, all spaces have to satisfy a natural spacing condition: j 1 2 j xV fxV (10) which ensures that all elements in a space are simply scaled versions of the elements in the next space. This relationship of the spanned spaces is illustrated in Figure 1. The nesting of the spans of 2j k , denoted by and graphically illustrated in Figure 1, is achieved by requiring that j V 1 V . This means that if is in , it is also in , the space spanned by 0 V1 V 2 . This means that can be expressed in terms of a weighted sum of the shifted 2 as 22 , n xhn xnn (11) where the coefficients are a sequence of real or possibly complex numbers called the scaling function coefficients (or the scaling filter or the scaling vector) and the hn 2 maintains the norm of the scaling function with the scale of two. This recursive equation is funda mental to the theory of the scaling functions and is re ferred by different names such as refinement equation, multiresolution analysis (MRA) Equation, or dilation Equation. The Haar scaling function is the simple unitwidth, unitheight pulse function shown in Figure 2(a). It is obvious that 2 can be used to construct by Figure 1. Nested vector spaces spanned by the scaling functions. Scaling Function Haar Wavelet Function (a) (b) db5 Scaling Function db5 Wavelet Function (c) (d) Sym5 Scaling Function Sym5 Wavelet Function (e) (f) Figure 2. “Haar”, “db5”, and “Sym5” scaling and wavelet functions. 22xxx 1 (12) which means that relation (11) is satisfied for coeffi cients 012h, 112h. The fifthorder Daubechies scaling function shown in Figure 2(c), satis fies relation (11) for 0 0.1601h, 1 0.6038h, 2 0.7243h, 30h.1384 , , 9h 0.0033. Also, the fifthorder Symlet scaling function shown in Figure 2(e) satisfies Equation (11) for 0 0.0195h, 1 0.0211h , 2 0.1753h , 3h0.0166 , , 9 0.0273h. In deed, the design of wavelet systems is how to choose the coefficients hn. 321 VVVV The important features of a signal can better be de scribed or parameterized, not by using ,jk and increasing to increase the size of the subspace spanned by the scaling functions, but by defining a slightly different set of functions j ,jk that span the differences between the spaces spanned by the various scales of the scaling function. These functions called the wavelet functions. There are several advantages for re quiring that the scaling and wavelet functions be or 0 Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL.331 thogonal. Orthogonal basis functions allow simple cal culation of expansion coefficients and also Parseval’s theorem holds that allows partitioning of the signal’s energy in the wavelet transform domain. The orthogonal complement of in is defined as . This means that all members of are orthogonal to all members of . We require j V jk 1j V ,,jl xx ,,jkl j W j V j W 0dx (13) for all corresponding 0 W 0 W . The relationship be tween the various subspaces can be seen from the fol lowing expansions. From (7), we may start at any , say at , and write j V 0j 2 01 .L 2 V 10 00 W VV 2 (14) We now define the wavelet spanned subspace such that 0 W VV (15) which extends to 20 1 .VV W (16) In general, this gives 1 LV W (17) when is the initial space spanned by the scaling function 0 V k 2 LV 2 j . Figure 3 pictorially shows the nest ing of the scaling function spaces for the different scales and how the wavelet spaces are the disjoint differences (except for the zero element) or, the or thogonal complements. j V j The scale of the initial space is arbitrary and could be chosen at a higher resolution of, say, to give 10j 10 11 LV W 10 W 55 W j W (18) or at a lower resolution such as to give 5 4 (19) or at even where (17) becomes Figure 3. Scaling and wavelet functions vector spaces. 2 21012 LWWWWW (20) eliminating the scaling space altogether. Since these wavelets reside in the space spanned by the next narrower scaling function, , they can be represented by a weighted sum of the shifted scaling function 0 WV1 2 defined in (11) by 122 , n xhn xnn (21) for some set of coefficients . From the require ment that the wavelets span the difference or orthogonal complement spaces, and the orthogonality of the integer translates of the wavelet (or scaling function), it can be shown that the wavelet coefficients (modulo translations by integer multiples of two) are required by orthogonal ity to be related to the scaling function coefficients by 1 hn 111 n hnh n . t (22) The function generated by (21) gives the prototype or the mother wavele for a class of expansion functions of the form 2 ,22 jj jk xk (23) where is the scaling of 2j , is the translation in k , and 2 2j maintains the norm of the wavelets for the different scales. The Haar wavelet function which is associated with the scaling function in Figure 2(a), is shown in Figure 2(b). For the Haar wavelet, the coeffi cients in (21) are 2 L 101h2, 1 h112 which satisfy Equation (22). Daubechies and Symlet wavelet functions associated with the scaling functions in Fig ures 2(c) and 2(e), are shown in Figures 2(d) and 2(f), respectively. We have now constructed a set of functions k and ,jk that could span all of . According to (17), any function 2 L 2 gx L could be written , 0 , k k jk jk gxck x djkx 21 WW V (24) as a series expansion in terms of the scaling function and wavelets. In this way, the first summation in (24) gives a function that is a low resolution or coarse approximation of x. For each increasing index in the second summation, a higher or finer resolution function is added, which leads to more details. j 0 V 00 W 0 W 32 VV V 1 V 2 W1 W 0 Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL. 332 The 1D Discrete Wavelet Transform: Since 000 2 1jjj LV W W (25) by using (4) and (23), a more general statement for the expansion Equation (24) can be given by 00 0 0 2 2 22 22 jj j k jj j kjj xck xk dk xk (26) or 00 0 , , jjk k jjk kjj gxc kx dk x (27) where could be zero as in (17) and (24), it could be ten as in (18), or it could be negative infinity as in (20) where no scaling functions are used. The choice of sets the coarsest scale whose space is spanned by 0 j 0 j 0,jk . The rest of is spanned by the wavelets which provide the high resolution details of the signal. The coefficients in this wavelet expansion are called the onedimensional discrete wavelet transform (1D DWT) of the signal 2 L x. If the wavelet system is orthogonal, these coefficients can be calculated by inner products ,jjk ck gxxdx (28) and ,jjk dk gxxdx (29) The DWT is similar to Fourier series but, in many ways, is much more flexible and informative. It can be made periodic like Fourier series to represent periodic signals efficiently. However, unlike Fourier series, it can be used directly on nonperiodic transient signals with excellent results. The 2D Discrete Wavelet Transform: The onedimensional transforms of the previous discussion are easily extended to twodimensional functions like images. In two dimensions, a twodimensional scaling function, , y , and three twodimensional wavelets, ,, ,, HV , D yx yx y are required. Each is the product of onedimensional scaling function and corresponding wavelet . Excluding products of func tions with the same variable that produce one dimensional results, like x , the four possible products produce the separable scaling function , and the separable directionally sensitive wavelets , H yx y (31) , V yx y (32) , D. yxy (33) These wavelets measure functional variations – inten sity or graylevel variations for images – along the dif ferent directions: measures the variations along columns (for example, horizontal edges), V responds to the variations along rows (like vertical edges), and corresponds to the diagonals variations. The direc tional sensitivity is a natural consequence of the separa bility imposed by Equations (31) to (33); it does not in crease the computational complexity of the two dimensional transform. Given separable twodimensional scaling and wavelet functions, extension of the onedimensional DWT to twodimensions is straightforward. We first define the scaled and translated basis functions: 2 ,, ,22 ,2 jj j jmn yxmy n (34) 2 ,, ,2 2,2 ,,. ijijj jmn , yxmy iHVD n (35) The discrete wavelet transform of function , xy of size N is then 0 0 11 ,, 00 , 1,, j MN jmn xy cmn xy xy MN (36) 0 11 ,, 00 , 1,, ,, . i j MN i jmn xy dmn , xy xy MN iHVD (37) As in the onedimensional case, is an arbitrary starting scale and the 0 j 0, j cmn coefficients define an approximation of , xy at scale. The 0 j , i j dmn coefficients add horizontal, vertical, and diagonal details for scales. We normally let and select N = M = 2J so that j = 0,1,2,…, J–1 and m, n = 0,1,2,…, 0 jj0 j0 2j–1. Given the 0, j cmn and of Equations (36) and (37), , i j dmn , xy is reconstructed via the inverse discrete wavelet transform yx y (30) Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL.333 00 0 ,, ,, ,, 1 ,, 1,, . jjmn mn ii jjmn iHVDj jmn , xyc mnxy MN dmn xy MN (38) In the next section, we will apply the 2D discrete wavelet transform to the frames of a video sequence in dependently to extract the low frequencies and the high frequencies components of each video frame. 3. Proposed Video Coding System As mentioned before, the main idea of this paper is to decompose a given video stream into two separated parts such that one part includes low frequencies components (information about the main structures and the back ground of video frames) and the other part includes high frequencies components (information about edges, bor ders, and details of the video frames). The decomposition of the input video stream into two separated components is accomplished through the twodimensional discrete wavelet transform. As shown in the previous section, there are several wellknown families of wavelets which can be used in image processing tasks such as Haar wavelets, Daube chies wavelets and Symlets (short form for symmetrical wavelets). Among the different families of wavelets, Haar wavelet transform is the simplest one and has very low complexity; for this reason it is used in many appli cations in signal and image processing. Hence, in our proposed method, we use twodimensional Haar wavelet as default. Of course, in order to generalize our technique for other types of wavelets, we have tested our proposed scheme by the fifthorder twodimensional Daubechies wavelet and the fifthorder twodimensional Symlet. The results are given in Section 4. Since H.264 codec is more compatible with coding the main structures of the objects and the low frequencies components in a video sequence, the proposed method utilizes twodimensional wavelet transform to extract the low frequencies components from video sequence and encode them by H.264 codec. The visual quality of these components directly depends on the quantization factor and the other parameters of H.264 video codec. In our proposed method, the low frequencies part of each frame has comparatively very smaller dimensions. Quantizing these parts of the video with more bits and utilizing the efficient types of motion estimation for motion compen sation will increase the quality of the reconstructed video. The remaining parts of the frames in the video stream, which are the high frequencies components, should be encoded by a different way. Since a large number of very small quantities are produced during the decomposition process, they can be neglected by assigning zero values after a thresholding procedure. So, a large number of zeros are the most repeated symbols in the high frequen cies bands. When a specific symbol is repeated very fre quently in a sequence, an optimum source coding proce dure can be done by Run Length Coding (RLC). In a raw of “zero” repetitions, one "zero" symbol and the number of repetitions are encoded afterward. The more the sym bol “zero” is repeated, the more the sequence is com pressed [8]. By applying a proper threshold value, the enough number of zeros is produced, so the compression rate is increased. This hard threshold value (T) is simply applied on each transform coefficient value () of the high frequencies bands by the following decision equa tion: ,ij P ,, , 0 ij ij ij PPT P otherwise (39) Figure 4 shows the block diagram of the overall pro posed system. First of all, the twodimensional discrete wavelet transform is applied on the video source and the low frequencies part is encoded by H.264 codec and the remaining parts, which include information mostly about the video objects’ edges and borders, are encoded using RLC algorithm. To apply the twodimensional wavelet transform on a given video sequence, it is applied on each frame of video sequence, independently. Since the video is QCIF formatted, each frame contains luminance (Y) and chrominance (Cb and Cr) layers; therefore the two dimensional wavelet transform is applied three times for each frame. By recollecting the LL band of the lumi nance and chrominance values for each frame and com bining them into a video with sequenced frames, a new video sequence is generated with very smaller dimen sions, with the same structure as the original video se quence. Figure 5 shows an example for twodimensional wavelet transform. Figure 5(a) is a frame of “Suzie” Figure 4. Block diagram of the proposed system. Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL. 334 (a) (b) LL LH HL HH (c) Figure 5. (a) a video frame; (b) twodimensional Haar wave let transform of the frame; (c) the corresponding bands. video sequence. After applying a two dimensional Haar wavelet transform on it, Figure 5(b) is obtained. Finally, Figure 5(c) indicates the corresponding LL, LH, HL, and HH bands according to Figure 5(b). Considering that the LH, HL, and HH bands show the disparity between the neighboring pixels, respectively, in the horizontal, verti cal and oblique directions, these bands resemble the edges and borders in a frame of video. Therefore the corresponding regions in the frame which do not have edges and borders, produce zero or near zero values for these bands. Also applying the hard threshold value can simply increase the number of “zero” symbols. By in creasing the threshold value, more “zero” symbols are produced and the compression rate is increased; therefore fewer bits are utilized for encoding by RLC algorithm. In other words, the amount of bits used to represent the high frequencies components of a frame is negligible when compared to the amount of bits produced by H.264 en coder to represent the low frequencies components of that frame [9]. 4. Experimental Results In this section, the results of the proposed method are compared with the results of H.264 default mode. At first, we need to choose a proper threshold value. A suitable value for the threshold can be chosen by crossvalidation. The proposed method is applied on some famous test video samples like “Suzie” and “foreman” video se quences. Experiments on these video sequences show that by selecting the hard threshold value so that about 95 percent of the coefficients in the high frequencies bands are set to “zero”, the best ratedistortion perform ance can be achieved. It is noticeable that for achieving the equal compression rates for the LH, HL, and HH bands and also in different layers of the input video (lu minance and chrominance layers), the different amount of threshold values must be applied for the different bands, since the required threshold value for the HH band is lower than the required threshold value for the LH and HL bands. In Figure 6, the hard threshold value is chosen so that about 95 percent of the quantities in any band, except the LL band, will be “zero”; therefore an equivalent compression is achieved for all three bands. It must be mentioned that the quantities produced by the twodimensional wavelet transform for the LH, HL, or HH bands are either positive or negative values; there fore an absolute threshold value is applied by the deci sion Equation (39). The ratedistortion plots of the proposed method and H.264 default mode are compared in Figure 7 for “Suzie” video sequence. Ratedistortion plot presents the amount of PSNR over the different bit rates. PSNR for the default mode is computed by comparing the output video of H.264 decoder with the original video (input video) pixelwise, where the dimensions of each frame are 176×144 pixels. The proposed method utilizes the twodimensional Haar wavelet transform; therefore the dimensions of each input frame to H.264 encoder are LL (H.264) LH (95%) HL (95%) HH (95%) Figure 6. The hard threshold values are chosen so that about 95 percent of the coefficients in the high frequency regional bands are set to “zero”. Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL.335 88 × 72 pixels. Hence, the spatial resolution of the pro posed method is 4 times smaller than the original H.264 mode, resulting in a very large compression rate; but PSNR is quite comparable for very low bit rates. In order to test the performance of the proposed method for the other types of wavelets, we also test our proposed technique on the fifthorder Daubechies wave let and the fifthorder Symlet wavelet. The ratedistortion plots of the proposed method by these wavelets are compared with Haar default wavelet and the original H.264 mode in Figure 8. As it shows, performance of our proposed system for these families of wavelets is comparable with Haar wavelet. This implies that we can easily generalize our proposed method for the other suitable types of wavelets. In order to compare the visual quality of the decoded videos subjectively, we also show a sample frame of the Figure 7. Comparison between ratedistortion plots of the original H.264 and the proposed method by “Haar” wavelet for “Suzie” video sequence. Figure 8. Comparison among ratedistortion plots of the original H.264 and the proposed method by “Haar”, “db5”, and “Sym5” wavelets for “Suzie” video sequence. reconstructed videos for both the original H.264 codec and our proposed system in Figure 9. As we can see in Figure 9, the visual quality of the decoded video frame for the proposed scheme (right side pictures) is much better than the visual quality of the decoded frame for the original H.264 method (left side pictures). Although for the high bit rates, the proposed method can not achieve good results, but for very low bit rates, it shows superior results. (a) (b) (c) (d) (e) (f) (g) Figure. 9. Subjective comparison between the visual quali ties of the decoded videos for a sample frame of “Suzie” video sequence: (a) the original input frame; (b), (d), (f) the outputs for H.264 decoder at rates 10 kbps (PSNR=27.8), 11 kbps (PSNR=28.2), 13 kbps (PSNR=28.9), respectively; (c), (e), (g) the outputs for the proposed decoder at the rates 10 kbps (PSNR=28.7), 11 kbps (PSNR=29), 13 kbps (PSNR= 29.7), respectively. Copyright © 2010 SciRes. WSN
A. AGHAGOLZADEH ET AL. Copyright © 2010 SciRes. WSN 336 The main advantages of the proposed method are sum marized as follows: Advantage 1: For a bit rate between 4 to 16 kb/s (very low bit rates), PSNR of the proposed method is higher than PSNR of H.264 default mode. Since the most impor tant information is lost during quantizing with high quan tization factors, the proposed method avoid losing this part of information by separating them from the original video and then coding them using RLC algorithm. This property can highly be utilized in applications when very low bit rates are requested for video communication (such as videoconferencing and video telephony). Advantage 2: The proposed method, compared with H.264 default mode, can achieve good performance for the much less bit rates. Therefore the proposed method can be utilized for sending video over very low capacity channels like the homeused dialup connections. There is another case for very low capacity channels in which our proposed video coding system can be used effectively. In GSM (Global System for Mobile communication) net works, speech or other data are communicated between BTS (Base Transceiver Station) and MS (Mobile Station) mostly over a halfrate traffic channel at rate 11.4 kbits/s. If we want to transmit or receive a video sequence over this very low capacity channel, It will be better to use the proposed video coding scheme of this paper since it pro vides much more acceptable basicquality video in such a bit rate (11.4 kbits/s) compared to the original H.264 co dec as can be seen in Figure 9. Advantage 3: The most challenging problem of H.264/AVC standard is its high computational complex ity which has limited its usage in reallife applications. The computational complexity of H.264/AVC standard is directly related to the dimensions of the frames in the video sequences. Therefore reducing the spatial resolu tion to a quarter of the size of the original resolution would reduce the computational complexity dramatically. Since the computational complexity of the wavelet transform in comparison to the computational complex ity of H.264 codec is almost negligible; therefore the proposed method is much faster than the case using just H.264 codec. This helps to improve the performance of H.264/AVC standard to be more compatible with the new emerging applications. 5. Conclusions In this paper we described a novel video compression approach that combines H.264/AVC standard and two dimensional discrete wavelet transform. The main goal of our proposed method is enhancing the performance of H.264/AVC standard to be more reliable for very low bitrate applications. To do this, video information is decomposed into two parts, known as the low frequen cies components and the high frequencies components, which contain information about the objects’ main struc tures and edges, respectively. To decompose this informa tion, the twodimensional discrete wavelet transform is applied on the sequenced frames. Then the low frequen cies parts of all frames are encoded by H.264/AVC stan dard while the high frequencies parts of frames are en coded using RLC algorithm. As revealed by experiments, the main advantage of the proposed method compared to H.264 default mode is requiring lower bit rate for the same value of PSNR in case of very low bit rates. Also we showed that the proposed method is computationally more efficient than the ordinary H.264/AVC standard. 6. Acknowledgement This research has been supported by Iran Telecommunica tion Research Center, Tehran, Iran, which is appreciated. 7. References [1] B. J. Kim, Z. Xiong and W. A. Pearlman, “Low BitRate Scalable Video Coding with 3D Set Partitioning in Hier archical Trees (3D SPIHT),” IEEE Transactions on Cir cuits and Systems for Video Technology, Vol. 10, No. 8, 2000, pp. 13741386. [2] I. E. G. Richardson, “Video Codec Design Developing Image and Video Compression Systems,” John Wiley & Sons, 2002. [3] J. Ostermann, J. Bormans, P. List, D. Marpe, M. Narro schke, F. Pereira, T. Stockhammer and T. Wedi, “Video Coding with H.264/AVC: Tools, Performance, and Com plexity,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 4, No. 1, 2004, pp. 728. [4] T. Wiegand, G. J. Sullivan, G. Bjntegaard and A. Luthra, “Overview of the H.264/AVC Video Coding Standard,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, 2003, pp. 560576. [5] R. Talluri, K. Oehler, T. Bannon, J. D. Courtney, A. Das. and J. Liao, “A Robust, Scalable, ObjectBased Video Compression Technique for Very Low BitRate Coding,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 7, No. 1, 1997, pp. 221232. [6] C. S. Burrus, R. A. Gopinath and H. Guo, “Introduction to Wavelets and Wavelet Transforms: A Primer,” Pren tice Hall, 1998. [7] R. C. Gonzalez and R. E. Woods, “Digital Image Proc essing,” 2nd Edition, Prentice Hall, 2002. [8] D. Salomon, “Data Compression: The Complete Refer ence,” 4th Edition, Springer, Berlin, 2007. [9] A. Aghagolzadeh, S. Meshgini, M. Nooshyar and M. Aghagolzadeh, “A Novel Video Compression Technique for Very Low BitRate Coding by Combining H.264/ AVC Standard and 2D Wavelet Transform,” Proceedings of 9th International Conference on Signal Processing, Beijing, 2008, pp. 12511254.
