Applied Mathematics
Vol.05 No.12(2014), Article ID:47391,10 pages

Fractal Image Compression Using Self-Organizing Mapping

Rashad A. Al-Jawfi1,2, Baligh M. Al-Helali1, Adil M. Ahmed3

1Department of Mathematics and Computer Science, Faculty of Science, Ibb University, Ibb, Yemen

2Department of Mathematics, Faculty of Sciences and Arts, Nauran University, KSA

3Department of Mathematics, Faculty of Ibn Alhaitham for Education, Baghdad University, Baghdad, Iraq


Copyright © 2014 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY).

Received 25 January 2014; revised 1 March 2014; accepted 9 March 2014;


One of the main disadvantages of fractal image data compression is a loss time in the process of image compression (encoding) and conversion into a system of iterated functions (IFS). In this paper, the idea of the inverse problem of fixed point is introduced. This inverse problem is based on collage theorem which is the cornerstone of the mathematical idea of fractal image compression. Then this idea is applied by iterated function system, iterative system functions and grayscale iterated function system down to general transformation. Mathematical formulation form is also provided on the digital image space, which deals with the computer. Next, this process has been revised to reduce the time required for image compression by excluding some parts of the image that have a specific milestone. The neural network algorithms have been applied on the process of compression (encryption). The experimental results are presented and the performance of the proposed algorithm is discussed. Finally, the comparison between filtered ranges method and self-organizing method is introduced.


Fractal Image Compression, Organizing Mapping

1. Introduction

The mathematics behind fractals began to take shape in the 17th century when mathematician and philosopher Leibniz considered recursive self-similarity (although he made the mistake of thinking that only the straight line was self-similar in this sense) [1]. Iterated functions in the complex plane were investigated in the late 19th and early 20th centuries by Henri Poincar, Felix Klein, Pierre Fatou and Gaston Julia. However, without the aid of modern computer graphics, they lacked the means to visualize the beauty of many of the objects that they had discovered [2]. In the 1960s, Benot Mandelbrot started investigating self-similarity in papers such as How Long Is the Coast of Britain? Statistical Self-Similarity and Fractional Dimension, which built on earlier work by Lewis Fry Richardson. Finally, in 1975 Mandelbrot coined the word fractal to denote an object whose Hausdorff-Besicovitch dimension is greater than its topological dimension [2]. Fractal image compression (FIC) was introduced by Barnsley and Sloan [3]. They introduce in another work a better way to compress images [4] , and after that, (FIC) has been widely studied by many scientists. FIC is based on the idea that any image contains self-similarities, that is, it consists of small parts similar to itself or to some big part in it [5]. So in FIC iterated function systems are used for modeling. Jacquin [6] presented a more flexible method of FIC than Barnsley’s, which is based on recurrent iterated function systems (RIFSs) introduced first by him. RIFSs which have been used in image compression schemes consist of transformations which have a constant vertical contraction factor. Fisher [7] improved the partition of Jacquin. A hexagonal structure called the Spiral Architecture (SA) [8] was proposed by Sheridan in 1996. Bouboulis et al. [9] introduced an image compression scheme using fractal interpolation surfaces which are attractors of some RIFSs. Kramm presented a quite fast algorithm [10] , manages to merge low-scale redundancy from multiple images. In this work, Neural Networks is used to optimize the process, by using self-organizing neural networks to provide domain classification. The experimental results are presented and the performance of the algorithms is discussed.

Artificial Neural Networks (ANN) has been used for solving many problems, special in cases where the re- sults are very difficult to achieve by traditional analytical methods. There have already been a number of studies published applying ANN to image compression [11]. It is important to emphasize, although there is no sign that neural networks can take over the existing techniques [11] , research on neural networks for image compression is still making some advances. Possibly in the future this could have a great impact on the development of new technologies and algorithms in this area.

J. Stark first proposed a research to apply the neural network to iterated function system (IFS) [12]. His me- thod was using Hopfield neural network to solve the linear progressive problem and get the Hutchinson metric quickly. However, his neural network approach cannot obtain the fractal code automatically. A few methods of optimization of exhaustive search [13] [14] , which were based on clustering of the set of domain blocks, were suggested. But the majorities of these methods either decrease lightly the computational complexity or result in high losses of the quality of an image. The method of clustering by means of Artificial Kohonen neural self-op- timizing network is least afflicted with these disadvantages [15].

2. Neural Networks

A neural net is an artificial representation of the human brain that tries to simulate its learning process. The term “artificial” means that neural nets are implemented in computer programs that are able to handle the large num- ber of necessary calculations during the learning process. To show where neural nets have their origin, let’s have a look at the biological model: the human brain.

2.1. The Components of a Neural Net

Generally spoken, there are many different types of neural nets, but they all have nearly the same components. If one wants to simulate the human brain using a neural net, it is obviously that some drastic simplifications have to be made: First of all, it is impossible to “copy” the true parallel processing of all neural cells. Although there are computers that have the ability of parallel processing, the large number of processors that would be neces- sary to realize it can’t be afforded by today’s hardware. Another limitation is that a computer’s internal structure can’t be changed while performing any tasks.

And how to implement electrical stimulations in a computer program? These facts lead to an idealized model for simulation purposes. Like the human brain, a neural net also consists of neurons and connections between them. The neurons are transporting incoming information on their outgoing connections to other neurons. In neural net terms these connections are called weights. The “electrical” information is simulated with specific values stored in those weights. By simply changing these weight values the changing of the connection structure can also be simulated.

As you can see, an artificial neuron looks similar to a biological neural cell. And it works in the same way, input is sent to the neuron on its incoming weights. This input is information called the propagation function that adds up the values of all incoming weights. processed by a threshold value by the neuron’s activation function. The resulting value is compared with a certain if the input exceeds the threshold value, the neuron will be activated, otherwise it will be inhibited. output on its outgoing weights to all connected neurons and so if acti- vated, the neuron sends an on. Figure 1(a) shows a neural net structure. In a neural net, the neurons are grouped in layers, called neuron layers. Usually each neuron of input one layer is connected to all neurons of the preceding and the following layer (except the layer and the output layer of the net). The information given to a neural net is propagated layer-by-layer from input layer to output layer hidden layers. Depending on the learning algorithm, it is also through either none, one or more possible that information is propagated backwards through the net.

Figure 1(b) shows a neural net with three neuron layers.

Note that this is not the general structure of a neural net. For example, some neural net types have no hidden layers or the neurons in a layer are arranged as a matrix, weight matrix, the what’s common to all neural net types is the presence of at least one connections between two neuron layers.

2.2. Types of Neural Nets

As mentioned before, several types of neural nets exist. They can be distinguished by their type (feedforward or feedback), their structure and the learning algorithm they use. The type of a neural net indicates, if the neurons of one of the net’s layers may be connected among each other. Feedforward neural nets allow only neuron con- nections between two different layers, while nets of the feedback type have also connections between neurons of the same layer.

2.3. Supervised and Unsupervised Learning

Neural nets that learn unsupervised have no such target outputs. It can’t be determined what the result of the


Figure 1. (a) Structure of a neuron in a neural net. (b) Neural net with three neuron layers.

learning process will look like. During the learning process, the units (weight values) of such a neural net are “arranged” inside a certain range, depending on given input values. The goal is to group similar units close to- gether in certain areas of the value range. This effect can be used efficiently for pattern classification purposes.

3. The Self-Organizing Mapping (SOM)

3.1. Competitive Learning and Clustering

Competitive learning is a learning procedure that divides a set of input patterns in clusters that are inherent to the input data. A competitive learning network is provided only with input vectors x and thus implements an unsupervised learning procedure. We will show its equivalence to a class of ‘traditional’ clustering algorithms shortly.

3.2. Winner Selection: Euclidean Distance

In the competitive structures, a winning processing element is determined for each input vector based on the similarity between the input vector and the weight vector.

To this end, the winning neuron k is selected with its weight vector w k closest to the input pattern x , using the Euclidean distance measure:

k : w k ( t ) x w l ( t ) x , l . (1.1)

the winning unit can be determined by

w k ( t ) x = min l w l ( t ) x (1.2)

where the index k refers to the winning unit.

Once the winner k has been selected, the weights are updated according to:

w k ( t + 1 ) = w k ( t ) + γ ( x ( t ) w k ( t ) ) . (1.3)

Note that only the weights of winner k are updated. The weight update given in Equation (1.20) effectively implement a shift to the weight vector w l towards the input vector x .

3.3. Cost Function

Earlier it was claimed, that a competitive network performs a clustering process on the input data, i.e., input patterns are divided in disjoint clusters such that similarities between input patterns in the same cluster are much bigger than similarities between inputs in different clusters. Similarity is measured by a distance function on the input vectors, as discussed before. A common criterion to measure the quality of a given clustering is the square error criterion, given by

E = p w k x p 2 , (1.4)

where k is the winning neuron when input x p is presented. The weights w are interpreted as cluster cen- ters. It is not difficult to show that competitive learning indeed seeks to find a minimum for this square error by following the negative gradient of the error-function [16] :

Theorem 3.1 The error function for pattern x p is

E p = i ( w k i x i p ) 2 , (1.5)

where k is the winning unit, is minimized by the weight update rule

w k ( t + 1 ) = w k ( t ) + γ ( x ( t ) w k ( t ) ) . (1.6)

Proof 1 We calculate the effect of a weight change on the error function. So we have that

Δ p w i l = γ E p w i l . (1.7)

where γ is a constant of proportionality. Now, we have to determine the partial derivative of E p :

E p w i l = { w i l x i p , i f l w i n s 0 o t h e r w i s e , (1.8)

such that

Δ p w i l = γ ( w i l x i p ) = γ ( x i p w i l ) (1.9)

which is Equation (1.6) written down for one element of w l . Therefore, Equation (1.4) is minimised by re- peated weight updates using Equation (1.6).

3.4. Winner Selection: Dot Product

For the time being, we assume that both input vectors x and weight vectors w l are normalised to unit length. Each output unit l calculates its activation value y l according to the dot product of input and weight vector:

y l = i w i l x i = W l T X . (1.10)

In a next pass, output neuron k is selected with maximum activation

l k : y l y k . (1.11)

Activations are reset such that y k = 1 and y l k = 0 .

This is the competitive aspect of the network, and we refer to the output layer as the winner-take-all layer. The winner-take-all layer is usually implemented in software by simply selecting the output neuron with highest activation value.

We now prove that Equation (1.1) reduces to (1.10) and (1.11) if all vectors are normalised.

Proposition 3.2 Let x , and w l be a normalised vectors, and let w k be selected such that

w k ( t ) x w l ( t ) x , l then y l y k : l k ,

where y l = i w i l x i .

Proof 2 Let x be a normalised input vector and w k be the winning unit is determined by 1.1 the minimum of the quantity w l ( t ) x , i.e.

w k ( t ) x w l ( t ) x , l

where y x = i ( y i x i ) 2 ,

then, we have that,

w k ( t ) x w l ( t ) x i ( w k i x i ) 2 i ( w l i x i ) 2 , l (1.12)

i ( w k i 2 2 w k i x i + x i 2 ) i ( w l i 2 2 w l i x i + x i 2 ) , l (1.13)

i w k i 2 i 2 w k i x i + i x i 2 i w l i 2 i 2 w l i x i + i x i 2 , l (1.14)

1 i 2 w k i x i + 1 1 i 2 w l i x i + 1 , l (1.15)

where i w k i 2 = i x i 2 = 1 (1.16)

2 i w k i x i 2 i w l i x i , l (1.17)

i w k i x i i w l i x i , l (1.18)

y k y l , l , (1.19)

where y l = i w l i x i , and y k = i w k i x i .

The Euclidean distance norm is therefore a more general case of Equations (1.10) and (1.11).

It can be shown that this network converges to a situation where only the neuron with highest initial activation survives, whereas the activations of all other neurons converge to zero. Once the winner k has been selected, the weights are updated according to:

w k ( t + 1 ) = w k ( t ) + γ ( x ( t ) w k ( t ) ) w k ( t ) + γ ( x ( t ) w k ( t ) ) , (1.20)

where the divisor ensures that all weight vectors w are normalised. Note that only the weights of winner k are updated. The weight update given in Equation (1.20) effectively rotates the weight vector w l towards the input vector x . Each time an input x is presented, the weight vector closest to this input is selected and is subsequently rotated towards the input. Consequently, weight vectors are rotated towards those areas where many inputs appear: the clusters in the input.

Previously it was assumed that both inputs x and weight vectors w were normalised. Using the activation function given in Equation (1.10) gives a “biological plausible” solution. Figure 2(b) shown how the algorithm would fail if unnormalised vectors were to be used.

An almost identical process of moving cluster centres is used in a large family of conventional clustering algorithms known as square error clustering methods, e.g., k-means, forgy, isodata, cluster. From now on, we will simply assume a winner k is selected without being concerned which algorithm is used.

4. The Inverse Problem of Fractals

The term Fractals was coined by Mandelbrot in 1975 to such sets, from the Latin word fractus, meaning broken. He provided a precise technical definition [17] : “fractal is a set with Hausdorff dimension strictly greater than its topological dimension”. Instead of giving a precise definition of fractals which almost exclude some interesting cases, it is better to regard a fractal as a set that has the properties [18] :

1) Has a “fine” structure.

2) Has some type of self-similarity.

3) Difficult to be described globally or locally by the classic Euclidean geometry.

4) Usually has a non-integer dimension.

5) Defined by a simple model that can be rendered recursively or iteratively.

6) Is usually a strange attractor for a dynamical system.

4.1. Iterated Function System

Barnsleyin 1988 introduced the iterated function system (IFS) [17] as an applications of the theory of discrete dynamical systems and useful tools to build fractals and other self-similar sets. The mathematical theory of IFS is one of the basis for modeling techniques of fractals and is a powerful tool for producing mathematical fractals

(a) (b)

Figure 2. (a) The selection failed in (b) if unnormalised.

such as Cantor set, Sirpinski gasket, etc, as well as real word fractals representing such as clouds, trees, faces, etc. IFS is defined through a finite set of affine counteractive mapping mostly of the form:

f i : R n R , i = 1 , 2 , , n , n N In particular case, two-dimensional affine maps have the following form:

f [ x y ] = [ a b c d ] [ x y ] + [ e f ] .

This map could be charaterized by the six constants a, b, c, d, e, f, which establish the code of f.

4.2. Fractal Inverse Problem

The fractal inverse problem is an important research area with a great number of potential application fields. It consists in finding a fractal model or code that generates a given object. This concept has been introduced by Barnsley with the well known collage theorem [5]. When the considered object is an image, we often speak about fractal image compression. A method has been proposed by Jacquin [6] to solve this kind of inverse problem.

4.3. Collage Theorem on ( H ( X ) , h )

Since the number of points in fractal sets is infinite and complicatedly organized, it is difficult to specify exactly the generator IFS. From the practical point of view, it will be acceptable for the required IFS to be chosen such that its attractor is close to a given image for a pre-defined tolerance.

The collage theorem is very useful to simplify the inverse problem for fractal images [5] , it has been addressed by many researchers as well [19].

Theorem 4.1 Let { X ; w 1 , w 2 , , w N } be a hyperbolic IFS with contractivity factor s, and W be the associated hutchinson map then

h ( A , A W ) < h ( A , W ( A ) ) 1 s , A H ( X ) (1.21)

where A W is the fixed point of W .

Hence if h ( A , W ( A ) ) < ϵ then

h ( A , A W ) < ϵ 1 s .

Proof 3 see [5].

The theorem can be used as following. Given a fractal image A , find a set of contractive mappings that maps A into smaller copies of itself such that the union of the smaller copies is close as ε to the target image. The determined contractions are the IFS codes with corresponding Hutchinson operator W .

The theorem states that, the attractor A W of the determined IFS W approximates the target image A (i.e.,

h ( A , A W ) < ϵ 1 s . It also implies that, the more accurately the IFS maps the image to itself, the more accurately

the IFS approximates the image.

5. Fractal Image Compression by Means of Kohonen Network

The Kohonen layer is a Winner-take-all (WTA) layer. Thus, for a given input vector, only one Kohonen layer output is 1 whereas all others are 0. No training vector is required to achieve this performance. Hence, the name: Self-Organizing Map Layer (SOM-Layer).

Let r i R n is a vector of the intensity of the range-block R i , and d j n is a vector of the intensity of the range-block D j which is transformed to the size of the corresponding range-block:

E ( r i , d j ) = min α , β r i ( α d j + β C ) , ( α , β ) 2 ,

where C n , C = ( 1 , , 1 ) / n . Let O be an operator of orthogonal projection which projects n on the orthogonal complement Γ , Γ is a linear envelope of the vector C , for Z = ( z 1 , , z n ) n \ Γ we shall define the operator:

τ ( Z ) = O Z O Z .

Theorem 5.1 Assume that n 2 and X = n \ Γ . Let us define the function Δ X × X [ 0, 2 ] in the following way:

Δ ( d , r ) = min ( τ ( r ) + τ ( d ) , τ ( r ) τ ( d ) ) .

For r i , d j X the minimum distance E ( r i , d j ) will be determined by the formula:

E ( r i , d j ) = ( r i , τ ( r i ) ) g ( Δ ( r i , d j ) ) ,


g ( Δ ) = Δ 1 Δ 2 4 .

Proof 4 See [20].

This theorem means that the less of the distance d ( τ ( r i ) , ± τ ( d j ) ) the less of the Error E ( r i , d j ) . Let D τ = { ± τ ( d j ) , j } and R τ = { τ ( r i ) , i } , which we call range--vector and domain--vector receptively.

Let us apply the kohonen network algorithm to cluster the domain vectors. In the beginning, domain vectors will train the network (learning), and, later, the range vectors will input the network to choose the optimal domain.

5.1. Global Codebook

The idea of the global codebook is to assign a fixed domain pool for the entire range pool or for a specific class of it (e.g. set of range blocks that have the same size in a quad tree partition) [16]. In the global codebook each rang block in the range pool has its own domain pool, and this domain pool can be selected by many ways. One of the methods of selecting the domain pool for a range block is to construct a set of domain blocks that are spatially close to the range block [11].

5.2. Ranges Filtering Algorithm

1) set i = 0.

2) if ( v a r ( r i ) > ϵ ) goto 5.

3) set the arguments of this range (position and the mean r ¯ i are enough) to the first codebook.

4) omit r i from R and mark it as one typed.

5) i = i + 1 .

6) if i < N r goto 2.

7) End.

5.3. (Training) Domains Clustering Algorithm

1) Network initialization. Let the neurons of the net are

D c = { d c i } i = 0 N c

where d c i is the cluster (neuron) number i of N c neurons and w i is the weight vector of the (neuron) number i which initialized to random domain.

2) Search the nearest cluster for τ ( d j ) D τ . chose the winner neuron d c k D c . τ ( d j ) D τ find w k such that:

d ( w k , τ ( d j ) ) d ( w i , τ ( d j ) ) , i = 1 , , N c

and add index of the vector τ ( d j ) into the corresponding memory D c k .

3) Update the weight vector w k of the winning neuron d c k by the following way:

Table 1. Results of the classic algorithm (without neural networks) vs the neurofractal.

w k = w k + γ ( τ ( d j ) w k ) , γ ( 0 , 1 )

4) End.

5.4. (Encoding) Ranges Matching Algorithm

1) for each cluster d c k let m i n k = m i n j D c k d ( w k , τ ( d j ) ) and m a x k = m a x j D c k d ( w k , τ ( d j ) ) .

2) r R set k = 0 , d r = d ( τ ( r ) , w k ) .

3) let d r k = d ( τ ( r ) , w k ) .

4) j D c k if d ( τ ( r ) , τ ( d j ) ) d r then r c o d e = a r g ( τ ( d j ) ) and d r = d ( τ ( r ) , τ ( d j ) ) .

5) k = k + 1 if k > N c then goto 7.

6) if d r m a x k d r goto 3 else goto 5.

7) set the r c o d e to the codebook.

8) End.

6. Results and Discussion

A gray level images of size 256 × 256 have been considered for training the network. A range pool is created having ranges of size 4 × 4 and 8 × domain blocks.

The computer simulations have been carried out in Visual C# environment on Pentium Dual CPU with 1.73 GHz and 2.00 GB RAM and the results have been presented in Table 1.

Table 1 compares fractal image compression results where the standard scheme is introduced in [19] and the self-organizing method. The neural method classifies domains using the self-organizing neural network approach, in each case, a total of 320 domain cells were used. A larger number of domains would have increased encoding times and provided marginally better compression ratios. The self-organizing method is faster than the filtered ranges method and therefore faster than the baseline method.


  1. 1. Rajeshri, R. and Yashwant, S.C. (2013) Escape Time Fractals of Inverse Tangent Function. International Journal of Computer and Organization Trends, 3, 16-21.

  2. 2. Satyendra, K.P., Munshi, Y. and Arunima (2012) Fracint Formula for Overlaying Fractals. Journal of Information Systems and Communication, 3, 347-352.

  3. 3. Barnsley, M.F. and Sloan, A.D. (1987) Chaostic Compression. Computer Graphics World, 10, 107-108.

  4. 4. Barnsley, M. and Sloan, A. (1988) A Better Way to Compress Images. Byte, 13, 215-223.

  5. 5. Barnsley, M.F. (1988) Fractals Everywhere. Academic Press, New York.

  6. 6. Jacquin, A.E. (1992) Image Coding Based on a Fractal Theory of Iterated Contractive Image Transformations. IEEE Transaction on Image Processing, 1, 8-30.

  7. 7. Fisher, Y. (1994) Fractal Image Compression-Theory and Application. Springer-Verlag, New York.

  8. 8. Sheridan, P. (1996) Spiral Architecture for Machine Vision. Ph.D. Thesis, University of Technology, Sydney.

  9. 9. Bouboulis, P., Dalla, P.L. and Drakopoulos, V. (2006) Image Compression Using Recurrent Bivariate Fractal Interpolation Surfaces. International Journal of Bifurcation and Chaos, 16, 2063-2071.

  10. 10. Kramm, M. (2007) Compression of Image Clusters Using Karhunen Loeve Transformations. Electronic Imaging, Human Vision, XII, 101-106.

  11. 11. Koli, N. and Ali, M. (2008) A Survey on Fractal Image Compression Key Issues. Information Technology Journal, 7, 1085-1095.

  12. 12. Bressloff, P.C. and Stark, J. (1991) Neural Networks, Learning Automata and Iterated Function Systems. In: Crilly A.J., Earnshaw, R.A. and Jones, H., Eds., Fractals and Chaos, Springer-Verlag, 145-164.

  13. 13. Jacquin, A.E. (1992) Image Coding Based on a Fractal Theory of Iterated Contractive Image Transformations. IEEE Transaction on Image Processing, 1, 18-30.

  14. 14. Hamzaoui, R. (1995) Codebook Clustering by Self-Organizing Maps for Fractal Image Compression. NATO ASI Conference Fractal Image Encoding and Analysis, Trondheim, July 1995, 27-38.

  15. 15. Kohonen, T. (1982) Self-Organized Formation of Topologically Correct Feature Maps. Biological Cybernetics, 43, 59-69.

  16. 16. Krose, B. and der Smagt, P.V. (1996) An Introduction to Neural Networks. Amesterdam Univerisity, Amesterdam.

  17. 17. Mandelbrot, B.B. (1982) The Fractal Geometry of Nature. Freeman Press, New York.

  18. 18. Nikiel (2007) Iterated Function Systems for Real-Time Image Synthesis. Springer-Verlag, London.

  19. 19. Al-Helali, B. (2010) Fractal Image Compression Using Iterated Function Systems. M.Sc Thesis, Taiz University, Yemen.

  20. 20. Saupe, D., Hamzaoui, R. and Hartenstein, H. (1996) Fractal Image Compression—An Introductory Overview. In: Saupe, D. and Hart, J., Eds., Fractal Models for Image Synthesis, Compression and Analysis, ACM, New Orleans, SIGGRAPH’96 Course Notes 27.