sgsdg

5

TRIANGLE-BOX COUNTING METHOD FOR FRACTAL DIMENSION ESTIMATION

Kuntpong Woraratpanya1, Donyarut Kakanopas2, Ruttikorn Varakulsiripunth3 1 Faculty of Information Technology, King Mongkuts Institute of Technology Ladkrabang, Thailand,

Email: [email protected] 2 Faculty of Information Technology, King Mongkuts University of Technology North Bangkok, Thailand,

Email: [email protected] 3 Faculty of Engineering, King Mongkuts Institute of Technology Ladkrabang, Thailand,

Email: [email protected]

Received: May 30, 2012

Abstract A fractal dimension (FD) is an important feature, which characterizes roughness and self-similarity of complex objects in nature images. In practice, the FD is determined by a box counting (BC) that is one of the commonly used estimation algorithms. However, this algorithm is sensitive to the minimum box covering requirements dependent on a box-size variation and box count, thus making a result inaccurate. This paper proposes a triangle-box counting (TBC) method, which can provide a more accurate result of FD estimations. This method derived from the classical BC simply divides square boxes into two equally triangle boxes to increase the precision of the box count and fit the requirements of the minimum box covering. The implemented algorithm is validated the accuracy of FD estimations with respect to the theoretical FDs. The experimental results show that the TBC method provides more accurate estimations when compared with the classical BC method. Furthermore, the TBC method is applied to a content-based image retrieval (CBIR) system. The results illustrate that it outperforms the classical BC method in terms of recall, precision, accuracy, and retrieved order. The higher accuracy of FD estimations leads to the more powerful features in applying to general and fine applications.

Keywords: Fractal Dimension, Box Counting, Triangle-Box Counting, Fractal Image, Fractal Dimension Estimation, Content-Based Image Retrieval

Introduction A fractal dimension (FD) is an effective measure for complex objects. It can be viewed as a feature, which characterizes roughness and self-similarity in nature images. Over the last ten years, the FD was applied broadly in many applications such as pattern recognition, texture analysis, segmentation, and medical signal analysis [1]. These applications rely heavily on the FD estimation. For examples, O.M. Bruno et al [2] applied the FD to plant identification by using box-counting (BC) and multiscale Minkowski methods to estimate the FD. The results of this research illustrated that both methods were strong potential to recognize tree species. In medical applications, R.D. King et al [3] used the BC method to estimate the FD of the cortical ribbons to discriminate patients with different degrees of cerebral atrophy. The results of using FD successfully discriminated between the two clinical groups. J.Z. Liu et al [4] also applied the BC method to estimate the FDs of human cerebellum. The results indicated that the cerebellum skeleton is a highly fractal structure, and no significant difference in the cerebellum fractal dimension was observed between men and women. Although the achievement in applying the FD to broad research areas was reported, almost papers suggested that the accurate FD estimation method still required. Various FD estimation approaches were introduced, such as differential box-

Invited Paper

6

counting method (DBCM), extended counting method (XCM), and fractional Brownian motion (fBm) methods as reported in [1].

A box-counting defined by Russell et al. in 1980 [1], [5] is one of the commonly used methods to estimate the FD in nature objects, since it is simple and practical as described in [1], [4], [6], [7]. However, such a method is sensitive to minimum box covering requirements dependent on the box-size variation and box count, thus making a result inaccurate [7]. Furthermore, it provides only binary images [1]. Various papers [8], [9], [10], [11] attempt to extend the BC capability to support gray-level images. Nevertheless, the FD estimation for binary images remains essential for applications whose features require high computational speed and lower space. A few research papers have introduced to factors that help improve the box-counting estimation for binary images. For example, K. Foroutan-pour et al [7] reported an important factor, i.e., a box size, directly related to the accurate estimation of FDs in binary images. They proposed a procedure for defining the most appropriate range of box sizes for any individual image, but did not report the improved method or algorithm. A.R. Backes [12] introduced to a combination of the multilevel Otsus approach and the FD to create a signature vector. Although this approach achieved in classification correctness, it provided a longer signature vector taking computation time.

As mentioned before, there are two factors directly related to the accurate FD estimation, i.e., box size and minimum box covering. This paper proposes the triangle-box counting (TBC) method to improve the accuracy of FD estimations. This method simply divides each square box into two equally triangle boxes so as to increase the precision of box counts associated with box sizes and fit the requirements of the minimum box covering as suggested in [7]. Finally, in order to prove that the proposed method outperforms the classical BC method, both methods are implemented in MATLAB programming and fractal images generated by mathematical formulas with theoretical FDs are used to validate such algorithms. Furthermore, they are implemented with a CBIR system for evaluating their performance.

Proposed Method In this section, the background of the BC method for fractal dimension estimations is re-viewed and pointed out its drawback. The proposed method is described in the last section.

Backgrounds of Box-counting Method In mathematics, a fractal dimension is a ratio providing a statistical index of complexity comparing how detail in a pattern change with the scale at which it is measured [13]. On the other hand, in applications the fractal dimension can be viewed as a feature, which characterizes roughness and self-similarity, especially in nature images [1], [8].

A box-counting defined by Russell et al. in 1980 [1], [5] is one of the commonly used techniques for fractal dimension estimations. This method covers a binary image with boxes of length s, and the fractal dimension is estimated as:

, (1)

where N(s) is the number of boxes needed to completely cover the image. In practice, the steps to estimate the FD of binary images can be described as follows.

First, providing grid sizes and counting objects N(s) which are covered with various box sizes s as shown in Figure 1, and recording N(s) as illustrated in the 2nd column of Table 1.

( )( )( )s

sN/1log

logFD

7

Then plotting log(N(s)) versus log(1/s). Finally, fitting a least-square regression line through the data points. A slop of a regression line represents the estimated FD.

The following example demonstrates steps of the BC method to estimate the FD of a Koch curve fractal image with a theoretical FD, 1.2620. In this method, the important factor for the accurate estimation of the FD is the box count. This example provides the largest box size equal to a half of the image size and the smallest box size equal to 2x2. The box size is reduced by half from larger to smaller for each scale of FD estimations as depicted in Figure 1. Table 1 shows a relation between the box size, s, and the number of boxes needed to completely cover the image, N(s). Figure 2 demonstrates the slope of the least-square regression line. As a result, The FD estimation of the BC method is approxi-mately 1.2131, which is 96.13% of accuracy when compared with the theoretical FD, 1.2620.

The fractal dimension requires the minimum box covering in order to estimate the FD accurately [5]. Looking back to regard the counting objects with box-size variations as shown in Figure 1(b)-1(e), there are many boxes that do not completely cover the Koch curve. This fails to attend the minimum box-covering requirements. Therefore, the estimation error caused by the box counts with box-size variations becomes the critical factor. One way to reduce this error is providing the appropriate box counts with various box sizes that can cover the object as well as possible. In order to overcome this problem, the triangle-box counting (TBC) is proposed to make the requirements of the minimum box covering possible. The following subsection demonstrates the triangle-box counting for estimating FDs.

(a) s = 128 and N(s) = 4 (b) s = 64 and N(s) = 6 (c) s = 32 and N(s) = 18

(d) s = 16 and N(s) = 38 (e) s = 8 and N(s) = 83

Figure 1. Box counts (N(s)) with various box sizes (s).

8

Table 1. A Relation of Box Counts and Various Box Sizes.

s Box Count (N(s)) Log(1/s) Log(N(s))

128 4 -2.1072 0.6021

64 6 -1.8062 0.7782

32 18 -1.5051 1.2553

16 38 -1.2041 1.5798

8 83 -0.9031 1.9191

4 218 -0.6021 2.3385

2 561 -0.3010 2.7490

Figure 2. A least-square regression line through the data points of the box-counting method (FD 1.2131).

Triangle-box Counting Method The accuracy of FD estimations depends on the minimum box covering and the box-size. In order to meet the requirements, the method simply divides square boxes provided by a grid into two equally triangle boxes to increase the precision of box counts associated with box sizes and fit the requirements of the minimum box covering as shown in Figure 4. For this purpose, the algorithm for counting objects, box counts N(s), covered by triangle boxes is proposed as follows.

Step 1: set g to the largest box size, N(s) = 0, and i = 1, where g is grid size.

Step 2: split a square box into two equally triangle boxes as depicted in Figure 3.

Step 3: count non-empty boxes in both patterns, such that C1 and C2 denote counter variables for the triangle-box pattern-1 and pattern-2, respectively.

9

Step 4: if C1 and C2 are equal to 2, Ni(s) = Ni(s) + 2; else if C1 and C2 are equal to 1, Ni(s) = Ni(s) + 1; else if C1 is not equal to C2, such that C1 and C2 are greater than 0, n = min{C1, C2}; Ni(s) = Ni(s) + n; else if C1 or C2 is equal to 0, n = max{C1, C2}, Ni(s) = Ni(s) + n; else do nothing

Step 5: if i does not reach the final box, increase i = i +1 and go to Step 2, otherwise Stop.

(a) pattern-1 (b) pattern-2

Figure 3. Patterns of splitting a square box into two equally triangle boxes.

(a) s = 128 and N(s) = 6 (b) s = 64 and N(s) = 10 (c) s = 32 and N(s) = 29

(d) s = 16 and N(s) = 68 (e) s = 8 and N(s) = 148

Figure 4. Box counts (N(s)) with various triangle-box sizes (s).

10

Table 2. A Relation of Box Counts and Various Triangle-Box Sizes.

s Box Count (N(s)) Log(1/s) Log(N(s))

128 6 -2.1072 0.7782

64 10 -1.8062 1.0000

32 29 -1.5051 1.4624

16 68 -1.2041 1.8325

8 148 -0.9031 2.1703

4 397 -0.6021 2.5988

2 1059 -0.3010 3.0249

Table 2 and Figure 4 show the box counts with various triangle-box sizes and Figure 5 shows the slop of the regression line representing FD estimation, respectively. In this case, the FD estimation by using the TBC method is approximately 1.2630, which is 99.92% of accuracy when compared with the theoretical FD. It significantly improves the accuracy of FD estimations.

Figure 5. A least-square regression line through the data points of the triangle-box counting method (FD 1.2630).

Experimental Results In order to evaluate the TBC method, the accuracy of FD estimation and efficiency in implementing with a CBIR system are tested as follows. Accuracy of Fractal Dimension Estimations In this experiment, both classical BC and TBC methods are implemented in MATLAB programming. The implemented algorithms are validated with fractal images generated by mathematical formulas with theoretical FDs varied from 1.2620 to 2.0000. The fractal

11

image size for validation is 256x256, and the experiment verifies results in the form of accurate estimations of the FD.

A comparison of experimental results shows in Table 3. In the first fractal image, its theoretical FD is 1.2620. The FD estimation from the TBC method is 1.2472 and the accuracy is 98.83%, whereas the BC method provides 1.1860 and the accuracy is 93.98%. Overall it is illustrated that the TBC is able to estimate the FD close to the theoretical FD. Especially in low and medium theoretical FDs, 1.2620-1.6280, the proposed method evidently outperforms the BC method. In high FD images, the estimation of the TBC and BC methods yield almost the same results, very slightly different. This experiment proves that the TBC algorithm provides a more accurate FD solution. The next subsection illustrates the efficiency of applying the TBC method to the CBIR system.

Table 3. Experimental Results of FD Estimation Methods Compared to Theoretical FD.

Fractal Image Theoretical FD

TBC Estimated FD (Accuracy)

BC Estimated FD (Accuracy)

1.2620 1.2472 (98.83%) 1.1860 (93.98%)

1.4650 1.4227 (97.11%) 1.3282 (90.66%)

1.5850 1.5585 (98.33%) 1.4677 (92.60%)

1.5850 1.5635 (98.64%) 1.4561 (91.87%)

1.6280 1.5815 (97.14%) 1.5018 (92.25%)

1.8930 1.9103 (99.09%) 1.8839 (99.52%)

1.9000 1.9020 (99.89%) 1.9032 (99.83%)

2.0000 1.8949 (94.75%) 1.8962 (94.81%)

12

Efficiency in Implementing with CBIR In this experiment, the implemented system of content-based image retrieval (CBIR) is setup as schematically illustrated in Figure 6. Such a system consists of two parts, training and testing processes. In the training process, a training set is extracted texture and shape features by a binarization procedure as shown in Figure 7. The binarization procedure converts color images into gray-level images as depicted in Figure 7(b). Then the thresholding technique and canny algorithm are applied to such gray-level images to extract texture and shape features as demonstrated in Figure 7(c) and 7(d), respectively. Finally, the TBC algorithm estimates the FDs of texture and shape features in binary images (Figure 7(c) and 7(d)) and form FD feature vectors.

In the testing process, a query image is processed similarly to the training process to achieve its FD feature vector. In addition, in order to test the FD accuracy estimated by the TBC and BC methods in the CBIR system without any bias, two impact factors are controlled, i.e., the similarity measures and features. The simple similarity measures L1 distance and k-nearest neighbor (k-NN) are used, whereas the shape and texture features extracted by FD estimations are applied to such a system. In fact, the Caladium Bicolor image database is comprised of three core features: shape, texture, and color. Thus, ignoring the color feature may lead to the reduction of recall and precision levels.

Figure 6. A schematic diagram of a CBIR system.

(a) Original image (b) Gray-level image (c) Texture image (d) Shape image Figure 7. Results of the binarization procedure and canny algorithm for providing texture

and shape features, respectively.

13

Figure 8. A part of Caladium Bicolor test images.

A data set for experiments is a Caladium Bicolor image database consisting of 50 classes, 10 sample images for each class, totally 500 images. This data set is provided for a training set. A part of Caladium Bicolor test images shows in Figure 8. The image size used in this experiment is categorized into three groups, 64x64, 128x128, and 256x256, respectively. The efficiency of FD estimations for the CBIR system is evaluated by recall and precision which are defined as.

, (2)

where Re is a recall; NRR is the number of relevant images that are retrieved; and NPR is the number of relevant image in the database.

, (3)

where Pr is a precision and NTR is the total number of images that are retrieved from the query.

In the experiment, a testing set consists of 195 images, 50 classes. For each class, 3-4 query images are tested with the implemented system. The average recall and precision are calculated. Since NPR is equal to NTR, the recall is equal to precision. The experimental results shown in Table 4 and 5 are the average recall and precision of testing with L1 distance and k-NN (k=3), respectively. It is proved that the proposed method evidently outperforms the classical BC method for all image groups. In Table 6, the average accuracy of testing with k-NN (k=3) obviously illustrates that the accurate FD features lead to the higher accuracy of image retrieval. Furthermore, when L1 distance is applied to the CBIR system, the TBC method can retrieve the first forth images in the same class (Figure 9(a)), whereas the BC method can retrieve only the first images in the same class

NPRNRR

=Re

NTRNRR

=Pr

14

(Figure 9(b)), when compared with the same recall precision values (Re=Pr=0.4). In the same way, it provides the same results as demonstrated in Figure 10 when using k-NN. This implies that the retrieved-image order of the proposed method is better than that of the BC method at the same levels of the recall and precision.

Table 4.1 Average Recall and Precision of Testing with L1 Distance.

Image Size TBC BC

64x64 0.3831 0.2262

128x128 0.4205 0.2538

256x256 0.4533 0.2523

Table 5. 1 Average Recall and Precision of Testing with k-NN, k=3.

Image Size TBC BC

64x64 0.3754 0.2215

128x128 0.4149 0.2415

256x256 0.4344 0.2456

Table 6. Average Accuracy of Testing with k-NN, k=3.

Image Size TBC BC

64x64 75.38% (147/195) 49.23% (96/195)

128x128 85.64% (167/195) 51.79% (101/195)

256x256 85.64% (167/195) 51.79% (101/195)

Conclusions In this paper, a triangle-box counting (TBC) method is proposed to improve the estimation accuracy of fractal dimensions for binary images. This method increases the precision of box counts and fits the requirements of the minimum box covering by simply dividing square boxes into two equally triangle boxes. The proposed method is simple and practical for application implementations. This paper verifies the implemented algorithm in two ways. First, the algorithm is tested with standard fractal images. The results of validation with respect to the theoretical FDs show that the TBC method can estimate the FD close to the theoretical FD. Especially in low and medium theoretical FD, the proposed method evidently outperforms the classical box-counting (BC) method. In high FD images, the estimation of the proposed method and the BC yields almost the same results, very slightly different. Second, the TBC method is applied to extract shape and texture features for the

1 Since the sample images for each class in the database (NPR = 10) are equal to the total number of images retrieved from the query (NTR = 10), this makes the average recall and precision identical.

15

CBIR system. It is illustrated that a more accurate FD feature leads to the higher efficiency of the CBIR system in terms of recall, precision, accuracy, and retrieved order, when compared with the classical BC method.

For future studies, the TBC approach will be extended to estimate FD for gray-level images and color images.

(a) TBC method (Re=Pr=0.4) (b) BC method (Re=Pr=0.4)

Figure 9. An example of experimental results of a retrieval image order by using L1 distance.

(a) TBC method (Re=Pr=0.4) (b) BC method (Re=Pr=0.4)

Figure 10. An example of experimental results of a retrieval image order by using k-NN, k=3.

References [1] R. Lopes, and N. Betrouni, Fractal and multifractal analysis: A review, Journal of

Medical Image Analysis, Vol. 13, pp. 634649, 2009. [2] O.M. Bruno, R.O. Plotze, M. Falvo, and M. Castro, Fractal dimension applied to

plant identification, An International Journal of Information Sciences, Vol. 178, pp. 2722-2733, 2008.

Query Image Query Image

Query Image Query Image

16

[3] R.D. King, A.T. George, T. Jeon, L.S. Hynan, T.S. Youn, D.N. Kennedy, and B. Dickerson, Characterization of atrophic changes in the cerebral cortex using fractal dimensional analysis, Brain Imaging and Behavior, Vol. 3(2), pp.154166, 2009.

[4] J.Z. Liu, L.D. Zhang, and G.H. Yue, Fractal dimension in human cerebellum measured by magnetic resonance imaging, Biophysical Journal, Vol. 85, pp. 40414046, 2003.

[5] J. Theiler Estimating fractal dimension, Journal of the Optical Society of America A, Vol. 7 (6), pp. 1055-1073, 1990.

[6] S. Changjiang, J. Guangrong, and W. Yangfan, Study of texture images classification method based on fractal dimension calculation, The International Joint Conference on Artificial Intelligence, pp. 488491, 2009.

[7] K. Foroutan-pour, P. Dutilleul, and D.L. Smith Advances in the implementation of the box-counting method of fractal dimension estimation, Applied Mathematics and Computation, Vol. 105, pp. 195-210, 1999.

[8] J. Li, Q. Du, and C. Sun, An improved box-counting method for image fractal dimension estimation, Pattern Recognition, Vol. 42, pp. 2460-2469, 2009.

[9] J. Feng, W.C. Lin, and C.T. Chen, Fractional box-counting approach to fractal dimension estimation, Proceedings of ICPR '96, pp. 854-858, 1996.

[10] S.T. Liu, An improved differential box-counting approach to compute fractal dimension of gray-level image, 2008 International Symposium on Information Science and Engineering, pp. 303306, 2008.

[11] D. Sankar, and T. Thomas, Fractal features based on differential box counting method for the categorization of digital mammograms, International Journal of Computer Information Systems and Industrial Management Applications, Vol. 2, pp.011-019, 2010.

[12] A.R. Backes, and O.M. Bruno, A new approach to estimate fractal dimension of texture images, Image and Signal Processing, Lecture Notes in Computer Science, Vol. 5099, pp. 136143, 2008.

[13] Fractal Dimension, Available: http://en.wikipedia.org/wiki/Fractal_dimension [Accessed: Mar, 2012].

TRIANGLE-BOX COUNTING METHOD FOR FRACTAL DIMENSION ESTIMATIONIntroductionProposed MethodReferences

sgsdg

Documents

Transcript of sgsdg