Proceedings of the First National Conference on Soft...

A University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

Proceedings of the First National Conference on Soft Computing

18-19 November – 2015

University of Guilan

B University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

A multistage differential transformation method for approximate solution of Chemical Kinetics System M. Mirzazadeh, M. Moradi

1200

Neural networks for prediction of liquid ternary phase behavior M. Moghadam, S. Asgharzadeh, B. Sharifzadeh

1207

Complex fuzzy system of linear equations with symmetric positive definite coefficient matrix Behrouz Fathi Vajargah, Zeinab Hassanzadeh

1213

Resource Allocation Optimization by Shuffled Frog Leaping Algorithm Anis Vosoogh, Reza Nourmandi-Pour

1218

Security issues and challenges using cloud computing Anis Vosoogh, Mohssen nazarian parizi

1234

Nonparametric Wavelet Regression Estimates for Consecutive Survival data Esmaeil Shirazi, Reza Zarei

1240

Wavelet-Based Quantile Density Estimation By Block Thresholding Method Esmaeil Shirazi, Abdolsaeed Toomaj

1248

Two new fuzzy process capability indices for asymmetric tolerance interval Zainab Abbasi Ganji, Bahram Sadeghpour Gildeh

1259

The Fuzzy q-Laplace Transforms Z. Noeiaghdam, S. Khakrangin

1266

Numerical Studying of PDEs on Manifold with Gaussian Data Mostafa Eslami, Hadi Estebsari

1271

A New Approach for Solving Intuitionistic Fuzzy Linear Systems Mahbobeh Esmaeili, Mohammad Keyanpour

1275

Numerical Treatment of Nonlinear Systems of Equations by Imperialist Competitive Algorithm H. Rouhi, R. Ansari

1281

Comparison between Optimal Homotopy asymptotic method and Homotopy perturbation Method for solving a class of voltera integral equation Z. Ayati , E. Moradi , F. Hajipour

1287

C University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

Investigation of surface and nonlocal effects in the vibration of mass- attached nanotubes by Differential transform method Asghar Zajkani, Gholam Reza Shaghaghi

1296

Free vibration analysis of Euler-bernoulli beam theory applied to cracked nanobeams using a nonlocal elasticity model Asghar Zajkani, Mohammad Reza Kokaba, Hamed Mohaddes Deylami

1307

Free vibration of circular nanobeams and nanorings including surface effects and resting on elastic foundations Asghar Zajkani, Mohsen Daman

1316

Extracting Drug-Drug Interaction from Literature through Detecting Linguistic-Based Negation and Neutral Candidates Behrouz Bokharaeian, Alberto Diaz

1328

Solving Fuzzy Linear system in the presence of Hukuhara Difference M.Keyanpour , M.Mohaghegh tabar

1340

On the Grey Dynamics of Type 2 Fuzzy Neural Hybrid Force Control Farnaz Sabahi

1346

TWO METHODS FOR DEFUZZIFICATION OF NUISANCE PARAMETER Ahmad Hozhabr, Adel Mohammadpour

1351

Fuzzy Linear system of the form and Hukuhara Difference Mohamad Keyanpour, Maryam Mohagheghtabar

1356

Fuzzy regression: M-estimation approach J. Chachi, S.M. Taheri

1362

A Novel Metaheuristics for Optimization Inspired by Mother-Infant Communication in Animal Colonies Alireza Ghaffari-Hadigheh

1370

A decision tree based neural network method for prediction of poor prognosis in traumatic brain injury patients Saeedeh Pourahmad, S.Mahmoud Taheri, Iman Hafizi-Rastani, Hosseinali Khalili, Shahram Paydar

1378

An Algorithm for Image Watermarking Embedding and Blind Detection in the Domain of Wavelet Transform Esmaeil Najafi

1384

D University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

Quasilinearization Numerical Scheme for Solving a Class of Weakly Singular Volterra Integral Equations Esmaeil Najafi

1391

Equation Chapter 0 Section 1Solution of a time fractional inverse parabolic problem by a numerical algorithm based on finite differences method Afshin Babaei, Alireza Mohammadpour

1397

A spectral method for solving an inverse time-dependent source problem Afshin Babaei, Somayeh Nemati

1402

t- Best approximation results in fuzzy normed spaces H. R. Goudarzi

1408

Existence and uniqueness of fuzzy differential equations with monotone condition Samira Siahmansouri , Omid Solimani Fard

1415

Fuzzy Quotient BCK-algebras S. Saidi Goraghani, F. Forouzesh

1420

Kolmogorov-Smirnov fuzzy test for fuzzy random variables Vahid Ranjbar, Zahra Radmehr, Kamel Abdollahnezhad

1425

Numerical solving Fredholm fuzzy integral equations by using Radial Basis Functions R. Firouzdor, Sh. Asghari, R. Salehi

1432

Numerical solution of fuzzy linear system by using Particle Swarm optimization method R. Salehi, R. Firouzdor, M. Amirfakhrian, Sh. Asghari

1439

Fitting an ARMA model to the GARCH simulations artificial time series, Forecasting GARCH model, An R software implementation Fatemeh Hassantabar Darzi, Narges Khoshnazar , Mehrdad Eslami

1448

Application of Linear Fuzzy and Ordinary Linear Regression on the Geographical data with Outlier Observation, Case Study: Saghez Station Mohammad Hossein Dehghan , Hojatollah Daneshmand , Narges Khoshnazar

1455

A Survey on Different Strategies on Preparing Data for Data Mining Michael Bidollahkhany, Marzieh Faridi Masouleh

1461

E University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

Edge Detection of Digital Image Using Fuzzy Rule Based Technique Maryam koniyehnoor, Nasim Pillehvar

1468

Fuzzy wavelet transform in nonparametric regression estimating Mohammad-Javad Davoudabadi, Mina Aminghafari

1474

Comparison of Optimal Homotopy Asymptotic Method and Homotopy Perturbation Method for Improved Boussinesq Equation Z. Ayati, Sima ahmady

1477

A Numerical Method for Solving Fuzzy Differential Equations with Fractional Order N. Ahmady , E. Ahmady

1485

Equation Chapter 1 Section 1 Fuzzy Optimization of Linear Fractional Function Subject to a System of Max- Arithmetic Mean Relational Inequalities Fateme Kouchakinejad, Mashaal lah Mashinchi, Esmaile Khorram

1491

Copula and W.L.W approaches for composite end point of multivariate failure time P. Azhdari, F. Abedini

1498

Stability of General Cubic Mapping in Fuzzy Normed Spaces S. Javadi

1505

An Accurate Fuzzy Frequent Pattern Based Classifier Using Confidence Tuning Alireza Hekmatinia, Mohammad Saniee Abadeh

1509

Solution of fractional Black–Scholes equation by using homotopy perturbation method Behrouz Fathi Vajargah, Maryam Ghazizadeh

1516

Fixed point theorems on intuitionistic fuzzy metric spaces S. Karimzadeh , M. Saheli

1523

Fuzzy Topology Generated By Fuzzy Norm M. Saheli , S. Karimzadeh

1525

Optimal correction of linear inequalities based on second order conic programming Hossein Moosaei, M Jalili

1527

F University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

Ridge regression model fitting with triangular fuzzy inputs-output data Mohammad Reza Rabiei, Fatemeh Piadeh Koohsar

1530

Hybrid algorithm based on PSO-TLBO and FCM for Image Color Quantization S. Milad. Nayyersabeti , A. Mostaar , MR. Deevband

1535

Tuning of PID Controller using PSO-TLBO Algorithm S.Milad.Nayyer sabeti, MR.Deevband

1542

A new fuzzy rule-based approach for automatic facial expression recognition Mohamad Roshanzamir , Ahmad Reza Naghsh Nilchi , Mahdi Roshanzamir

1547

A study on the vulnerability of CAPTCHA patterns of Iranian popular websites and presenting approaches for resolving it Hossein KardanMoghaddam , Hossein Moradi

1553

1200 University of Guilan-Faculty of Engeineering & Technology-East of Guilan 18-19 November - 2015

A multistage differential transformation method for approximate solution of Chemical Kinetics System

M. Mirzazadeh, M. Moradi

Department of Engineering Sciences, Faculty of Technology and Engineering, University of Guilan, East of Guilan, Rudsar, Iran

[email protected], [email protected]

Abstract

In this paper, the application of multistage differential transform method (MDTM) presented for obtaining

the analytic solutions of nonlinear systems that often appear in chemical problems. The results obtained by

using MDTM are compared to those obtained by using the Runge–Kutta method (RK4). The results show

that MDTM is very effective and convenient and the obtained solutions of this method have high accuracy

respect to DTM.

Key word: Multistage differential transform method; Chemical Kinetics System

Math Subject Classification: 78A60, 37K10, 35Q51, 35Q55

Introduction In every phenomenon in real life, there are many parameters and variables related to each other under the law

imperious on that phenomenon. When the relations between the parameters and variables are presented in

mathematical language we usually derive a mathematical model of the problem, which may be an equation, a

differential equation, an integral equation, a system of integral equations and etc. Consider a model of a

chemical process [1] consisting of three species, which are denoted by A;B and C. The three reactions are

A B→ (1)

B +C A + C→ (2)

B + B C→ (3)

mailto:[email protected]



Let u, v and w denote the concentrations of A, B and C, respectively. We assume these are scaled so that the

total of the three concentrations is 1, and that each of three constituent reactions will add to the concentration

of any of the species exactly at the expense of corresponding amounts of the reactants. The reaction rate of

Eq. (1) will be denoted by a. This means that the rate at which u decreases, and at which v increases, because

of this reaction, will be equal to au. In the second reaction Eq. (2), C acts as a catalyst in the production of A

from B and the reaction rate will be written as β , meaning that the increase of u, and the decrease of w, in

this reaction will have a rate equal to vwβ . Finally, the production of C from B will have a rate constant

equal to γ , meaning that the rate at which this reaction takes place will be 2vγ . Putting all these elements of

the process together, we find the system of differential equations for the variation with time of the three

concentrations to be:

1 2

21 2 3

23

,

,

,

du k u k vwdxdv k u k vw k vdxdw k vdx

= − −

= − −

=

(4)

subject to the initial conditions:

1 2 3(0) , (0) , (0) .u v wα α α= = = (5)

If, the three reaction rates are moderately small numbers are not greatly different in magnitude, then this is a

straightforward problem.

Many different methods have recently introduced to solve nonlinear problems, such as, differential

transform method [2-9]. Recently, the multistage DTM (MDTM) [10-13] is proposed to accelerate the

convergence of the truncated approximation in a large domain as well as to improve the accuracy of the

standard DTM.

In this paper, the MDTM is used to give the analytical solutions of chemical kinetics problem. Compared

with the classical fourth-order Runge–Kutta (RK4) method and the DTM, the MDTM is a very effective and

easy method for the analytical solutions of chemical kinetics problem.

DTM for system of chemical kinetics The basic definitions and fundamental operations of differential transform are given in [2-8]. For

convenience of the reader, we will present a review of the differential transform method. The differential

transform of the function of k th derivative of function ( )u x is defined as follows

0

1 ( )( ) ,!

k

kx x

d u xU kk dx

=

=

(6)


where ( )u x is the original function and ( )U k is the transformed function. The differential inverse transform of

( )U k is defined as

00

( ) ( )( ) ,k

ku x U k x x

∞

=

= −∑ (7)

in a real application, and when 0

x is taken as 0, then the function ( )u x is expressed by a finite series and Eq.

(7) can be written as

0( ) ( ) .k

ku x U k x

∞

=

= ∑ (8)

The fundamental mathematical operations performed by one-dimensional differential transform method can

readily be obtained and are listed in Table 1. Table 1

Original function Transformed function

( ) ( ) ( )u x f x g x ( ) ( ) ( )U k G k H k

( ) ( )u x g x ( ) ( )U k G k

( ) ( ) /u x g x x= ∂ ∂ ( ) ( 1) ( 1)U k k G k

( ) ( ) /m mu x g x x= ∂ ∂ ( ) ( 1)( 2) ( ) ( )U k k k k m G k m= + + + +K

( ) mu x x

1,( ) ( )

k mU k k m

o otherwiseδ

== − =

( ) ( ) ( )u x f x g x 0( ) ( ) ( )

k

rU k F r G k r

=

= −∑

1 2( ) ( ) ( ) ( )

mu x f x f x f x K

2

1 1

1 1 2 2 1 10 0

( ) ( ) ( ) ( )m

kk

m mk k

U k F k F k k F k k−

−= =

= − −∑ ∑K K

According to DTM, by taking differential transformed both sides of the systems of equations given Eqs. (4)

and (5) is transformed as follows:

1 20

1 2 30 0

30

( 1) ( 1) ( ) ( ) ( ),

( 1) ( 1) ( ) ( ) ( ) ( ) ( ),

( 1) ( 1) ( ) ( ),

k

rk k

r rk

r

k U k k U k k V r W k r

k V k k U k k V r W k r k V r V k r

k W k k V r V k r

=

= =

=

+ + = − − −

+ + = − − − −

+ + = −

∑

∑ ∑

∑

(9)

Therefore, according to DTM the n-term approximations for the solutions of (4) can be expressed as


1

01

01

0

( ) ( ) ,

( ) ( ) ,

( ) ( ) ,

nk n

kn

k n

kn

k n

k

u x U k x u

v x V k x v

w x W k x w

−

=

−

=

−

=

=

=

=

∑

∑

∑

;

;

;

(10)

NDTM for system of chemical kinetics The approximate solutions (10) are generally, as will be shown in the numerical experiments of this paper,

not valid for large x . A simple way of ensuring validity of the approximations for large t is to treat (8) and

(9) as an algorithm for approximating the solutions of (4) and (5) in a sequence of intervals choosing the

initial approximations as

0 0 1

0 0 2

0 0 3

( ) ( ) ( ) ,( ) ( ) ( ) ,( ) ( ) ( ) ,

u x U x u xv x V x v xw x W x w x

α

α

α

∗ ∗

∗ ∗

∗ ∗

= = =

= = =

= = =

(11)

where x ∗ is the left-end point of each subinterval.

In order to carry out the iterations in every subinterval of equal length x∆ , 1 1 2 1[0, ),[ , ) ,[ , ),jx x x x x−K we

would need to know the values of the following,

0 0 0( ) ( ), ( ) ( ), ( ) ( )u x u x v x v x w x w x∗ ∗ ∗ ∗ ∗ ∗= = = (12)

But, in general, we do not have these information at our clearance except at the initial point 0x ∗ = . A simple

way for obtaining the necessary values could be by means of the previous n-term approximations , ,n n nu v w

of the preceding subinterval given by (9), that is,

0 0 0( ), ( ), , ( )n n nu u x v v x w w x∗ ∗ ∗ ∗ ∗ ∗; ; K ; (13)

Example 1. As the first example, consider the Eqs. (4,) with 1

2

3

0.04,0.03,0.05.

kkk

= = =

and 1

2

3

0.04,0.03,0.05.

kkk

= = =

In Fig.1, we present the numerical solutions which are applied by the 10-term DTM and the 10-term NDTM

and RK4 on step 0.01x∆ =


Fig. 1. Comparisons between DTM, MDTM and RK4

Example 2. Consider the problem (4) with1

2

3

0.1,0.02,0.009.

kkk

= = =

,1

2

3

10,5,20,

ααα

= = =

In Fig.2, we present the numerical

solutions which are applied by the 10-term DTM and the 6-term DTM and RK4 on step 0.01x∆ =


Fig. 2. Comparisons between DTM, MDTM and RK4

Conclusions In this article, we have applied multistage differential transform method for solving the nonlinear system of

chemical kinetics. The basic idea of the MDTM is to apply the standard DTM to each sub-domain by

updating an initial condition. Comparison results with DTM in both examples reveal that the MDTM is a

very powerful and easy to use in obtaining approximate solutions of nonlinear problems. Here, all

computations are performed by using Maple 14.

References

[1] J.C. Butcher, Numerical Methods for Ordinary Differential Equations. John Wiley and Sons, 2003.


[2] J.K. Zhou, Differential transformation and its application for electrical circuits, Huarjung University

Press, Wuuhahn, China, 1986 (in Chinese).

[3] M.J. Jang, C.L. Chen, Y.C. Liy, Two-dimensional differential transform for Partial differential

equations, Appl. Math. Comput. 121, 261–270(2001).

[4] A. Arikoglu, I. Ozkol, Solution of boundary value problems for integro-differential equations by using

differential transform method, Appl. Math. Comput. 168, 1145–1158(2005).

[5] C.L. Chen, Y.C. Liu. Solution of two point boundary value problems using the differential

transformation method. J Opt Theory Appl 99,23–35(1998).

[6] Fatma Ayaz, Applications of differential transform method to to differential-algebraic equations,

Applied Mathematics and Computation 152, 649–657(2004).

[7] Figen Kangalgil, Fatma Ayaz, Solitary wave solutions for the KdV and mKdV equations by differential

transform method Chaos, Solitons and Fractals 41(1), 464-472 (2009).

[8] S.V. Ravi Kanth, K. Aruna,Two-dimensional differential transform method for solving linear and non-

linear Schrödinger equations Chaos, Solitons and Fractals 41(5), 2277-2281 (2009).

[9] A. Arikoglu, I. Ozkol, Solution of fractional differential equations by using differential transform

method, Chaos Soliton. Fract. 34, 1473– 1481(2007).

[10] A. Gokdogan, M. Merdan, and A. Yildirim, A multistage differential transformation method for

approximate solution of Hantavirus infection model, Communications in Nonlinear Science and

Numerical Simulation, 17, 1–8 (2012).

[11] A. Gokdogan, M. Merdan, and A. Yildirim, Adaptive multi-step differential transformation method to

solving nonlinear differential equations, Mathematical and Computer Modelling, 55(3) 761–769,

(2012).

[12] Z. M. Odibat, C. Bertelle, M. A. Aziz-Alaoui, and G. H. E. Duchamp, A multi-step differential

transform method and application to non-chaotic or chaotic systems, Computers & Mathematics with

Applications, 59(4), 462–1472 (2010).

[13] V. S. Erturk, Z. M. Odibat, and S. Momani, An approximate solution of a fractional order differential

equation model of human T-cell lymphotropic virus I (HTLV-I) infection of CD4 T-cells,

Computers & Mathematics with Applications, 62(3), 996–1002 (2011).


Neural networks for prediction of liquid ternary phase behavior

M. Moghadam*, S. Asgharzadeh, B. Sharifzadeh

Faculty of Technology and Engineering East of Guilan, University of Guilan, Rudsar, Iran

[email protected]

Abstract Artificial neural network was investigated for emulating liquid ternary phase behavior. Reproducing the feed data and expanding the ternary phase pattern makes this approach able to predict the phase behavior beyond the collection of experimental measurements. Well-trained networks even can reproduce outlier data at different compositions and/or temperatures, acceptably. Keywords: Neural network, Ternary phase, Liquid-liquid equilibrium

1. INTRODUCTION Understanding of phase equilibria, and specifically ternary phase equilibrium, is fundamental for improving extraction techniques and is important in design and evaluation of industrial extraction processes. Generally well-known thermodynamic models including universal quasi-chemical (UNIQUAC) model [1], nonrandom two-liquid (NRTL) model [2] and UNIFAC [3] are used for correlating the experimental data, and gaining the interaction parameters of components. However, because of experimental expenses and theoretical limitations, there is a lot of interest in developing methods that can generally predict the phase behavior of ternary liquid systems. Neural networks are often used for statistical analysis and data modeling, in which their role is perceived as an alternative to standard nonlinear regression or cluster analysis techniques. ANN is an interconnected assembly of simple processing units (nodes). The processing ability of the network is stored in the interunit connection strengths (weights) obtained by learning from a set of training patterns [4,5]. In the past decade, applications of machine learning methods in the field of chemistry have been further expanded into the analysis of spectral data, pharmaceutical product development, classification of compounds, prediction of chemical reactivity, physical properties, electrostatic potential, ionization potentials as well as QSARs [6–10]. Large numbers of the researcher have employed Multilayered back propagation (MBP) ANNs in their works. This type of network consists of input layer, one or more hidden layer and output layer with predefined weight functions. Moreover, recently networks based on group method of data handling (GMDH) algorithm have been developed and employed. These networks are self-constructive, well flexible and suitable for modeling of highly nonlinear and noisy systems. They use short-term polynomial transfer functions as pattern estimator [11,12]. In recent years, prediction methods based artificial neural networks (ANNs) were adapted to establish interpretive and flexible models for vapor-liquid equilibria (VLE) and liquid-liquid equilibria (LLE) [13–15]. The focus of this paper is on modeling Liquid-liquid equilibrium data using ANN. 2. THEORETICAL CONSIDERATION Ghanadzadeh et al. [16,17] have reported results on application of neural networks for liquid-liquid extraction process. It seems some modifications are required for application of ANN for extraction process. Statistical factors in published works show that the networks might be overfitted due to complex network structure and deficiency of liquid-liquid equilibrium data sets employed. Vossoughi et al. [14] have recently reported same approach for modeling liquid-liquid equilibrium behavior of aliphatic + aromatic + ionic liquids. They have used a larger data set (340 points) and the statistical factors in their published work are reasonable, however, the network prediction capability is restricted to the data set. These ANNs have not been developed to predict the system behavior beyond the data set space. These restrictions inspired for enhancing the methodology so it can be used for generalization and prediction appropriately. Lack of input data affects the entire pattern of the equilibrium system for proper training. “How one can solve it?”

1312111 =++ www (1)



1322212 =++ zzz (2)

1332313 =++ www (3)

21 LLF += (4)

23112 LwLwFz iii +⋅=⋅ (5)

In these equations, ijw represent the molar weight of component j in phase i, F is total feed weight and iL is weight of each phase. All the points on ternary tie lines are connected with similar separated phase compositions. In fact, the problem is mainly is connected to initial feed composition. It seems that there is no experimental solution due to insufficient experimental feed data. However, according to the law of mass conservation the mass balance equations can be solved for arbitrary total mass of the initial feed. Consequently, proper number of data sets system can be generated (Figure 1).

Figure 1. Equilibrium phase composition and corresponding reproduced feed data

3. TERNARY SYSTEM AT DIFFERENT TEMPERATURES The proposed approach was tested for (Water + butyric acid + oleyl alcohol) [18] systems at three different temperatures. The experimental data with 15 intervals was reproduced. Subsequently, the data were partitioned as 70% for training and 30% for testing. It provides 384 data sets. Since weight fraction of components are interconnected, number of network inputs and outputs reduce to three ( 12z , 22z and T) and four ( 11w , 21w , 13w and 23w ), respectively. Using reproduced data sets fine-structured MBP network (Figure 2) were trained and used to predict ternary phase data in equilibrium phases (Figure 3 and 4).

Figure 2. MBP network structure for (Water + butyric acid + oleyl alcohol) at different temperatures

The network training parameters for investigated system are available in table 1.

Solvent Water

Solute

31

21

11

www

32

22

12

zzz

33

23

13

www


Table 1. MBP node weights for (Water + butyric acid + oleyl alcohol) at different temperatures

From the input layer To the 1th hidden layer bias 1th neuron 2th neuron 3th neuron 1th neuron 5.0252 0.1184 5.4742 0.0300 2th neuron 1.1445 -5.6580 -6.0590 -0.0665 3th neuron -0.9989 -1.4507 -3.0851 -0.0453 4th neuron -0.8987 -0.7050 -0.2562 1.7479

From the input layer To the 2th hidden layer bias 1th neuron 2th neuron 3th neuron 4th neuron 1th neuron -0.0530 -4.9606 3.4847 1.8964 0.5169 2th neuron -0.0892 5.2099 -3.6779 -1.6891 -0.5367 3th neuron 0.0201 3.3395 -4.9225 -1.7330 2.3131 4th neuron 0.3559 5.2034 -4.1990 -2.1752 -0.1598

0.85

0.90

0.95

1.00

0 50 100 150 200 250 300 350

Data set number

W11

(wei

ght f

ract

ion)

training set testing set

0.00

0.05

0.10

0.15

0 50 100 150 200 250 300 350

Data set number

W21

(wei

ght f

ract

ion)



Figure 3. (Water + butyric acid + oleyl alcohol) at T=298.15, 308.15 and 318.15 K (connection between data points are

employed for better visualization and have no physical meaning) The calculated statistical factors [19] for these systems are summarized in table 2. The data in this table shows that both types of the networks reproduce the equilibrium systems acceptably and can forecast the separation data reasonably.

Table 2. Network statistical factors for (Water + butyric acid + oleyl alcohol) at different temperatures (training sets: 268, testing sets: 116)

w11 w21 w13 w23

training testing training testing training testing training testing MSE a <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 0.0003 0.0003 RMSE b 0.0040 0.0038 0.0039 0.0038 0.0060 0.0056 0.0165 0.0163 MRE % c 0.36 0.32 6.21 5.52 23.84 23.23 9.14 8.07 R2 d 1.0000 1.0000 0.9982 0.9982 0.9823 0.9818 0.9974 0.9973 r e 0.9948 0.9948 0.9947 0.9945 0.9805 0.9798 0.9950 0.9939 Average RMSE: MBP=0.0075; UNIFAC=0.0746 [51] a: Mean Square Error b: Root Mean Square Error c: Mean Relative Error d: Absolute Fraction of Variance e: Correlation coefficient

0.00

0.04

0.08

0.12

0 50 100 150 200 250 300 350

Data set number

W13

(wei

ght f

ract

ion)


0.00

0.20

0.40

0.60

0 50 100 150 200 250 300 350

Data set number

W23

(wei

ght f

ract

ion)



Figure 4. Correlation between experimental and calculated network values for MBP

Generally, the trained network is validated according to the statistical factors connected with training and testing data sets from same data point pattern. However, a good examination for a well-trained network is to calculate the outputs for outlier data consistently. Figure 5 shows sample of such prediction for some random outliers of (Water + butyric acid + oleyl alcohol) system. It proves that quality of the network outputs is well for the outlier inputs.

Figure 5. MBP network outputs for outliers () Equilibrium separated phases, (•) Random outlier feeds for (Water + butyric

acid + oleyl alcohol)

y = 0.9995xR2 = 0.9884

0.85

0.9

0.95

1

0.85 0.9 0.95 1

Experimental W 11 (weight fraction)

Pre

dict

ed W

11 (w

eigh

t fra

ctio

n)

y = 0.9983xR2 = 0.9881

0

0.05

0.1

0.15

0 0.05 0.1 0.15


Pre

dict

ed W

21 (w

eigh

t fra

ctio

n)

y = 0.9654xR2 = 0.9597

0

0.05

0.1

0.15

0 0.05 0.1 0.15


Pre

dict

ed W

13 (w

eigh

t fra

ctio

n)

y = 0.9876xR2 = 0.9888

0

0.2

0.4

0.6

0 0.2 0.4 0.6


Pre

dict

ed W

23 (w

eigh

t fra

ctio

n)

Water0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Butyric Acid

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Oleyl Alcohol

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0


4. CONCLUSIONS In this work, application of ANN for liquid-liquid separation process alongside LSER method was investigated. Reproducing the feed data and expanding the ternary pattern improves the network ability in modeling the phase behavior. Using proper population of data sets, the trained network can reproduce feature of liquid ternary system at different temperatures. well-trained networks even could reproduce the outlier data, acceptably. 5. REFERENCES [1] D.S. Abrams, J.M. Prausnitz, Statistical thermodynamics of liquid mixtures: A new expression for the excess Gibbs energy of partly or completely miscible systems, AIChE J. 21 (1975) 116–128. [2] H. Renon, J.M. Prausnitz, Local compositions in thermodynamic excess functions for liquid mixtures, AIChE J. 14 (1968) 135–144. [3] A. Fredenslund, R.L. Jones, J.M. Prausnitz, Group-contribution estimation of activity coefficients in nonideal liquid mixtures, AIChE J. 21 (1975) 1086–1099. [4] K. Gurney, An Introduction to Neural Networks, CRC Press, 2003. [5] GMDH, Group Method of Data Handling, Group Method Data Handl. (2015). http://www.gmdh.net/ [6] J.C. Cancilla, P. Díaz-Rodríguez, G. Matute, J.S. Torrecilla, The accurate estimation of physicochemical properties of ternary mixtures containing ionic liquids via artificial neural networks, Phys Chem Chem Phys. 17 (2015) 4533–4537. [7] J.C. Cancilla, P. Díaz-Rodríguez, J.G. Izquierdo, L. Bañares, J.S. Torrecilla, Artificial neural networks applied to fluorescence studies for accurate determination of N-butylpyridinium chloride concentration in aqueous solution, Sens. Actuators B Chem. 198 (2014) 173–179. [8] P. Díaz-Rodríguez, J.C. Cancilla, G. Matute, J.S. Torrecilla, Viscosity estimation of binary mixtures of ionic liquids through a multi-layer perceptron model, J. Ind. Eng. Chem. 21 (2015) 1350–1353. [9] J.S. Torrecilla, C. Tortuero, J.C. Cancilla, P. Díaz-Rodríguez, Estimation with neural networks of the water content in imidazolium-based ionic liquids using their experimental density and viscosity values, Talanta. 113 (2013) 93–98. [10] J.S. Torrecilla, C. Tortuero, J.C. Cancilla, P. Díaz-Rodríguez, Neural networks to estimate the water content of imidazolium-based ionic liquids using their refractive indices, Talanta. 116 (2013) 122–126. [11] S.J. Farlow, Self-Organizing Methods in Modeling: GMDH Type Algorithms, Marcel Dekker, Inc., New York and Basel, 1984. http://www.biblio.com/book/self-organizing-methods-modeling-gmdh-type/d/226388195 . [12] A.G. Ivakhnenko, Polynomial Theory of Complex Systems, IEEE Trans. Syst. Man Cybern. SMC-1 (1971) 364–378. [13] S. Bogdan, D. Gosak, Ð. Vasić-Rački, Mathematical modeling of liquid - liquid equlibria in aqueous polymer solution containing neutral proteinase and oxytetracycline using artificial neural network, Comput. Chem. Eng. 19, Supplement 1 (1995) 791–796. [14] M. Hakim, G. Behmardikalantari, H. Abedini Najafabadi, G. Pazuki, A. Vosoughi, M. Vossoughi, Prediction of liquid–liquid equilibrium behavior for aliphatic + aromatic + ionic liquid using two different neural network-based models, Fluid Phase Equilibria. 394 (2015) 140–147. [15] J.S. Torrecilla, M. Deetlefs, K.R. Seddon, F. Rodríguez, Estimation of ternary liquid–liquid equilibria for arene/alkane/ionic liquid mixtures using neural networks, Phys. Chem. Chem. Phys. 10 (2008) 5114–5120. [16] H. Ghanadzadeh, M. Ganji, S. Fallahi, Mathematical model of liquid–liquid equilibrium for a ternary system using the GMDH-type neural network and genetic algorithm, Appl. Math. Model. 36 (2012) 4096–4105. [17] H. Ghanadzadeh, S. Fallahi, M. Ganji, Liquid–Liquid Equilibrium Calculation for Ternary Aqueous Mixtures of Ethanol and Acetic Acid with 2-Ethyl-1-hexanol Using the GMDH-Type Neural Network, Ind. Eng. Chem. Res. 50 (2011) 10158–10167. [18] M. Bilgin, Phase equilibria of liquid (water + butyric acid + oleyl alcohol) ternary system, J. Chem. Thermodyn. 38 (2006) 1634–1639. [19] J. Wu, G. Zhang, Q. Zhang, J. Zhou, Y. Wang, Artificial neural network analysis of the performance characteristics of a reversibly used cooling tower under cross flow conditions for heat pump heating system in winter, Energy Build. 43 (2011) 1685–1693.

http://www.gmdh.net/

http://www.biblio.com/book/self-organizing-methods-modeling-gmdh-type/d/226388195


Complex fuzzy system of linear equations with symmetric positive definite coefficient matrix

Behrouz Fathi Vajargah, Zeinab Hassanzadeh

Department of Mathematics, Faculty of Mathematical Sciences, University of Guilan

[email protected], [email protected] [email protected], [email protected]

Abstract

In this paper, we investigate an approach to solve complex fuzzy system of linear equations in the form = , where ∈ ℝ × is a crisp symmetric positive definite matrix and , are the complex fuzzy vectors. After reviewing the existence and uniqueness theorems, we propound special technique for converting such problems to systems of linear equations. Then we apply direct and decomposition methods to solve them. The applicability and efficiency of this approach is illustrated with a numerical example. Keywords: Complex fuzzy number, Symmetric positive definite, System of linear equations, Cholesky decomposition method.

1. INTRODUCTION The embedding method for solving fuzzy system of linear equations (FSLE) was first introduced by Friedman et al. [7], on which the coefficient matrix is crisp and the right-hand side is arbitrary fuzzy vector. They replaced the original FSLE by a crisp system of linear equations with a nonnegative coefficient matrix . In [6] some iterative methods have applied to solve FSLE when the coefficient matrix is strictly diagonally dominant with positive diagonal entries. Abbasbandy et al. [1] investigated the existence of the solution for FSLE when the coefficient matrix is symmetric positive definite (SPD) and then used conjugate gradient (CG) iterative method to solve it. Also in [2] and [3] Abbasbandy et al used LU decomposition and steepest descent methods for solving FSLEs. In recent years, complex fuzzy system of linear equations (CFSLE) has been mentioned by authors. They solved the CFSLE in this way, first obtained the solution in terms of the fuzzy center and then used this solution with the width to achieve the final solution (for more details see [4], [10]). Jahantigh et al [8] get the solution of CFSLE with specific approach according to the logic of complex fuzzy numbers arithmetic. In this paper, we convert the original system to the smaller subsystems, so that the CFSLE converts to the two equivalent FSLEs. Then we indicate that the solutions of these systems are the same. In continuation we apply the existence and uniqueness theorems for them when their coefficient matrix are SPD. At first we recall some basic definitions and principals of the fuzzy numbers arithmetic. Definition 1.1 [7] A fuzzy number in parametric form is an ordered pair of functions ( ( ), ( )), 0 ≤ ≤ 1 which satisfies in the following conditions:

1. ( ) is a bounded left continuous nondecreasing function on [0,1], 2. ( ) is a bounded right continuous nonincreasing function on [0,1], 3. ( ) ≤ ( ).

If ( ) = ( ), 0 ≤ ≤ 1 then is a crisp number. In this paper we use the notation = [ ( ), ( )] to denote the -cut of arbitrary fuzzy number . For introducing the complex fuzzy numbers, CFSLE and computing its solution, we recall the arithmetic operations of arbitrary fuzzy numbers ( ) = ( ( ), ( )) and ( ) = ( ( ), ( )), 0 ≤ ≤ 1 and real number as:

1. ( ) = ( ) if and only if ( ) = ( )and ( ) = ( ), 2. ( ) + ( ) = ( ( ) + ( ), ( ) + ( )),






3. ( ) = ( ), ( ) , ≥ 0, ( ), ( ) , < 0, 4. ( ) = ( ), ( ) , ( ), ( ) .

1.1 Complex Fuzzy Numbers Complex fuzzy number is defined by two fuzzy numbers that represent real and imaginary part of the complex as = + , where ( ) = ( ( ), ( )), ( ) = ( ( ), ( )) and 0 ≤ ≤ 1. Therefore can be written as follows: ( ) = ( ( ), ( )), on which ( ) = ( ) + ( ), ( ) = ( ) + ( ). Complex fuzzy arithmetic is defined similar to the real fuzzy numbers. Hereupon we can write

1. + = ( + ) + + , 2. − = ( − ) + − , 3. = ( − ) + ( + ).

For further details, we refer to [5] and references there in. 1.2 Fuzzy system of linear equations Let us consider FSLE in the following form = , (1) where ∈ ℝ × is a crisp matrix and is an arbitrary real fuzzy number vector. According to the [7], we have the following definition. Definition 1.2 [7] A fuzzy number vector ( , , … , ) given by = ( ( ), ( )), 1 ≤ ≤ , 0 ≤ ≤ 1, is called a solution of (1) if

⎩⎪⎪⎨⎪⎪⎧

= = ,

=

= , = 1,2, … , . (2)

It is easy to show that (2) is equivalent to the 2 × 2 crisp system of linear equations as follows: = . (3) In equation (3), = ( , … , ,− , … ,− ) , = ( , … , ,− , … ,− ) and are determined in the following form: ≥ 0 ⟹ = , = , , = , = 0, < 0 ⟹ , = , = − , , = , = 0. The structure of implies that = , where contains the positive entries of and the absolute values of the negative entries of . Therefore we can write = − . Definition 1.3 [7] Let = ( ),− ( ) , 1 ≤ ≤ denote the unique solution of (3). The fuzzy number vector = ( ), ( ) , 1 ≤ ≤ defined by ( ) = ( ), ( ), ( ) , ( ) = ( ), ( ), ( ) , ( ) is called the fuzzy solution of = . If ( ), ( ) , 1 ≤ ≤ , are all fuzzy numbers then ( ) = ( ), ( ) = ( ) , 1 ≤ ≤ and is called a strong fuzzy solution. Otherwise, is a weak fuzzy solution. 2. MAIN RESULTS In this section, at first we recall basic theorems for existence the solution of FSLE, then pay attention to the main theorem of the paper.


Theorem 2.1 [7] The matrix is nonsingular if and only if the matrix = − and + are both nonsingular. Theorem 2.2 [7] Let be nonsingular, the unique solution of (3) is always a fuzzy vector for arbitrary , if and only if is nonnegative. Lemma 2.1 [9] An arbitrary matrix is positive definite if and only if all its eigenvalues are positive. Theorem 2.3 If = − and + are symmetric positive definite matrices then is symmetric positive definite. 2.1 Complex fuzzy system of linear equation The system of linear equations (1) is called the CFSLE whenever ∈ ℂ × is a complex crisp matrix and is an arbitrary complex fuzzy vector. In this paper, we suppose that is an SPD matrix and is assumed as above. By letting = + , we have = + , (5) where, ( ) = ( ), ( ) , ( ) = ( ), ( ) , = 1,2, … , . Similarly, for unknown fuzzy vector , we can write ( ) = ( ), ( ) , = + , ( ) = ( ), ( ) , ( ) = ( ), ( ) , = 1,2, … , . Consequently, a complex fuzzy number vector given by = + , = 1,2, … , is called complex fuzzy solution of the CFSLE (5) if the complex fuzzy vectors , are fuzzy solutions of FSLE = , = , (6) respectively. As a result, it is sufficient that we solve FSLEs (6) to solve CFSLE (1). For solving each of the systems (6), we use the presented method in section one. This means that, in order to solve the systems (6) we must solve two 2 × 2 crisp SLE in the following form: = , = . (7) Theorem 2.4 The complex fuzzy vector solutions of CFSLEs (1) and (5) are equivalent. By assuming that is an SPD matrix and from theorem 2.3 we can deduce that is an SPD matrix. It is well-known that every SPD matrix is nonsingular, therefore the matrix is nonsingular and the systems (7) have unique solutions. Consequently we can use its Cholesky factorization to solve the SLEs (7). 3. NUMERICAL EXPERIMENT In this section, we apply direct and Cholesky decomposition method to solve the systems (7) and then we compute the final solution of (1). 3.1 Example In CFSLE (5) suppose that , , are defined as follows:

=⎝⎜⎜⎛

7472482472102

2424

−−

−−−−

⎠⎟⎟⎞, =

⎝⎜⎜⎛

rrrrrrrr

2710315122222446543932814121830

−−−−−−−−

⎠⎟⎟⎞, =

⎝⎜⎜⎛

rrrrrrrr

5219183860221430431724483618420

−−−−−−−−

⎠⎟⎟⎞.

Therefore is computed as:


=⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛

740200704804002000100720224040020007074020020480472020010000202404

⎠⎟⎟⎟⎟⎟⎟⎟⎟⎟⎞,

where and are SPD matrices. In table (1) the exact solution is computed. The Cholesky factorization of the matrix is applied to obtain the solution in table (2).

Table 1- Computed exact solution with direct method

⎝⎜⎜⎜⎜⎛

rrrrrr

rr

2121232

341121

−+−+−−+−+−

⎠⎟⎟⎟⎟⎞

⎝⎜⎜⎜⎜⎛

rrrrrrrr

22152223123222

−−+−−+−−+−−+

⎠⎟⎟⎟⎟⎞

⎝⎜⎜⎜⎜⎛

)22()21()1()21()52()2()22()32()31()34()23()1()22()1()2()21(

rirrirrirrirrirrirrirrir

−−+−+−++−+−+−++−−+−+−++

−+−+++−

⎠⎟⎟⎟⎟⎞

Table 2- Computed solution with Cholesky decomposition

⎝⎜⎜⎛

rrrrrrrr

9999989999999999.1999999869999999999.0000000450000000000.1999998929999999999.1000000550000000000.1000000130000000000.2000000220000000000.2000000530000000000.3000000620000000000.3000000270000000000.499986129999999999.0000000720000000000.1

9999996029999999999.0999999799999999999.0999999869999999999.0999999689999999999.1

−+−−−+

−−

⎠⎟⎟⎞

⎝⎜⎜⎛

rrrr

rrrr

999998889999999999.1000000550000000000.2999999849999999999.0999999659999999999.0000000550000000000.5000000270000000000.2000000070000000000.2000000170000000000.2

0.3000000200000000000.1000000760000000000.2000000410000000000.3999999589999999999.1999999659999999999.10.1000000100000000000.2

−−+−−+−

−+−−+

⎠⎟⎟⎞

4. CONCLUSIONS In this paper, we have studied the solving of the CFSLEs, when the coefficient matrix is symmetric positive definite. The CFSLE has been converted to the smaller FSLEs and then these systems have been rearranged to the SLEs. Finally, we have applied Cholesky decomposition method to solve them. It is interesting problem that, one may employ mentioned approach to solve CFSLEs with Hermitian positive definite coefficient matrix. 5. REFERENCES


1. Abbasbandy S, Jafarian A, Ezzati R. (2005),“Conjugate gradient method for fuzzy symmetric positive definite system

of linear equations ,”Applied Mathematics and Computation, 171,pp. 1184-1191.

2. Abbasbandy S, Ezzati R, Jafarian A. (2006),“LU decomposition method for solving fuzzy system of linear equations ,”Applied Mathematics and Computation, 172,pp. 633-643.

3. Abbasbandy S, Jafarian A. (2006),“Steepest descent method for system of fuzzy linear equations ,”Applied Mathematics and Computation, 175,pp. 823-833.

4. Behera D, Chakraverty S. (2012),“A New method for solving real and complex fuzzy systems of linear equations,”Computational Mathematics and Modeling, 23,pp. 507-518.

5. Buckley J. J. (1989),“Fuzzy complex number, ” Fuzzy Sets and Systems, 33,pp. 333-345.

6. Dehghan M, Hashemi B. (2006),“Iterative solution of fuzzy linear systems,” Applied Mathematics and Computation, 175,pp. 645-674.

7. Friedman M, Ming M, Kandel A. (1998),“Fuzzy linear systems,”Fuzzy sets and systems, 96, pp. 201–209.

8. Jahantigh M. A, Khezerloo S, Khezerloo M. (2010), “Complex fuzzy linear systems,” International Journal of Industrial Mathematics, 2,pp. 21-28.

9. Laub A. J. (2005),“Matrix Analysis ,” Siam, California.

10. Majumdar S. (2013),“Numerical solutions of fuzzy complex system of linear equations ,”German Journal of Advanced Mathematical Sciences, 1,pp. 20-26.


Resource Allocation Optimization by Shuffled Frog Leaping Algorithm

Anis Vosoogh1*, Reza Nourmandi-Pour2

1Department of computer engineering, Sirjan Science and research branch, Islamic azad university, Sirjan, Iran 1Department of computer engineering, Sirjan branch, Islamic azad university, Sirjan, Iran

2Department of computer engineering, Sirjan branch, Islamic azad university, Sirjan, Iran

Email: [email protected]

[email protected]

Abstract

In this paper by frog leaping algorithms, as a system modeling where improve behavior of system, proposed a new scheduling of tasks and resources in the cloud environment. The proposed algorithm is named RASFLA that has been improving operation of scheduler workload in the cloud. Similarly, this algorithm has been extend for online scheduling and proposed RASFLA algorithm. All proposed method is simulated by CloudSim environment and the results are reported.

Keywords: Cloud computing, task scheduling, resources allocation, SFLA, cloud simulation.

I. Introduction Resource allocation in cloud environment is an optimization problem which can have direct effect on network performance, response time in the cloud, and its operating power. In the virtualization layer of the cloud, assignment of a virtual machine to do a job and executing it can be performed in various forms. Since often it can be expected that requests existing in a line of resources shall be predicted in advance or indicate a repetitive behavior in similar periods of time, history of the line of requests in a cloud can be used for resource allocation in the current situations. This issue results in learning ability of resource allocation section in the cloud. Hence, if optimization approaches provide learning and improvement for activities of a cloud based on its previous behavior, can appropriately be applied in productivity of cloud network.

In this paper, we attempt to extend Shuffled frog leaping algorithm (SFLA) and the new types of this optimization method in such a manner to improve resource allocation, and increase productivity by decreasing "makespan" and increasing response rate. In the beginning, primary definitions and standard types of this algorithm are presented. After considering general definitions of SFLA, we have considered the way to determine parameters existing in it. Then we have presented a new formulation for optimization problem of resource allocation of cloud to increase productivity of




virtual machines in using these resources, in such a manner that it can be implemented by SFLA. Based on this formulation, we introduce a new type of SFLA algorithm as its extension for resource allocation optimization in the cloud. The proposed method named RASFLA, provides the optimized choice for resource allocation in the scheduler section. We have examined efficiency of this new optimization method in presenting an appropriate response for resource allocation in the cloud environment separately, when resources are stationary and dynamic. Results of the mentioned proposed algorithm have been reported by simulation in "Cloudsim" environment.

II. Introducing Frog Leaping Algorithm In 2003, by Eusuff and Lansey, two civil engineering researchers of Arizona University, the new method of optimization based on leaping power pattern of frogs in searching and obtaining the goal (food), was introduced [7]. When frogs in a group work are looking for a better answer in an environment, a type of knowledge transfer is shared from response situations. Hence, responses obtained by each group of frogs, in a memetics evolution model, provide the possibility for the population to reach a better response. The fundamental point is the ability to jump or to leap and change the location which provides a new method in searching for environment of the problem and also it is possible that a group of frogs have effect on the rest of the population.

Based on concept, shuffled frog leaping algorithm (SFLA) is a type of evolutionary algorithm based on meta-population which is also based on memetic concept. As a tool, memetic applies meta-content issues as posed in Darwin evolution theory [7]. Definitely, this application shall be conducted through randomization mechanism for a set of events, since mostly, a meta-content issue which requires a meta-heuristic understanding, can be available through randomization or something like that. Here, evolution occurs in the form of transmission of cultural information of a limit set to the whole existing population. Then, by evaluating evolution extent of the population based on distribution of culture of a certain elite subset, we can achieve better elite subsets. Creating a completely elite subset, ensures us that it can dominate the whole population. By this action we can evaluate the population which must be optimized and observe the result. Therefore, the main strategy of SFLA to achieve the optimum answer in a population is to select and find the culture of an elite set from the population and then distribute this culture in the whole population.

The main advantage of SFLA is its fast performance in achieving the optimum answer. This algorithm, besides completely implementing local search in random sub-environments created, has a good efficiency in global optimum searching [23]. In SFLA, the shuffling section, provides the possibility of sharing messages in local searching. Also this algorithm, even in global searching, has made message sharing possible [49]. This issue has caused that global and local searching have a better response by this algorithm [50]. Usually, conducting transmission between memetics to improve local searching and the possibility to combine it with global searching to achieve the global optimum point is effective. Therefore, SFLA can solve many of the non-linear, indiscoverable [23], and multistate [49] problems. This algorithm is generally used to solve the problem of water resource distribution [50]. Comparing SFLA with genetic algorithm (GA) and particle swarm optimization (PSO) algorithm, appropriately indicates its distinguished aspects. It is interesting that it is proved that SFLA has a higher accuracy and global searching ability compared to GA and PSO algorithms [49]. One of the features of SFLA is its fast convergence which verifies its application to be used in cloud environment with online processing in resource allocation [56, 59].


• Environment and Performance of SFLA Assume that a set of frogs with the same structure and different locations are searching for food location (Probably each of them, after finding location of the food, would guide others to reach the location). Here, a search operation is being occurred. In the zero time, there is no recognition of frogs and even no data about the probable locations where the foods might be. Each frog provides a response, but to know which response is the optimum one and to know whether the optimum response is included in the answers given by these limited frogs, we need an optimization policy. For this purpose, it is proposed that in SFLA, population of frogs be divided into a number of groups (for instance based on their location). As an expression, each group is called a "memeplex". Frogs existing in a memeplex, search for the response in their space locally. The number of frogs in the groups can be the same or different. This grouping is followed by two search strategies including implementing local search and then exchanging the responses and improving them to conduct global search. In other words, at the beginning, frogs in each group, by exchanging the information, optimize their location in terms of food's location, then they compare the results obtained by these groups and after that they improve the global optimum response.

Figure 1. A scheme of the SFLA search environment

The fact that how many frogs are supposed to search for the optimum point, requires specialized investigation of each certain problem, which may be determined by repeating and experimentally executing that problem. Anyway, this number has to be determined at the beginning of executing SFLA. After determining the number of frogs, a population out of this number will be established randomly in the search space. This population can be arranged by use of indexes such as talent of frogs. Then groups shall be arranged in a manner that the most talented frogs be placed in one group and etc., and by this order, the groups shall be created. Combination of frogs can be various. If we have 9 frogs as below, so that f(i) represents their height rank, creation of memeplexes can be conducted in various forms as following samples:


Figure 2. Various methods of creating memeplex

Each group of frogs creates one or multiple groups of talented frogs. Each group also can conduct local search independently and by a different method. Frogs existing in a subset can affect the other frogs existing in the same subset. Therefore, frogs existing in a subset would evolve. Memtic evolution, improves memtic quality of signle frogs and increase their ability in achieving the goal. To achieve a good goal, we can increase the weight of talented frogs and decrease weight of bad frogs. After evolution, some of the memtics of these subsets combine with each other. As a result of combination, memtics become optimized in the global field and by using combination mechanism, new subsets are created. The process of combination practically increases the quality of memtics which are under the effect of different subsets. Results of local and global search constantly combine until the convergence condition is provided. Balance between exchange of global message and local search enables the algorithm to easily leap the local minimum and develop until achieving the optimum response. Main steps of SFLA are given in the following [7,23].

• Steps of Shuffled frog leap algorithm Step Description Goal

1

A. Algorithm starts with random selection of G number of frogs. Productivity of the ith frog, when 1 ≤ i ≤ G, is indicated by U(i). For instance, this parameter can be considered as the maximum leap length of the frog. B. Now that the primary population of frogs has been established, this

Population production and creation of memeplex and


set will be divided into "m" sets with "n" members of memeplexes, as G = m×n. More generally we have G = n1, n2, …, nm. C. In order that search of frogs in each memeplex wouldn’t interfere with its local search (since global optimum has to be one of these local optimums and in this case it will face problems), in each memeplex, we constitute a submemeplex from competent frogs.

submemeplex

2

D. If Px would be the global optimum response of Pbi and Pwi which are respectively the best and the worst response of the ith submemeplex, after each local search, a leap would be proposed for location of the bad frog in the following form: First, the appropriate size of Si is calculated for leap in the ith memeplex as below: Si = minint[rand*(Pbi - Pwi)], Smax (To leap towards the positive direction) Si = maxint[rand*(Pbi - Pwi)], -Smax (To leap towards the negative direction. Notice that Si leap can be made towards a lower situation (negative direction) or larger situation (positive direction) from the current situation. Here, Smax is the maximum step for leap and "rand" is a random number in (0,1). E. Now we investigate that if this leap occurs, optimality of the set would be increased or not. For this purpose, we calculate the optimality of the ith submemplex responses which has "q" (q<n) frogs, as below: Ui(q) = Pwi + Si If Ui(q) is better than Pwi, it would substitute the Pwi. Otherwise, updating procedure will be conducted by Pbi instead of Pwi. If even by this action a better response is not yet resulted, a random response would replace Pwi. This steps will be repeated for a constant number of N times.

Investigating the optimality of each submemeplex and correcting it to improve the optimum response and combination of frogs.

3

F. In order that optimality information of memeplexes would be exchange with each other, frogs are combined with each other. This combination can merely include elite submemeplexes of each group or execute memeplex combination. In this step, after conducting global search, extent of Px is calculated for the target function and a number of steps would be repeated. At the end, if conditions of convergence of the algorithm to the optimum response has not been realized yet, algorithm would revert to the local search and again would establish a new grouping for memeplexes and start again the local search procedure.

Combining memeplexes and creating the global optimum

Parameters such as number of frogs (G), number of memeplexes (m), number of frogs selected in submemeplexes (ni), and constant number of N which has been considered for repetition loop, can be effective in performance of this algorithm. Increase in the number of frogs in each memeplex can improve the local optimal, and in the second step, result in correction of frog with less efficiency. Also since "m" is large, it is more probable that the global optimum response be easily identified and transferred in the combination step, though stages of combination of memeplexes' responses increase and so the time for executing the algorithm increases. Also large value of "n" results in increase of the number of local responses, and this issue also increases computations of local


optimum search section. Usually, in the procedure of executing the algorithm, it is attempted that these parameters be selected in a more effective manner to increase the performance of SFLA and be corrected each time. General stages of this algorithm are indicated in figure 4-3.

• Step 1: N constant might be different for local or global which are respectively Nlocal and Nglobal

• Step 2: If decision variables are more than one, it is necessary that the local primary response includes a set of responses, so that if "d" is the decision variable, then we have: U(i) = Ui1, Ui2, …, Uid

• Step 3: In order that creation of Memeplexes wouldn’t be random, sometimes frogs are arranged based on f(i). Then by information of a couple of elements of <f(i), U(i)> the following descending order is created. <f(a), U(a)>, …. <f(x), U(x)>, <f(x'), U(x')>, …, <f(b), U(b)>, For each x we have f(x) > f(x'). Then they are classified using various methods (figure 1). After creating memeplexes, local search repeats for Nlocal times, and in each time, leap and improvement of Pw result of that memeplex is conducted.

• Step 4: In this step, Px which must be a global optimum, will be updated. It is probable that in this time of executing Px loop, it would be updated with a higher value for maximum optimum and a lower value for the minimum optimum.

When iteration of the loop does not guide us to a better result, the algorithm will be converged to an optimum response, so that this response will be our global optimum response.


Figure 3: SFLA algorithm

III. Scheduling Optimization and Resource Allocation in Cloud Environment In this section we attempt to present a new formulation for scheduling the jobs in the cloud when resources of the cloud are determined in advance. The purpose is to propose a way to apply SFLA algorithm, so that it can by concentrating on past behavior of the system in allocating virtual machines and allocating resources to requests of the users, support an appropriate schedule to increase productivity of resources.

If "T" is the line of duties so that we have T= T1, T2, …Tm , and includes "m" requests of users, T would always change dynamically. Hence in time "i", properties of request of user is Ti= Ti1, Ti2, …Tim . If physical resources are indicated by "R" (processor and storage space), so that R= R1, R2, …,Rn is obvious in advance. Virtualization is allocated on physical resources to the number of required Vi. An appropriate response S schedules allocation of the Vi to the line of duties. In the time "i", appropriate response of allocation of virtual machines is provided by Si. In this case, S= S1, S2, …Sr indicates sequence of scheduling resources to virtual machines and allocation of virtual machines to duties. Figure 4.4 indicates activity cycle of the cloud. Here, the set V= V1, V2, …, Vp indicates the number of virtual machines.


Task queue Cloud solution for scheduling T: T1, T2, …Tn S: S1, S2, …Sr

Figure 4. Activity cycle of the cloud: request of users/response of the cloud to the requests

Generally, in a cloud environment, scheduling of V virtual machines is conducted without attention to previous situations of the system and only based on current information of the system. Analysis of the scheduling behavior indicates the effect of previous conditions of the system on imbalance of load of system [6]. We have assumed that resources existing in physical machines are constant. Hence, in order to have balance in working load, no virtual machine travels and there is no extra expenses related to displacement of the Vi on physical machines. Here, by appropriately selecting the Vis for duties, scheduling should have the least load on the system. This situation provides the grounds for introducing an appropriate algorithm by using data of the past.

• Cloud Computing Model First we present architecture of the cloud network based on its components. If the cloud network is consisted of "M" number of heterogeneous process units, the form of jobs in it will be as follows:

Duties are dynamic, so it is not known in advance that when jobs are ready to enter T. Each job (Ti), has characteristics such as entrance time (ai), worst response time (bi), and deadline (di). It also uses resources of the required space (Si), number of processors (Ci), a part of bandwidth of the virtual machine (Vi) which will be used by Ti and also uses (BW(i,j)) in their execution. Here we consider some notices related to Ti. In this case, data of T are in the form of an m×6 matrix. It should be noted that uij represent the resources being used by Ti when Vi is assigned to it. Therefore, productivity of virtual machines is a resultant of sum of productivity of resources Vj ∈ V in terms of jobs of T = T1, …,Tn. If the resource used by the jth virtual machine and by the ith job can be indicated by uij, then we have:

1..( ) /

=

= ∑j ij i ji m

U V u q (1)

Where,

qij is equivalent to makespan. If the index of makespan decreases, productivity of the cloud would increase. Hence, productivity of all virtual machines existing in the cloud can be obtained by the following equation: ( ) = ∑ ∑ ⁄ .. .. (2)

Using the abovementioned definition, average of productivity of V depends on the number of duties that before end of the deadline (di) from the entrance time (ai), have succeeded to provide the appropriate virtual machine which can supply the required resources for Ti.

Here, the scheduler model is indicated in figure 5-4.

Cloud Resource R: R1, R2, …Rm

VM: V1, V2, …, Vp


Figure 5: Matrix of properties of duties existing in line

We assume that physical resources in a cloud environment are constant during scheduling and only the duties line works in a dynamic manner. These resources include the following parts:

− Storage space: Disc = Disc1, …, Discz − Existing hosts: Host = Host1, …, Hostu − Existing processors: CPU = CPU1, …, CPUv − Existing data centers: DC = DC1, …, DCw − Extra costs: Cost = Cost1, …, Costn − Bandwidth: Bandwidth = BW1, …, BWs

Now, properties of list of virtual machines is like figure 4-6. Here, providing an appropriate scheduling can be achieved by finding a possible response Si.

Figure 6. Matrix of properties of resources existing in the cloud

Scheduler of cloud environment, in each moment encounters a line of duties dynamically. When at least for one of the jobs existing in the line some resources are provided by a virtual machine, by considering di deadline, allocation of virtual machines to duties will be proceeded. If Ve represents a part of virtual machines involved, then Vf=V-Ve represents the free part of resources existing in the cloud. Therefore, list of all duties from the T line which their required resources exist in Vf,


create the line T'. Now the scheduling approach should be able to conduct allocation of virtual machines in a manner that maximizes productivity of resources. A response for scheduling and presenting S response is that it arranges the jobs in terms of their least need to resources (Shortest job first). Now T" is the arranged line of existing duties and the scheduler until the time it is applicable, assigns the jobs to the virtual machine existing in Vf.

S = T" ⊗ Vf (3)

If T"' would be a part of T" which has exited the line to be executed and a virtual machine is assigned to it, F function establishes the new situation by eliminating scheduled jobs and taking down the added jobs together with eliminating allocated resources from the free resources and taking down the new freed resources. Also the element H is updated by equation 4-7.

T = ((T ʘ T"') ⊕ Tnew) (4)

Vf = ((Vf ʘ V(T"')) ⊕ Vnew ) (5)

Here, the operator ⊕ represents sum of rows of two same-column matrices and operator ʘ represents deduction of rows of two same-column matrices. When request of users in different working load times includes similar jobs, a learning algorithm can be provided for improvement of S. In the following we will indicate how SFLA will be applied for this reason.

IV. Shuffled Frog Leaping Algorithm for Resource Allocation in the Cloud (RASFLA) Based on the way SFLA works, here we suggest to create a correspondence between scheduling components and resource allocation in the cloud, for instance between line of duties and sections of SFLA. Since duties existing in a line can be divided into similar groups with the title of "similar jobs", based on some of the components, our primary population can be easily defined and memeplexes can be naturally created. In this regard, search for a duty to find an optimum virtual machine that can execute it with less resources, will be assigned to the frog corresponding to that duty. Here, similar jobs (SiTa), based on equation 3-8, result in creation of memeplexes. High rate of response can be considered as convergence conditions. In the proposed algorithm called RASFLA, the value of n (number of frogs) compared to m (number of memeplexes) is very higher (n>>m). Hence, definitely we shall follow the strategy of increasing the number of submemeplexes in the algorithm, since in this condition we may not result in appropriate local optimum responses.

• RASFLA Steps According to the formulation in section 4-4 and definition of problem of resource allocation in a cloud in the form of an optimization problem with target function U(V) to achieve the possible maximum, here we attempt to define resource allocation in a SFLA as below:

A. establishment of primary population (frogs): Beginning of a problem in evolutionary algorithm is based on population. In resource allocation, both virtual machines and jobs line can be considered as primary population of frogs in SFLA. Jobs line is more desirable for this choice, since line of virtual machines (V) and free machine (Vf) is limited. However, line of T jobs is constantly changing and encountering entrance of new things. Here we should decide whether the frogs are jobs which are meant to leap towards resources such as virtual machines, or the virtual machines as resources of clouds, are frogs which are meant to move


towards jobs waiting, to select the most appropriate job with the objective to optimize the performance. This is reasonable to select the first. Hence, set of virtual machines will be considered as limits of the problem and U(V) is the target function for maximum optimization. We consider other parameters of SFLA such as the number of memeplexes with the constant number of MT = |SiTa| based on number of groups and also consider qi = |T| / (MT × MT) for 1 ≤ i ≤ MT. For the standard state, we consider iteration number of N for local search equal to |V|.

B. Primary responses: Now by computing the uij for each job (Ti), we calculate the virtual machine (j) which is 1 ≤ j ≤ |Vf | and has the highest productivity, as the primary response. U = Vj | uij = maxuik | i = 1 … |T| , j = 1 … |V| (6) Now we investigate competency of each job in the created memeplexes. F= f(i) | f(i) = Vj | maxqij | j = 1 … |V| (7) Where, qij is that "makespan". Therefore, we can easily have an index for competency of each job. Now we can follow the global and local search ring of SFLA on all structures created above. Therefore, the proposed algorithm for scheduling and resource allocation in the cloud which we have called RASFLA, can be defined as follows: Management of raw data of the scheduling behavior of cloud environment can be used for increasing intelligence of RASFLA.

• Pseudo code of RASFLA scheduling algorithm Algorithm: Resource Allocation by shuffled frog leaping algorithm (RASFLA) RASFLA (T, V, H, TH, S) Inputs:

T: List of Request for VMs from client applications. Spase: Set of space/ GB, DC: number of Datacenter, CPU: set of CPU / GHz. Host: number of Host, Cost: List of cost for Network, BW: the Bandwidth of Network. TH: Threshold H: Array with size TH, initial is set by 0 G : = |T|, MEM[1...m] is m subset of T where every has similar tasks. ni is number of Tasks where is i th subsets. qi is number of Sub MEM of MEMi, N : = |V|.

Output:

S: List of VM allocated to T where Utility is High and Debt is low Initialize: In each state, by SFLA for Resource requirement of Task get decided. The set of actions of SFLA is the set of permissible VM allocated to tasks (or cloudlet) with high utiliy The following steps are taken 1. while T ≠ ∅ do 2. Copmute T and Vf 3. Sort T; 4. for i = 1 to MT do Create MEMi , 1 ≤ i ≤ MT 5. for i = 1 to |T| 6. for j = 1 to |Vf | 7. if 8. Vj is member of Ui do Allocated Vj to Ti 9. fi 10. rof 11. rof


12. for i = 1 to |T| 13. for j = 1 to |Vf | 14. if 15. U = Vj | uij = maxuik | i = 1 … |T| , j = 1 … |V| 16. fi 17. rof 18. rof 19. While Px is not total optimal solution 20. Serch for Local optimum solution for any MEMi := Comput Pbi for i = 1….|T| 21. Shufelled MEM1 , …, MEMn and compute Px 22. elihw

V. Simulating, evaluating, and comparing the proposed algorithms To simulate the proposed approaches, first we conduct scheduling based on the smallest duty. We consider a threshold (TH) for requests and based on that, evaluate the extent of productivity created for scheduled jobs. Gradually, scheduling and effect of jobs with different dimensions on extent of productivity of the cloud, update the H element, and we choose the priority of selection based on extent of jobs productivity in the clouds behavior. Results of output of executing the algorithm are indicated in figure 4-9. As it can be observed, by increase of time, response of productivity is increasing in a normal manner. Our result for return time is close to the predicted value, but its growth intensifies with increase of time period.

Figure 7: Comparing productivity of resources of various methods (resources versus time)

Among important indexes for scheduling "makespan" factor which is equal to subtracting the end time from the start time, has an important effect on evaluation of the method. Here we compare the computation of the resulted makespan from the existing methods. As you can see, by increase in the number of cloudlets, makespan of non-linear methods increases, while the proposed methods, by using capabilities of learning automata, act completely descending.

0

50

100

150

200

250

300

RASFLA

LSTR

nLSTR

nLOSTR


Figure 8: Diagram of makespan to the number of cloudlets

Debt decreases in this method, since increase of productivity is in an interaction with debt and exists in |Ti| as a negative index. Hence, increase in productivity will be followed by decrease in debts of responding to duties in a cloud.

Figure 9: Diagram of debt to the number of cloudlets

The proposed approach, by using capabilities of learning automata compared to custom methods used for scheduling including STF and FCFS, indicates that indexes of productivity or resources and decrease in debts, are improving with an appropriate trend, but still the average of response time in these methods, provides the best possible conditions.

0

5

10

15

20

25

30

LSTR NLSTR ONLSTR RASLFA

0

500

1000

1500

2000

2500

Chart Title

LSTR NLSTR ONLSTR RASFLA


Figure 10: Average of job flow time in scheduler (time versus number of jobs)

Anyway, the main advantages that learning methods provide for productivity of resources and cause improvement of responding level of the cloud for requests under SLA, should be able to increase quality of response level of services by improving job flow time.

VI. Comparing Complexity of Methods As in the previous section proposed methods were compared in terms of indexes of productivity, debt, and average time for job flow, in this paper, some theses related to complexity of proposed methods have been posed and at the end, the various methods considered in this dissertation will be compared in terms of the extent of required calculation for the methods themselves.

Thesis 1: Complexity of NLSTR algorithm for concurrent scheduling of jobs and resources is equal to O((αm + α+ 3)n). Where αm is the coefficient for number of virtual machines in the cloud and α is the line coefficient for FoTQ.

As we know, the general algorithm of LSTR, provides an appropriate linear mapping from the jobs line (TQ) to the line of resources which exist in the virtual machines. Here, if the length of TQ is equal to n and length of RQ is equal to m, complexity will be O(mn). Since naturally we have n>>m, we can replace "m" variable with a constant coefficient like αm. Now the nonlinear algorithm of NLSTR has the following sections:

A. This algorithm extents TQ line to the new line of TQ∪FoTQ. According to equation 3-9, size of this line depends on α coefficient, for instance if we have α =4/3, in a normal distribution, length of the new line will be 7n/3, and in a general form it is equal to (1+1/α)n.

B. In NLSTR algorithm, the frequency sequence of jobs are used for predicting FoTQ line. Length of this sequence is equal to n, so each updating requires O(n) time. Updating is conducted in two phases, one before and one after scheduling a TQ line.

Therefore, totally, cost of executing this algorithm is:

NLSTRComplexity = O(αm n) + O(2n) + O((1+α)n) = O((αm + α+ 3)n)

0

2

4

6

8

10

12

5 10 15 20 25 30 35 40

LSTRNLSTRONLSTRRASFLAFCFSSTF


Thesis 2: Complexity of ONLSTR for concurrent scheduling of jobs and resources, when it is merely conducted for predetermined jobs is equal to O((β+1)αm n). Where αm is the coefficient for number of virtual machines in the cloud and α is the line coefficient for FoTQ.

According to complexity in NLSTR method, in the jobs scheduling method, after calculating frequency of events of various jobs in the cloud environment, FoTQ line will be created with the length equal to n. Here, any job which is probable in the near future which is situated in a certain period of time, is predefined and its required resources are predicted. In the following, scheduling acts similar to LSTR, so complexity of online scheduling method, has the two following sections:

A. Creation of FoTQ line that its complexity depends on the extent of iterating the periods and is indicated by β. If we seek to achieve an appropriate frequency of jobs in each period of cloud's activity, naturally, β shall be as large as possible. Generally, a period of 30 days can be an appropriate choice. Here, complexity is equal to O(βαmn).

B. In another part, LSTR scheduling for FoTQ is equal to O(αmn).

Therefore, total cost of executing this algorithm is:

ONLSTRComplexity = O(βαm n) + O(αm n) = O((β+1)αm n)

Thesis 3: Complexity of RASFLA for scheduling resource allocation in the cloud is equal to:

O(Nlocal × NGlobal × (|MT +|V| )× (|V|+|T|)) Where MT represents the number of memeplexes. As indicated in RASFLA, this algorithm has two "for" loops and two "while" loops, that if their coefficient is properly calculated, the results would be as explained.

As indicated in [30], FCFS algorithm, in terms of debt value for scheduling method, has very low complexity. Also regarding complexity of scheduling STF, it is of the degree equal to O(θn), so that we have 0 < θ < 1. Therefore, we can compare various scheduling methods based on extent of complexity of the method, like figure 4-13. Here we have αm = 20, α = 4, β = 2

Figure 11: Comparing complexity of various scheduling methods

0

5000

10000

15000

20000

25000

30000

5 10 15 20 25 30 35 40

LSTRNLSTRONLSTRRASFLASTFFCFS


An appropriate method is the one which there is a proportion between its productivity extent and its overload debt extent. Rating various considered scheduling methods is conducted based on productivity versus complexity. Therefore, in a general view, various methods are rated based on improvement of scheduling resources and jobs in figure 4-14. = (6)

Figure 12: Rating scheduling of various methods

VII. Conclusion In this paper, by using scheduler learning approach in the cloud environment, when we ignore effects due to dynamic change in resources of the cloud, we propose an appropriate algorithm for scheduling duties and allocating virtual machines. This algorithm, by using capabilities of learning automata attempts to identify an optimum behavior for scheduling. The proposed algorithm called RASFLA, increases productivity of resources significantly. Also relatively improves makespan and improves the index for decreases of cost of cloud. Since online scheduling in the cloud has a high importance especially in providing services existing in the cloud for users of SLA, we have developed the proposed method for online scheduling. The obtained results confirm the efficiency of the proposed method.

References

1) 7. Eusuff, M. and Lansey, K. (2003). ”Optimization of Water Distribution Network Design Using the Shuffled Frog Leaping Algorithm.” J. Water Resour. Plann. Manage., 129(3), 210–225.

2) Junqing Li, Quanke Pan, Shengxian Xie, An effective shuffled frog-leaping algorithm for multi-objective flexible job shop scheduling problems, Applied Mathematics and Computation 218 (2012) 9353–9371.

3) 49. Chen Fang, Ling Wang, An effective shuffled frog-leaping algorithm for resource-constrained project scheduling problem, Computers & Operations Research 39 (2012) 890–901.

4) 50. Jian-ping Luo, Xia Li, Min-rong Chen, Hybrid shuffled frog leaping algorithm for energy-efficient dynamic consolidation of virtual machines in cloud data centers, Expert Systems with Applications 41 (2014) 5804–5816.

5) 56. A. Legrand, L. Marchal, H.Casanova, "Scheduling Distributed Applications: the SimGrid Simulation Framework", in Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and Grid, Cardiff, UK, pp.138–145, 2003.

6) 59. Yue Miao, Fu Rao and Luo Yu, Research on the Resource Scheduling of the Improved SFLA in Cloud Computing, International Journal of Grid Distribution Computing Vol.8, No.1 (2015), pp.101-108.

0

2

4

6

8

10

12

LSTR NLSTR ONLSTR RASFLA STF FCFS

Ranking


Security issues and challenges using cloud computing

Anis Vosoogh1*, Mohssen nazarian parizi2

1 Young Researchers and Elite Club, Sirjan Branch, Islamic Azad University, Sirjan, Iran

[email protected]:

2Department of computer engineering, Sirjan Science and research branch, Islamic azad university, Sirjan, Iran & Department of computer

engineering, Sirjan branch, Islamic azad university, Sirjan, Iran

[email protected]

Abstract. At the end of the first decade of 21st century, the IT world witnessed a boom of a new technology called cloud computing and leading companies in IT industry have tried to use this technology in their customer service processes so that the best of these companies today, use cloud computing as a key competitive advantage against their rival companies. Despite very important challenges, this relatively new technology is developing rapidly. At the beginning of its development, many researchers have thought that cloud computing will soon be widespread but recent studies suggest that security is the biggest barrier against the spread of this technology. This study tries to review the security challenges of this technology to a certain extent, and why the security issues have caused a low reception of this technology among users and IT managers. First, we give you an introduction and a definition of cloud computing, then we express the advantages and benefits of using this technology and also its security disadvantages and problems and finally, a security analysis is represented for it. Keywords: Cloud computing; security; security in cloud computing 1 Introduction Cloud computing is a different look at the internet. Until recently, internet was mostly used for transmission of data between devices connected to the internet and almost all calculations and processing were performed locally on users’ computers or internal systems of organizations. In cloud computing model, internet acts as a medium to access computing resources and services on systems that are located in different and somewhat unidentified geographical areas. This concept goes back to the beginning of 1961, when Professor John McCarthy proposed that computer-time sharing technology may lead to a future that computing power and special applications would be sold using a utility-type business model (Anthes & Gary, 2010). It was a famous idea by the end of that decade, but in mid-70s it became known that current IT-based technologies are not capable of improving such futuristic computational model. However, at the turn of the millennium, this concept took a new life and cloud computing technology began to emerge in technological cycles. 1.1 Cloud Computing Cloud computing is a model for providing the possibility of self-service and on-demand access to a series of services and computational resources such as networks, service providers, storage media and applications through the net (internet) so that it can be provided and used quickly, with minimal involvement of suppliers and vendors, and according to the consumers’ needs (M. Armbrust et al, 2009). This means that access to IT resources at the time of request and based on the level of user’s request, is provided for the user via the internet in a flexible and scalable manner. The term “Cloud” is a metaphorical term that refers to the internet and a picture of a cloud is also used in computer network diagrams to represent the internet network. Internet is likened to a cloud because like a cloud, it hides its technical details form the users’ view and creates a layer of abstraction between these technical details and the users. For example, what is a cloud computing softwareprovider offers include online business applications that are provided to the users through a web browser or other software. Applications and information are stored on servers and provided to the users upon their request. Details remain hidden from the user’s view and users do not need to be a specialist for or control the infrastructural cloud technology in order to be able to use it (John W et al, 2010). 2. ADVANTAGES AND BENEFITS OF USING THE CLOUD COMPUTING




Because customers generally do not provide the infrastructure used in clod computing environments by themselves, they can be relieved from high costs and may consume resources as a service just by paying their price. Many presentations of cloud computing have been adopted as the public computing and a billing model and individuals receive bills based on a subscription fee. By sharing computing power among multiple users, rates of return are generally improved because cloud computing servers do not sleep due to lack of use. This factor alone can significantly reduce infrastructure costs and increase the rate of development of applications. Another benefit of using this model is that the capacity of computer increases dramatically because customers do not need to manage their applications at peak consumption times, when there is a very high process loading. Adoption of cloud computing model would be a powerful choice because of greater access to a faster bandwidth (Daniele Catteddu and Giles Hogben, 2009). Multi-tenancy makes it possible to share resources and cost across a large number of users. Major advantages of a multiple permission approach include: • Central infrastructure and lower costs • Improving the performance at peak consumption time • Improving the efficiency of systems that often are general • Dynamic allocation of CPU, storage and Bandwidth • Compatible performance that has been reviewed by the service provider. 2.1 SECURITY ADVANTAGES IN CLOUD ENVIRONMENTS Current cloud service providers, employ a tremendous amount of systems. They use complex processing and have experienced staff to maintain their systems that small companies cannot have access to them. As a result, there are direct and indirect security advantages for cloud users. Here we introduce a number of key security advantages in a cloud computing environment: Centralization of data: In a cloud environment, service provider takes care of storage problems and small companies do not need to pay lots of money for physical storage devices. Cloud-based storage also offers a way for faster and potentially cheaper centralization of data. This is especially beneficial for small companies who cannot pay extra money to security specialists in order to monitor the data. Event reaction: Infrastructure providers can create an exclusive legal server that may be used based on required time. By using a number of investments, an environment backup can easily be created and implemented on cloud without affecting the normal course of the company. Forensic image verification time: Multiple cloud storage implementations show a checksum or hidden break. For example, Amazon S3 created MD56 that breaks an object automatically when you store it (Craig Balding, 2008). Logging: In a traditional computing paradigm, logging is often frequently an afterthought. In a cloud, the need for storage in order to standard registration of events has been automatically solved (Craig Balding, 2008). Based on a report by ENISA, cloud computing environments have the following Safety and scale advantages: Different kinds of security measures are less expensive when implemented on a large scale. This includes a variety of defensive measures such as filtering, patch management, etc. Other benefits of scale are: multiple positions, uninterrupted response to events and threat management. Security as a market distinguisher: Security is the major concern for many cloud customers. Most of them purchase options based on their reputation for reliability, integrity and flexibility. This is a powerful force for providers to improve their security measures. Standardized user interface to manage security services: Big cloud providers can offer a standardized open source user interface in order to manage security service providers. Fast and smart scanning of resources: Cloud providers’ ability to dynamically re-allocate resources for filtering, traffic shaping, identification, encryption, etc. to defensive measures has obvious advantages for flexibility. Auditing and evaluation of gatherings: Cloud computing can provide exclusive and legitimate images for any amount of use of virtual machines that are accessible in offline environment without creating infrastructures. They will reduce the time needed for legal analysis.


3. CHALLENGES AND RISKS OF CLOUD COMPUTING The biggest challenges for cloud providers are the security of data storage, high-speed internet access and standardization. Storing a large volume of data related to users’ privacy, identity and special software priorities in focused situations, raises many issues for data protection. These issues in turn, include doubts in legal frameworks that should be implemented in an equal-based environment. Another challenge in cloud computing model is that the bandwidth penetration in the U.S. is much higher than that of many countries in Europe and Asia. Cloud computing without bandwidth connections (wired or wireless) is untenable. Without high speed bandwidths, cloud computing services cannot be widely available. Finally, technical standards used to implement a variety of computer systems and applications which must be created for cloud computing, are not well-defined yet or been widely reviewed and their hidden parts have not been approved either (John W et al, 2010). Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) are three general models of cloud computing. Each of these models has a different impact on software security. However, two common security questions arise in a typical scenario where software is hosted on a cloud: • How is the data security? • How is the code security? Security, availability and reliability are important quality issues for users of cloud services. Gens et al. have pointed out that security is a remarkable challenge among other quality challenges as depicted in figure1 (Gens. F, 2013). Fig. 1. Results of IDC survey ranking security challenges (Gens. F, 2013)

According to the article “security architectures for cloud computing”, users have feelings of security and trust when they really understand how the operation is being implemented and conducted. Although cloud computing provides great convenience for users by releasing them form their need to know the details of the processes, it forces the users to trust the cloud service provider who is concerned for many users. In today’s market, the awareness about the cloud computing issues is more shifting towards reliability and security subjects. For example, a survey managed by Fujitsu revealed the importance of cloud computing issues and showed that security, stable operation, support system, safety and reliability had the highest ranks among the concerns of the users. This survey also showed that cloud computing IT systems are unobvious to the users and it is understandable that customers highly demand their information to be fully protected and the provided services to be sustainable. The following concerns about security have especially increased among customers: • Within the same data center, there are several issues in which the information belongs to more than one customer and this information is located on a same computer. In such cases, are these variable sets of information stored properly? • Should we be concerned that information may be leaked or distorted in a data center? • Since the system policy of a cloud service provider is being shared by a wide variety of customers’ media, can reliability be a problem? For example, if a malicious program such as a virus penetrates the services, is it not possible for it to affect all the environments that are using such services? • When using multiple cloud services at the same time to do a job, the relationship between the tasks becomes difficult. Is service reliability guaranteed? (M. Okuhara, et al, 2010).


3.1 SECURITY FLAWS IN CLOUD ENVIRONMENTS Despite having security advantages, cloud computing paradigm also introduces some key security challenges. Here, we describe some of these key challenges: Location of data: In general, cloud users are not aware of the exact location of data centers and they also have no control over physical mechanisms of data processing. Most popular cloud service providers have data centers in different parts of the world. Many service providers also have the advantage of their global data centers. However, in many cases, applications and data may be stored in a country that can be faced with legal issues. For example, if the user’s data is stored in country X, then service providers would be faced with security and legal requirements of country X. This may occur while the user is not aware of such issues. Investigation: Investigating an illegal activity in cloud environments can be impossible. Cloud services are especially hard to investigate as multiple users’ data are located in a single location and can also be released among data centers. Users have a limited knowledge about the network topology of the environment they are using. Service provider can also sometimes apply services on users’ network security. Separation of data: Data in the cloud is typically in a shared environment with other customers’ data. Encryption cannot be considered as a specific solution for data separation issues. In many situations, customers do not want to encrypt data because there are cases that data will be destroyed upon their encryption. Long-term sustainability: Service providers should guarantee data security in changing business situations including mergers and acquisitions. Customers should also ensure data accessibility in such situations. Cloud service provider should also guarantee data security in bad business conditions such as prolonged power failures. Endangered servers: In a situation where a server is compromised, users need to shut down their servers until they can get a backup of the data. This will cause further concern about accessibility. Routine utility: Experienced cloud service providers are subjected to the penetration of external audits and security certificates. If a cloud service provider does not adhere to this security audit, it leads to an apparent decline in customers’ confidence. Recycling: Cloud service providers must guarantee the security of data in the event of natural and man-made disasters (A. Sangroya, 2010). 4. CLOUD COMPUTING AND ORGANIZATIONS During a February 2009 speech in IDC cloud computing forum, an analyst of this corporation said: “The major concern for organizations that are willing to use cloud services is information security. According to IDC, about 75 percent of IT managers are concerned for the security of cloud computing services”. In order to understand the cause, we should break security problems into smaller parts. In organizations that are using cloud services, all security problems are placed in three major groups: Security of the platform which is located at the place of cloud computing server. Security of workstations (endpoints) which are located at the place of clients. Security of data that is transmitted from endpoints to the platform. The last mentioned security concern which is the security of transmitted data, has been actually resolved by using encryption techniques, secure connections and VPN. Almost all the new services support these mechanisms and today, data transmission from endpoints to platforms is performed in a completely safe process. 4.1 SECURITY PROBLEMS RELATED TO SYSTEM VALIDATION AND OPERATION Today, the biggest nightmare of IT managers is security problems related to the performance of service platforms. For most of these individuals, finding an appropriate method of providing security for a system that cannot be directly controlled is not an easy process. Cloud services platform is not a centralized system in organization’s data center; rather it is often located in an unknown data center of an unknown country. In other words, the main security problem of cloud computing is due to flaws in validity and verification of service providers and in fact it’s the continuation of the same problems that occur during the process of assigning some of organizational affairs to external providers. Specialists and managers of organizations are not familiar with assigning vital issues such as providing security for their working data to external providers. However, we can be sure that this problem will be solved like the previous issues related to delegation of performing internal organizational tasks to miscellaneous providers and all concerns about the safety of resources will disappear.


The question here is that, what is the basis of such statement? First of all, providing security for a data center which includes processing resources is much easier for service providers. The reason is the distribution property (scale) of cloud computing. Since service providers offer their services to many users, they have the possibility to provide security for all users at the same time and thus, can be benefitted from more effective and sophisticated security approaches. Of course, companies like Google and Microsoft have more resources and facilities to provide information security compared to small companies and even big institutions that use exclusive data centers. Second, using cloud services between client organizations and service provider is always done based on the mutually agreed quality in contracts that predict all the responsibilities of service provider related to security problems. Third, further operation of service providers is directly related to their work experience and because of that they always try to provide information security at the highest level possible. Clients of cloud platforms have also concerns about security problems related to the performance of cloud systems in addition to validity and verification problems. While many of home systems have this property (due to long-term evolution), the situation becomes much more complex when using cloud services. Gartner research institute reviews seven of the most important security problems of could services in a short article entitled “Evaluation of cloud computing security risks”. Most of these problems are related to the operation of cloud systems. Gartner Inc. specially recommends the study of cloud systems in terms of the distribution of data access rights, information retrieval capabilities, research supports and periodic revisions. Is there a limitation that prevents the actual implementation of these properties? The answer is of course negative. Every measure that its implementation is possible in an organization can also be implemented through a cloud system. Security problems basically depend on the quality of cloud computing products and services design. Before entering the discussion of cloud computing platform security, you should solve some important legal and audit problems. The problem occurs when data separation is done between clients of a service provider in a cloud environment and this separation usually makes the process of ensuring the compliance with legislations and standards more complex. Despite the seriousness of this problem, there is no doubt that it will be solved sooner or later. On the one hand, technologies used for monitoring the implementation of legislations will be enhanced through the development of cloud computing and on the other hand, legislators consider the technical complexities of cloud computing environments as a serious matter in their legislations. 5 .CLOUD COMPUTING USERS IN ORGANIZATION In a perfect “cloud world”, since the information is not stored in accessories, security of cloud computing is provided at the platform level and through communication with auxiliary accessories. This model is not yet suitable for practical use and information that is sent to platform is actually being made, processed and stored in system’s endpoints. So it seems that the auxiliary accessories of cloud environments will always face security problems. Actually, according to a more powerful theory, such problems become more serious through time. Most security threats come from the global network and through entering the organizational infrastructure of the client. In a home system, the major problem occurs when interacting with the platform but in cloud environment, unprotected endpoints will face security problems. As it was said before, the security level of global cloud platforms such as Google and Microsoft is much higher than that of independent organizational systems because they are benefitted from so many facilities, capacities, professional experts and unlimited resources. Because of this, external attackers are confident about the futility of their attacks against such protected service providers and as a result, cyberspace criminals direct their attacks towards auxiliary accessories of the “cloud world”. The original concept of cloud computing, which includes permanent and unrestricted local access to a platform, increases the possibility of such incidents. By investigating the number of attacks to computers that are located at system’s endpoints, organizational information security services should be altered in a way that they will be able to protect the auxiliary accessories of the cloud environment. Doing this is vital to providing security for organizational information. Joseph Tobolski, director of cloud computing in Accenture institute, says: “I think a lot of security objections to the cloud are emotional in nature, it's reflexive’ response to this technology”.Doug Menefee, CIO of Schumacher Group, is also familiar with emotional aspects of this subject. He says: “My IT department came to me with a list of 100 security requirements and I thought, Wait a minute, we don't even have most of that in our own data center”( James Niccolai, 2009). Making decisions about the use of cloud computing is like driving a car for the first time. On the one hand, most of your acquaintances have probably taken this step but on the other hand, entering a crowded highway for the first time can be a frightening experience especially when we hear horrible stories about car accidents in daily news. Nevertheless, driving in a highway is not more dangerous than drinking coffee in a moving train or waiting at a bus station. Cloud computing situation is very similar to that of using software in a traditional way. Cloud environment needs attention to the information security and we are completely certain that there are appropriate solutions to the current problems. There are important considerations about the cloud environment security that have basically different priorities (from environmental protection to the protection of auxiliary accessories). However, if the developers of information security providing tools help the companies to overcome these obstacles, the cloud environment would have a bright future.


6. CONCLUSION Cloud computing security has benefits and disadvantages that some of them have been described in this article. Now we should see when and where to use it and evaluate the risks associated with its using and to what extent the security will be endangered by using it. Security issues in cloud computing has played a major role in slowing down its acceptance, in fact security ranked first as the greatest challenge issue of cloud computing. Data security breach and losses are obvious dangers in this regard. Cloud computing technology is in the process of growth and evolution and there are still many problems including some potentials of this technology that are not agreed upon, even in IT community. References:

1. Anthes, G. (2010). Security in the cloud. Communications of the ACM, 53(11), 16-18. 2. Armbrust, M., Fox, O., Griffith, R., Joseph, A. D., Katz, Y., Konwinski, A., ... & Zaharia, M. (2009). M.:

Above the clouds: a Berkeley view of cloud computing. 3. Rittinghouse, J. W., & Ransome, J. F. (2009). Cloud computing: implementation, management, and security.

CRC press. 4. Catteddu, D. (2010). Cloud Computing: benefits, risks and recommendations for information security. In Web

Application Security (pp. 17-17). Springer Berlin Heidelberg.. 5. Gens, F. (2009). New IDC IT Cloud Services Survey: Top Benefits and 6. Challenges, IDC eXchange, Available, http://blogs.idc.com/ie/?p=730 [1 August 2015] 7. Okuhara, M., Shiozaki, T., & Suzuki, T. (2010). Security architecture for cloud computing. Fujitsu Sci. Tech.

J, 46(4), 397-402. 8. Craig Balding. (2008). ITG2008 World Cloud Computing Summit, 2008. 9. Sangroya, A., Kumar, S., Dhok, J., & Varma, V. (2010). Towards analyzing data security risks in cloud

computing environments. In Information Systems, Technology and Management (pp. 255-265). Springer Berlin Heidelberg.

17

http://blogs.idc.com/ie/?p=730


Nonparametric Wavelet Regression Estimates for Consecutive Survival data

Esmaeil Shirazi1, Reza Zarei2

1Department of Statistics, Faculty of Science, Gonbad Kavous University, Gonbad Kavous, Iran.


2Department of Statistics, Faculty of Mathematical Sciences, University of Guilan, Rasht, Iran.


Abstract We consider a wavelet method for estimating a regression function nonparametrically, for consecutive survival data. In some statistical application to medical data, one is interested in sequentially observed periods of disease. In such situations we encounter with a set of data that may then be truncated and/or censored. Our goal is to estimate nonparametrically the regression functions of this type of the survival time data and investigate their asymptotic properties. A simulation study is conducted to investigate the performance of the proposed estimators. Keywords: Consecutive lifetime data, Nonparametric regression, Thresholded wavelet estimator

1. Introduction and problem statement Nonparametric regression estimation with survival data has received considerable attention recently. In survival or reliability studies, the observed data are typically censored and/or truncated. Left truncation and right censoring together naturally occur in cohort follow-up studies. Many studies in nonparametric regression setup refer to the case when only the response (or dependent) variable is subject to truncation or censoring (see Shirazi et al. 2013, Chaubey et al. 2013 and Alvarez et al. 2012). Some authors also considered the situations where both components are subject to truncation or censoring, but with different censoring times (see Ahmadi et al. 2012 and Park 2004). In practice, we may encounter with a set of consecutive lifetime data where are subject to truncation and/or censoring. More precisely, let suppose that 1 2 3X X X≤ ≤ be consecutive times denoting, e.g. the beginning of different phases in the development of a disease. Furthermore, let E be the end of an observational period. Very often, an individual can be identified as having the disease and thus becoming part of the study only if 2X takes place before E. In the other

case, there is no access either to 1X or to 2X . Given 2X E≤ , the quantity 3X can be observed only if 3X occurs

before E. If 2 3X E X≤ < , besides 1X and 2X only E is known. Typically, in a medical study, not the times but the individual durations of the different phases of a disease are of interest, i.e. 1 2 1U X X= − and 2 3 2U X X= − . Set 1Z E X= − , the time elapsed from the possibly unknown

onset time 1X until the end of the study, assumed to be independent of ( )1 2,U U . Then 1U can be observed only if

1U Z≤ . In the other case, we say that 1U is truncated from the right by Z. Furthermore, we may have censoring

effects in that we observe 2U only if 1 2U U Z+ ≤ . In the other case, we observe Z so that 2U is censored from the

right. The time Z elapsed between 1X and E is mainly affected by experimental constraints so that independence of

( )1 2,U U and Z may often be justified. Therefore given that 1U isn't truncated, we observe

1 2 2 1, min ,U U U Z U≡ −% and ( )1 2I U U Zδ = + ≤ , indicating whether 2U or some censoring variable was observed.




In this paper we are interested to define the relation between 2U and 1U by estimating the mean regression function

( ) ( )2 1|x E U U xϕ = = . Let ( )1 2, , , 1i i iU U i nδ ≤ ≤% , be a given sample of independent of replicates of

( )1 2, ,U U δ% . The goal then is to estimate the regression function g based on ( )1 2, , , 1i i iU U i nδ ≤ ≤% .

The subject of censored nonparametric has been mentioned elsewhere in the literature. See for example Currently, wavelet based methods for the estimation of the regression function are becoming increasingly popular as an alternative to the usual methods such as them we have briefly mentioned above. These methods offer fast computations and easy updating in addition to being easily adapted to the design. The rest of this paper is organised as follows. In the next section, we give some notations and details about the consecutive lifetime data. Basic elements of the wavelet theory, and the definition of the regression estimator and subsequent analysis are given too. Main result is described in Section ٣, it's proof is given in Section ٤.

2. Wavelet estimator for regression function ϕ Consider the nonparametric regression model which is given as follows.

( ) ( ), 1, 2,..., 2.1i iY X i nϕ ε= + =

where the errors iε , conditionally on iX , are zero mean, independent with a bounded (conditional) variance.

Our goal is finding an estimator for the regression function ( )g x that we can define it as follows

( ) ( )( ) ( ) ( )2 2 22 1

, ( )| , (2.2)( ) ( )

u f u x du g xx E U U xf x f x

ρϕ ρ= = = =∫

Similarly to the Nadaraya-Watson (see, for example, Rosenblatt 1991) estimator, our estimate of ϕ will be obtained by taking the ratio of wavelet estimates of g fϕ= and f . One possible way to estimate a function in an infinite-dimensional vector space, from a finite set of data, is to project it onto a sequence of nested approximating subspaces. In wavelet-based methods, such a sequence is given by the multiresolution 1 0 1... ...V V V−⊆ ⊆ ⊆ ⊆ analysis generated by a scaling function φ (see Mallat 1989). Recall that

jV is the linear space spanned by the functions ( ) ( )2, 2 2 ,

jj

j k x x k kφ φ= − ∈Z . We shall restrict ourselves to

orthogonal scaling functions, meaning that 0, ,k kφ ∈Z forms an orthonormal basis of 0V . Then the projector

onto jV is given by

, (2.3)jV jk jk

kP g ϑ φ

∈

= ∑Z

In a similar manner, one estimates f by , (2.4)

jV jk jkk

P f α φ∈

= ∑Z

where jk and jk are the wavelet coefficients which we will discuss more about them in follow. The above procedure is linear. But there exist other nonlinear procedures based on soft or hard thresholding of wavelet coefficients. It is known that there exists a wavelet function ϕ such that the orthogonal complement of Vj in Vj+1 is spanned by

( ) ( )2, 2 2 ,

jj

j k x x k kφ φ= − ∈Z . One can now decompose the function JV JP g V∈ , for some large J , as

0 00

, (2.5)jV j k jk j k jk

k j j kP g ϑ φ ξ ϕ

∈ ≥ ∈

= +∑ ∑ ∑Z Z

where the wavelet coefficients have been defined as follows

( ) ( )( ) ( ) ( )

0 0

02 2 , (2.6)

j k i j

i j

g x x dx

u x F du dx

ϑ φ

ρ φ

=

=

∫∫

and


( ) ( )( ) ( ) ( )

0

,

2 2 , (2.7)

j k jk

i j

g x x dx

u x F du dx

ξ ϕ

ρ ϕ

=

=

∫∫

Now for estimating the above coefficients, we need to define some notation and find an estimator for the joint distribution function F . In Strzalkowska-Kominiak and Stute (2012), they considered

( ) ( )1 2 1 1 2 2, ,F u u P U u U u= ≤ ≤

and then they derived an estimator of ( )1 2,F u u , which it is defined as below

( ) ( ) ( )( )1 21

1 2 1 1 2 21 1 2

, , , (2.8)n

i i nn i i

i n i i

I U U TF u u n I U u U u

A U Uδ−

=

+ ≥= ≤ ≤

+∑

where a nonparametric estimation of nA is given by

( ) ( )( )[ )

*

1,

,nn

nx

G dyA x

F y∞

= ∫

Here *nG and 1nF present the nonparametric product estimator of the conditional sub-distribution *G and the

distribution function of Z which defined respectively,

( ) ( ) ( ) ( )1

1 111

1 11 , ,i

n

n n i iiU t n i

F t with G x I U x ZnC U n =>

= − = ≤ ≤

∑∏

and

( ) ( )*

1

1 ,n

n ii

G y I Z yn =

= ≤∑

To estimate the above wavelet coefficients based on a random sample ( )1 2, , , 1i i iU U i nδ ≤ ≤% , we will employ

the above mentioned estimator of F an then we can provide the following empirical wavelet coefficients:

( ) ( ) ( )( ) ( ) ( )

0 0

0

2 1 1 2

, 2 1 1 2

ˆ , , (2.9)

ˆ , , (2.10)

j k i j n

j k j k n

u u F du du

u u F du du

ϑ ρ φ

ξ ρ ϕ

=

=

∫∫

Or ( ) ( ) ( )

( )( ) ( ) ( )

( )

0

0

0

2 1 2 11

1 1 2

2 1 2 11,

1 1 2

ˆ ,

ˆ ,

ni i i i n j k i

j ki n i i

ni i i i n j k i

j ki n i i

U I U U T Un

A U U

U I U U T Un

A U U

δ ρ φϑ

δ ρ ϕξ

−

=

−

=

+ ≥=

+

+ ≥=

+

∑

∑

Now the proposed linear estimator of g for all x ∈ R is

( ) ( ) ( ) ( ) ( )( ) ( )2 1 2 11

1 1 2

ˆˆn

i i i i n Jk iJk Jk Jk

k k i n i i

U I U U T Ug x x n x

A U Uδ ρ φ

ϑ φ φ−

∈ ∈ =

+ ≥= =

+∑ ∑∑Z Z

From Vidakovic (1999, p. 171-173), it follows that ( ) ( ) ( ),k

K x y x k y kφ φ∈

= − −∑Z

is a reproducing kernel

of 0V . By self-similarity of multiresolution subspaces, ( ) ( ), 2 2 , 2j j jjK x y K x y= is a reproducing kernel

of jV . Thus, the projection of g on the space jV is given by

( ) ( ) ( )( ) ( )2 1 21

11 1 2

ˆ ,n

ni i i i n

h ii n i i

U I U U Tg x n K x U

A U Uδ ρ−

=

+ ≥=

+∑


where 2 Jnh −= . We are now in the position to construct our regression estimator based on the form of Nadria-

Watson estimator

( ) ( )( )

( ) ( )( ) ( )

( )( ) ( )

2 1 21

1 1 2

1 21

1 1 2

,ˆ

ˆ (2.11)ˆ,

n

n

ni i i i n

h ii n i i

ni i i n

h ii n i i

U I U U TK x U

g x A U Ux

I U U Tf x K x UA U U

δ ρ

ϕδ

=

=

+ ≥+

= =+ ≥

+

∑

∑

3. Results ٣.١ Assumptions

Before describing the main results, we formulate the following assumptions: A١. There exists a constant 0C > such that

( ) ( )sup| | , | | .y

y C y dy Cρ ρ+∞

∈ −∞

< ≤∫R

A٢. The function ( )f x is bounded and uniformly continuous on R 3.2. Main Results

Theorem 3.1 below investigates the uniform rates of almost sure convergence for ˆ(2.11)ϕ based on consecutive survival data. Theorem 3.1. Let assumptions A١ and A٢ hold. Then for every compact set D ∈ R

( ) ( ) ( )ˆsup| | ,nx D

x x Oϕ ϕ η∈

− =

almost surely. Here

12log

nn

nnh

η

=

.

4. Proofs In this section, C denotes any constant that does not depend on j , k and n . Its value may change from one term to

another and may depend on φ .

Proof of Theorem 3.1: The proof of the Theorem will be completed as follow. In the first step we show that the

convergence of ϕ depend on the convergence of g and f . We can define this fact as follow,

|)()(ˆ

))()(ˆ)(())()(ˆ)((|sup|=)()(

)(ˆ)(ˆ

|sup|=)()(ˆ|supxfxf

xfxfxgxgxgxfxfxg

xfxgxx

xxx

−−−−−

∈∈∈ DDDϕϕ

|)(ˆ

))()(ˆ)(())()(ˆ(|sup|=)()(ˆ|supxf

xfxfxxgxgxxxx

−−−−

∈∈

ϕϕϕ

DD

Under assumptions A1 and A2, it is easy to show that

sacoxfxfxx

. 0>|=(1))(|inf|=)(ˆ|inf +∈∈ DD

∞∈

|<)(|sup xx

ϕD

Now, to prove the main theorem we need to show the following results.

saOxgxg nx

. ),(|=)()(ˆ|sup η−∈D


saOxfxf nx

. ),(|=)()(ˆ|sup η−∈D

In next step of the proof we need to show the almost convergency for g . So we break it in two part

),()(

)())(()(1=)(ˆ 121

2122

1=inh

iin

niiniiin

in

UxKUUA

TUUUUn

xg+

≥+≤∑ II τρρδτ

|))(ˆ())(ˆ(||))(ˆ()(ˆ||)(ˆ)(ˆ|sup|)()(ˆ|sup xgxgxgxgxgxgxgxgnnnnRxRx

EEE −+−+−≤−∈∈

ττττ

321=: SSS ++

),()(

)()>)(()(1|=)(ˆ)(ˆ|sup 121

2122

1=inh

iin

niiniiin

inRx

UxKUUA

TUUUUn

xgxg+

≥+− ∑

∈

II τρρδτ

)>)(()(||sup 22

1=nii

n

in

UUAnh

Cτρρ I∑≤

Now )|)((|)|>)((| 22νν ρττρ inni UU EP −≤ and since ∞−∞∑ <

1=ντ nn

we have by the Borel-Cantelli lemma that

niU τρ ≤|)(| 2 almost surely for all sufficiently large n. Since nτ is increasing, niU τρ ≤|)(| 2 for all ni ≤ . Hence

|)(ˆ)(ˆ|sup xgxgnRx τ−∈ is eventually zero almost surely.

It is also necessary to mention that the term )( 21 iin UUA + is bounded from below. Because from Lemma 4.3 in

[Strzalkowska-Kominiak and Stute(2010)] we know that the process |/| nAA is bounded from above with probability tending to one. then we have

.. ||sup||inf . sup saACAorsaCAA

nn

≥≤

Upper bound 3S : Using the same argument as above for nA and based on assumption A.1, we have

|),()(

)()>)(()(1|=|))(ˆ())(ˆ(| 121

2122

1=

+

≥+− ∑ inh

iin

niiniiin

in

UxKUUA

TUUUUn

xgxg IIEEE τρρδτ

( ) duhu

hxKUU

hM

nnRnii

n

),()|>)((||)(| 221

1 ∫≤ τρρκ

IE

)|)((|(2)21

1

11 νν ρ

τκ in

UGM E−≤


Based on this fact , 0,11 ∞→→− n

nn ητ ν we obtain ).(=3 nOS η Now we turn to find an upper bound for 2S .

Upper bound for 2S : Since D is compact set, it can be covered by a finite number nL of interval of length nl and

centered at some points nj Ljx 1,...= , . Let us denote each interval by ).,( njxI l One can find a 0>c satisfying 1= −

nn cL l . Now for any Dx∈ , there exists an interval ),( njxI l which contain x. Therefore,

njxx l≤− ||

Hence,

|))(ˆ()(ˆ|supmax|))(ˆ()(ˆ|sup),()(1

jnjxIxnLjnnx

xgxgxgxg EE −≤−∈≤≤∈ l

ττD

|)(ˆ)(ˆ|supmax),()(1

jnnnjxIxnLj

xgxg ττ −≤∈≤≤ l

|))(ˆ()(ˆ|max)(1

jnjnnLjxgxg ττ E−+

≤≤

|))(ˆ())(ˆ(|supmax),()(1

xgxgnjn

njxIxnLjττ EE −+

∈≤≤ l

321=: QQQ ++

So, we can write

( )),(),()(

)())(()(1|=)(ˆ)(ˆ| 1121

2122

1=ijnhinh

iin

niiniiin

ijnn

UxKUxKUUA

TUUUUn

xgxg −+

≥+≤− ∑ II τρρδ

ττ

1

21

2 κτ

κτ

n

nnj

n

n

hxx

hl

≤−≤ PP

by considering 1/223 )/log(= nnn nnh τl , then we have

. ),(|=)(ˆ)(ˆ|supmax=),()(1

1 surelyalmostOxgxgQ njnnnjxIxnLj

ηττ −∈≤≤ l

We can immediately prove that the upper bound for 3Q will be obtain by using the same argument like .1Q The main

task is to show that )(=2 nOQ η almost surely. Let 0>*c be constant which will be chosen a posteriori. Note that

( )εε ττ *1=

*2 |>))(ˆ()(ˆ|)( cxgxgcQ jnjn

nL

jEPP −≤≥ ∑

( ),|>))(ˆ()(ˆ|sup *εττ cxgxgL jnjnxn EP

R−≤

∈ (12)

where nL is defined by 1/223 )/log( −nn nnhc τ . Note that,


n

Wxgxg nin

ijnjn ∑−

1==))(ˆ()(ˆ ττ E

where,

)(

),()>())(()(=

21

12122

iin

inhniiniiini UUA

UxKTUUUUW

+

+≤ II τρρδ

+

+≤−

)(

),()>())(()(

21

12122

iin

inhniiniii

UUA

UxKTUUUU IIE

τρρδ

Theorem 4.1 (Bernstein’s Inequality). Suppose that nWW ,...,1 are independent random variables with mean zero and

0|<| CWni a.s. Let )(=2nii WVarσ and 2

1=2 1= i

n

inσσ ∑ . Then,

+

−≤

≥∑ /322

exp|1|0

2

2

1= εσε

εC

nWn ni

n

iP

Now for the Bernstein inequality to be applicable, we need to suppose 0→nnητ . Then, by considering niW as

independent random variables in the above theorem, we can easily show that 0=)( niWE . Also, we have

( ) nninhinhi ChUxKUxKCW τ111 |),(),(||| −≤−≤ E

111

21 ),()( −− ≤≤ ∫ nni ChduuxKChWV

Therefore by using the above results and The Bernestain inequality, we can show that

+

−≤≥ −−

∞∞

∑∑ /3)(2exp)( 11

22*

1=*2

1= nnnn

nn

nn

n hhCncLcQ

ητη

ηP

1/2

*1=1

2*

1==logexp

ccn

nn

n n

Lc

ncL ∑∑∞∞

−≤

So by choosing *c large enough, we can show that ∞∑∞ <1/2

*1= cc

nn

n

L and therefore by Borel-Cantelli lemma yields

),(|=))(ˆ()(ˆ|max=)(1

2 njjnLj

OxgxgQ ηE−≤≤

almost surely.

References


[1] J. Ahmadi, M. Doostparast, A. Parsian, (٢٠١٢). Estimation with left-truncated and right censored data: A comparison study, Statistics and Probability Letters, ١٤٠٠-١٣٩١ ,٨٢. [2] Alvarez, J., Liang, H. and Rodriguez-Casal, A. (٢٠١٠). Nonlinear wavelet estimator of the regression function under left-truncated dependent data, Journal of Nonparametric Statistics, ٣٤٤- ٣١٩ ,٢٢. [3] Chaubey, C., Chesneau. C. and Shirazi, E. (٢٠١٣). Wavelet-based Estimation of Regression Function for Dependent Biased Data under a Given Random Design. Journal of Nonparametric Statistics, ٧١- ٥٣ ,(١)٢٥. [4] Mallat, S. (٢٠٠٩). A wavelet tour of signal processing. Elsevier/ Academic Press, Amsterdam, third edition. The sparseway, With contributions from Gabriel Peyr\'e. [5] Park, J. (٢٠٠٤). Optimal global rate of convergence in nonparametric regression with left-truncated and right-censored data, Journal of Multivariate Analysis, ٨٦- ٨٩,٧٠. [6] Rosenblatt, M. (١٩٥٦). A central limit theorem and strong mixing conditions, Proc . Nat. Acad. Sci., ٤٧- ٤٣ ,٤. [7] Shirazi. E., Doosti, H., Nirumand, H. A. and Hosseiniun. N. (٢٠١٣). Nonparametric Regression Estimates With Censored Data Based On Block Thresholding Method. Journal of Statistical Planning and Inference, ١١٦٥- ١١٥٠ ,١٤٣. [8] Strzalkowska-Kominiak, E. and Stute, W. (٢٠١٠). The statistical analysis of consecutive survival data under serial dependence. Journal of Nonparametric Statistics, ٥٩٧- ٥٨٥ ,٢٢. [9] Strzalkowska-Kominiak, E. and Stute, W. (٢٠١٢). Nonparametric Regression For Consecutive Survival Data, South African Statistical Journal, ٣٧٦- ٣٥٧ ,٤٦. [10]Vidakovic, B.(١٩٩٩). Statistical modeling by wavelets. John Wiley And Sons, New York.


Wavelet-Based Quantile Density Estimation By Block Thresholding Method

Esmaeil Shirazi, Abdolsaeed Toomaj

Department of Statistics, Faculty of Science, Gonbad Kavous University, Gonbad Kavous, Iran.

Abstract:

Here we consider wavelet-based identification and estimation of a quantile density function via block thresholding methods and investigate their asymptotic convergence rates. We show that these estimators, based on block thresholding of empirical wavelet coefficients, achieve optimal convergence rates over a large range of Besov function classes, and in particular enjoy those rates without the extraneous logarithmic penalties that are usually suffered by term-by-term thresholding methods.

Keywords: block thresholding method, Quantile density estimation, Wavelets. 1 Introduction and problem statement The quantile function approach is a useful tool in statistical analysis. It has been used in

exploratory data analysis, applied statistics, reliability and survival analysis. (See, for example, Reid (1981), Slud et al.(1984), Su and Wei (1993), Nair et al. (2008), Nair and Sankaran (2009) and Sankaran and Nair (2009). For a unified study of this concept, one can refer to Parzen (1979), Jones (1992), Friemer et al. (1988), Gilchrist (2000) and Nair and Sankaran (2009). The concept of quantiles has been used by Peng and Fine (2007), Jeong and Fine (2009) and Sankaran et al. (2010) for modelling competing risk models.

In classical statistics, most of the distributions are defined in terms of their cumulative distribution function (cdf) or probability density function (pdf). There are some distributions which do not have the cdf/pdf in an explicit form but a closed form of the quantile function is available, for example Generalised Lambda distribution (GLD) and Skew logistic distribution Gilchrist (2000). Karian and Dudewicz (2000) showed the significance of different Lambda distributions for modeling failure time data. Quantile measures are less influenced by extreme observations. Hence the quantile function can also be looked upon as an alternative to the distribution function in lifetime data for heavy tailed distributions. Sometimes for those distributions whose reliability measures do not have a closed or explicit form, the reliability characteristics can be represented through quantile function.

Let X be a continuous random variable with cumulative density function F (x), density function f (x) and hazard function r(x). The quantile function of X is defined as

>)( ;inf=)(=)( 1 xyFyxFxQ R∈− (1) It satisfies F (Q(x)) = x. Parzen (1979) and Jones (1992) defined )(=)( xQxg ′ as the quantile density function corresponding to quantile function )(xQ . Differentiating (1), we get


[0,1] ,))((

1=)( ∈xxQf

xg (2)

Note that the sum of two quantile density functions is again a quantile density function. This idea is useful in modeling data. Nair and Sankaran (2009) defined the hazard quantile function

(0,1). ,)()(1

1=))((1

))((=))((=)( ∈−−

xxgxxQF

xQfxQrxR

Hence g(x) appears in the expression for hazard quantile function and it would be useful to study nonparametric estimators of this unknown quantile density function.

Recently, the estimation of quantile density function have been studied elsewhere in some literatures. For example, the kernel method were used by Jones (1999) and Soni et al.(2012) for studying non-parametric estimators of quantile density estimation. They proposed some smooth estimators and investigated their asymptotic properties. Chesneau et al.(2015) discussed the nonparametric wavelet estimators of the quantile density function and proposed tow kinds of the projection estimators in linear and nonlinear form.

The objective of this paper is to propose block thresholding wavelet estimators for the quantile density which belong to a large function class and investigate their asymptotic convergence rates. We show that these estimators attain optimal and nearly optimal rates of convergence over a wide range of Besov function classes, and in particular enjoy those rates without the extraneous logarithmic penalties that given in Chesneau et al.(2015).

The paper is structured as follows. Section 2 presents additional assumptions on the model. We describe in Section 3 our wavelet-based framework and strategy. Results are given in Section 4. The proofs are gathered in Section 5.

2 Wavelets, Besov balls and estimators We start this section by introducing the concept of Multiresolution Analysis (MRA) on R

as described in Meyer (1992). Let φ be a scale function and ψ its associated wavelet basis of [0,1]2L , and define )(22=)( 0/20

0jxx ii

ji −φφ and )(22=)( /2 jxx iiij −ψψ . We assume that the father

and mother wavelets, )(xφ and )(xψ , are bounded and compactly supported over [0,1], that 1=φ∫ and that the wavelets are r-regular. We call a wavelet ψ r-regular if ψ has r vanishing moments and r continuous derivatives. Given a square-integrable g, put jiji g

00= φα ∫ and ijij gψβ ∫= . An

empirical wavelet expansion for all )(2 RLg ∈ is given by ),()(=)(

000

xxxg ijijZjii

jijiZj

ψβφα ∑∑∑∈≥∈

+

As is done in the wavelet literature, we investigate wavelet-based estimators asymptotic convergence rates over a large range of Besov function classes s

pqB , 0>s , ∞≤≤ qp,1 . The parameter s measures the number of derivatives, where the existence of derivatives is required in an

pL -sense, whereas the parameter q provides a further finer gradation. The Besov spaces include, in particular, the well-known Sobolev and Hölder spaces of

smooth functions mH and sC and ( mB22 and sB ∞∞, respectively), but in addition less traditional spaces, like the spaces of functions of bounded variation, sandwiched between 1

1,1B and 11,∞B . The

latter functions are of statistical interest because they allow for better models of spatial of inhomogeneity (e.g. Meyer (1992) and Donoho and Johnstone (1995)).


For a given r-regular mother wavelet ψ with sr > , define the sequence norm of the wavelet coefficients of a regression function s

pqBg ∈ by

qqppij

j

i

ii

ppji

jspqB

g 1/1/

0=

1/0

])||([2)||(=|| βα σ ∑∑∑∞

+ (3)

Where ps 1/1/2= −+σ . Meyer (1992) shows that the Besov function norm spqB

gPP is equivalent to

the sequence norm spqB

g || of the wavelet coefficients of g . Therefore we will use the sequence

norm to calculate the Besov norm spqB

gPP in the sequel. We also consider a subset of Besov space

spqB such that ][1,,1,> ∞∈qpsp . The spaces of regression functions that we consider in this paper

are defined by [0,1],,,:=)(, ⊆≤∈ gsuppMgBggMF s

pqBspq

sqp PP

i.e., )(, MF sqp is a subset of functions with fixed compact support and bounded in the norm of one of

the Besov spaces spqB . Moreover, 1>sp implies that )(, MF s

qp is a subset of the space of bounded continuous functions.

From Vidakovic (1999, p.171-173), it follows that )()(=),( jyjxyxKj

−−∑ φφ is a

reproducing kernel of 0V . By self-similarity of multiresolution subspaces, ),2(22=),( yxKyxK iiii is

a reproducing kernel of iV . Thus, the projection of g on the space iV is given by .)(),2(22=)(=)( dyygyxKxgKxgProj iii

iiV ∫

The detail spaces iW reproducing kernel spaces as well, and dyygyxDxgDxgProj iii

iiW )(),2(22=)(=)( ∫

Where )()(=),( jyjxyxDj

−−∑ ψψ . Noting that ),(),(=),( 1 yxKyxKyxD iii −+ and there exists

an integrable function Q such that )(|),(| yxQyxK −≤ for all x, y, which implies that for all integers i and all ∞≤≤ p1 ,

ppippi gQgDandgQgK PPPPPPPPPPPP 11 ≤≤ The above representation suggests the following estimators for )(xgKi and )(xgDi :

),(1=)(ˆ),,(1=)(ˆ1=1=

mim

n

mimim

n

mi XxDY

nxDXxKY

nxK ∑∑ (4)

The term-by-term hard thresholded wavelet estimator of )(xg (see Chesneau et al.(2015)) is ),()|>ˆ(|ˆ)(ˆ=)(~

000

xIxxg ijijijZjii

jijiZj

H ψλββφα ∑∑∑∈≥∈

+ (5)

where λ is a threshold and the empirical wavelet coefficients are defined by

dxxFdxxxFf

dxxxg jijijiji ))((=)())((

1=)()(=0[0,1]01[0,1]0[0,1]0

φφφα ∫∫∫ −

Similarly, dxxFjiij ))((=

0[0,1]ψβ ∫

Since F is unknown, we estimate it by the empirical estimator:


[0,1] ),<(1=)(ˆ1=

∈∑ xxXIn

xF i

n

in

This leads the following integral estimator for ji0α and ijβ

dxxFdxxF jiijjiji ))(ˆ(=ˆ ,))(ˆ(=ˆ0[0,1]0[0,1]0

ψβφα ∫∫ (6)

Clearly, ji0α and ijβ are not unbiased estimators for ji0

α and ijβ . However, using the dominated convergence theorem, one can prove that they are asymptotically unbiased.

The above term-by-term thresholded estimators (5) which are also considered in Chesneau et al. (2008) don’t attain the optimal convergence rates of )2/(12 ssn +− , but do attain the rate

)2/(122

1 )log( ssnn +− , which involves a logarithmic penalty. The reason is that a coefficient is more likely to contain a signal if neighboring coefficients do also. Therefore, incorporating information on neighboring coefficients will improve the estimation accuracy. But for a term-by-term thresholded estimator, other coefficients have no influence on the treatment of a particular coefficient.

A block thresholding estimator is to threshold empirical wavelet coefficients in groups rather than individually. It is constructed as follows. At each resolution level i , the integers j are divided among consecutive, nonoverlapping blocks of length l , say

∞−∞≤≤+−Γ <<,11)(:= kkljlkjik . Within this block ikΓ , the average estimated squared bias 2

)(1 ˆ=ˆ

ijkBjik lA β∑ ∈− will be compared to the threshold. Here, )(kB refers to the set of indices j in

block ikΓ . If the average squared bias is larger than the threshold, all coefficients in the block will be kept. Otherwise, all coefficients will be discarded. For additional details, see Cai (2002).

Let 2)(

1= ijkBjik lA β∑ ∈− and estimating this with ikA , the block thresholding wavelet

estimator of )(xgτ becomes

)>ˆ()(ˆ)(ˆ=)(ˆ 1

)(0=00

−

∈∈∑∑∑∑ + cnAIxxxg ikijij

kBjk

R

iijiji

Zjψβφα (7)

),>ˆ()()(ˆ)(ˆ= 1

0=0

−∈+ ∑∑ cnAIJxIxDxK ikikikk

R

i

The smoothing parameter R corresponds to the highest detail resolution level, parameter l is the block length and c is a threshold constant. Notice that )(ˆ xDik is an estimate of

)(=)()(

xxgD ijijkBjik ψβ∑ ∈, and =0:=

)()( ijkBjijkBjik suppxJ ψψ UU ∈∈≠ .

3 Asymptotic results Our main theorem shows that the wavelet-based estimators, based on block thresholding of

the empirical wavelet coefficients, attain optimal and nearly optimal convergence rates over a large range of Besov function classes.

Theorem 3.1 If the above conditions hold and if τg is as defined by (7), with the block

length nl log= and − )(log= 22 nlR , then there exists a positive constant C such that for all

)(0,, 0 ∞∈CM and rsp <<1/ ; ][1,∞∈q : 1.if ][2,∞∈p ,


,)ˆ(sup )2/(122

)(,

ss

MsqpFg

CnggE +−

∈

≤−∫ ττ

2.if [1,2)∈p ,

,)log()ˆ(sup )2/(12)2(12

22

)(,

ssspp

MsqpFg

nnCggE +−+−

∈

≤−∫ ττ

3.1 Auxiliary results In the following section we provide some asymptotic results that are of importance in

proving the theorem. The proof of Theorem 3.1 is a consequence of Propositions 3.1 and 3.2 of [Chesneau(2010)] and we describe them below. They show that the estimators jkβ defined by (6) satisfy a standard moment inequality and a specific concentration inequality. Before presenting these inequalities, the following lemma determines an upper bound for |ˆ| ijij ββ − .

Lemma 3.1 Suppose that the assumptions of Theorem 3.1 are satisfied. Then, for any

1,..., 0 Rii +∈ and any 10,...,2 −∈ ij , the estimator ijβ defined by (6) satisfies dxxFxFK i

ijij |)()(ˆ|2|ˆ|[0,1]

/23 −≤− ∫ββ

.|)()(ˆ|sup2[0,1]

/23 xFxFK i −≤

with |)(|sup= [0,1] xK ψ ′ and )(22=)( /23 jxx iiij −′ ψψ .

Proposition 3.1 Let 2≥p . Suppose that the assumptions of Theorem 3.1 are satisfied, then

there exists a constant 0>C such that, for any 0jj ≥ , and n large enough, the estimator jkβ , defined by (6) satisfies the following:

( ) ppjkjk Cn−≤− 2|ˆ| ββE (8)

The expression in (8) holds for jkα as well, replacing jkβ by jkα and jkβ by .jkα Proposition 3.2 Let 2≥p . Under the assumptions of Theorem 3.1, there exists a constant

0>c such that, for any ,0jj ≥ and large enough n, the estimators jkξ defined by (6) satisfy the following concentration inequality:

,))(log(|ˆ| 1/21/21/

)(

pp

pjkjk

iCnncn −− ≤

≥

−∑ ββP (9)

for some constant 0.>C 4 Proof In this section, C represents a constant which may differ from one term to another. We


suppose that n is large enough. Proof of the Theorem 3.1: For the sake of simplicity, we set ijijij ββθ −ˆ=ˆ . Applying the

Minkowski inequality and an elementary inequality of convexity, we have ( ) ( )4321

14ˆ TTTTggE ppp +++≤− −PP , where

,ˆ= 22001 PP gKKET −

,])>ˆ()(ˆ[= 22

1

0=2 PP gDcnAIJIDET iikikik

k

si

i−−∑∑

,])>ˆ()(ˆ[= 22

1

1=3 PP gDcnAIJIDET iikikik

k

R

sii−−

+∑∑

,= 22

1=4 PP gDET i

Ri∑

∞

+

In order to prove the above theorem, it suffices to bound each term 321 ,, TTT and 4T separately. Lemma 4.1 Assume nu R∈ and pp

iip uu 1/)||(= ∑PP , for ∞≤≤ 21<0 pp . Then the following inequalities hold:

.2

2

1

1

1

12 ppp

pp unuu PPPPPP−

≤≤

Lemma 4.2 Let gDi and iD be defined as in (2.2) and (2.4). Then

[ ]2

1/22

=

2

2=)ˆ()ˆ(

−≤− ∫∑∑ fDDEgDDE ii

J

Iiii

J

Ii

Lemma 4.3 If ( ) 112 2)( −−≤∫ lCndxxgDik

ikJ, then

( ) ( ) .0.08)()(ˆ)(ˆ 1212

≥−⊆

≥ −− ∫∫ lcndxxgDxDlcndxxD ikik

ikJikikJ

If 112 2>))(( −−∫ lCndxxgDikikJ

, then

( ) ( ) .0.08)()(ˆ)(ˆ 1212

≥−⊆

≤ −− ∫∫ lcndxxgDxDlcndxxD ikik

ikJikikJ

The upper bound for 1T : It follows from Lemma 4.1 and proposition ?? that

( ) ,2ˆ)()ˆ(= 102

00

102

0=

22000

102

0=1

−−−

≤−≤

− ∑∑ nCECxET i

jiji

i

jjijiji

i

jααφαα PP

Based on our choice of 0=0i , we have )(= 11

−nOT . The upper bound for 4T :Using the Minkowski inequality, (2.1), the inclusion s

qps

qp BB21

⊆ , and the fact that >s , we find


σββ i

Riij

i

jRiij

i

jRiMCCCT 22

1=

212

0=1=

2

21

212

0=1=4 2|| −

∞

+

−∞

+

−∞

+∑∑∑∑∑ ≤≤

≤

σσσσ RR CC 21222 2)2(122= −−−−− ≤− On the basis of our choice R with 2

2 )log(2 −nnR; and )2/(12>)1/2(1=2 ssps +−+σ , we obtain )(= )2/(12

4ssnOT +− .

The upper bound for 2T : Applying the Minkowski inequality and an elementary

inequality of convexity, we have

21/221

0=2 )>ˆ()(ˆ

−≤ −∑∫∑ dxgDcnAIJIDET iikikik

k

si

i

Writing )(=:)(=:)(=)()(

xgDxxxgD ikkijijkBjkijijji ∑∑∑∑ ∈ψβψβ , we have for the term in

brackets

dxxgDcnAIJIxDE iikikikk

21 )()>ˆ()()(ˆ

−−∑∫

( ) dxcnAIJIxgDxDE ikikikikk

21)>ˆ()()()(ˆ 3

−≤ −∑∫

dxcnAIcnAIxgDE ikikikk

211 )ˆ()2<()(

≤+ −−∑∫ (10)

)ˆ()2>()(2

11 dxcnAIcnAIxgDE ikikikk

≤+ −−∑∫

( ) dxxgDxDE ikik

2)()(ˆ 3 −≤ ∫

( ) )2()( 12 −≤+ ∫∑ cnAdxIxgDE ikikikJ

k

( ) )ˆ()2>()( 112 −− ≤+ ∫∑ cnAIcnAdxIxgDE ikikikikJ

k

( ),=: 232221 TTT ++ where the last inequality follows from the orthogonality of ψ and )(ˆ=)(ˆ xxD ijijji ψβ∑ .

The upper bound for 21T : As to the first term 21T , by using the same argument as in term

1T and from Proposition ??, we can get 12

21 2)ˆ(= −≤−∑ nCET iijij

jββ (11)

The upper bound for 22T : On the basis of the orthogonality of wavelets ψ we have )2(=)2(= 112

)(22

−−

∈

≤≤ ∑∑∑ cnAIlAEcnAIET ikikk

ikijkBjk

β

11 2=2 −−∑≤ nCcnl i

k

(12)


Where the last inequality arising from existence at most il 21− terms in ∑k for each i . Thus, we

have 122 2 −≤ nCT i . The upper bound for 23T : From Lemma 4.3 and Proposition ?? we have

( ) ( )

≥−≤ −∫∫∑ 122

23 0.16)()(ˆ)( lcndxxgDxDdxIxgDET ikikikJik

ikJk

≥−≤ −

∈∫∑∑ 122

)(0.16))()(ˆ( lcndxxgDxDIE ikik

ikJijkBjk

β

≥−≤ −−

∈∑∑∑ 12

1212

)(0.16))ˆ(( lcnlP ijijij

kBjkβββ

.12

)(

1 −

∈

− ≤≤ ∑∑ CnCn ijkBjk

β (13)

Now, by using the results in Eq. (11), (??) and (??), we have

[ ]2

1/2111

0=2 22

++≤ −−−∑ CnnCnCT iisi

i

[ ]2

1/21/21

0=)(2

+≤ −−∑ nnC isi

i

( ).2 121 −− +≤ ninC ssi

Now, if si satisfies 1)1/(22 +ssi n; , then 1)/(222

+−≤ ssCnT . If si satisfies )21/(1)2(12

2 )log(2 sspp

si nn ++−

; ., then

)2/(12)2(12

22 )log( sspp

nnCT +−+−

≤ α . The upper bound for 3T : From Lemma 4.2, we have

21/221

1=3 )()>ˆ()(ˆ

−≤ −

+∑∫∑ dxxgDcnAIJIDET iikikik

k

R

sii

Applying the same argument as in 2T , we can write the term in brackets as

dxxgDcnAIJIDE iikikikk

21 )()>ˆ()(ˆ

−−∑∫

( ) )>ˆ()2>()()(ˆ 112 −−−≤ ∫∑ cnAIcnAdxIxgDxDE ikikikikk

( ) )>ˆ()2()()(ˆ 112 −−≤−+ ∫∑ cnAIcnAdxIxgDxDE ikikikikk

( ) )ˆ()2()( 112 −− ≤≤+ ∫∑ cnAIcnAdxIxgDE ikikikk

( ) )ˆ()2>()( 112 −− ≤+ ∫∑ cnAIcnAdxIxgDE ikikikk

( ).=: 34333231 TTTT +++ (14)

The upper bound for 31T : With accepting that 121 ≥− ncAik and 12)ˆ( −≤− CnE ijij ββ for all i and


j , we have

( ) ncdxAxgDxDET ikikikk

2)()(ˆ 1231

−−≤ ∫∑

( ) nlEC ijkBj

ijijkBjk

2

)(

12

)(.ˆ βββ ∑∑∑

∈

−

∈

−≤

.=. 22

)(

11ij

jij

kBjkCnllnC ββ ∑∑∑

∈

−−≤

When 2≥p , from (3), we have siijj

C 22 2−≤∑ β . Hence

siCT 231 2−≤ (15)

The upper bound for 33T : By using the same arguments as to term 31T , we have

( ) ,2)( 22233

siij

jik

kCdxxgDT −≤≤≤ ∑∫∑ β (16)

Next, to find an upper bound for the terms 31T and 33T , when 2<1 p≤ and

)21/(1)2(12

2 )log(2 sspp

si nn ++−

; , we have

( ) 2211231 ))(2>()()(ˆ

p

ik

p

kikikik

kAnlnCncAdxIxgDxDET −− ∑∫∑ ≤−≤

.2

2

)(

21

212

2

)(

121

p

ijkBjk

ppp

ijkBjk

p

nCllCln

≤

≤ ∑∑∑∑

∈

+−−

∈

−+−ββ

By using the result noted for 4T , when 2<1 p≤ , we have pii PPPP .2. ββ ≤ , thus

pijkBj

p

ijkBjββ ∑∑ ∈∈

≤)(

22)(

)( . Hence,

,221

21

21

21

31pip

ppp

ijj

pp

MnClnClT σβ −+−−+−−≤≤ ∑ (17)

As to the term 33T , for 2<p , noticing that 12 11 ≥−−ikAcn , we have

211112

33 ).(2)2())((p

ikikk

ikikk

AcnlAcnAdxIxgDT−−−− ∑∫∑ ≤≤≤

,2= 21

21

221 pip

ppp

ikk

p

MnClACln σ−+−−+−≤∑ (18)

which the last step follows from that of 31T . The upper bound for 32T : Suppose that the assumptions of Theorem 3.1 hold. Using

Lemma 4.3, we have

( ) ,)0.16))()(ˆ(()()(ˆ 12232

≥−×−≤ −∫∫∑ lcndxxgDxDIdxxgDxDET ikik

ikJikikk

the Cauchy-Schwartz inequality, and the Proposition ?? and ??, we obtain

( ) 21

121

2121

4

)(32 0.16))ˆ((ˆ

≥−×

−≤ −−

∈∑∑∑ lcnlPET ijijijij

kBjkββββ


122

)(2 −−−

∈

≤≤≤ ∑∑ CnnCnC i

kBjk

(19)

The upper bound for 34T : The treatment of 34T and 23T are the same in the proof. We notice that the only difference between them is the different ranges of i . Since the proof of 32T holds for all

Ri 2≤ , we also have )(= 134

−nOT . Combine these four terms together; we have

[ ]2

1/21212

1=3 22

+++≤ −−−−

+∑ CnCCnCT sisi

R

sii

( )2

1/2

1=2

+≤ −−

+∑ CnC si

R

sii

( )1222 −− +≤ nCRCC ssi .2/12 ssCn +−≤

Finally, by Combining these four bounds together, we complete the proof of Theorem 3.1. References

1. Ahmad, I.A. (1995). On multivariate kernel estimation for samples from weighted distributions, Statistics and Probability Letters, 22, 121-129.

2. Antoniadis, A. (1997). Wavelets in statistics: a review (with discussion). Journal of the Italian Statistical Society, Series B, 6, 97-144.

3. Antoniadis, A., Bigot, J. and Gijbels, I. (2007). Penalized wavelet monotone regression. Statistics and Probability Letters, 77, 1608-1621.

4. Antoniadis, A. and Pham, D. T. (1998). Wavelet regression for random or irregular design. Computational Statistics and Data Analysis, 28, 353-369.

5. Burman, P. (1991). Regression Function Estimation from Dependent Observations. Journal of Multivariate Analysis 36, 263-279.

6. Babu, G. J. and Chaubey, Y.(2006). Smooth estimation of a distribution and density function on a hypercube using Bernstein polynomials for dependent random vectors, Statistics and Probability Letters, 76, Issue 9, 959-969.

7. Chesneau, C. (2010). Wavelet block thresholding for density estimation in the presence of bias. Journal of the Korean Statistical Society, 39, 43-53.

8. Chesneau, C.and Shirazi, E. (2011). Nonparametric wavelet regression based on biased data. Submitted.

9. Cohen, A., Daubechies, I., Jawerth, B. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms, Applied and Computational Harmonic Analysis, 24, 1, 54-81.

10. Cristóbal, J.A. and Alcalá, J.T. (2000). Nonparametric regression estimators for length biased data. J. Statist. Plann. Inference, 89, 145-168.

11. Cristóbal, J.A. and Alcalá, J.T. (2001). An overview of nonparametric contributions to the problem of functional estimation from biased data. Test, 10(2), pp. 309-332.

12. Cristòbal, J.A., Ojeda, J.L. and Alcalà, J.T. (2004). Confidence bands in nonparametric regression with length biased data. Ann. Inst. Statist. Math., 56, 3, pp. 475-496.

13. Delyon, B. and Juditsky, A. (1996). On minimax wavelet estimators, Applied Computational Harmonic Analysis, 3, 215-228.

14. Donoho, D., Johnstone, I., Kerkyacharian, G., and Picard, D. (1996). Density estimation by


wavelet thresholding. Annals of Statistics, 24, 2, 508-539. 15. Efromovich, S. (2007). Adaptive estimation of error density in nonparametric regression with

small sample size, Journal of Statistical Planning and Inference, 137, 2, 363-378. 16. Fujii, T. and Konishi, S. (2006). Nonlinear regression modeling via regularized wavelets and

smoothing parameter selection, Journal of Multivariate Analysis, 97, 9, 2023-2033. 17. Hall, P. and Heyde, C. C. (1980). Martingale Limit Theory and its Application. Academic

Press, New York. 18. Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998). Wavelet, Approximation

and Statistical Applications, Lectures Notes in Statistics, New York 129, Springer Verlag. 19. Liebscher, E. (1996). Strong convergence of sums of a-mixing random variables with

applications to density estimation., Stochastic Processes and their Applics., 65, 69- 80. 20. Mallat, S. (2009). A wavelet tour of signal processing. Elsevier/ Academic Press, Amsterdam,

third edition. The sparse way, With contributions from Gabriel Peyré. 21. Masry, M.(2000). Wavelet-Based estimation of multivariate regression functions in besov

spaces. Journal of Nonparametric Statistics, 12: 2, 283-308. 22. Meyer, Y. (1992). Wavelets and Operators. Cambridge University Press, Cambridge. 23. Patil, G.P. and Taillie, C. (1989). Probing encountered data, meta analysis and weighted

distribution methods. In Statistical data analysis and inference (Neuchâtel, 1989), pp. 317-345. North-Holland, Amsterdam.

24. Petrov, V.V. (1995). Limit Theorems of Probability Theory: Sequences of Independent Random Variables. Oxford: Clarendon Press.

25. Ogden, T. and Parzen, E. (1996). Data Dependent Wavelet Thresholding in Nonparametric Regression with Change-point Applications, Computational Statistics and Data Analysis, 22, 53-70.

26. Ramirez, P. and Vidakovic, B. (2010). Wavelet density estimation for stratified size-biased sample. Journal of Statistical Planning and Inference, 140, 2, 419-432.

27. Rio, E. (1995). The functional law of the iterated logarithm for stationary strongly mixing sequences. Annals Prob., 23, 1188- 1203.

28. Rosenthal, H.P. (1970). On the subspaces of pL ( 2≥p ) spanned by sequences of independent random variables. Israel Journal of Mathematics, 8, 273-303.

29. Rosenblatt, M. (1956). A central limit theorem and strong mixing conditions, Proc. Nat. Acad. Sci., 4, 43-47.

30. Sköld, M. (1999). Kernel regression in the presence of size-bias, Journal of Nonparametric Statistics, 12, 41-51.

31. Vidakovic, B. (1999). Statistical Modeling by Wavelets. John Wiley & Sons, Inc., New York, 384 pp.

32. Wieczorek, B. and Ziegler, K. (2010). On optimal estimation of a non-smooth mode in a nonparametric regression model with α -mixing errors, Journal of Statistical Planning and Inference, 140, 2, 406-418.

33. Wu, C. O. (2000). Local polynomial regression with selection biased data. Statist. Sinica, 10, 3, 789-817.

34. Zhang, S., Wong, M. Y. and Zheng, Z. (2002). Wavelet Threshold Estimation of a Regression Function with Random Design, Journal of Multivariate Analysis, 80, 2, 256-284.

35. Yoshihara, K. and Kanagawa, S. (2009). Change-point problems in nonlinear regression estimation with dependent observations, Nonlinear Analysis: Theory, Methods and Applications, 71, 12, 15, 2152-2163.


Two new fuzzy process capability indices for asymmetric tolerance interval

Zainab Abbasi Ganji, Bahram Sadeghpour Gildeh*

Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran

* Corresponding Author’s E-mail: sadeghpour@ um.ac.ir

Abstract Process capability indices are used to evaluate the performance of the manufacturing process. When the specification limits and the target value are not precise, we cannot use the traditional methods for assessing the capability of the process. For the processes with asymmetric tolerance intervals, some fuzzy process capability indices have been introduced such as C and C . In some cases, these indices may fail to account the process performance. So, in this paper we proposed two new fuzzy indices when the specification limits and the target value are fuzzy while the data are crisp to overcome the problem with these indices. To demonstrate the effectiveness and performance of the proposed indices, we presented an application example. Keywords: Normal distribution, Process capability index, Asymmetric tolerance interval, Fuzzy logic.

1. INTRODUCTION Process capability indices measure how much of the production process is in accordance with the requirements of the intended construction engineers or customers. A tolerance interval for a product characteristic X consists of lower and upper limits, LSL and USL, together with a target value T somewhere between these limits.

A tolerance interval is symmetric if the target value is the midpoint of the tolerance limits. Although the symmetric cases are by far the more common, there are sufficiently many instances in which the target value is not the midpoint of the tolerance limits. In these cases, the tolerance interval is called asymmetric.

There have been some indices introduced to handle the processes with asymmetric tolerances. Some superstructure indices are C ∗(u, v), C (u, v) and C (u, v) (see [2], [3], [5], [6], [8], [9] and [10]).

Suppose product characteristic, X, has Normal distribution with mean μ and standard deviation σ. Then, some well-known indices widely used to measure the capability of the processes with asymmetric tolerances are as the following C = d∗ − A∗3σ , C = d∗ − A∗3√σ + A , (1)

where, A = maxd(μ − T) D , d(T − μ) D ⁄⁄ , A∗ = maxd∗(μ − T) D ⁄ , d∗(T − μ) D ⁄ , d = (LSL + USL) 2⁄ , d∗ = minD , D , D = USL − T, and D = T − LSL. 2. FUZZY SET THEORY AND BASIC DEFINITIONS Fuzzy sets were introduced by Zadeh [11] to manipulate the data and information possessing nonstatistical uncertainties. Here are some basic definitions. Definition1. (Fuzzy number) A fuzzy number is a fuzzy set of the real line with a normal, (fuzzy) convex and continuous membership function of bounded support. The family of fuzzy numbers will be denoted by ℱ. Definition2. ( -cut) An α-level set of a fuzzy set A of X is a crisp set written by A and is defined by A = t ∈ X|A (t) ≥ α α > 0,cl(suppA) α = 0, (2)

where cl(suppA) denotes the closure of the support of A. Definition3. (LR fuzzy number/quantity) Any fuzzy number (quantity) can be described as


A (t) =⎩⎪⎨⎪⎧L(a − tα ) t ∈ [a − α, a],1 t ∈ [a, b],R(t − bβ )0 t ∈ [b, b + β],otherwise. (3)

where L: [0, 1] → [0, 1], R: [0, 1] → [0, 1], are continuous and non-increasing shape functions with L(0) = R(0) = 1 and L(1) = R(1) = 0. We call this fuzzy interval of LR-type and refer to it by A = (a, b,α, β) . 3. FUZZY PROCESS CAPABILITY INDICES Kaya and Kahraman [7] introduced fuzzy process capability indices for the processes with asymmetric tolerances. So, in this section, we first review the indices proposed by them and then, we refer to a problem with those indices. In the next section, we propose two new fuzzy process capability indices that modify those indices and overcome the problem with them.

In this paper, we consider the cases in which the specification limits and target value are fuzzy. So, it is realistic to have a process capability index which is also fuzzy, since it would have more information than a precise capability index.

Suppose X is a random variable which is normally distributed and its mean and variance are unknown. A random sample of size n is taken, and the sample mean and standard deviation are calculated to estimate the mean and standard deviation of X. Suppose the data are crisp, so, the sample mean and standard deviation are crisp numbers. To obtain the fuzzy capability index, first, we obtain fuzzy process variance and fuzzy process mean. Buckley [1] introduced a new method in fuzzy statistics to estimate mean and variance parameters of normal distribution that is established by a set of confidence intervals, one on top of the other, and triangular shaped fuzzy numbers of mean and variance are formed.

3.1. Fuzzy process variance As we mentioned earlier, suppose X is a random variable with probability density function N(μ,σ ). To estimate σ we obtain a random sample X , X , … , X from N(μ,σ ). The sample variance, S is a point estimator of σ . If the values of the random sample are x , x , … , x then we have s = ∑ (x − x ) (n − 1)⁄ where x = ∑ x n⁄ . S is an unbiased estimator of σ .

We know that (n − 1)s σ ⁄ has a chi-square distribution with n − 1 degrees of freedom. The unbiased fuzzy estimator of the variance is as the following (n − 1)s L(λ) , (n − 1)s R(λ) , (4) where L(λ) = (1 − λ)χ , . + λ(n − 1), and R(λ) = (1 − λ)χ , . + λ(n − 1).

So, we place these confidence intervals one on top of another to obtain (unbiased) fuzzy estimator σ for the variance. Hence, the α-cut intervals of σ are as the following σ (α) = (n − 1)s (1 − α)χ , . + α(n − 1) , (n − 1)s (1 − α)χ , . + α(n − 1) , 0.01 ≤ α ≤ 1. (5)

3.2. Fuzzy process mean To estimate μ (mean of the normal distribution), we obtain a random sample X , X , … , X from N(μ,σ ). The sample mean, X is distributed as N(μ,σ n⁄ ). So, (1 − β) × 100% confidence interval for μ is as the following x − z σ√n , x + z σ√n , (6)

where, z is the lower γ quantile of a standard normal distribution. We obtain μ by placing these confidence intervals one on top of another. By using α-cut intervals of fuzzy

variance, the α-cut intervals of fuzzy mean are as the following μ (α) = [μ (α), μ (α)], (7) where, μ (α) = min x − z σ (α)√n , i = 1, 2; μ (α) = max x + z σ (α)√n , i = 1, 2 (8)

and σ (α) and σ (α) are the left and right sides of the interval σ (α).


3.3. Fuzzy specification limits and fuzzy target value Suppose that the specification limits and target value of the manufacturing process are imprecise. So, fuzzy numbers are applied to manipulate them. In this paper, we consider two cases in which these data are triangular fuzzy numbers and trapezoidal fuzzy quantities.

First, suppose the lower and upper specification limits are triangular fuzzy numbers as LSL = T(l , l , l ) and USL = T(u , u , u ). Moreover the target value is as T = T(t , t , t ). Then, their α-cut intervals are as the following USL (α) = [U (α), U (α)] = [u + (u − u )α, u − (u − u )α], LSL (α) = [L (α), L (α)] = [l + (l − l )α, l − (l − l )α], T (α) = [T (α), T (α)] = [t + (t − t )α, t − (t − t )α]. (9)

Now, suppose the specification limits and target are trapezoidal fuzzy quantity as LSL = T (l , l , l , l ), USL = T (u , u , u , u ) and T = T (t , t , t , t ). Their α-cut intervals are as the following USL (α) = [U (α), U (α)] = [u + (u − u )α, u − (u − u )α], LSL (α) = [L (α), L (α)] = [l + (l − l )α, l − (l − l )α], T (α) = [T (α), T (α)] = [t + (t − t )α, t − (t − t )α]. (10) 3.4. Ranking function There are a lot of comparison and ranking techniques which are difficult to comprehend and some of them may have antithetical repercussions. But, the concept of area is obvious and because of this, we apply the ranking method proposed by [4] as follows.

Suppose A and B are two fuzzy numbers (quantities), so A is said to be “greater than or equal to” B according to C(A ≥ B ) as the following A > B iff C A ≥ B > 0, A ≥ B iff C A ≥ B ≥ 0. (11) (which implies C(B ≥ A ) < 0).

This relation induced a complete ranking of all fuzzy numbers (quantities), which corresponds to the defuzzification function ℛ. The total integral value given as what follows C A ≥ B = ℛ A − ℛ B , ℛ A = 12 A (α) + A (α) dα

. (12)

3.5. Fuzzy estimation of the indices and As we mentioned before, when the specification limits and target value are fuzzy numbers (quantities), the capability index should be fuzzy number (quantity). Based on the formula of C , C is obtained as the following C = d ∗ ⊝ A ∗3σ , (13) where, d ∗ = min T ⊝ LSL , USL ⊝ T , A ∗ = d ∗⨂ max T ⊖ μ T ⊝ LSL , μ ⊝ T USL ⊝ T . (14)

Therefore, we have d ∗(α) = d ∗(α), d ∗(α) , and A ∗(α) = A ∗(α), A ∗(α) . So, we get α-cut intervals of the index C as follows C (α) = d ∗(α) − A ∗(α)3 σ (α) , d ∗(α) − A ∗(α)3 σ (α) . (15)

We note that A ∗(α) = maxA ∗ (α), A ∗ (α), where A ∗ (α) = d ∗(α) T (α) − μ (α)T (α) − L (α) , T (α) − μ (α)T (α) − L (α) , A ∗ (α) = d ∗(α) μ (α) − T (α)U (α) − T (α) , μ (α) − T (α)U (α) − T (α) , (16) and d ∗(α) = mind ∗(α), d ∗ (α), with d ∗(α) = (T − L , T − L ), d ∗ (α) = (u − T , u − T ). (17)

Based on the equation C = (d∗ − A∗) 3 (σ + A )⁄ , we get C as the following


C = d ∗ ⊝ A ∗3 σ ⨁A , (18)

where A = d ⨂ max T ⊖ μ T ⊝ LSL , μ ⊝ T USL ⊝ T , (19)

and d = (USL ⊖ LSL )/2. Then, d (α) = d (α), d (α) , A (α) = A (α), A (α) . Hence, α-cut intervals of the index C is as the following C (α) = d ∗(α) − A ∗(α)3 σ (α) + A (α) d ∗(α) − A ∗(α)3 σ (α) + A (α) . (20)

We note that A (α) = maxA (α), A (α), where A (α) = d (α) T (α) − μ (α)T (α) − L (α) , T (α) − μ (α)T (α) − L (α) , A (α) = d (α) μ (α) − T (α)U (α) − T (α) , μ (α) − T (α)U (α) − T (α) . (21) There is a basic problem with these indices because they only attend to the proportion of nonconforming

products and don't mention the mean departure of target. For example, suppose two processes A and B have the same fuzzy triangular specification limits as LSL = T(130.149, 130.150, 130.151), USL = T(130.207, 130.208, 130.209) and fuzzy triangular target T = T(130.179, 130.180, 130.181). So, tolerance interval is asymmetric because target is closer to the upper specification limit than the lower specification limit. Let the mean of the process A coincides the fuzzy upper specification limit, i.e., μ = USL and the mean of the process B is on the fuzzy lower specification limit, i.e., μ = LSL . Also, suppose based on a random sample of size n, the sample variance is obtained 0.0000904. We obtain capability of the processes A and B as “approximately zero”, based on the fuzzy indices C and C . Figure1 shows membership functions of C and C for two processes.

Figure 1. the membership function of (the left one) and (the right one)for the processes A and B

It is seen that capability of two processes is “approximately zero”, but it is not true. We note that the proportion of non-conforming products of two processes are equal while the mean of the process A is closer to the fuzzy target than the mean of the process B. We mention that not only the proportion of non-conforming products affects the process capability, but also process centering affects too. So, A should score higher than B.

4. NEW FUZZY CAPABILITY INDICES As we mentioned in the previous section, the C , and C indices, sometimes measure inconsistently the Process capability. So, in this section we modify these indices. In the first subsection we introduce the modification of C as C and in the subsequent section, we modify C and propose the index C . 4.1. Fuzzy capability index To modify the fuzzy index C , we replace the factor A ∗ with the factor ℱ ∗ which is as the following

ℱ ∗ = ⎩⎪⎨⎪⎧ μ ⊖ T ⨂ μ ⊝ T USL ⊝ T , max μ , T = μ , T ⊝ μ ⨂ T ⊝ μ T ⊝ LSL , max μ , T = T . (22)


Then, the α-cut intervals of ℱ ∗ are as

ℱ ∗(α) =⎩⎪⎨⎪⎧ μ (α) − T (α) U (α) − T (α) , μ (α) − T (α) U (α) − T (α) ; max μ , T = μ , T (α) − μ (α) T (α) − L (α) , T (α) − μ (α) T (α) − L (α) ; max μ , T = T . (23)

Based on C = (d ∗ ⊖ ℱ ∗) 3σ ⁄ , d ∗(α) = (d ∗(α), d ∗(α)), and ℱ ∗(α) = (ℱ ∗(α), ℱ ∗(α)) we have the α-cut

intervals of the fuzzy number (quantity) C as the following C (α) = d ∗(α) − ℱ ∗(α)3 σ (α) , d ∗(α) − ℱ ∗(α)3 σ (α) . (24)

4.2. Fuzzy capability index The index C is a modification of the index C . For this, we replace the factor A ∗ with ℱ ∗ as introduced in the previous subsection. Moreover, in the denominator of the index C we set the factor ℱ instead of A which is similar to it. So, we have

ℱ = ⎩⎪⎨⎪⎧d ⨂(μ ⊖ T )USL ⊝ T , max μ , T = μ , d ⨂(T ⊝ μ )T ⊝ LSL , max μ , T = T . (25)

Then, its α-cuts are as

ℱ (α) = ⎩⎪⎨⎪⎧ d (α) μ (α) − T (α) U (α) − T (α) , d (α) μ (α) − T (α) U (α) − T (α) ; max μ , T = μ , d (α) T (α) − μ (α) T (α) − L (α) , d (α) T (α) − μ (α) T (α) − L (α) ; max μ , T = T . (26)

Based on the equation C = (d ∗ ⊖ ℱ ∗) 3 σ ⨁ℱ ⁄ , the α-cut intervals of the fuzzy index C are as the following C (α) = d ∗(α) − ℱ ∗(α)3 σ (α) + ℱ (α) , d ∗(α) − ℱ ∗(α)3 σ (α) + ℱ (α) . (27)

To demonstrate the performance and effectiveness of the proposed indices, we obtain the capability of the processes A and B discussed in the previous section based on these indices. Based on C we get the capability of the process A as “approximately zero” while the capability of the process B is obtained “approximately -0.0701” which are reasonable.

Figure 2. The membership function of (the left one) and (the right one) for processes A and B

Now, we calculate the capability of two processes based on the index C . Based on C we obtain the capability of the process A as “approximately 0.0109” while the capability of the process B is obtained “approximately -0.0109”. Figure 2 shows the membership functions of the indices C and C for both processes.


5. AN APPLICATION EXAMPLE In this section, we apply our proposed indices on the data set which is used by Kaya and Kahraman [7] and then compare the results with the ones by those indices. The data are for a piston manufacturer that locates on the city Konya's Industrial Area, Turkey. Its' main products are piston, liner and piston ring. In application, a Volvo Marin motor's piston for engine is selected. One of the measurable characteristics of the piston is compression height. Specification limits and target value are supposed to be fuzzy triangular numbers. Suppose the lower and upper specification limits are as approximately 114.174 and approximately 114.226. The target value is approximately 114.220.

A sample of 200 data is taken and the sample variance and sample mean are 0.000071 and 114.200, respectively. So, we can obtain the fuzzy variance and fuzzy mean membership functions.

By using the ranking function, we obtain R(μ ) = 114.200 and R T = 114.220. It is obvious that R(μ ) < R(T ), so, μ < T that is, max μ , T = T . Capability of the process is obtained as “approximately -0.1066” based on C and its membership function is shown in Figure 3. Also, we get the capability of the process as “approximately -0.0097” based on C . Figure 4 shows the membership function of this fuzzy number.

Figure 3. The membership function of Figure 4. The membership function of 6. CONCLUSION The ability of a process to meet specifications can be expressed as a single number using a process capability index. When the tolerance interval is asymmetric, that is, the target value is not the midpoint of the specification limits, there have been some capability indices such as C , and C .

In this paper, we considered situations in which the specification limits and target value are not crisp. So, they are described by fuzzy numbers (quantities). Then, for these cases, it is appropriate that the capability index is fuzzy number (quantity), too. Moreover, we supposed the data taken from the process are crisp. So, we used Buckley's approach to obtain the fuzzy process variance, and fuzzy process mean.

Kaya and Kahraman [7], fuzzified C , and C and introduced C , and C . But there is a main problem with these indices, which may fail to account the capability of the process. So, we modified these indices to overcome the problem with them and proposed C , and C . To demonstrate the effectiveness and operation of these new proposed indices, we applied them to get the capability of a process and made comparison between them and C , and C .

Finally, we concluded that our new proposed indices are superior to existing ones. Because we referred to both the expected proportion of non-conforming products and the mean departure from target.

7. REFERENCES

10. Buckley, J.J. (2003), “Fuzzy Probabilities: New Approach and Application,” Physica-Verlag, Heidelberg.

11. Chen, K.S., and Pearn, W.L. (2001), “Capability Indices for Processes with Asymmetric Tolerances,” Journal of the Chinese Institute of Engineers, 24 (5), pp. 559-568.

12. Chen, K.S., Yang, S.L., and Chen, H.T. (2015), “Process improvement capability index with cost- A modeling Method of mathematical programming,” Applied Mathematical Modeling, 39 (5-6), pp. 1577-1586.


13. Fortemps, P., and Roubens, M. (1996) “Ranking and defuzzification methods based on area compensation,” Fuzzy Sets and Systems, 82, pp. 319-330.

14. Franklin, L. A., and Wasserman, G. (1992), “Bootstrap Lower Confidence Limits for Capability Indices,” Journal of Quality Technology, 24 (4), pp. 196-210.

15. Kane, V.E. (1986), “Process Capability Indices,” Journal of Quality Technology, 18, pp. 41-52. 16. Kaya, I., and Kahraman, C. (2011), “Fuzzy process capability indices with asymmetric tolerances,” Expert Systems

with Applications, 38, pp. 14882-14890. 17. Kushler, R. H., and Hurley, P. (1992), “Confidence Bounds for Capability Indices,” Journal of Quality Technology,

24 (4), pp. 188-195. 18. Pan, J.-N., and Wendy K.-C. H. (2015), “Developing new multivariate process capability indices for autocorrelated

data,” Quality and Reliability Engineering International, 31 (3), pp. 431-444. 19. Siman, Miroslav (2014), “Multivariate process capability indices: A directional approach,” Communications in

Statistics Theory and Methods, 43, pp. 1949-1955. 20. Zadeh, L.A. (1965), “Fuzzy sets,” Information and Control, 8, pp. 338–353.


The Fuzzy q-Laplace Transforms

Z. Noeiaghdam, S. Khakrangin1

Young Researchers and Elite Club, Ardabil Branch, Islamic Azad University, Ardabil, Iran.

Corresponding Author’s E-mail:[email protected]

Abstract

In this paper, for first time, fuzzy q-laplace transform for fuzzy-valued function is introduced then we state and prove its important properties. Keywords: Fuzzy number; Generalized Hukuhara difference; q-calculus; q-Laplace transform; Fuzzy q-Laplace transform.

1. Introduction The quantum calculus begins with F. H. Jackson [2] in the early twentieth century, but only recently it arouses interest, due to high demand of mathematics that models quantum computing. The quantum calculus is as a connection between mathematics and physics which is considered the begin of quantum calculus (for short q-calculus). It has a lot of applications in different mathematical areas. Recently, there have appeared many papers which study for more details on q-calculus, we refer the reader to the references [3, 4]. Today, the fuzzy arithmetic in various sciences such as mathematics has many applications is. For this reason, it is one of the noteworthy sciences in the scientific community and in the last century, scientists try to study various issues in the fuzzy arithmetic field. The Laplace transform provides an effective method for solving linear differential equations with constant coefficients and certain integral equations. In this work, starting with a general definition of the fuzzy q-Laplace transform on arbitrary time scales, then the particular concepts are specified and many properties are given as well. This manuscript is organized as follows: Section 2, presents some preliminaries and basic definitions. Section 3, introduces the fuzzy q-Laplace transform with its principal properties. Finally, we conclude the paper with some brief comments in last section.

2. Notations and Preliminary Results Before stating our main results, we introduce some definitions and notations that will be used throughout the paper. At first, the reader requires some fuzzy details, which can be found in the articles [6, 7]. Basically, an arbitrary fuzzy number u in parametric form is an ordered pair of functions ));(),(( ruru 10 ≤≤ r which satisfy the following requirements: • )(ru is a bounded monotonic increasing left continuous function in (0,1];

• )(ru is a bounded monotonic decreasing left continuous function in (0,1];

• 1.0),()( ≤≤≤ rruru

The set of all these fuzzy numbers is denoted by fR . A crisp number k is simply represented by

1,0 ,=)(=)( ≤≤ rkruru and called singleton. For arbitrary fRvu ∈ , and scalar ,k we define addition,

),()(=))(( rvrurvu +⊕ and scalar multiplication

1 Phd student, Islamic Azad University - Science and Research Branch, Tehran, Iran.

mailto:E-mail:[email protected]


)).(),((=))(( 0,<

0, )),(),((=))((

rukrukrukkfor

andkforrukrukruk

⊗

≥⊗

In this paper, the meaning of fuzzy-valued function is a function RARAf f ∈→ ,: where R is the set of all real

numbers, and )],(),,([=),( rxfrxfrxf so called the r-cut or parametric form of the fuzzy-valued function f .

This function f is integrable on ],,[ ba if the function f is continuous by the metric d , its definite integral exists. Furthermore, we have

].);(,);([=);( dxrxfdxrxfdxrxfb

a

b

a

b

a ∫∫∫

now, let ., fRvu ∈ For defined Hukuhara difference and the generalized Hukuhara difference, respectively we

have: if there exists , fRw∈ such that ,= wvu + then w is called the Hukuhara difference (H-difference for short)

of u and ,v and it is denoted by vuΘ and for fRw∈ existential, it is the generalized Hukuhara difference (gH-difference for short)

−+

+⇔Θ

.1)(= )(

,= )(=

wuviior

wvuiwvu gH

It is easy to show that (i) and (ii) are both valid if and only if w is a crisp number. The generalized Hukuhara derivative of a fuzzy-valued function fRbaf →),(: at 0x is defined as

.)()(

=)()( 0000 h

xfhxflimxf gH

hgH

Θ+′ → If fgH Rxf ∈′ )()( 0 , we say that f is generalized Hukuhara

differentiable (gH-differentiable for short) at 0x . Also, we say that f is [(i)-gH]-differentiable at 0x if

1,0 )],,()(),,()[(=),()( )( 000 ≤≤′′′ rrxfrxfrxfi gH

and that f is [(ii)-gH]-differentiable at 0x if

1,0 )],,()(),,()[(=),()( )( 000 ≤≤′′′ rrxfrxfrxfii gH next, we summarize the basic definitions on q-calculus and q-Laplace transform. For more details please refer to [1, 5]. Let qT be the time-scale, for 1<<0 q , 0,:= ∪∈ ZnqT n

q where Z is the set of integer number. The q-

analogues of a positive integer number a is defined by .11=)(

qqa

n

q −−

Consider an arbitrary function RTf q →: . Its q-differential is ).()(=)( qtftftfdq − So, the q-derivative of

f is given by

( )12,0 ,)(1

)()(=)(

=))(( −−∈−−

qq

qq Tt

tqqtftf

tdtfd

tfD

it is obvious that if 0=t then we get, (0),=))(( ftfDq ′ where f ′ is usual derivative.

The q-antiderivative of f is

( )221.<<0 ),()(1=)(0=0

−− ∑∫∞ qtqfqtqsdsf iiiq

t


This series is called Jackson integral [5] of )(xf . Of course, this definition implies that

( )32).()(1)()(1=)(0=0=

−−−− ∑∑∫∞∞

ii

i

ii

iq

b

aaqfqqabqfqqbsdsf

Also, the q-fractional function for Nn∈ is defined by

),(=)( 1

0=sqtst in

inq −− ∏ −

and for the case that α is a non positive integer, we have

.1

1=)(

0=α

αα

+

∞

−

−− ∏

i

i

iq

qts

qts

tst

For given function f(.) and for Cs ∈ with support over )(0,∞ , we define its q-Laplace transform as

( )42.>)( ),()(][=))](([0

−−∞

∫ osRxdxfesxfL qsx

qq

where (.)R denotes the real part of (.),

≥−+

−−−

≡−−

−

−

1.> 0, ,))(1(1

1,< ,1

1<<0 ,))(1(1

11

11

qxxqs

qq

xxqske

q

qsx

q

and k is the normalizing constant. The basic properties of the q-laplace transform are given below: • Scaling: For a real constant λ , ))](([=))](([ sxfLsxfL qq λλ ;

• Linearity: ))](([))](([=))](()([ 2121 sxgLcsxfLcsxgcxfcL qqq ++ where Rcc ∈21, ;

• Transform of derivatives: For ))](([=))](([,>)( sqxfsLsxfdxdLosR qq for all 1.≠∈ Rq

3. Fuzzy q-Laplace Transform Let f is a fuzzy function, firstly we introduce the fuzzy q-Laplace transforms. Then, expressed some properties which are useful and can be proved easily. Immediately, we can express important definition as follow Definition 1 Let f is continuous fuzzy-valued function. Suppose that sx

qerxf −⊗);( is improper fuzzy Riemann-

integrable on [0, 1), then xderxf qsx

q−∞

⊗∫ );(0

is called fuzzy q-Laplace transforms and denoted by:

.);(=))](;([0

xderxfsrxfL qsx

qq−∞

⊗∫

from last section, we have

],);(,);([=);(000

xderxfxderxfxderxf qsx

qqsx

qqsx

q−∞−∞−∞

⊗⊗⊗ ∫∫∫

also by using the definition of classical q-Laplace transform

xderxfsrxfl qsx

qq−∞

⊗∫ );(=))](;([0

and

,);(=))](;([0

xderxfsrxfl qsx

qq−∞

⊗∫

then we follow


( )52,)))](;([),)](;([(=))](;([ −srxflsrxflsrxfL qqq

Theorem 1 Let f(x;r), g (x;r) be continuous fuzzy-valued functions. Suppose that 1c and 2c are constants. Then

( )62)]).;([()]);([(=))];(());([( 2121 −⊗⊕⊗⊗⊕⊗ rxgLcrxfLcrxgcrxfcL qqq

Proof. By using definition of q-Laplace transform and fuzzy calculus we have

xderxgcrxfcrxgcrxfcL qsx

qq−∞

⊗⊕⊗⊗⊕⊗ ∫ ));(());((=))];(());([( 21021

xderxgcxderxfc qsx

qqsx

q−∞−∞

⊗⊕⊗ ∫∫ );();(= 2010

)]).;([()]);([(=]);([]);([= 210201 rxgLcrxfLcxderxgcxderxfc qqqsx

qqsx

q ⊗⊕⊗⊗⊕⊗ −∞−∞

∫∫

So, theorem is proved.

Lemma 1 Let f is continuous fuzzy-valued function on [0,1) and 0≥∈ Rλ . Then

)].;([=)];([ rxfLrxfL qq ⊗⊗ λλ

Proof. proof is obvious. Lemma 2 Let f is a continuous fuzzy-valued function and 0)( ≥xg is a real-valued function. Suppose that

sxqexgrxf −⊗⊗ ))();(( is improper fuzzy Riemann-integrable on [0,1) , then

]);()(,);()([=))();((000 q

sxqq

sxqq

sxq derxfxgderxfxgxdexgrxf −∞−∞−∞

∫∫∫ ⊗⊗

Proof proof is obvious.

Theorem 2 Let f be a continuous fuzzy-valued function and )(=)];([ pFrxfLq , then

0.> ),(=)];([ apapFrxfeL axqq −−⊗

Proof. q

pxaxq

axqq drxferxfeL );(=)];([

0⊗⊗ −∞

∫

).(=);(=]);(,);([= )(

000apFdrxfedrxfedrxfe q

xapqq

pxaxqq

pxaxq −−−∞−∞−∞

∫∫∫

Theorem 3 Let );( rxf ′ be an integrable fuzzy-valued function, and f(x;r) is the primitive of );( rxf ′ on ).[0, ∞ Then

),(0;))](;([=))](;([ rfsqrxfLssrxfL qq Θ⊗′ where f is [(i)-gH]-differentiable, and

)),)](;([()(0;=))](;([ sqrxfLsrfsrxfL qq ⊗−Θ−′ where f is [(ii)-gH]-differentiable. Proof. For arbitrary fixed [0,1]∈r , we first consider f is [(i)-gH]-differentiable, then by linearity of ,qL


)).)](;([),)](;([(=)))](;(),;([(=))](;([ srxflsrxflsrxfrxflsrxfL qqqq ′′′′′ From last section, we have

).(0;))](;([=))](;([ ),(0;))](;([=))](;([ rfsqrxfslsrxflrfsqrxfslsrxfl qqqq −′−′

Hence, we get

)(0;))](;([=))(0;))](;([),(0;))](;([(=))](;([ rfsqrxfLsrfsqrxfslrfsqrxfslsrxfL qqqq Θ⊗−−′

Now, for [(ii)-gH]-differentiability, we have

)).)](;([))(0;),)](;([)(0;(=)))](;([()(0; sqrxfslrfsqrxfslrfsqrxfLsrf qqq +−+−⊗−Θ−

Since )(0;))](;([=))](;([ ),(0;))](;([=))](;([ rfsqrxfslsrxflrfsqrxfslsrxfl qqqq −′−′ , then it follows that

)).)](;([()(0;=))](;([ sqrxfLsrfsrxfL qq ⊗−Θ−′ 4. Final Comments The q-difference calculus or quantum calculus is an old discipline that started to be developed by Jackson [2] and in recent decades it has been important subject for applied science. So that, the q-laplace transform is a very suitable tool in describing and solving a lot of problems in various sciences. In this article, the basic idea is to apply fuzzy concepts for q-laplace transform. Then, theorems connected to it, are expressed and proofed. To obtain an unique solution for fuzzy differential equations is important object that the fuzzy q-laplace transform is very suitable method for solving it, but this topic exceeds the scope of this paper and we would investigate it in future our works.

5. References [1] Abdi, W. H.(1962), On certain q-diference equations and q-Laplace transform, Pro-ceedingsofthe national academy of sciences, India A, 1-15. [2] Jackson, F. H.(1908), On q-functions and certain difference operator, Trans Roy Soc Edin, (46) 253-281. [3] Jarad, F. and Abdeljawad, T. and Baleanu, D. (2013), Stability of q-fractional non-autonomous system, Nonlinear Analysis: RealWorld Applications, (14) 780–784. [4] Jiang, M. and Zhong, S. (2014), Existence of solutions for nonlinear fractional q-difference equations with riemann-liouville type q-derivatives, Journal of Applied Mathematics and Computing, 1–31. [5] Kac, V. and Cheung, P. (2001), Quantum Calculus, Springer. [6] Ma, M. and Friedman, M. , Kandel A.(1999), A new fuzzy arithmetic, Fuzzy set Syst, (108) 83-90. [7] Stefanini, L. and Bede, B.(2009), Generalized Hukuhara differentiability of interval-valued functions and interval differential equations, Nonlinear Analysis, (71) 1311-1328.


Numerical Studying of PDEs on Manifold with Gaussian Data

Mostafa Eslami1, Hadi Estebsari2

1- Department of Mathematics , Faculty of Mathematical Sciences , University of Mazandaran , Babolsar , Iran 2- Department of Mathematics, Iran University Science, and Technology, Tehran, Iran

Corresponding Author’s E-mail: [email protected]

Abstract

The present paper considers PDEs on curves with Gaussian data . We state formulation of elliptic PDE on curves applying recently developed implicit closest point method and drive the cut finite element solution . The Gaussian data is included to the PDE and we drive the required statistics analytically and numerically . Uncertainty quantification is performed with Monte Carlo sampling method and stochastic Collocation method . Karhunen-Loeve expansion is applied for approximation of random data . The method is numerically implemented to some examples and results illustrating the efficiency of the method . Keywords: Uncertainty Quantification , PDE on curves , Tensor product spaces , Closest point method

1. INTRODUCTION Assuming that ( , , )µΩ F is complete probability space in which Ω is sample space , F a σ -Algebra on it and probability measure µ . We have following very basic definitions from [1] :

Definition 1. We define 1 ( ) 0nx xϕ+= ∈ =∑ ¡ implicitly with 1: nϕ + →¡ ¡ , signed distance function dϕ

enables us to find the closest point ecp( )x on Σ to a point 1nx +∈¡ by taking ecp( ) ( ) ( ),dx x x xϕ= − n and also define the projection ( ) : ( ) ( )P x I x x= − ⊗n n where I is the identity tensor .

For separable Hilbert spaces ( )2 1; ( ) ,W L H≡ Ω ∑ ( ) 2 1; ( : ( ) 0)v vdxV L H dµ ωΩ×Σ

Ω == ∈ ∑ ∫ , we

denote associated linear , boundedly invertible deterministic operator *( , )A W V∈L with bilinear form

*,, : ( , ) : .

V WAu v u v W V⟨ ⟩ = × → ¡A

We assume that *f V∈ is Gaussian which can be characterized by it's mean , 2 ( )fm L∈ ∑ and it's covariance

2( ( ))f LQ D∈L . Then we use the following linear operator equation with Gaussian data : given *f V∈ , find

u W∈ such that

*, in .Au f V= (1)

Now denoting the distribution of f with ,f fm QN , we can derive the characteristics of the solution by [2] :

*: [ ] , Cov[ ] ,u f u fm u A m u Q A Q A− − −= = = =1 1E

which shows that the solution $u$ is itself Gaussian . For all the problems we described , existence and uniqueness of solutions are guaranteed by the Lax-Milgram lemma . 2. Discretization 2.1 Semi-discretization with finite element method



In the present paper we applied so called cut finite element method for hypersurface PDEs with Gaussian random Data . Noting that we can represent Σ as the zero level of an implicit function ϕ or a signed distance function dϕ . Using piecewise continuous finite elements on the background mesh hT so that we obtain hϕ e.g . we can obtain the

discretized implicit function as h hϕ π ϕ= i.e . as the continuous piecewise linear interpolant of ϕ . Then the

discretized curve hΣ can be obtained as the zero level of hϕ .

For hΣ we define discretized quantities : the discrete unit normal field (which is the equal to the exact normal on hΣ )

( )( ) ,( )

hh

h

xxx

φφ

∇=

∇n

‖ ‖

and the discrete closest point mapping

,( ) ( ) ( ),h d h hx x x xφ= −n n

where ,d hφ is the discretized signed distance function . We can also define approximate differential operators such as the tangential gradient

( ) ( ) ( ) ( ).h h h hx x I xφ φ φΣ∇ = ∇ = − ⊗ ∇P n n

This surface finite element space is the space of traces on hΣ of all piecewise linear continuous functions with respect

to the background mesh hT . We introduce the finite element space 2 ( ; ( )) : | , h h h h hv L C v S P S= ∈ Ω Ω ∈ ∀ ∈ΩV

where P is the space of polynomials . The space hV induces the surface space on hΣ 2 1

, ( ; ( )) : . . | . hh h h h h h hL H v s t vψ ψΣ Σ= ∈ Ω Σ ∃ ∈ =V V

If we denote by , 1,...,i i Nφ = the nodal finite element basis functions that correspond to the vertices of the elements

in hΩ , then the space ,h ΣV is spanned by the traces of the basis functions | hiφ Σ . We apply the method to the weak

form of the random Laplace-Beltrami problem for semi-dicretization and we obtain the formulation : find ,h hu Σ∈Vsuch that

,( , ) ( ), ,h h h h h h hu v l v v Σ= ∀ ∈A V (2) where the semi-discretized symmetric bilinear and linear forms are

2, ( )( , ) , , ( ) , ( ),

h h h h

eh h L

u v u v l v f v µ ωΣΣ Σ ΣΩ

= ⟨∇ ∇ ⟩ = ⟨ ⟩∫VA

noting that ( ) ecp( ) (ecp( ))ef x fo x f x= = .

The solution of (2) can be expressed as the linear combination of the traces of the outer basis functions on hΣ

1| .

h

N

ih ii

u u φ Σ=

= ∑

Then we can obtain the linear system A b=u with ( , ), ( ).ij h i j i h iA B lφ φ φ= =A

2.2 Uncertainty Quantization Finite Noise Assumption Finite-dimensional noise assumption is employed for representing random fields , with this statement the random fields , Gaussian data f can be approximated via prescribed finite number of random variables which enables a parameterization of the problem in y rather than with random eventsω .


Assuming that :k ky for k+Γ → Γ ⊂ =¡ 1,2, mK is a real valued random variable with m < ∞ . Let

:k kρ +Γ → ¡ , be the probability density function of ky , and ( )yρ joint density of

[ , , , ]mT m

m kk

y y y y=

= … ∈Γ ≡ Γ ⊂∏1 21

¡ .

For random field f we assume that

1( ) ( ) ( ), , ,k k k

kf f x x y x Dλ ψ ω ω

∞

=

= + ∈ ∈Ω∑

where and [ ]f E f= . kλ is eigenvalues and kφ is the corresponding orthogonal eigenfunctions for 1,2,k = K . Stochastic Collocation Finite Element Method Stochastic Collocation Method are studied in e.g. , [3] , that is one of the efficient uncertainty quantification technique which inherited some interesting properties of MC method . For a given sample 1( ( ), , ( )) , 1, 2, ,k k k N

Ny y y k Kω ω= … ∈ = …¡ , of the random variable y , we get the state

( ; )ku y D as a week solution of parametric variational formulation

10( ( ; ), ) ( ), , ( ),k ku y x v l v y v H= ∈Γ ∈ ΣA (3)

The algorithm includes the following steps : a set of nodes according to either multivariate interpolation or integration theory to be determined; implementing deterministic code at each node and post-processing for interpolation . For collocation space we get 1 2 ( )P Lµ

− ⊂ ΓP as the space of polynomials with degree of 1P − . Assuming 1

1 ( )k P Pky −

= ⊂ ΓP to be collocation points and constructing the set of Lagrange basis functions build on collocation

points , ( ) , ,ji ijL y i jδ= = 1,2,K then the solution would be the Lagrange interpolant

1( ; ) ( ; ) ( ),

Pk k

p kk

u y x u y x L y=

= ∑

where ( ; )ku y D for 1, ,k P= … are the solution of

( ( ; ), ) ( ), , ,ku y x v l v v VΓ= ∈A (4)

Stochastic collocation method look for the solution in tensor product space 1 1 N k Ph j kyψ = =⊗ , that is for

1, ,k P= … we solve the equations

,( ( ; ), ) ( ), ,h h h hu y x v l v v Σ= ∀ ∈A V (5) the solution of complete discretized system (5) can be constructed as :

1 1( ; ) | ( ).

h

P Nk

p ik i kk i

u y x u L yφ Σ= =

= ∑∑

Stochastic collocation method can be very efficient if accompanied with sparse grids constructed with tensor products [2] or Smolyak type sparse grids . 3. Numerical Examples For numerical test we considered stochastic Laplace-Beltmary problem defined on a circle in 2D space which centered at (0,0) and radios r=1 . The circle is represented with the signed distance function

2 21 21 2( , ) ( 0) ( 0)d xx x rxϕ − + − −=


and background spatial domain[2-,2] × [2-,2] . Stochastic source function is considered in polar coordinated

( , )f r ϕ = 10 sin( )y ϕ , where random variable y is Gaussian data with [ ]E y = 10and Var[ ]y = 2. stochastic collocation method with Smolyak type sparse grid is applied for uncertainty quantification . In Figure 1 , we compared sparse grid with full tensor grids for 10 nodes .

Figure 1: Numerical solution of stochastic Laplace-Beltmary on unite circle in 2D with stochastic collocation method 4. References 1. S . J . Ruuth and B . Merriman , A simple embedding method for solving partial differential equations on

surfaces , J . Comput . Phys. , 227 (2008) , pp . 1943-1961 . 2. Ch . Schwab , and C . J . Gittelson , Sparse tensor discretizations of high-dimensional parametric and stochastic

PDEs , Acta Numer . 20 (2011) , pp . 291-467 . 3. D . Xiu , Numerical methods for stochastic computations : a spectral method approach , Princeton University

Press , 2010 .


A New Approach for Solving Intuitionistic Fuzzy Linear Systems

Mahbobeh Esmaeili1, Mohammad Keyanpour2

1- Department of Mathematics, Faculty of Sciences, University of Guilan, Rasht, Iran 2- Department of Mathematics, Faculty of Sciences, University of Guilan, Rasht, Iran

Corresponding Author’s E-mail ([email protected])

Abstract

In this paper, we focus on solving intuitionistic fuzzy linear systems (IFLS) and propose a method based on the Moore-Penrose Pseudoinverse matrix of the extended crisp linear system. Some numerical examples are given to illustrate the effectiveness of the method.

Keywords: Intuitionistic fuzzy numbers; Intuitionistic fuzzy linear system; Intuitionistic fuzzy solution; quasi-inverse algorithm.

1. INTRODUCTION System of simultaneous linear equations plays a major role in various areas such as mathematics, physics, statistics, engineering and social sciences[1]. Since in many applications, the systems parameters are represented by intuitionistic fuzzy rather than fuzzy numbers, it is important to modified mathematical methods to deal with intuitionistic fuzzy linear systems. In this paper, we apply Moore-Penrose Pseudoinverse algorithm to find the solutions of system. The proposed method is illustrated by solving some numerical examples. 2. Preliminaries Definition 2.1. Let X be the universal set [3]. An intuitionistic fuzzy set (IFS) A% in X is given by

( , ( ) , ( )) | A AA x x x x Xµ ν= ∈% %%

where the functions ( ) , ( )A Ax xµ ν% %

define respectively, the degree of membership and degree of non-membership of the element x to the set A% ,and for every x X∈ we have 0 ( ) ( ) 1A Ax xµ ν≤ + ≤% %

. The α -cut of an intuitionistic fuzzy set A% is defined as [ ] [ , ]A A A α

α α=% where

| ( ) AA x xα µ α= ∈ ≥%¡

| ( ) 1 AA x xα ν α= ∈ ≤ −%¡ . Definition 2.2. An IFS ( , ( ) , ( )) | A AA x x x x Xµ ν= ∈% %

% is called IF-normal if there exist at least two points

0 1,x x X∈ such that 0 1( ) 1, ( ) 1A Ax xµ ν= =% %

.

Definition 2.3. An IFS ( , ( ) , ( )) | A AA x x x x Xµ ν= ∈% %% of real line ¡ is called IF-convex if

1 2, , [0,1]x x λ∀ ∈ ∀ ∈¡ ,

1 2 1 2( (1 ) ) ( ) ( ),A A Ax x x xµ λ λ µ ν+ − ≥ ∧% % % 1 2 1 2( (1 ) ) ( ) ( ).A A Ax x x xν λ λ ν ν+ − ≤ ∨% % % Thus A% is IF-convex if its membership function is fuzzy convex and its non-membership function is fuzzy concave.

mailto:([email protected])


Definition 2.4. An IFS ( , ( ) , ( ) ) | a aa x x x x Xµ ν= ∈% %% of the real line ℜ is called an intuitionistic fuzzy number (IFN) if

(i) a% is IF-normal. (ii) a% is IF-convex.

(iii) aµ %

is a bounded upper semi-continuous function and aν %

is a lower semi-continuous function. Every α -cut of an intuitionistic fuzzy sets is a closed interval. Hence we can presented As [ , ] [ ( ( ) , ( ) ) , ( ( ) , ( ) ) ]A A A A A Aα

α α α α α= where

in f | ( ) ,AA x xµ α= ∈ ≥%¡ su p | ( ) ,AA x xµ α= ∈ ≥%¡

in f | ( ) 1 ,AA x xν α= ∈ ≤ −%¡ s u p | ( ) 1 AA x xν α= ∈ ≤ −%¡ .

Definition 2.5. An arbitrary intuitionistic fuzzy number in parametric form is represented by an ordered pair of functions [ ( ( ) , ( ) ) , ( ( ) , ( ) ) ]a r a r a r a r which satisfies the following requirements (i) ( ) , ( )a r a r are bounded left-continuous non-decreasing functions over [0,1], (ii) ( ) , ( )a r a r are bounded right-continuous non-decreasing functions over [0,1], (iii) ( ) ( )a r a r≤ , ( ) ( )a r a r≤ , (iv) ( ) ( )a r a r≤ , ( ) ( )a r a r≤ . Definition 2.6. For arbitrary intuitionistic fuzzy numbers [ ( ( ) , ( ) ) , ( ( ) , ( ) ) ]x x r x r x r x r= ,

[ ( ( ) , ( ) ) , ( ( ) , ( ) )]y y r y r y r y r= and real number k , we may define the equality, the addition and the scalar

multiplication of intuitionistic fuzzy numbers by using the extension principle as

(i) x y= if ( ) ( ), ( ) ( ), ( ) ( ), ( ) ( ),x r y r x r y r x r y r x r y r= = = =

(ii) [( ( ) ( ), ( ) ( )), ( ( ) ( ), ( ) ( ))],x r y r x r y r x r y r x r y r+ + + +

(iii) if k is non-negative ( , ) [( ( ), ( )), ( ( ), ( ))]k x y k x r k x r k y r k y r= and if k is negative

( , ) [( ( ), ( )), ( ( ), ( ))]k x y k x r k x r k y r k y r= .

Definition 2.7. An intuitionistic fuzzy vector 1 2 1 2( , ) ( , , , , , , , )n nX Y x x x y y y= K K given by

( , ) (( ( ), ( )), ( ( ), ( )))j jjj j jx y x r x r x r x r= , 1, 2, ,j n= K , 0 1r≤ ≤ , is called a solution of the

intuitionistic fuzzy system, if

1 1( )

n n

iij ij ij ijj j

a x a x b r= =

= =∑ ∑ , 1 1

( )n n

iij ij ij ijj j

a x a x b r= =

= =∑ ∑ ,

1 1( )

n n

ij ij ij ij ij j

a x a x b r= =

= =∑ ∑ , 1 1

( )n n

iij ij ij ijj j

a x a x b r= =

= =∑ ∑ .

In order to solve the intuitionistic fuzzy linear system we first transform the system to crisp linear system where the

right-hand side column is the function vector 1 11 1( , , , , , , , , , , , )T

m mm mb b b b b b b bK K K K .

We get the 4 4m n× linear system


11 111 1, 1, 1 1,2

11,1 , , 1 ,2

111,1 1, 1, 1 1,2 1

112 ,1 2 , 2 , 1 2 ,2

111 1, 1, 1 1,21 1

nnn n n

nn mm m n m n m n

nnm m n m n m n

nnm m n m n m n m

nn n nn

m

s x s x s x s x b

s x s x s x s x b

s x s x s x s x b

s x s x s x s x b

s x s x s x s x b

s

+

+

+ + + + +

+

+

+ + + + + =

+ + + + + =

+ + + + + =

+ + + + + =

+ + + + + =

K KM

K K

K KM

K K

K K

M

1,1 , , 1 ,21

11,1 1, 1, 1 1,2 11

12 ,1 2 , 2 , 1 2 ,21

nm n m n m nn m

nm m n m n m nn

nm m n m n m n mn

x s x s x s x b

s x s x s x s x b

s x s x s x s x b

+

+ + + + +

+

+ + + + + = + + + + + = + + + + + =

K K

K K

M

K K

In this system [ ]ijS s= are determined as follows [2]

,

, ,

0 ,0 .

ij ij i m j n ij

ij i m j i j n ij

a s s aa s s a

+ +

+ +

≥ ⇒ = = ≤ ⇒ = =

For 1, 2, ,i m= K and 1, 2, ,j n= K . Any ijs which is not determined, consider as zero. We get the matrix

1 2

2 1

,s s

Ss s

=

1 2A S S= + .

Theorem. The matrix S is a non-singular if and only if the matrix 1 2A S S= + and 1 2S S− are both non-singular. Using matrix notation we get ,SX Y=

1 11 1

1 11 1

[ , , , , , , , , , , , ] ,

[ , , , , , , , , , , , ] ,

Tn nn n

Tn nn n

X x x x x x x x x

Y b b b b b b b b

=

=

K K K K

K K K K

1 2

2 1

1 2

2 1

0 00 0

0 00 0

i i

i i

i i

i i

bxS SbxS SbxS S

S S x b

=

.

3. Moore-Penrose Pseudoinverse of a matrix In this section we discuss the solution of Ax b= when A is a m n× matrix and m n≠ . If for matrix m nA × there exist a matrix A∗ satisfies the following conditions then A∗ is called quasi-inverse of A

(i) A = A,AA∗


(ii) ,A AA A∗ ∗ ∗=

(iii) (A A) = A A.T∗ ∗

For computation of A∗ we have (i) if rank m nA n× = then 1( ) ,T TA A A A∗ −=

(ii) if rank m nA m× = then 1( ) ,T TA A AA∗ −=

(iii) if rank ,m nA r m n× = ≠ then 1 1( ) ( ) ,T T T TA C CC B B B∗ − −=

where there are two matrices m rB × and r nC × such that .A BC=

Consider the system Ax b= %% , where A is m n× matrix and b% is an m-vector of intuitionistic fuzzy numbers (i) If rank ( )A m= , rearrange the columns of A so that [ , ]A B N= , where B is matrix m m× of full rank and

N is an ( )m n m× − matrix. Let Bx% and Nx% be the vectors corresponding to B and N , respectively. Then

system Ax b= %% can be rewritten as follows

.B NBx Nx b+ = %% % The solution of system is represented as

B N

N N

x B b B Nxxx x

∗ ∗ −= =

%% %%% %

.

(ii) if rank m nA n× = , then x A b∗= %% where 1( ) ,T TA A A A∗ −=

(iii) if rank ,m nA r m n× = ≠ then x A b∗= %% where 1 1( ) ( ) .T T T TA C CC B B B∗ − −= 4. EXAMPLES Example 4.1. Use the presented algorithm to find the solution of the following fuzzy linear system

1 1 2 21 21 2

1 1 2 21 21 2

2(( , ), ( , ) ( , ), ( , )) ((12 ,15 2 ), (9 4 ,16 3 )),

(( , ), ( , ) 5( , ), ( , )) ((10 2 ,14 2 ), (7 5 ,17 5 )),

x x x x x x x x r r r r

x x x x x x x x r r r r

− = + − + − + = + − + −

the extended matrix is

2 0 0 11 5 0 0

,0 1 2 00 0 1 5

S

− = −

S ∗ is computed by quasi-inverse algorithm as

1 2

0.5051 -0.0101 -0.0505 0.1010-0:1010 0.2020 0.0101 -0.0202

,-0.0505 0.1010 0.5051 -0.01010.0101 -0.0202 -0.1010 0.2020

S S∗ ∗

= =

1

2

0.

0S

SS

∗∗

∗

=

We obtain the solution as follows


1

2

[x ] = ((6.6162 + 0.3838r, 7.8384 - 0.8384r), (5.3838 + 1.6162r, 8.1616 - 1.1616r)),[x ] = ((0.6768 + 0.3232r, 1.2323 - 0.2323r), (0.3232 + 0.6768r, 1.7677 - 0.7677r)).

r

r

Example 4.2. Use the presented algorithm to find the solution of the following fuzzy linear system

1 2 3

1 2 3

2 3 ((2 2 ,5 ), (1 3 ,6 2 )),3 ((7 ,9 ), (4 4 ,10 2 )),x x x r r r rx x x r r r r

+ − = + − + − − + = + − + −

In this system [ ( , ) , ( , )]iii ii

x x x x x= . The extended matrix is

1 2 0 0 0 33 0 1 0 1 0

,0 0 3 1 2 00 1 0 3 0 1

S

− − = −

−

Note that the rank ( )A = 4 . So that A can be decomposed into[ , ]B N , such that

1 2 0 03 0 1 0

,0 0 3 10 1 0 3

B

= −

−

0 31 0

,2 00 1

N

− − =

where B is an 4 4× invertible matrix, then we have B ∗ is computed by quasi-inverse algorithm as

1 2

-0.0189 0.3396 0.1132 -0.03770.5094 -0.1698 -0.0566 0.0189

,0.0566 -0.0189 -0.3396 0.11320.1698 -0.0566 -0.0189 0.3396

B B∗ ∗

= =

1

2

0.

0B

BB

∗∗

∗

=

The solution of the above intuitionistic fuzzy linear system by using algorithm is given below

1

2

3

[x ] = ((2.820 + 0.201r, 3.113 - 0.091r), (1.901 + 1.120r, 3.434 - 0.411r)),[x ] = ((1.839 + 0.381r, 2.500 - 0.280r), (1.810 + 0.410r, 2.550 - 0.330r)),[x ] = ((1.037 + 0.116r, 1.500 - 0.345r), (0.

r

r

r 844 + 0.309r, 1.510 - 0.350r)).

5. CONCLUSIONS In this work, we considered m × n intuitionistic fuzzy linear systems. We presented a method for solving the intuitionistic fuzzy linear system based on quasi-inverse matrix, and obtain the general solution of system when the coefficient matrix is full row rank.


6. REFERENCES 1. Friedman, M., Ming, M. and Kandel, A, (1998) “Fuzzy linear systems, ” Fuzzy Sets and Systems, pp. 201-209. 2. Matinfar, M., Nasseri, S.H., and Alemi, M. (2009) “A new method for solving of rectangular fuzzy linear system of equations based on Grevilles algorithm,” Applied Mathematical Sciences, pp. 75-84. 3. Nehi, H.M. (2010) “A new ranking method for intuitionistic fuzzy numbers,” International Journal of Fuzzy Systemes, pp. 80-86.


Numerical Treatment of Nonlinear Systems of Equations by Imperialist Competitive Algorithm

H. Rouhi*, R. Ansari

Department of Mechanical Engineering, University of Guilan, P.O. Box 3756, Rasht, Iran

* Corresponding Author’s E-mail: [email protected]

Abstract Solving systems of nonlinear equations is of great importance in numerical computation as well as in different engineering design problems. In this study, a novel socio-politically motivated optimization method, called imperialist competitive algorithm (ICA), is employed to solve system of nonlinear equations. In comparison to other evolutionary optimization strategies such as particle swarm optimization (PSO) algorithm, the proposed solver reveals great performance in global optima achievement. Several case studies are provided to demonstrate the efficiency of the present approach in solving nonlinear equations systems. Keywords: Imperialist competitive algorithm, Nonlinear system of equations, Meta-heuristic algorithms.

1. INTRODUCTION Since most physical systems are intrinsically nonlinear, solving nonlinear system of equations is of interest in various scientific fields such as computational mechanics, robotics, weather forecast, petroleum geological prospecting, etc. There are several methods for solving these problems. These approaches can be classified into two main categories. One category covers traditional approaches, which includes techniques such as Newton’s method, secant method and Muller’s method [1-3]. The other category is based on optimization algorithms such as genetic algorithm (GA) and particle swarm optimization (PSO) which are increasingly being viewed as an alternative way of solving nonlinear problems [4-7].

The imperialist competitive algorithm (ICA) proposed by Atashpaz-Gargari and Lucas [8] is one of the meta-heuristic optimization methods in the evolutionary computation field which has the ability to solve optimization problems of different types. A survey of literature shows that this technique has been successfully used to solve different problems [9-13]. ICA has been developed based on social policy of imperialisms. In this new algorithm, each agent is a country which can be categorized into two groups: colony and imperialist. Countries in ICA are the counterpart of chromosomes and particles in GA and PSO, respectively. These countries create initial empires. The ICA is inspired from imperialistic competition between these empires. During the competition, weak empires collapse and powerful ones occupy their colonies.

In the current work, the ICA is applied for numerical treatment of nonlinear systems of equations. The rest of the paper is structured as follows: In Section 2, the basic ideas of the ICA are expressed. In Section 3, the ICA is used for solving some important nonlinear systems of equations and the results are compared with the existing data in the literature, and at last conclusions of the research are presented in Section 4.

2. IMPERIALIST COMPETITIVE ALGORITHM The goal of this algorithm is to find an optimal solution in terms of the problem variables. Fig. 1 shows the flowchart of the ICA. The fundamental ideas of the algorithm are as follows.

In an -dimensional optimization problem, a country is a 1 × array which is given by = [ , , , … , ] (1)

in which s are the optimization variables. The algorithm starts with a randomly initial population of size . In the first step, the generated population is categorized into two groups: imperialists and colonies. The of the most powerful countries are chosen to form the empires and the remaining ones, , are the colonies of imperialists which are distributed among the imperialists depending on their powers.



Figure 1. Flowchart of the imperialistic competitive algorithm

According to the assimilation policy, colonies are forced to move toward their imperialists. This modeling conception is shown in Fig. 2, schematically.

Figure 2. Movement of a colony toward its imperialist (assimilation) [8]

To increase the search area, each colony is absorbed to its related imperialist by units with a deviation of from the connecting line between them. and are random numbers with uniform distribution, then [8] ~ (0, ∗ ) (2) ~ (− , ) (3)

where is a number greater than one and is the distance between the colony and its imperialist. Motivated by the results of Ref. [14], in this study, the following linear formula for is employed = − ( − ) × (4)

where and are the initial and final values of , is the current iteration number and “ ” is the maximum number of allowable iterations.

In every generation, some countries experience a rapid change in their characteristics. In ICA, this concept is called revolution which prevents the local optima trap. The revolution rate specifies the percentage of colonies in each colony which will alter their position at random. After moving toward the imperialist, there is an opportunity that a colony reaches to a better position with lower cost than the imperialist. In this situation, the colony and its associated imperialist exchange their positions. In the next step, the similar empires united. To embark on the imperialistic competition, it is first necessary to compute total objective function of each empire which relies on objective function of both an imperialist and its colonies. Thereafter, each empire attempts to take over the colonies of other empires. Fig. 3


depicts this imperialistic competition. As seen from this figure, the weakest colony of the weakest empire is selected to give it to the suitable empire which is chosen on the basis of a competition among all empires. Based on this model, step by step, weaker empires lose their power and the stronger ones will possess their colonies. When an empire loses all of its colonies, it will collapse. At the end of competition there is a winner empire which has possessed all of the colonies.

Figure 3. Schematic of imperialistic competition [8] 3. APPLICATIONS In this section, some case studies are given to show the accuracy of the ICA in solving nonlinear systems of equations. Let the general form of a system of nonlinear equations be ( , , … , ) = 0 ( , , … , ) = 0 ⋮ ( , , … , ) = 0

(5)

The above system of nonlinear equations is identical to the following optimization problem

( ) = ( )

(6)

Let ∗ the global minimum of ( ). If ∗ ∶ ( ∗) = 0, then it means that ∗ is a global minimum and thus ( ∗) = ( ∗) = ⋯ = ( ∗) = 0 and accordingly ∗ is a root for the corresponding system of equations [6].

Case study 1. [15] + 4 + 0.75 = 0 (7a) + 0.405 − 1.405 = 0 (7b) 3 − 2 + 1.5 = 0 (7c) 4 − 0.605 − 0.395 = 0 (7d) − 2 + 1.5 = 0 (7e) − = 0 (7f)

ICA Parameters: Search range = [−100 100] Number of decades = 1000 Initial population = 300

ℎ

1

2

3

3

2

1


Number of imperialist = 3 Revolution rate = 0.2 = 2, = 0 By employing ICA, two solutions are obtained for the problem. Case study 2. [16] + − 5 − 85 = 0 (8a) − − − 60 = 0 (8b) + − − 2 = 0 (8c) 3 ≤ ≤ 5, 2 ≤ ≤ 4, 0.5 ≤ ≤ 2. ICA Parameters: Number of decades = 1000 Initial population = 20 Number of imperialist = 2 Revolution rate = 0.1 = 2, varies from 0 to 2 Case study 3. [1] + + + + − 2 = 0 (9a) − + + + = 0 (9b) − + + = 0 (9c) − + = 0 (9d) + − = 0 (9e)

ICA Parameters: Search range = [−100 100] Number of decades = 2000 Initial population = 50 Number of imperialist = 2 Revolution rate = 0.1 = 2, varies from 0 to 4 Case study 4. [17] 3 − cos( ) − 0.5 = 0 (10a) − 625 − 0.25 = 0 (10b) + 20 + 10 − 33 = 0 (10c)

ICA Parameters: Search range = [−1000 1000] Number of decades = 1000 Initial population = 50 Number of imperialist = 3 Revolution rate = 0.1 varies from 1 to 2, varies from 0 to 4 Case study 5. [7] 3 − = 0 (11a) sin − − 4 = 0 (11b) − + 0.2707 = 0 (11c) 2 − − = 0 (11d)

ICA Parameters:


Search range = [−100 100] Number of decades = 2000 Initial population = 100 Number of imperialist = 2 Revolution rate = 0.1 = 2, varies from 0 to 4 Case study 6. [3] + + = 0 (12a) − 2 + = 0 (12b) + = 0 (12c)

ICA Parameters: Search range = [−10000 10000] Number of decades = 1000 Initial population = 20 Number of imperialist = 1 Revolution rate = 0.1 = 2, = 0 Table 1 presents the solutions of case studies 1-6 obtained by the ICA and other approaches.

Table 1- Results for case studies 1-6

Case study

Method

Case 1 ICA (present) -1.0002 1.0001 -0.9997 1.0004 -0.9999 1.0001

ICA (present) 0.9435 -0.6734 0.3825 1.8158 -2.1979 2.0737

CDPSO [4] -1 1 -1 1 -1 1

PSO [6] -1 1 -1 1 -1 1

Case 2 ICA (present) 4 3 1

CDPSO [4] 4 3 1

PSO [6] 4 3 1

Case 3 ICA (present) 0.9999 1.0000 -0.0054 -0.0059 0.0009

Newton’s method [1]

1 1 0 0 0

Case 4 ICA (present) 0.5000 -0.0002 -0.5236

Modified Newton’s

method [17]

0.50 0 -0.5236

Case 5 ICA (present) 2.9996 1.9999 1.0001 0

Invasive weed

optimization algorithm and clustering [7]

3 2 1 0

Case 6 ICA (present) -0.0000 0.6909e-3 -0.7489e-3

Quasi-Newton’

method [3]

0 0 0

4. CONCLUSIONS In this work, the application of imperialist competitive method, a socia-political based evolutionary algorithm, was examined for solving system of nonlinear equations. By making use of this approach, some case studies for the systems


of nonlinear equations were presented. Numerical tests revealed the good performance of the ICA in handling a wide variety of nonlinear systems of equations. 5. REFERENCES [1] Shen, Y. Q., and Ypma, T. J., 2005, “Newton’s Method for Singular Nonlinear Equations Using Approximate Left

and Right Nullspaces of the Jacobian,” Appl. Numer. Math., 54, pp. 256–265.

[2] Darvishi, M. T., and Barati, A., 2007, “A Third-Order Newton-Type Method to Solve Systems of Nonlinear Equations,” Appl. Math. Comput., 187, pp. 630–635.

[3] Buhmiler, S., Krejic, N., and Lužanin, Z., 2010, “Practical Quasi-Newton Algorithms for Singular Nonlinear Systems,” Numer. Algor., 55, pp. 481–502.

[4] Mo, Y., Liu, H., and Wang, Q., 2009, “Conjugate Direction Particle Swarm Optimization Solving Systems of Nonlinear Equations,” Comput. Math. Appl., 57, pp. 1877–1882.

[5] Chang, W. D., and Shih, S. P., 2010, “PID Controller Design of Nonlinear Systems Using an Improved Particle Swarm Optimization Approach,” Commun. Nonlinear Sci. Numer. Simulat., 15, pp. 3632–3639.

[6] Jaberipour, M., Khorram, E., and Karimi, B., 2011, “Particle Swarm Algorithm for Solving Systems of Nonlinear Equations,” Comput. Math. Appl., 62, pp. 566–576.

[7] Pourjafari, E., and Mojallali, H., 2012, “Solving Nonlinear Equations Systems with a New Approach Based on Invasive Weed Optimization Algorithm and Clustering,” Swarm Evol. Comput., 4, pp. 33–43.

[8] Atashpaz-Gargari, E., and Lucas C., 2007, “Imperialist Competitive Algorithm: An Algorithm for Optimization Inspired by Imperialistic Competition,” IEEE Congress on Evolutionary Computation, Singapore, 2007, pp. 4661–4667.

[9] Niknam, T., Taherian Fard, E., Pourjafarian, N., and Rousta, A., 2011, “An Efficient Hybrid Algorithm Based on Modified Imperialist Competitive Algorithm and K-means for Data Clustering,” Eng. Appl. Artif. Intel., 24, pp. 306–317.

[10] Karami, A., Rezaei, E., Shahhosseni, M., and Aghakhani, M., 2012, “Optimization of Heat Transfer in an Air Cooler Equipped with Classic Twisted Tape Inserts Using Imperialist Competitive Algorithm,” Exp. Therm. Fluid Sci., 38, pp. 195–200.

[11] Yousefi, M., Darus, A. N., and Mohammadi, H., 2012, “An Imperialist Competitive Algorithm for Optimal Design of Plate-Fin Heat Exchangers,” Int. J. Heat Mass Transfer, 55, pp. 3178–3185.

[12] Talatahari, S., Farahmand Azar, B., Sheikholeslami, R., and Gandomi, A. H., 2012, “Imperialist Competitive Algorithm Combined with Chaos for Global Optimization,” Commun. Nonlinear Sci. Numer. Simulat., 17, pp. 1312–1319.

[13] Talatahari, S., and Mohajer Rahbari, N., 2015, “Enriched Imperialist Competitive Algorithm for System Identification of Magneto-Rheological Dampers,” Mech. Sys. Signal Process., 62–63, pp. 506–516.

[14] Eberhart, R. C., and Kennedy, J., 1995, “A New Optimizer Using Particle Swarm Theory,” Proc. Sixth Int. Symp. Micro Machine and Human Sci., Nagoya, Japan, pp. 39–43.

[15] Krzyworzcka, S., 1996, “Extension of the Lanczos and CGS Methods to Systems of Nonlinear Equations,” J. Comput. Appl. Math., 69, pp. 181–190.

[16] Zeng, Y., 2005, “The Application of Floating Genetic Algorithms to Solving Non-Linear Equation Groups,” J. East China Jiaotong University, 22, pp. 152–155.

[17] Hueso, J. L., Martínez, E., and Torregrosa, J. R., 2009, “Modified Newton’s Method for Systems of Nonlinear Equations with Singular Jacobian,” J. Comput. Appl. Math., 224, pp. 77–83.


Comparison between Optimal Homotopy asymptotic method and Homotopy perturbation Method for solving a class of voltera integral

equation

Z. Ayati 1 , E. Moradi , F. Hajipour2

1. Department of Engineering sciences, Faculty of Technology and Engineering East of Guilan, University of Guilan, P.C. 44891-63157, Rudsar-Vajargah, Iran

Abstract

Many phenomena in physics and engineering reduced to integral equation. There are several methods for solving these equations. The homotopy perturbation method is a powerful device for solving a wide variety of problems arisen in many scientific applications. In this article, Optimal Homotopy asymptotic method as a modification of Homotopy perturbation method is used to obtain an approximate solution of a class of integral equation. a numerical comparison have been conducted between Optimal Homotopy asymptotic method and Homotopy perturbation Method for solving these equations. Two illustrative examples are presented to show that the solution obtained by Optimal Homotopy asymptotic method is more accurate than Homotopy perturbation method. Keywords: Optimal Homotopy asymptotic method; volterra integral equation; Homotopy perturbation Method

1. Introduction There are many applications in real problems that modeled by Volterra integral equation, for example, in fluid mechanics, bio-mechanics [1,2, 3] electrostatic [4] diffusion problems [5] heat conduction problems [6]. However, it is difficult to obtain analytic solutions, especially for nonlinear equations. In most cases, only approximate solutions (either analytical or numerical) can be expected. There are some methods to obtain approximate solutions of integral equations. One of them is Homotopy perturbation method. The method introduced by He in 1998, well addressed in [7], has been known as a powerful device for solving different kind of equation [8]. The general form of Volterra integral equation of first kind is

( )

( )

( , ) ( ( )) ( ), ( ) ,b x

a x

k x t g u t dt f x a t b x T= ≤ ≤ ≤∫ (1)

where f is a known function called the free term, k is the kernel of the integral and g is the linear or nonlinear function of the unknown function u . For solving Volterra integral equations of first kind by using HPM or OHPM, we need the canonical form of them. Consider the following equation

( , ) ( ) ( ), .x

a

k x t u t dt g x a t x b= ≤ ≤ ≤∫ (2)

First, differentiate both sides of Eq. (2) with respect to x and according to the Leibnitz generalized formula, we obtain:

(3) ( , ) ( ) ( , ) ( ) ( ),x

xa

k x x u x k x t u t dt g x′+ =∫


if ( , ) 0k x x ≠ , then above equation can be rewritten as,

( , ) ( )( ) ( ) .( , ) ( , )

xx

a

k x t g xu x u t dtk x x k x x

′= − +∫ (4)

At the moment, we start to solve the problem with HPM or OHAM.

2. Basic idea of optimal homotopy asymptotic method (OHAM)

The general canonical form of Volterra integral equations which we consider in this paper have the following form

( ) ( ) ( , )( ( )) 0, [ , ],x

p

a

u x f x k x t u t dt x a b p Nλ+ + = ∈ ∈∫ (5)

Where λ is real number and ( , )k x t is kernel of the integral equation, which is continuous function in [ , ] [ , ]a b a b× and ( )f x is analytic function defined in [ , ]a b .We make an optimal homotopy ( , ) : [0,1] ,u x p RΩ× → as follows

(1 ) ( ; ) ( ) ( ) ( ) ( ) ( , )( ( )) 0,x

p

a

p u x p f x H p u x f x k x t u t dtλ − + − + + =

∫ (6)

where [0,1]p ∈ is an embedding parameter, 2 31 2 3( ) ...H p pc p c p c= + + + , and ; 1,2,3,...ic i = are auxiliary

constants. For finding the approximate solution of problem, we use Taylor’s series expansion about p as follows

01

( ; , ) ( , ) ( ; ) , i 1,2,3.ki k i

ku x p c u x t u x c p

∞

=

= + =∑ (7)

Substituting (7) in (6) and equating the coefficients of like powers of p , we get a series of problem.

(8)

(9)

(10)

(11)

(12)

(13)

By substituting the solution of above obtained problems in Eq. (7), we discover approximate analytic solution of our

Problem. Now, we note that the constant , 1, 2,3,...ic i = are present in this solution. For getting these constants we form residual equation as;

0

1 1 0

22 1 1 1 0 1 2 0

2 33 1 1 1 0 1 2 0

1

0 : ( ) ( )

1: ( ) ( , ) ( ) ,

2 : ( ) (1 ) ( , )(2 ( ) ( )) . ( , ) ( ) ,

3 : ( ) (1 ) ( , )(3 ( ) ( )) . ( , ) ( ) ,

( ) ( )

xp

ax x

a ax x

a a

j j

p u x f x

p u x c k x t u t dt

p u x c u c k x t u t u t dt c k x t u t dt

p u x c u c k x t u t u t dt c k x t u t dt

u x u x

λ

λ λ

λ λ

−

= =

= = −

= = + − −

= = + − −

= +

∫

∫ ∫

∫ ∫M

1 1 12

1 0 11 1 0

1 1 1 13

1 1 0 11 1 0 0

( , ) ( ) ( , ) ( ) ( ) ,

( ) ( ) ( , ) ( ) ( , ) ( ) ( ) ( ) .

x xj j j

i j j i k j ki i ka a

x xj j j j i

j j i j j i i k j k ii i i ka a

c u c k x t u t dt c k x t u t u t dt

u x u x c u c k x t u t dt c k x t u t u t u t dt

λ λ

λ λ

− − −

− − −= = =

− − − − −

− − − − −= = = =

− −

⇒ = + − −

∑ ∑ ∑∫ ∫

∑ ∑ ∑ ∑∫ ∫


( ; ) ( ; ) ( ) ( , )( ( ; ))x

pi i i

a

R x c u x c f x k x t u x c dtλ= + + ∫ , (14)

and then find , 1, 2,3,...ic i = by two methods,

1. For ( , )ik a b∈

1 1 2 2 2 3( , ) ( , ) ( , ) ( , ) 0, 1,2,3, ,m iR k c R k c R k c R k c i m= = = = = =L L

2 .We construct 1 2 3( , , , )mJ c c c cL as 1 2 3( , , , )mJ c c c cL 21 2 3( ; , , , ) ,

x

maR xc c c c dx a x b= ≤ ≤∫ L

We derive , 1,2,3,...ic i = by minimizing J where a and b is domain of the problem.

3. Numerical examples

In this section two patterns of the linear and nonlinear Volterra integral equations have been solved via HPM and OHAM to illustrate the methods.

Example 1. Consider the following linear Volterra integral equation with exact solution, ( ) xu x e= ,

2 2 2

0

( 1) ( ) (1 2 ) 1x

xx t u t dt x e x− + = − − − +∫ .

(15)

Solution of the problem by HPM

For using this method, we derive canonical form of Eq. (15) as follows

(16)

Here, we construct a homotopy of Eq. (16).

00

(1 )( ( ) ( )) ( ( ) 2 [ ( ) 2 (1 2 )] ) 0x

xp U x u x p U x x U t x e x dt− − + − + − + =∫ .

So

(17)

It is very natural to assume that the solution of (17) can be expressed as a power series in p

20 1 2( ) ( ) ( ) ( )U x U x pU x p U X= + + +L (18)

Substitute Eq.(18) in Eq .(17) and Equating the coefficients of the terms with the identical powers of p, leads to

0

( ) 2 ( ) 2 (1 2 ).x

xu x x u t dt x e x+ = − + +∫

0 00

( ) ( ) ( ( ) 2 ( ) 2 (1 2 )) .x

xU x u x p u x x U t x e x dt− = − + + − +∫


00 0

11 0

22 1

: ( ) ( ) 0

: ( ) 2 ( ) 0

: ( ) 2 ( ) 0

x

ox

o

p U x u x

p U x x u t dt

p U x x u t dt

− =

− =

− =

∫

∫M

So we derive

02 3

13 5 3 4

2

( ) 2 (1 2 ),( ) 4 2 2 2 ,( ) 4 4 4 8

x

x x

x x

U x x e xU x x e xe x xU x x x e x x e

= − + +

= − − +

= − − +M

Therefore, the three-terms approximation of the solution will be derived as the following form

0 1 24 2 5 3 3

( ) ( ) ( ) ( )( ) 8 4 4 4 2x x x x

U x U x U x U XU x x e x e e x e x x

= + +

= + + − − +

Solution of the problem via OHAM

The OHAM formulation for Eq. (16) is as fallows

(1 )[ ( ( ; ) ( )] ( )[ ( ( ; )) ( ( ; )) ( )] 0p L u x p f x H p L u x p N u x p f x− + − + + =

Where 0

( ( ; )) ( ), ( ( ; )) 2 ( ) ,x

L u x p u x N u x p x u t dt= = ∫ and ( ) 2 (1 2 )xf x x e x= − + . By substituting

20 1 2( ) ,u x u pu p u= + + +L and 2 3

1 2 3( )H p pc p c p c= + + +L we derive

2 2 3 20 1 2 1 2 3 0 1 2

20 1 2

0

(1 )[( ( ) ( ) ( ) ) 2 (1 2 )] ( )[( ( ) ( ) ( ) )

2 ( ( ) ( ) ( ) ) 2 (1 2 )].

x

xx

p U x pU x p U x x e x pc p c p c U x pU x p U x

x U x pU x p U x dt x e x

− + + + + − + = + + + + + +

+ + + + + − +∫

L L L

L

Equating the coefficients of the same powers of ,p leads to: (19)

00 0: ( ) ( ) 2 (1 2 ),xp U x u x x e x= = − + +

(20)

so

21 1( ) 2 ( 2 1)x xU x xc x xe e= − + − +

42 2 2 2

2 1 1 2 1( ) 2 (1 )( (2 1) 1) 2 ( (2 1) 1) 4 ( 5 (1 5 ) 2 5)4

x x x xxU x xc c x e x xc x e x xc e x x e−= + − + − + + − + − + + + − + −

11 0 1 0 1 0 1

0

22 1 1 1 2 0 2 0 1 1 2

0 0

33 2 1 2 2 1 3 0 3 0

0

: ( ) ( ) (2 (1 2 )) ( ) 2 ( ) (2 (1 2 )),

: ( ) ( ) ( ) ( ) 2 ( ) 2 ( ) (2 (1 2 ),

: ( ) ( ) ( ) ( ) ( ) 2 ( ) 2

xx x

x xx

x

p u x u x x e x c u x xc u t dt c x e x

p u x u x c u x c u x xc u t dt xc u t dt c x e x

p u x u x c u x c u x c u x c x u t dt x

− − − + = + + − +

− = + + + + − +

− = + + + +

∫

∫ ∫

∫ 2 1 1 20 0

( ) 2 ( ) ,x x

c u t dt xc u t dt+∫ ∫M


(21)

Therefore, by considering 3

0( ) ( ; )j i

jU x U x c

=

= ∑ , we get

(22)

1

2

2

0.20087762050.30231037240.001243492533

ccc

= −= −= −

We Substituting above values in Eq. (22) and derive:

6 5

4 3 2

7 5 3

3.945698305 0.1296924363e 0.6484621814e1.810389418e 0.9051947090e 3.896370577e0.6484621814x 0.9700409271 2.8533799980.9987565075 3.945698304

x x

x x x

x x

u x x xx x x

x xe e x

= − − +

+ − −

+ − +

+ +

Table1 - show the numerical results and compared with exact solution.

Table1 - the numerical results and compared with exact solution.

x 0 1 0.9987565075 1 0 0.0012434912

0.1 1.147801208 1.104276307 1.105170918 0.042630290 0 .000894611

0.2 1.408096266 1.221465888 1.221402758 0.1186693508 0.000063130

0.3 1.821774078 1.350538913 1.34985808 0.471915998 0.000680833

0.4 2.457251080 1.491876529 1.491824698 0.965426382 0.000051831

0.5 3.422442542 1.647358377 1.648721271 1.773721271 0.001362894

0.6 4.881792001 1.822408906 1.822118800 3.059673201 0.000290106

0.7 7.079575499 2.028851546 2.013752707 5.065822792 0.015098839

0.8 10.37095040 2.288646525 2.225540928 8.145409472 0.063105597

0.9 15.26251933 2.638561036 2.459603111 12.80291622 0.178929926

1 22.46453645 3.135779341 2.718281828 19.74625462 0.417497513

Example 2. Consider the following nonlinear Volterra integral equation with exact solution, ( )u x x= ,

7 3 3 3 3 3 4 3 2 53 1 1 3 1 3 1 2 1 2 1 2 1 2 1 2

6 3 5 3 4 3 3 3 2 3 3 2 4 21 1 1 1 1 1 3 1 2 1

3 2 2 2 2 21 1 1 2

( ) 8 6 2 2 16 8 8 4 816 8 16 8 4 2 4 4 168 8 4 4

x x x x x

x x x x x x x x

x x x x

U x x c x c x c xc e c e x c c e x c c e x c c e xc c x c ce x c e x c e x c e x c e x c e xc e x c xc c e x c

e x c e x c e x c e x c

= − + − + + + − + − −

+ − + − + − + + +

− + + + − 2 3 2 5 2 31 1 2 1 1 1

3 22 1 2 1

4 2 2 4 8 22 4 2 2 .

x x xe xc e xc e xc x c x c x cx c xc xc xc

− − + − −

− + + +

7 3 3 3 3 3 4 3 21 1 3 1 3 1 2 1 2 1 2 1 2

5 4 3 3 3 2 3 3 2 3 4 21 2 1 1 1 1 3 1 2 1 2 1

3 2 2 2 2 21 1 1 2 1

( ) 2 8 6 2 2 16 8 8 48 16 8 4 2 4 4 4 2412 12 12 8 6

x x x x x

x x x x x x

x x x x x

U X x x c x c x c xc e c e x c c e x c c e x c c e xc cx c c e x c e x c e x c e xc e x c xc c x c c e x ce x c e x c e x c e x c e xc

= − − + − + + + − + −

− + − + − + + + +

− + + + − 2 3 2 5 21 2 1 1

3 3 21 2 1 2 1

6 4 6 126 4 6 4 2 6 .

x x

x x

e xc e xc x c x cx c x c xc xc e e x xc

− − + −

− − + + + + +


3 3 5 2

0

1 1( ( ) ( ) ( ))20 2

x

xu t tu t u t dt x x− + = +∫ . (23)

Solution of the problem by HPM

first differential of Eq. (23), we have

(24)

We construct a homotopy of Eq . (24) in the following form:

4 30

0

1(1 )( ( ) ( )) ( ) ( ) 0.4

x

p U x u x p x x p u t dt− − − + − =∫

(25)

Substitute 20 1 2( ) ( ) ( ) ( )U x U x pU x p U X= + + +L in Eq .(25) and Equating the coefficients of the terms with the

identical powers of p, leads to

So, we get

40 0

4 3 13 10 7 41

0

2 22 19 16 13 10 132 0 1

0

1( ) ( ) ( ),4

1 1 3 3 1( ) ( )4 832 160 28 4

3 177 1509 591 39 3( ) 3 ( ) ( )292864 632320 465920 29120 560 28

x

x

U x u x x x

U x x x dt x x x x

U x U t U t dt x x x x x x

= = +

= + = + + +

= = + + + + +

∫

∫M

Therefore, the three-term approximation of the solution will be derived as the following form

4 13 10 7

22 19 16

1 313 99 3( )2 14560 1120 14

3 177 1509 .292864 632320 465920

U X x x x x x

x x x

= + + + +

+ + +

Solution of the problem via OHAM

According to the OHAM, we construct the following homotopy

3 4

0

1( ) ( )4

x

u x u t dt x x− = +∫

0 40 0

1 31 0

0

2 22 0 1

0

1: ( ) ( ) ( ) 0,4

: ( ) ( ) 0,

: ( ) 3 ( ) ( ) 0,

x

x

p U x u x x x

p U x U t dt

p U x U t U t dt

− − + =

− =

− =

∫

∫M


4 2 3 3 41 2 3

0

1 1(1 )[ ( ) ] ( )[ ( ) ( ) ] 0.4 4

x

p u x x x pc p c p c u x u t dt x x− − − − + + + − − − =∫L

Substituting equation (7) into above equation, and equating the terms with the identical powers of p, we get:

0 40 0

1 31 1 0

0

2 3 3 22 1 1 2 0 1 1 1 0 1

0 0 0

1: ( ) ( ) ( ) 0,4

: ( ) ( ) 0,

: ( ) (1 ) ( ) ( ) ( ) 3 ( ) ( ) 0,

x

x x x

p U x u x x x

p U x c u t dt

p U x c U x c u t dt c u t dt c u t u t dt

− − + =

+ =

− + + + − =

∫

∫ ∫ ∫M

So we get

40 0

13 10 7 41 1

13 10 7 4 2 13 2 10 2 42 1 1 1 1 1 1 1

13 10 7 4 4 402 2 2 2 1

1( ) ( ) ,4

1 3 3 1( ) ( ),832 160 28 4

1 3 3 1 139 57 1( )832 160 28 4 7280 1120 4

1 3 3 1 1 9832 160 28 4 23037214720 4097966080

U x u x x x

U x c x x x x

U x c x c x c x c x c x c x c x

c x c x c x c x c x c

= = +

= − + + +

−= − − − + + −

− − − − + + 4 371

4 34 4 31 4 28 4 251 1 1 1

4 22 4 19 4 16 4 13 2 221 1 1 1 1

2 191 1

3357 107391 3069 7191965899724800 150212608000 456601600 1630720000

102021 1521 9 1 3502261760 2383360 7168 832 292864

177 1509632320 465920

x

c x c x c x c x

c x c x c x c x c x

c x c

+ + + +

+ + + + +

+ + 2 16 ,x

M

Therefore, the three-term approximation of the solution will be derived as the following form

3 10 3 4 5 40 5 37 5 341 1 1 1 1

5 31 5 28 5 25 5 22 5 191 1 1 1 1

51

9 1 1 9 3357( )280 4 23037214720 4097966080 65899724800

107391 3069 71919 102021 1521150212608000 456601600 1630720000 502261760 2383360

97168

U x x c x c x c x c x c x

c x c x c x c x c x

c

= + − + + +

+ + + + +

+ 16 5 13 13 10 7 4 19 161 3 3 3 3 2 1 2 1

13 10 7 4 22 13 10 71 2 1 2 1 2 1 2 2 1 1 1 1

4 2 131 1

1 1 3 3 1 177 1509832 832 160 28 4 632320 465920

521 9 3 1 3 3 9 929120 280 28 2 292864 832 160 283 1077 934 29120

x c x c x c x c x c x c c x c c x

c c x c c x c c x c c x c c x c x c x c x

c x c x

+ − − − − + +

+ + − − + − − −

− + + 2 10 2 7 2 4 13 10 7 41 1 1 2 2 2 2

4 40 4 37 4 34 4 31 4 281 1 1 1 1

4 251

3 3 1 3 3 11120 28 4 416 80 14 2

1 9 3357 107391 306911518607360 2048983040 32949862400 75106304000 228300800

71919 102021815360000 25113088

c x c x c x c x c x c x c x

c x c x c x c x c x

c x

− − − − − −

+ + + + +

+ + 4 22 4 19 4 16 4 13 2 22 2 191 1 1 1 1 1

2 16 3 28 3 25 3 22 3 19 3 16 3 131 1 1 1 1 1 1

1521 9 1 3 1770 1191680 3584 416 146432 316160

1509 33 5517 843 3033 519 1232960 24227840 232960000 3660800 2383360 163072 320

38583

c x c x c x c x c x c x

c x c x c x c x c x c x c x

+ + + + +

+ − − − − − +

− 3 31 41

1 .5776 4

c x x+


by using the technique OHAM, we have

1

2

3

19.01649292326.23151245906.251765

CCC

= −= −= −

We Substituting this values in above equation and derive:

31 28 25 22

19 16 40 37

34 13 10 7

( ) 1.590697674 14.94786492 97.97938916 450.35876181409.450197 2749.741916 0.00009659658914 0.0048872652650.1133599124 2563.828145 132.4382406 5.39550209 0.3005

U x x x x x xx x x x

x x x x

= − − − −

− − − −

− − + + + 4277000x

Table -2 show the numerical results and compared with exact solution .

Table -2 the numerical results and compared with exact solution .

x u u u 0 0 0 0 0 0

0.05 0.05000312517 0.05000188252 0.05 0.00000312517 0.00000188252 0.1 0.100500214 0.1000306052 0.1 0.000500214 0.0000306052

0.15 0.1502534916 0.1501620753 0.15 0.0002534916 0.0001620753 0.2 0.2008027520 0.2005613501 0.2 0.0008027520 0.0005613501

0.25 0.2519662886 0.2515906900 0.25 0.0019662886 0.0015906900 0.3 0.3040973897 0.3039755626 0.3 0.0040973897 0.0039755626

0.35 0.3576434589 0.3584597253 0.35 0.0076434589 0.0084597253 0.4 0.4131605001 0.4119945577 0.4 0.0131605001 0.0119945577

0.45 0.4713346199 0.4398798365 .045 0.0213346199 0.0101201635

4. Conclusions In this article, we have been looking for the solution of Volterra integral equation. We have achieved this goal by

applying homotopy perturbation method and optimal asymptotic homotopy methods. The study shows that the

technique lead to accurate results in comparison to Homotopy perturbation method. Finally, the optimal homotopy

asymptotic method is more effective than Homotopy perturbation method. The computations associated with the

examples in this work were performed by using Maple 18.

5. Reference

1. AJ. Jerri, Introduction to Integral Equations with Applications, Wiley, New York,1999. 2. M. Rahman, Integral Equations and their Application, WIT PRESS, Southampton,Boston, 2007. 3. AM. Wazwaz, A First Course in Integral Equations, World Scientific, 1997.

4. H.J. Ding, H.M. Wang, W.Q. Chen, Analytical solution for the electrostatic dynamics of a nonhomogeneous spherically isotropic piezoelectric hollow sphere, Arch. Appl. Mech., (2003) .


5. P. Baratella, A Nystrom interpolant for some weakly singularlinear volterra integral equations, J. Comput. Appl. Math. (2009) . 6. M.A. Bartoshevich, On one heat conduction problem, Inz-Fiz.Zh. (1975) . 7. J.H. He, Homotopy perturbation technique, Computer Methods in Applied Mechanics and Engineering , (1999). 8. S.Salahshour, M.Khan, “ Exact solution of nonlinear interval Volterra integral equations”, (2012). 9. N.Ngarasta, solving integral equation of the first kind by decomposition method. Kybernets, (2009).


Investigation of surface and nonlocal effects in the vibration of mass- attached nanotubes by Differential transform method

Asghar Zajkani1, Gholam Reza Shaghaghi 2

1, 2- Department of Mechanical Engineering, Faculty of Engineering, Imam Khomeini International University, Qazvin

Corresponding author email: [email protected]

Abstract In this paper nonlocal Euler–Bernoulli beam theory is employed for vibration nanotubes that carrying a spherical nanoscale object at the free end with considering surface effect by using semi analytical differential transform method (DTM). The nonlocal Eringen theory takes into account the effect of small size, which enables the present model to become effective in the analysis and design of nanosensors and nanoactuators. Governing equations are derived through Hamilton’s principle and they are solved applying differential transform method (DTM). The good agreement between the results of this article and those available in literature validated the presented approach. The detailed mathematical derivations are presented and numerical investigations are performed while the emphasis is placed on investigating the effect of several nanotube parameters such as size effects, mode number, surface effect and attach mass on normalized natural frequencies and mode shapes of nanotube in detail. It is concluded that these effects play significant role on dynamic behavior of nanotube. Numerical results are presented to serve as

benchmarks for future analyses of such nanotubes.

Keywords: Nanotube, Nonlocal elasticity, Differential transformation method, surface effect

1. INTRODUCTION Nanoscale engineering materials have significant mechanical, electrical and thermal performances that are superior to the conventional structural materials. They have attracted great interest in modern science and technology after the invention of carbon nanotubes (CNTs) by Iijima [1]. For example, in micro/nano electromechanical systems (MEMS/NEMS); nanostructures have been used in many areas including communications, machinery, information technology, biotechnology technologies. So far, three main methods were provided to study the mechanical behaviors of nanostructures: atomistic model by Baughman et al., [2], semi-continuum and continuum models by Wang and Cai, [3]. However, both atomistic and semi-continuum models are computationally expensive and are not suitable for analyzing large scale systems. In other words, since conducting experiments at the nanoscale is a daunting task, and atomistic modeling is restricted to small-scale systems owing to computer resource limitations, continuum mechanics offers an easy and useful tool for the analysis of nanostructures. Therefore, there are considerable efforts made to develop and calibrate continuum structural models for nanobeams analysis. Moreover due to the inherent size effects, at nanoscale, the mechanical characteristics of nanostructures are often significantly different from their behavior at macroscopic scale. Such effects are essential for nanoscale materials or structures and the influence on nano-instruments is great by Maranganti and Sharma, [4]. Generally, theoretical studies on size effects at nanoscale are by means of surface effects by Zhu et al., [5], strain gradients in elasticity by Mindlin, [6] and plasticity by Aifantis, [7], and the nonlocal stress field theory by Eringen [8,9] . Unfortunately, the classical continuum theories are deemed to fail for these nanostructures. Consequently, the classical continuum models need to be extended to consider the nanoscale effects and this can be achieved through the nonlocal elasticity theory proposed by Eringen [9] which consider the size-dependent effect. According to this theory, the stress state at a reference point is considered as a function of strain states of all points in the body. This nonlocal theory is proved to be in accordance with atomic model of lattice dynamics and with experimental observations on phonon dispersion by Eringen [8]. In nonlocal theory, the nonlocal nanoscale in the constitutive equation could be considered simply as a material-dependent parameter. The ratio of internal characteristic



scale (such as lattice parameter, C-C bond length, granular distance, etc.) to external characteristic scale (such as crack length, wave length, etc.) is defined within a nonlocal nanoscale parameter. If the internal characteristic scale is much smaller than the external characteristic scale, the nonlocal nanoscale parameter approaches zero and the classical continuum theory is recovered. In recent years, nanobeams hold a wide variety of potential applications Zhang et al [10]. Based on nonlocal elasticity theory, the equations of motion and bending of Euler–Bernoulli beam for cantilever microtubules was presented The object mass is used in medicine. For example, vibration of carbon nanotube (CNT)-based biosensor. A CNT-based biosensor is modeled as a nonlocal Timoshenko beam made of multiwall CNT carrying a spherical nanoscale bio- object at the free end has been investigated [11]. In order to study the mechanical behaviors of nanostructures, the surface effects and nonlocal elasticity are two important issues investigated by researchers separately, or simultaneously. The surface of a solid is a region with small thickness which has different properties from the bulk. If the surface energy-to-bulk energy ratio is large, for example in the case of nanostructures, the surface effects cannot be ignored. To account for the effect of surfaces on mechanical deformation, the surface elasticity theory is presented by modeling the surface as a two dimensional membrane adhering to the underlying bulk material without slipping [12]. In this study, differential transformation method is applied in analyzing vibration characteristics of nanotubes that carrying a spherical nanoscale object at the free end with considering surface effect. The superiority of the DTM is its simplicity and good precision and depends on Taylor series expansion while it takes less time to solve polynomial series. It is different from the traditional high order Taylor’s series method, which requires symbolic competition of the necessary derivatives of the data functions. The Taylor series method is computationally taken long time for large orders. With this method, it is possible to obtain highly accurate results or exact solutions for differential equations. To the author’s best knowledge there is no work reported on the application of DTM on vibration analysis of nanotubes. In this study the non-classical beam model is developed within the framework of Euler–Bernoulli beam theory is developed for nanotube. Governing equations and boundary conditions for the free vibration of a nonlocal beam have been derived via Hamilton principle. The detailed mathematical derivations are presented and numerical investigations are performed while the emphasis is placed on investigating the effect of several parameters such as size effects, mode number, attach mass and surface effect on vibration characteristics of nanotubes. Comparisons with the results from the existing literature are provided and the good agreement between the results of this article and those available in literature validated the presented approach. Numerical results are presented to serve as benchmarks for the application and the design of nanoelectronic and nano-drive devices, nano-oscillators, and nanosensors, in which nanotubes act as basic elements. 1. THEORY AND FORMULATION

1.1. NONLOCAL ELASTICITY THEORY

Based on the Eringen nonlocal elasticity model [8], the stress at a reference point x in a body is considered as a function of the strains of all points in the adjacent region. This assumption is in agreement with the experimental observations of the atomic theory and lattice dynamics in phonon scattering. For a homogeneous and isotropic elastic solid, the nonlocal

stress-tensor components ijσ at any point x in the body can be expressed as

( ) ( , ) ( ) ( )ij ijx x x t x d xσ α τΩ

′ ′ ′= − Ω∫ (1)

where ( )ijt x′ are the components of the classical local stress tensor at point x which are related to the components of

the linear strain tensor klε by the conventional constitutive relations for a Hookean material,

klijklij Ct ε= (2)

The meaning of Eq. (1) is that the nonlocal stress at point x is the weighted average of the local stress of all points in the neighborhood of x, the size of which is related to the nonlocal kernel ),( τα xx −′ . Here xx −′ is the Euclidean

distance and τ is a constant given by

lae0=τ (3)


Which represents the ratio between a characteristic internal length, a (such as lattice parameter, C–C bond length and granular distance) and a characteristic external one, l (e.g. crack length, wavelength) trough an adjusting constant, 0e ,

dependent on each material. The magnitude of 0e is determined experimentally or approximated by matching the dispersion curves of plane waves with those of the atomic lattice dynamics.

According to Eringen and Edelen [13], for a class of physically admissible kernel ),( τα xx −′ , it is possible to

represent the integral constitutive relations in Eq. (1) in an equivalent differential form as

klkl tae =∇− σ))(1( 20 (4)

Where 2∇ is the Laplacian operator. Thus, the scale length 0e a takes into account the size effect on the response of a

nanostructure. For an elastic material in the one dimensional case, the nonlocal constitutive relations may be simplified as [13]

2

2xx

xx xxExσ

σ µ ε∂

− =∂

(5)

Where σ and ε are the nonlocal stress and strain, respectively, 20( )e aµ = is a nonlocal parameter, and E is the

elasticity modulus.

1.2. SUFACE EFFECT THEORY The energy associated with atoms in the surface layers is different from those in the bulk of material, which is called the surface free energy. In most study, this energy has been neglected, because it is introduced with a few layers of atoms near the surface, but in the study of nanosize structures, this energy cannot be ignored [18]. In nanoscale this effect has dominant influence, because of its high ratio of surface energies to volume, which implies higher elastic modulus and

mechanical strength than classical theories.

Consider the nanotube of length L and inner and outer radii Ri and Ro in Fig. 1. The curvature of a beam in bending can be approximated by 2 2/w x∂ ∂ , where w denotes the lateral deflection of the beam. The Laplace–Young equation [19]

as given below indicates that for a beam with curvature 2 2/w x∂ ∂ , the distributed transverse load q induced by the residual surface stress is:

Figure. 1. Geometry of a nanotube with length L and inner and outer radii Ri and Ro

)6( 2

0 2wq q H

x∂

= +∂

Where q0 is initial distributed transverse load and the parameter H is a constant determined by the residual surface stress and the shape of cross section [14]. For circular cross sections, H is given by

)7( 2 oH Dτ=


Where oτ is the residual surface stress under unstrained condition and D is diameter of circular beam. The effective

flexural rigidity, *EI , for the nanotube shown in Fig. 1 is given by [14]:

)8( * 4 4 3 31 ( ) ( )4

so i o iEI E R R E R Rπ π= − + +

Where sE is the surface elastic modulus, which can be determined by atomistic simulations or experiments, and Ri and Ro denotes the inner and outer radii, respectively, of the nanotube.

1.3. THE EULER-BERNOULLI BEAM THEORY The simplest beam theory is the Euler–Bernoulli theory (EBT), which implies that plane sections that are normal to the mid-plane of the beam remain straight and normal to the mid-plane after deformation. Thus, the effects of shear deformation and rotational inertia are not included in this theory. By the EBT, the displacement u1 of a point at

coordinate x with a height z from the mid-plane of the beam is [15]:

1 2 3( , ) , 0 , ( , )wu u x t z u u w x tx

∂= − = =

∂ (9)

Where ( , )u w are the axial and transverse displacements of the point on the mid-plane of the beam. The axial strain

xxε and shear strain γ xz of the beam based on the EBT are

ε xx =∂u∂x

− z ∂2w∂x2 , γ xz =

∂u1

∂x+

∂u2

∂x= 0 (10)

The governing equations of motion and the boundary conditions for beam can be derived from Hamilton’s principle [20]:

0( ) 0

tT U V dtδ − + =∫

(11)

Where U is the strain energy, T is the kinetic energy and V is the work done by external forces. The first variation of the strain energy can be calculated as:

2

20

L u wU Nx x

M dxδ δ δ = −

∂ ∂∂ ∂∫

(13)

Where N and M are the axial force and bending moment, respectively. These stress resultants used in Eq. (13) are defined as

,xx xxA AN dA M zdAσ σ= =∫ ∫

(14)

The kinetic energy for the beam can be written as

2 2310

1 ( ( ) ( ) )2

L

A

uuT dAdxt t

ρ∂∂

= +∂ ∂∫ ∫

(15)

Accordingly, the first variation of Eq. (15) can be obtained as

2 2

0( ) ( )) ( )

L u u w w w wT A A I dxt t t t t x t x

δ ρ δ ρ δ ρ δ ∂ ∂ ∂ ∂ ∂ ∂

= + + ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∫

(16)


Where ρ, I , A are the mass density, moment inertia and cross sectional area of the nanotube, respectively.

The first variation of the external work of the beam can be written as

0

LV q wdxδ δ= ∫ (17)

in which q denote respectively the transverse loads distributed along the length of beam. Substituting Eqs. (13), (16) and

(17) into Eq. (11) and setting the coefficients of wδ and ( )wx

δ∂∂

equal to zero, we obtain the following equations of

motion:

2

2N uAx t

ρ∂ ∂

=∂ ∂

(18.a)

2 2 4

2 2 2 2M w w wN q A Ix x x t t x

ρ ρ∂ ∂ ∂ ∂ ∂ − + = − ∂ ∂ ∂ ∂ ∂ ∂

(18.b)

Integrating Eq. (5) over the cross-sectional area of the beam, we obtain the force-strain and moment-strain relations of the nonlocal beam as follows:

2

2N uN EA

x xµ

∂ ∂− =

∂ ∂ (19.a)

2 2*

2 2M wM EIx x

µ∂ ∂

− = −∂ ∂

(19.b)

The explicit expressions of the nonlocal normal force and bending moment of the beam can be derived by substituting the second derivative of the bending moment M in Eq. (18) into Eq. (19) as follows:

3

2u uN EAx x t

µ ∂ ∂

= + ∂ ∂ ∂ (20.a)

2 2 4*

2 2 2 2w w w wM EI N q A I

x x x t t xµ ρ ρ

∂ ∂ ∂ ∂ ∂ = − + − + − ∂ ∂ ∂ ∂ ∂ ∂ (20.b)

The nonlocal governing equations of the Euler-Bernoulli nanotubes in terms of the displacements can be derived by substituting for N and M from Eq. (20) into Eq. (18) as follows [38]:

2 4

2 2 2u u uEA A A

x x t t xρ µρ

∂ ∂ ∂ ∂ = − ∂ ∂ ∂ ∂ ∂ (21.a)

2 2 2 2 2 4 2

0 02 2 2

4

*2 2 2 2 2

2

2 22

w w w wEI N A Ix x x x x t t x

w w wN A Ix

w wq

x t x

H q Hx x

t

µ ρ ρ

ρ ρ

∂ ∂ ∂ ∂ ∂ ∂ ∂ − + − + − + ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ − = − ∂ ∂ ∂ ∂

∂ ∂− +

∂ ∂

∂

(21.b)

For a beam in free vibration, the motion can be assumed to be of the harmonic type with a natural frequency ω, namely,

( , ) ( ) i tw x t W x e ω= (22) Substituting the preceding equation into Eq. (21) leads to the following:


* 2 22 2 2

2 2

2 2 2 2 2 2

0 02 2 2

2

2

WEI N A W Ix x x x x x

W w

WN

W wq H q Hx

I A Wx x

x

xW

µ ρ ω ρ ω

ρ ω ρ ω

∂ ∂ ∂ ∂ ∂ ∂ − + − − + + ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ − = − ∂ ∂

∂ ∂− +

∂

∂

∂

(23)

This is exactly the time-dependent equation of motion that servs as the basis of the present study.

2. DIFFERENTIAL TRANSFORMATION METHOD The differential transformation method is one of the most useful techniques for solving the differential solutions with small calculation errors. It is particularly suitable for solving the nonlinear equations of the boundary value problems. Abdel-Halim Hassan [16] applied the DTM to the solution of some eigenvalue problems. Wang [17] presented the axial vibration analysis of stepped bars utilizing the DTM. The DTM is proved to be an effective computational tool for various engineering problems. Using the DTM, the ordinary and partial differential equations can be transformed into algebraic equations, from which a closed-form series solution can be obtained easily. By this method, certain transformation rules are applied to both the governing differential equations of motion and the boundary conditions of the system, so as to transform them into a set of algebraic equations as presented in Table 1 and Table 2. The solution of these algebraic equations gives the desired results of the problem. The basic definitions and the application procedure of

this method can be introduced below.

The transformation equation of function ( )f x can be defined as [17]

[ ]0

1 ( )( )!

k

x xkd f xF k

k dx ==

(24)

Where ( )f x is the original function and [ ]F k is the transformed function. The inverse transformation is defined as

( ) [ ]00

( ) k

k

f x x x F k∞

=

= −∑

(25)

Combining Eqs. (24) and (25), we obtain

( ) ( )0

0

0

( ) ( )!

kk

x xkk

d f xx xf xk dx

∞

==

−= ∑

(26)

In actual application, the function f(x) is expressed by a finite series. Thus, Eq. (26) can be written as follows:

( ) ( )0

0

0

( ) ( )!

kk

x x

N

kk

d f xx xf xk dx =

=

−= ∑

(27)

which implies that the remaining term as given below is negligible:

( ) ( )0

0

1

( ) ( )!

kk

x xkk N

d f xx xf xk dx

∞

== +

−= ∑

(28)

3. IMPLEMENTATION OF DIFFERENTIAL TRANSFORM METHOD In solving Eq. (23), the DTM approach will be adopted for its advantage in solving complicated transcendental algebraic equations with general boundary conditions. In order to derive the transcendental algebraic equations for the differential equation of Eq. (23), we refer to Table 1 and the following expressions for the two cases of vibration

equation and buckling equation.


Table 1: Some of the transformation rules of the one-dimensional DTM Original function Transformed function

( ) ( ) ( )f x g x h x= ± ( ) ( ) ( )F K G K H K= ±

( ) ( )f x g xλ= ( ) ( )F K G Kλ=

( ) ( ) ( )f x g x h x= ( ) ( )0

( )K

l

F K G K l H l=

= −∑

( ) ( )n

nd g xf x

dx= ( ) ( )! ( )

!k n

F K G K nk+

= +

( ) nf x x= ( ) ( ) 10

k nF K K n

k nδ

== − = ≠

Table 2: Transformed boundary conditions (B.C.) based on DTM

X=0 X=L Original B.C.

Transformed B.C.

Original B.C.

Transformed B.C.

(0) 0f = F[0] 0= ( ) 0f L = 0

[ ] 0k

F k∞

=

=∑

d (0) 0d xf

= [1] 0F = df ( ) 0

dxL

= 0

[ ] 0k

k F k∞

=

=∑

2

2(0) 0

dxd f

= [2] 0F = 2

2( ) 0

dxd f L

= ( )0

1 [ ] 0k

k k F k∞

=

− =∑

3

3(0) 0

dxd f

= [3] 0F = 3

3( ) 0

dxd f L

= ( )( )0

1 2 [ ] 0k

k k k F k∞

=

− − =∑

• Vibration equation:

( )* 2 2 2

2

( 4)! ( 2)!( ) [ 4] [ 2]! !

[ ] 0

k kEI H I W k A I H W kk k

A W k

µ µρ ω µρ ω ρ ω

ρ ω

+ ++ − + + + − +

=

(29)

Where W[k] is the transformed function of w. Also, according to the transformed expressions given for various boundary conditions of the nanotube in Table 2, we can obtain the following for the boundary conditions:

• Clamped–Free: [ ]W 0 0=and [ ]W 1 0=

(30) ( ) ( )( )

( )

22

30

23

0

[ ] 1 2 [ ] [ ] 0

[ ] 1 [ ] 0

k

k

A EIkW k k k k W k m W kl l

EIm A W k k k W kl

µρ ωω

ρ ω

∞

=

∞

=

−+ − − + =

+ − =

∑

∑

By using Eq. (29) along with the transformed boundary conditions, we arrive at the following eigenvalue problem:


[ ]11 12

21 22

0A A

CA A

=

(31)

Where [ ]C corresponds to the missing boundary conditions at x = 0. For non-trivial solution of Eq. (31), the

determinant of the coefficient matrix should be equal to zero. Solution of Eq. (31) is simply a polynomial root finding problem. Many techniques such as Newton’s method, Laguerre’s method, etc., can be used to find the roots of this

frequency equation.

4. NUMERICAL RESULTS AND DISCUSSIONS In the present study, as indicated in the previous section, a nanotube with circular cross-section comprised of aluminum (Al) is considered in the case study to explain the general behavior of a nanotube. The elastic bulk and surface

properties of aluminum with crystallographic direction of [1 1 1] is given in Table 3.

Table 3. Material properties of AL and Si [18, 19]

Material ( )E GPa 3( / )kg mρ ν ( / )sE N m 0 ( / )N mτ Al 70 2700 0.3 5.1882 0.9108

Table 4. Comparisons of the non-dimensional fundamental frequency for a nanotube with various nonlocal parameters with C-F boundary conditions ( / 10L h = ).

µ Frequency number Boundary condition C-F Present paper Eltaher[FEM 30]

0 1 3.5160 3.5161 2 22.0345 22.0375 3 61.6972 61.7171

1 1 3.5312 3.5314 2 20.6792 20.6817 3 51.0583 51.0695

2 1 3.5469 3.5470 2 19.5090 19.5111 3 44.5529 44.5603

3 1 3.5629 3.5630 2 18.4839 18.5857 3 40.0891 40.0954

4 1 3.5794 3.5795 2 17.5751 17.5767 3 36.8047 36.8089

5 1 3.5794 3.5963 2 16.7615 16.7629 3 34.2726 34.2759

Firstly, the accuracy of the nonlocal natural frequencies of nanotube is investigated [38]. In Tables 4, the non-

dimensional fundamental frequencies, 2ˆ ρA / EILω ω= of the nonlocal nanotube without considering the surface

effects, along with the results by Eltaher et al. [20] is listed. It is observed that the present results agree very well with


those given by Refs. [20], and that increasing the nonlocality parameter tends to decrease the natural frequency. The reason is that the presence of the nonlocal effect tends to decrease the stiffness of the nanostructures and hence decrease

the values of natural frequencies.

Next, the first three natural frequencies ω of the nanotube without considering surface effect are presented in Tables 5 and 6 for constant values of aspect ratio L/D=10 and various nonlocal parameter µ . Also the first three natural

frequencies ω of the nanotube with considering surface effect can be presented. The nonlocal parameters 20( )e aµ =

are taken as 0, 1, 2, 3, and 4 nm2. 0µ = corresponds to the local beam theory. It can be found from the results that with the consideration of surface effects, the stiffness of nanotube increases. Also, it can be seen that the influence of surface effects against higher-order frequencies is smaller value and results are closer. The surface effects on the fundamental natural frequency can be shown to be more pronouncing than that of the nonlocal parameter. Also if we start to increase

mass of bulky ball, natural frequency decrease what with considering surface effect or without it.

Table 5. Comparisons of the non-dimensional fundamental frequency for a nanotube with various nonlocal parameters with C-F boundary conditions ( / 10L h = ) without considering surface effect.

m 0µ = 1µ = 2µ =

1 2 3 1 2 3 1 2 3 1*10-21 0.21802 15.4332 49.9782 0.21803 14.6132 41.8080 0.21803 13.9092 36.6683 2*10-21 0.08914 15.4207 49.9675 0.08914 14.6016 41.7969 0.08914 13.8983 36.6585

1.1*10-20 0.06584 15.4196 49.9663 0.06584 14.6005 41.7959 0.06584 13.8973 36.6576 1.6*10-20 0.05460 15.4191 49.9659 0.05460 14.6001 41.7956 0.05460 13.897 36.6573 2.1*10-20 0.04766 15.4189 49.9656 0.04766 14.5999 41.7954 0.04766 13.8968 36.6571 2.6*10-20 0.04283 15.4188 49.9655 0.04285 14.5998 41.7952 0.04283 13.8967 36.6570 3.1*10-20 0.03922 15.4187 49.9654 0.03922 14.5997 41.7952 0.03922 13.8966 36.6569 3.6*10-20 0.03640 15.4186 49.9653 0.03640 14.5996 41.7951 0.03640 13.8965 36.6569 4.1*10-20 0.03411 15.4186 49.9653 0.03411 14.5996 41.7951 0.03411 13.8965 36.6569 4.6*10-20 0.03220 15.4185 49.9652 0.03220 14.5995 41.7950 0.03220 13.8964 36.6568 5.1*10-20 0.03058 15.4185 49.9652 0.03058 14.5995 41.7950 0.03058 13.8964 36.6568 5.6*10-20 0.02918 15.4185 49.9652 0.02918 14.5995 41.7950 0.02918 13.8964 36.6568 6.1*10-20 0.02796 15.4185 49.9651 0.02796 14.5994 41.7949 0.02796 13.8964 36.6568 66*10-20 0.02688 15.4184 49.9651 0.02688 14.5994 41.7949 0.02688 13.8964 36.6567 71*10-20 0.02592 15.4184 49.9651 0.02592 14.5994 41.7949 0.02592 13.8963 36.6567 76*10-20 0.02505 15.4184 49.9651 0.02505 14.5994 41.7949 0.02505 13.8963 36.6567 81*10-20 0.02427 15.4184 49.9651 0.02427 14.5994 41.7949 0.02427 13.8963 36.6567 8.6*10-20 0.02355 15.4184 49.9651 0.02355 14.5994 41.7949 0.02355 13.8963 36.6567 9.1*10-20 0.022898 15.4184 49.9650 0.02289 14.5994 41.7949 0.02289 13.8963 36.6567

1*10-19 0.022293 15.4184 49.9650 0.02229 14.5994 41.7949 0.02229 13.8963 36.6567

Table 6. Comparisons of the non-dimensional fundamental frequency for a nanotube with various nonlocal parameters with C-F boundary conditions ( / 10L h = ) without considering surface effect.

m 3µ = 4µ = 5µ = 1 2 3 1 2 3 1 2 3

1*10-21 0.21804 13.2966 33.0591 0.21804 12.7573 30.3468 0.21805 12.2778 28.2125 2*10-21 0.0891481 13.2864 33.0503 0.08914 12.7477 30.3386 0.08914 12.2689 28.2048

1.1*10-20 0.0658492 13.2855 33.0495 0.06584 12.7468 30.3379 0.06584 12.2681 28.2041 1.6*10-20 0.0546021 13.2851 33.0491 0.05460 12.7465 30.3376 0.05460 12.2678 28.2038 2.1*10-20 0.0476619 13.2849 33.049 0.04766 12.7463 30.3374 0.047662 12.2676 28.2037 2.6*10-20 0.0428352 13.2848 33.0489 0.04283 12.7462 30.3373 0.04283 12.2675 28.2036


3.1*10-20 0.0392294 13.2847 33.0488 0.03922 12.7462 30.3373 0.03922 12.2675 28.2036 3.6*10-20 0.0364037 13.2847 33.0488 0.03640 12.7461 30.3372 0.03640 12.2674 28.2035 4.1*10-20 0.034112 13.2846 33.0487 0.034112 12.7461 30.3372 0.034112 12.2674 28.2035 4.6*10-20 0.0322049 13.2846 33.0487 0.03220 12.746 30.3372 0.03220 12.2673 28.2035 5.1*10-20 0.0305856 13.2846 33.0487 0.03058 12.746 30.3372 0.03058 12.2673 28.2034 5.6*10-20 0.0291884 13.2846 33.0487 0.02918 12.746 30.3371 0.02918 12.2673 28.2034 6.1*10-20 0.0279666 13.2845 33.0487 0.02796 12.746 30.3371 0.02796 12.2673 28.2034 66*10-20 0.0268865 13.2845 33.0486 0.02688 12.746 30.3371 0.02688 12.2673 28.2034 71*10-20 0.0259226 13.2845 33.0486 0.02592 12.746 30.3371 0.02592 12.2673 28.2034 76*10-20 0.0250554 13.2845 33.0486 0.02505 12.7459 30.3371 0.02505 12.2673 28.2034 81*10-20 0.0242698 13.2845 33.0486 0.02426 12.7459 30.3371 0.02426 12.2672 28.2034 8.6*10-20 0.0235537 13.2845 33.0486 0.02355 12.7459 30.3371 0.02355 12.2672 28.2034 9.1*10-20 0.0228975 13.2845 33.0486 0.02289 12.7459 30.3371 0.02289 12.2672 28.2034

1*10-19 0.0222933 13.2845 33.0486 0.02229 12.7459 30.3371 0.02229 12.2672 28.2034

5. CONCLUSIONS In this paper, the surface effects on free vibration of nanotubes were studied in the framework of the EBT with Clamped-free boundary conditions, using the nonlocal elasticity theory. The nanotubes is considered to be made of Al with positive surface elasticity. Hamilton’s principle was employed to derive the governing differential equations together with the boundary conditions. Then, the DTM is applied to governing differential equations, by which the

numerical solutions of the frequencies are obtained. From the present results, the following conclusions are drawn:

• For all values of the nonlocal parameters, influence of surface effect decreases as mass of bulky ball increases. • The surface effect on higher-order frequency is less prominent. • Increasing the non-locality parameter leads to decrease in the fundamental frequency of the nanotube. • The surface effect on the fundamental natural frequency is higher than that of the nonlocal parameter.

6. REFERENCES 1. Iijima, S. (1991). Helical microtubules of graphitic carbon. Nature, 354.6348: 56-58. 2. Baughman, R. H., Zakhidov, A. A., de Heer, W. A. (2002). Carbon nanotubes--the route toward

applications. Science, 297(5582): 787-792. 3. Wang, X., Cai, H. (2006). Effects of initial stress on non-coaxial resonance of multi-wall carbon nanotubes. Acta

materialia, 54(8): 2067-2074. 4. Maranganti, R., Sharma, P. (2007). Length scales at which classical elasticity breaks down for various

materials. Physical review letters, 98(19): 195504. 5. Zhu, H., Wang, J., Karihaloo, B. (2009). Effects of surface and initial stresses on the bending stiffness of trilayer

plates and nanofilms. J. Mech. Mater. Struct,4(3):, 589-604. 6. Mindlin R D. Micro-structure in linear elasticity. Arch Rational Mech Analysis, 1964, 16: 51–78 7. Aifantis, E. C. (1984). On the microstructural origin of certain inelastic models. Journal of Engineering Materials

and technology, 106(4): 326-330. 8. Eringen, A. C. (1972b). Nonlocal polar elastic continua. International Journal of Engineering Science, 10(1): 1-16. 9. Eringen, A. C. (1983). On differential equations of nonlocal elasticity and solutions of screw dislocation and

surface waves. Journal of Applied Physics,54(9): 4703-4710. 10. Zhang, Y. Q., Liu, G. R., Wang, J. S. (2004). Small-scale effects on buckling of multiwalled carbon nanotubes

under axial compression. Physical review B, 70(20): 205430. 11. Shen, Zhi-Bin, et al. "Transverse vibration of nanotube-based micro-mass sensor via nonlocal Timoshenko beam

theory." Computational Materials Science 53.1 (2012): 340-346. 12. Ebrahimi, Farzad, Gholam Reza Shaghaghi, and Mahya Boreiry. "A Semi-Analytical Evaluation of Surface and

Nonlocal Effects on Buckling and Vibrational Characteristics of Nanotubes with Various Boundary Conditions."International Journal of Structural Stability and Dynamics (2015): 1550023.

13. Eringen, A. Cemal, and D. G. B. Edelen. "On nonlocal elasticity." International Journal of Engineering Science 10.3 (1972): 233-248.

14. Miller, Ronald E., and Vijay B. Shenoy. "Size-dependent elastic properties of nanosized structural elements." Nanotechnology 11.3 (2000): 139.

15. Reddy, J. N. "Nonlocal theories for bending, buckling and vibration of beams." International Journal of Engineering Science 45, no. 2 (2007): 288-307.


16. Hassan, IH Abdel-Halim. "On solving some eigenvalue problems by using a differential transformation." Applied Mathematics and Computation 127.1 (2002): 1-22.

17. Wang, Zhen Gang. "Axial Vibration Analysis of Stepped Bar by Differential Transformation Method." Applied Mechanics and Materials 419 (2013): 273-279.

18. Ogata, Shigenobu, Ju Li, and Sidney Yip. "Ideal pure shear strength of aluminum and copper." Science 298.5594 (2002): 807-811.

19. Miller, Ronald E., and Vijay B. Shenoy. "Size-dependent elastic properties of nanosized structural elements." Nanotechnology 11.3 (2000): 139.

20. Eltaher, M. A., Amal E. Alshorbagy, and F. F. Mahmoud. "Vibration analysis of Euler–Bernoulli nanobeams by using finite element method." Applied Mathematical Modelling 37.7 (2013): 4787-4797.


Free vibration analysis of Euler-bernoulli beam theory applied to cracked nanobeams using a nonlocal elasticity model

Asghar Zajkani1, Mohammad Reza Kokaba 2 Hamed Mohaddes Deylami 3


3- Faculty of Engineering, Faculty of Engineering - East Guilan, University of Guilan

[email protected] author email:

Abstract This paper investigates a nonlocal cracked-nanobeam model from which we have analyzed the transverse vibrations of a cracked Euler-Bernoulli nanobeam. Several types of boundary conditions, including the consideration of a fixed at the end of the nanobeam, have been studied. The nonlocal Eringen elasticity theory and Hamilton’s principle are used to formulate the problem. The effect of the nonlocal small-scale parameter, crack severity, cracked section position, different boundary conditions are examined in this work. Keywords: Free vibration, Euler-bernoulli, Cracked Nanobeams, Nonlocal elasticity

1. INTRODUCTION

One of the well-known models is the non-local elasticity theory [1, 2]. This non-local theory has been applied to solve wave propagation, dislocation and crack problems. Certain elements of the theory were anticipated in attempts to connect lattice mechanics to continuum mechanics. It would appear that nonlocal continuum mechanics could potentially play a useful role in analysis related to nanotechnology applications. The theory of nonlocal continuum mechanics initiated by Eringen and co-worker to account for the scale effect in elasticity by assuming that the stress at a reference point can be considered to be a functional of the strain field at every point in the body [3]. This continuum theory on one hand is suitable for modeling sub micro‐ or nano sized structures, while on the other hand it avoids enormous computational efforts when compared with discrete atomistic or molecular dynamics simulations. Application of nonlocal continuum theory to nanotechnology was initially addressed by Peddieson et al.

[4], a version of nonlocal elasticity was proposed for formulating a nonlocal version of Euler–Bernoulli beam. Earlier works on the vibration of nonlocal beams did not consider the effects of transverse shear deformation and rotary inertia [16–18].Many researchers have applied the nonlocal elasticity concept for the bending, buckling and vibration analyses of beam-like elements in micro‐ or nano electromechanical devices [19–22]. Since the mathematical model must be as accurate as possible in the frequency range of interest, an Euler–Bernoulli formulation may not be adequate even in the case of slender beams, especially when excitation of higher frequencies is involved. In this paper, free vibration of nanobeams with multiple cracks based on nonlocal elasticity theory has been reported. Analytical solutions are given for cracked Euler–Bernoulli nano-beams of different boundary conditions. Cracks are modeled as rotational springs following the classical approach proposed by Loya et al [18].

2. MATHEMATICAL FRAMEWORK 2.1 Free vibration for an intact nanobeam A schematic diagram of nanobeam with multiple cracks subjected to axial force is depicted in Fig. 1. The nanobeam has n cracks located at = ( = 1, . . , ).



Figure. 1. Nanobeam with multiple cracks subjected to axial force.

Nonlocal constitutive relations for present nanobeam can be written as follows: [1 − ( ) ∇ ] = (1)

In which , and ∇ are the stress tensor of the nonlocal elasticity, the classical local stress tensor and the Laplace operator, respectively. is a nanolength scale where a is an internal characteristic length (e.g. C–C bond length, granular distance, lattice parameter) and is a calibration constant appropriate to each material. The magnitude of is determined experimentally or estimated such that the relations of the nonlocal elasticity model could provide satisfactory approximations to the atomic dispersion curves of the plane waves with those obtained from the atomistic lattice dynamics [13]. A conservative estimate of the scale coefficient < 2.0 nm for a single-walled carbon nanotube has been proposed by Wang [4].

The parameter e0 was proposed as 0.39 by Eringen [2]. Zhang et al. [19] used results from molecular mechanics simulation for the critical axial buckling strain of a single-walled carbon nanotube to predict the value of parameter as 0.82. Wang and Hu [20] presented = 0.288 for the flexural wave propagation in a single-walled carbon nanotube using nonlocal Timoshenko beam model and molecular dynamics simulations. Duan et al. [21] presented a calibration of the small scaling parameter in the nonlocal Timoshenko beam theory using molecular dynamic simulations results (at room temperature condition) for use in the free vibration analysis of single-walled carbon nanotubes. Instead of taking on a fixed value, they founded the calibrated values of to vary between 0 and 19 depending on the length-to-diameter ratio, boundary conditions and mode shapes [21]. In this study, the nonlocal constitutive relations (Eq. (1)) for one-dimensional case can be written as: ( ) − ( ) ( ) = ( ) (2)

In which E is the Young’s modulus and ( ) is the local strain. Consider the definition of the resultant bending moment for a beam: = ∫ (3)

In which M, y and A are the resultant bending moment, the distance from the neutral axis and the cross-sectional area, respectively.

Substituting Eq. (3) into Eq. (2) leads to: − ( ) = (4)


In which EI is the flexural stiffness and w is the transverse deflection of the nanobeam. From Fig. 3, the force equation of motion in the vertical direction has the form:

+ + − = 0 (5)

And

− = 0 (6)

In which Q, N and are the resultant shear force, the external axial force and the mass per unit length, respectively. Substituting Eq. (6) into Eq. (5), we have:

+ − = 0 (7)

Assume that N and EI are constants. Substituting Eq. (7) into Eq. (4), we obtain the following equations: = − + ( ) − + (8)

Then, substituting Eq. (8) into Eq. (7), we obtain the partial differential equation for this nanobeam in the form: + ( ) − − + = 0 (9)

For free vibration, the transverse displacement is assumed to be of the form ( , ) = ( ) ; where is the angular frequency. The dimensionless variables are defined as follows:

= , = , = , = , = (10)

Then, the equation of motion (Eq. (9)) is simplified in the form

(1 + ) + ( − ) − = 0 (11)

The solution of Eq. (11) has the exponential form: ( ) = (12)

Inserting Eq. (12) into Eq. (11) and dividing through by Aeisx, we obtain the characteristic equation (1 + ) − ( − ) − = 0 (13)

This has the solutions

, = ± = ± ∝ , , = ± = ± (14)

Where = , = , = ( + ) + 4 , = 2(1 + ) (15)

In this case, Eq. (12) yields: ( ) = + + + (16)


Assume: ( ) = , ( ) = , ( ) = , ( ) = (17)

In order to simplify the analysis, the linearly independent fundamental solutions are denoted by ( )( = 1,2,3,4), which satisfy the following normalization condition at the origin of coordinate system [22]:

⎣⎢⎢⎢⎡ (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0)

⎦⎥⎥⎥⎤ = (18)

Where I4 is an identity matrix of order 4. ( ) CAN be constructed by the following equation (Eq. (19)):

⎩⎪⎨⎪⎧ ( ) ( ) ( ) ( )⎭⎪⎬

⎪⎫ = ⎣⎢⎢⎢⎡ (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) (0) ⎦⎥⎥⎥

⎤ ⎩⎨⎧ ( ) ( ) ( ) ( )⎭⎬

⎫ (19)

The primes in Eqs. (18) and (19) indicate the differentiation with respect to x. ( ) has the following form ( ) = ( cosh + cos ) (20a) ( ) = ( ∝ sinh + sin ) (20b)

( ) = (cosh − cos ) (20c) ( ) = ( ∝ sinh − cos ) (20d)

= (20e)

The general solution (Eq. (16)) can be written as a function of boundary conditions at x = 0 in the form ( ) = (0) (0) + (0) (0) + (0) (0) + (0) (0) (21)

2.2 Free vibration of multiple cracks nanobeam

The influence of the crack was represented by an elastic rotational spring connecting the two segments of the nanobeam at the cracked section [23] (Fig. 2). The crack is assumed to be open. The mode shape function of the first segment can be expressed as: ( ) = (0) (0) + (0) (0) + (0) (0) + (0) (0) 0 ≤ < (22)

In this paper, we assume that q = 0. The continuity conditions at the crack position between the two adjacent segments are: ( ) = ( ) (23a)


( ) = ( ) (23b)

( ) − ( ) = ( ) (23c)

( ) + ( ) = ( ) + ( ) (23d)

= (23e)

In which is the dimensionless position of the ith crack and is the dimensionless flexibility of the rotational spring. Using ( ), the mode shape function of the 2nd segment can be written as: ( ) = ( ) + ( )[ ( − ) − ( − )], ≤ ≤ (24)

We have n cracks, then the number of segments is n + 1. The mode shape functions of the ith and (n + 1)th segments are: ( ) = ( ) + ∑ − − − , ≤ ≤ (25) ( ) = ( ) + ∑ − − − , ≤ ≤ (26)

3. Result and discussion

Numerical results are given for an analytical solution and for some type of boundary conditions. the aim is the study of influences of nonlocal parameter, crack location and flexibility on the natural frequency of nanobeams.

3.1 Simply supported beam

For a nanobeam with simply supported end conditions, the following relations (Eq. (27)) must be satisfied:

(0) (0) (1) (1) 0y y y y′′ ′′= = = = (27)

For example, we assume that the nanobeam has two similar cracks. Then, the mode shape functions for all segments are:

1 1 2 1 4( ) (0) ( ) (0) ( )y x y P x y P x′ ′′′= + (28a) ( ) = ( ) + ( )[ ( − ) − ( − )] (28b) ( ) = ( ) + ( )[ ( − ) − ( − )] (28c)

Now, applying the boundary conditions at x = 1, we have :


111 12

21 22 1

(0) 0(0) 0

ya aa a y

′ = ′′′

(29)

2 4( ) ( ) ( )T x P x P xψ= −

In fact,with increase in the nonlocal parameter, the values of the nonlocal natural frequencies diverge from the values of the clas sical natural frequencies. The divergence rate is faster for the higher modes. Note that the results associated with ε = 0 correspond to those of the local nanobeam.

Table 1 shows the values of first three natural frequencies for different crack flexibility and nonlocal parameter:

S-S

ω ε 0iC = 0.0086iC = 0.032iC =

1 0 9.8696 9.8678 9.8628

2 0 39.4835 39.4804 39.4784

3 0 88.8264 88.8259 88.8246

1 0.5 5.3002 5.2995 5.2976

2 0.5 11.9748 11.7852 11.307

3 0.5 18.439 18.425 18.3352

1 1 2.9935 2.9745 2.9602

2 1 6.205 6.1352 5.8795

3 1 9.3721 9.3641 9.3233

Fig. 2 shows the NDFF(non-dimensional fundamental frequency) of the simply supported beam vs. e0a for various non-dimensional locations of cracks (both cracks are similar. In this figure, it can be seen that as the distance between cracks decreases the NDFF also decreases

Obviously, the type of boundary conditions influences natural frequencies of the nanobeam. It is observed from Fig. 3 that, the natural frequencies have their highest values in the clamped–clamped and clamped–simply supported case, respectively.

3.2 Clamped-clamped beam

Table 2 shows the values of first three natural frequency for different crack flexibility and nonlocal parameter :


C-C

ω ε 0iC = 0.0086iC = 0.032iC =

1 0 22.3733 22.3706 22.3533

2 0 61.6728 61.02459 60.6687

3 0 120.903 120.3245 119.4593

1 0.5 10.9914 10.9805 10.9356

2 0.5 17.273 17.2532 16.8452

3 0.5 24.3324 24.3148 23.7485

1 1 6.056 6.0512 6.0278

2 1 8.8954 8.8514 8.5279

3 1 12.453 12.3765 11.9958

Figure. 2. The NDFF vs. e0a for various non-dimensional locations of cracks, where the nanobeam has two similar cracks

4. Conclusion

The transverse vibration analysis of nanobeams with multiple cracks of different boundary conditions was studied using nonlocal elasticity theory. The Euler–Bernoulli beam theory was used. As demonstrated in this paper, the effects of nonlocal parameter on the NDFF are apparent when the length of the beam is in nanoscale. Also the effects of the crack location and the crack parameter on the natural frequencies of the cracked nanobeam were investigated. In addition, the influences of different boundary conditions on the natural frequencies of the cracked nanobeam were considered.


Figure 3. The first non-dimensional natural frequencies of nanobeams of different boundary conditions vs. ε, where nanobeams have two similar cracks.

5. REFERENCES 1. C. Liu, R.K.N.D. Rajapakse, Continuum models incorporating surface energy for static and dynamic response of

nanoscale beams, IEEE Trans. Nanotechnol. 9 (4) (2010) 422–431 . 2. A.C. Eringen, On differential equations of nonlocal elasticity and solutions of screw dislocation and surface

waves, J. Appl. Phys. 54 (9) (1983) 4703. 3. A.C. Eringen, C.G. Speziale, B.S. Kim, Crack-tip problem in non-local elasticity, J. Mech. Phys. Solids 25 (5)

(1977) 339–355. 4. Q. Wang, Wave propagation in carbon nanotubes via nonlocal continuum mechanics, J. Appl. Phys. 98 (12)

(2005) 124301 5. G.F. Wang, T.J. Wang, X.Q. Feng, Surface effects on the diffraction of plane compressional waves by a nanosized

circular hole, Appl. Phys. Lett. 89 (23) (2006) 231923. 6. Q. Wang, V.K. Varadan, Vibration of carbon nanotubes studied using nonlocal continuum mechanics, Smart

Mater. Struct. 15 (2) (2006) 659–666. 7. T. Murmu, S.C. Pradhan, Thermo-mechanical vibration of a single-walled carbon nanotube embedded in an elastic

medium based on nonlocal elasticity theory, Comput. Mater. Sci. 46 (4) (2009) 854–859. 8. M. Aydogdu, A general nonlocal beam theory: its application to nanobeam bending, buckling and vibration, Phys.

E 41 (9) (2009) 1651–1655. 9. J.C. Niu, C.W. Lim, A.Y.T. Leung, Third-order non-local beam theories for the analysis of symmetrical

nanobeams, J. Mech. Eng. Sci. 223 (10) (2009) 2451–2463. 10. R. Chowdhury, C.Y. Wang, S. Adhikari, Low frequency vibration of multiwall carbon nanotubes with

heterogeneous boundaries, J. Phys. D Appl. Phys. 43 (8) (2010) 085405. 11. R. Chowdhury, S. Adhikari, C.Y. Wang, F. Scapra, A molecular mechanics approach for the vibration of single-

walled carbon nanotubes, Comput. Mater. Sci. 48 (4) (2010) 730–735. 12. J.K. Phadikar, S.C. Pradhan, Variational formulation and finite element analysis for nonlocal elastic nanobeams

and nanoplates, Comput. Mater. Sci. 49 (3) (2010) 492–499. 13. T. Murmu, S. Adhikari, Nonlocal effects in the longitudinal vibration of double-nanorod systems, Phys. E 43 (1)

(2010) 415–422. 14. T. Murmu, S. Adhikari, Nonlocal transverse vibration of double-nanobeam-systems, J. Appl. Phys. 108 (8) (2010)

083514. 15. A.Y. Joshi, S.C. Sharma, S.P. Harsha, Analysis of crack propagation in fixed-free single-walled carbon nanotube

under tensile loading using XFEM, J. Nanotechnol. Eng. Med. 1 (4) (2010) 041008. 16. T. Belystschko, S.P. Xiao, G.C. Schatz, R. Ruoff, Atomistic simulations of nanotube fracture, Phys. Rev. B 65

(23) (2002) 235430. 17. W. Brostow, A.M. Cunha, J. Quintanilla, R. Simoes, Crack formation and propagation in molecular dynamics

simulations of polymer liquid crystals, Macromol. Theory Simul. 11 (3) (2002) 308–314.

0

5

10

15

20

25

0 0.5 1 1.5

NDFF

ε

s-sc-cc-f


18. J. Loya, J. Lopez-Puente, R. Zaera, J. Fernandez-Saeza, Free transverse vibrations of cracked anobeams using a nonlocal elasticity model, J. Appl. Phys. 105 (4) (2009) 044309.

19. Y.Q. Zhang, G.R. Liu, X.Y. Xie, Free transverse vibrations of double-walled carbon nanotubes using a theory of nonlocal elasticity, Phys. Rev. B 71 (1 9) (2005) 195404.

20. L.F. Wang, H.Y. Hu, Flexural wave propagation in single-walled carbon nanotubes, Phys. Rev. B 71 (19) (2005) 195412.

21. W.H. Duan, C.M. Wang, Y.Y. Zhang, Calibration of nonlocal scaling effect parameter for free vibration of carbon nanotubes by molecular dynamics, J. Appl. Phys. 101 (2) (2007) 024305.

22. Q.S. Li, Vibratory characteristics of multi-step beams with an arbitrary number of cracks and concentrated masses, Appl. Acoust. 62 (6) (2001) 691–706.

23. S.M. Hasheminejad, B. Gheshlaghi, Y. Mirzaei, S. Abbasion, Free transverse vibrations of cracked nanobeams with surface effects, Thin Solid Films 519 (8) (2011) 2477–2482.

24. B. Binici, Vibration of beams with multiple open cracks subjected to axial force, J. Sound Vib. 287 (1–2) (2005) 277–295.

25. S.C. Pradha, J.K. Phadikar, Bending, buckling and vibration analyses of nonhomogenous nanotubes are using GDQ and nonlocal elasticity theory, Struct. Eng. Mech. 33 (2) (2009) 193–213.


Free vibration of circular nanobeams and nanorings including surface effects and resting on elastic foundations

Asghar Zajkani1, Mohsen Daman 2


Corresponding author email: [email protected]

Abstract This paper deals with the free vibration problem with consideration of surface properties of nanorings/arches. The Gurtin-Murdach model is employed for incorporating the surface effects parameters including surface density, surface tension and surface elasticity. However linear and nonlinear elastic foundations, are considered on the curved nanobeam. Simply supported at both ends are assumed as a boundary condition for this case. The analytically Navier solution is employed to solve the governing equation. It is explicitly shown that the vibration characteristics of a nanobeams is significantly influenced by these surface effects. However, it is shown that by increasing the size of thickness of curved nanobeam, the influence of surface effects reduce to zero, and the natural frequency reaches its

classical value. Numerical results are presented to serve as benchmarks for future analyses of curved nanobeams.

Keywords: elastic foundations, Nonlocal elasticity, Differential transformation method, surface effect

1. INTRODUCTION Nano materials are attracting many researchers over the recent years due to their improvement of the quality properties. Both experimental and atomistic modeling studies show that when the dimensions of structures become very small, the size effect gains important. Due to this fact, the size effect plays an important role on the mechanical behavior of micro- and nanostructures [1]. Among various nano structures, nanobeams have more important applications [2,3]. A nonlocal beam theory is proposed by Thai [4], for bending, buckling, and vibration of nanobeams. Closed-form solutions of deflection, buckling load, and natural frequency obtained for simply supported nanobeams. However, the nonlinear vibration of the piezoelectric nanobeams based on the nonlocal theory and Timoshenko beam theory has been investigated by Liao-Liang et al [5]. In this paper, the nonlinear governing equations and boundary conditions are derived by using the Hamilton principle and discretized by using the differential quadrature (DQ) method. In addition Murmu and Adhikari [6], have investigated the nonlocal transverse vibration of double-nanobeam-system. In this research, an analytical method has been developed for determining the natural frequencies of the nonlocal double-nanobeam-system. Also Eltaher et al [7], have presented free vibration analysis of functionally graded (FG) size-dependent nanobeams using finite element method. He has modeled the nanobeam according to Euler–Bernoulli beam theory and its equations of motion derived using Hamilton’s principle. The finite element method was used to discretize the model and obtained a numerical approximation of the equation of motion. Because the nanobeams has the high proportion of the surface to volume, the surface stress effects has important role in their mechanics behavior of these structures. Hence Gurtin and Murdach [8] have considered surface stress effects. In this theory the surface is considered as a part of (nonphysical) the two-dimensional with zero thickness (mathematically) which has covered the total volume. This theory has used in many researches about nanobeams. The nonlinear flexural vibrations of micro and nanobeams in presence of surface effects have been studied within the framework of Euler–Bernoulli beam theory including the von Kármán geometric nonlinearity by Gheshlaghi and Hasheminejad [9]. In this research, exact solution has been obtained for the natural frequencies of a simply-supported nanobeam in terms of the Jacobi elliptic functions by using the free vibration modes of the corresponding linear problem. Nevertheless, nonlinear free vibration of functionally graded nanobeams has been investigated by Sarabiani and Haeri-Yazdi [10]. In this study within the framework of Euler–Bernoulli beam theory including the von Kármán



geometric nonlinearity. In addition, Sahmani et al, have investigated Surface energy effects on the free vibration characteristics of post buckled third-order shear deformable nanobeams [11]. And they have been studied Surface effects on the nonlinear forced vibration response of third-order shear deformable nanobeams [12]. In these papers they have been used to Gurtin-Murdach elasticity theory, using of third-order shear deformation beam theory. Furthermore, The nonlinear free vibration of nanobeams with considering surface effects (surface elasticity, tension and density) has been studied by Nazemnezhad et al [13] within framework Euler–Bernoulli beam model including the von kármán geometric nonlinearity. However, Hosseini-Hashemi and Nazemnezhad [14] have presented Nonlinear free vibration of simply supported FG nanoscale beams with considering surface effects (surface elasticity, tension and density) and in this research, balance condition between the FG nanobeam bulk and its surfaces has been studied. As well as, Ansari et al [15] have investigated nonlinear forced vibration characteristics of nanobeams including surface stress effect. In this study, a new formulation of the Timoshenko beam theory has been developed through the Gurtin–Murdoch elasticity theory in which the effect of surface stress has been incorporated. Moreover, The surface and nonlocal effects on the nonlinear flexural free vibrations of elastically supported non-uniform cross section nanobeams have been investigated by Malekzadeh and Shojaee [16] simultaneously.in this paper, The formulations have been derived based on both Euler–Bernoulli beam theory (EBT) and Timoshenko beam theory (TBT) independently using Hamilton’s principle in conjunction with Eringen’s nonlocal elasticity theory. In the field of elastic foundation there are linear and nonlinear which is named as Winkler and Pasternak respectively. Elastic foundation has employed in the size of macro and nanobeams in many recent researches as explained below Zhao et al [17], have investigated the axial buckling of a nanowire (NW) lying on Winkler–Pasternak substrate medium with the Timoshenko beam theory. In addition, Simple analytical expressions have been presented by Fallah and Aghdam [18] for large amplitude free vibration and post-buckling analysis of functionally graded (FG) beams rest on nonlinear elastic foundation. Furthermore, Jang et al [19], have presented a new method of analyzing the non-linear deflection behavior of an infinite beam on a non-linear elastic foundation. Also, Niknam and Aghdam [20] have obtained a closed form solution for both natural frequency and buckling load of nonlocal FG beams resting on nonlinear elastic foundation. Eringen’s nonlocal elasticity theory has been employed into the Euler–Bernoulli beam theory to obtain the nonlinear governing partial differential equation. Moreover, the static instability of a nanobeam with geometrical imperfections with elastic foundation has been investigated by Mohammadi et al [21]. In this paper, Size-dependent effect is included in the nonlinear model. Nevertheless, differential transformation method (DTM) has been used to predict the buckling behavior of single walled carbon nanotube (SWCNT) on Winkler foundation under various boundary conditions by Pradhan and Reddy [22]. In this study, four different boundary conditions namely clamped–clamped, simply supported, clamped hinged and clamped free have used.In all above articles, they implemented on straight beams not curved ones. In recent years vibration of curved nanobeams and nanorings, have been worked in many empirical experiments and dynamic molecular simulations [23]. Hence some researchers are interested in studding of vibration curved nanobeams. Yan and Jiang [24] have investigated the electromechanical response of a curved piezoelectric nanobeam with the consideration of surface effects through the surface-layer-based model and the generalized Young–Laplace equations. In their paper size dependent piezoelectric structures, the surface effects also include surface piezoelectricity in addition to the residual surface stress and surface elasticity for elastic nanomaterials. An Euler–Bernoulli curved beam theory has used to get the explicit solutions for the electroelastic fields of a curved cantilever beam when subjected to mechanical and electrical loads. In addition, a new numerical technique, the differential quadrature method (DQM) has been developed for dynamic analysis of the nanobeams in the polar coordinate system by Kananipour et al [25]. Moreover, Khater et al [26] have investigated the effect of surface energy and thermal loading on the static stability of nanowires. In this research, nanowires has been considered as curved fixed–fixed Euler-Bernoulli beams and has been used Gurtin-Murdoch theory to represent surface effects. The model has taken into account both von Kármán strain and axial strain. As well as, Wang and Duan [23] have surveyed the free vibration problem of nanorings/arches. In this research the problem was formulated on the framework of Eringen’s nonlocal theory of elasticity according to allow for the small length scale effect exact. In this article defects and elastic boundary conditions were investigated. The small length scale effect lowered the vibration frequencies. Nevertheless, explicit solution has been shown for size and geometry dependent free vibration of curved nanobeams with consideration of surface effects by Assadi and Farshi [27]. In this paper, surface elasticity, surface residual stress and also surface mass density were included in the study to popularize the existing classical theories. The deviations of actual dynamic characteristics from classical theories for various geometries were found as new results in their research. To the best of the author’s knowledge, there has been no record or any study regarding the curved nanobeams with surface effects and elastic foundation through any of the studies mentioned in the literature review, to this present day. Therefore, there is strong scientific need to understand the vibration behavior of curved nanobeam with surface effects in considering the effect of elastic foundations. The aim of this research is to survey the effects of Winkler and Pasternak elastic foundation on vibrations and natural frequencies of curved nanobeams. In this regard, the curved


nanobeams have been used in framework Euler-Bernoulli beam theory. So the paper has investigated the effects of surface density, surface elasticity and surface residual stress. 2. PROBLEM STATEMENTS In plane free vibration of curved nanobeam is considered. The radius curvature and thickness are considered R and h

respectively. Additional surface effects are supposed for all the external surfaces.

The dynamic equilibrium equations for a curved element neglecting effects of rotary inertia and shear deformation are given as:

2 2 2

2 2 2

2 2 2

2 2 2

0

sr r r

s

u u uV P AR Rb fRt t t

u u uP V AR Rb pRt t t

M RV

θ θ θ

ρ ρθ

ρ ρθ

θ

+ −

+ −

∂ ∂ ∂∂+ = + + − ∂ ∂ ∂ ∂

∂ ∂ ∂∂− = + + − ∂ ∂ ∂ ∂

∂+ =

∂

Where A, ρ and sρ are the cross sectional area, mass density and surface density of the structure, respectively. In Eq.

(1) b is the width of curved element. It must be noted that due to the compatibility relation of a simple curved beam the displacement components of the surface material must satisfy the relations below [28].

Substituting Eq. (2) into Eq. (1) and then simplifying the relations obtained the equilibrium equations result.

( )

( )

22

2 2

2

2

1 2

1 2

s r

s

uMP AR Rb fRR t

uP M AR Rb pRR t

θ

ρ ρθ

ρ ρθ θ

∂∂− = + −

∂ ∂∂∂ ∂

+ = + −∂ ∂ ∂

(3)

Eliminating the normal stress resultant P from Eq. (3) yields the relation between bending moment and radial displacement component. For this purpose it is required to differentiate first of the two equations in Eq. (3) with respect to θ and then substitute the obtained relation for P

θ∂

∂ into the other. Further derivation of the obtained result with

respect to θ and simplifying this relation yields

( )2 42 4 2

2 4 2 2 2 21 2 s r ru uM M p fR AR RbR t t

ρ ρθ θ θ θ θ

∂ ∂∂ ∂ ∂ ∂+ + − = + − ∂ ∂ ∂ ∂ ∂ ∂ ∂

(4)

On the other hand, the constitutive equation of elastic surface materials for the general case is given by [8]

;2

r ru u uu u uθ θ θ

+ −

+ −

= =

+ =

&& && &&&& && &&

(2)


( )( ) ( )τ δ τ τ τs su u u uαβ αβ αβ βα γγ αβ αβτ µ λ δ± ± ± ± ±= + − + + + + (5)

Where τ stands for the surface residual stress, while sλ and sµ are the lame constants for the surface material. Plus

and minus sign refer to S+ and S − surfaces, respectively. Other required relation for derivation of the modified differential equation is the tangential strain in terms of planar displacement component, which can be obtained as [28].

1 rr

u uxu uR R

θθθ θε

θ θ θ∂ ∂ ∂ = − + − + ∂ ∂ ∂

(6)

Considering inextensible deformation of the curved element at x=0, it can be conclude that ruu θ

θ∂= ∂ . Using the

tangential strain statement by considering the corresponding out of plane contractions for a curved beam at Eq. (5) and assuming the same surface properties for S+ and S − , the stress-strain relation for the surface material can be obtained as

( ) 2

2 2

2 1 uτ τ2 θ

s sr

r

hu

Rθθ

µ λ ν ντ±

+ − − ∂ = ± + ∂

(7)

On the other hand, the resultant bending moment acting on the cross section of the curved beam can be obtained by integrating the strain components on the cross section as follow:

2

2

M b ( τ )2

h

hbhE xdxθθ θθ θθε τ + −

−= − + −∫

(8)

Using Eq. (6) and (7) in Eq. (8) the bending moment of the cross section can be derived with respect to the radial displacement as follow:

( )2 s s 2r

r2 2 2

bh 2µ λ 1 ν ντ uEIM uR 2R θ

+ − − ∂ = + + ∂

(9)

Substituting Eq. (9) into Eq. (4) yields the modified governing equation of motion for the curved beam as a pure function of the radial displacement components, which must be solved to obtain the natural frequencies and mode shapes of vibration.

( )6 4 2 2 42

sr r r r r6 4 2 2 2 2 2

u u u u uλ p f2 R ρAR 2RbρR θ θ θ θ θ t θ t

∂ ∂ ∂ ∂ ∂∂ ∂+ + + − = + − ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂

(10)

Where f and λ , defined as follow:

2w pf K u K u= − + ∇ (11)


( ) ( )( )2

2

EI 0.5 2μ λ 1s sbhR

υ υτλ

+ + − −=

(12)

For free vibration of curved beams or rings, the radial displacement vector can be assumed as

( ) ( )ni tr ru u e ω ϕθ += (13)

In which nω is the nth natural frequency of the structure. By substituting from Eq. (13) in Eq. (10) one can obtain the following relation:

( )( )

6 4 2 2

6 4 2 2

4 42 2

2 0

2 42 2 1

r r r rn r

s

n ns s

u u u uu

AR R bEI Ah

βθ θ θ θ

ρ ρβ ω

µ λ ν ντ

∂ ∂ ∂ ∂+ + + − = ∂ ∂ ∂ ∂

+=

+ + − −

(14)

Solution of the above equation with satisfaction of simply supported boundary condition yields the vibration characteristics of curved nanostructures.

3. NUMERICAL RESULT Next, to work out a numerical case, consider a material with bulk material properties equal to 177.3E Gpa= ,

37000Kgmρ = and 0.27ν = , for which the surface properties are given as 8s N

mλ = − , 2.5s Nmµ = ,

1.7 Nmτ = and 6

27 10s Kgmρ −= × [8]. To validating results, Assume

*nβ and

*nω be the corresponding

parameters for a curved beam without consideration of surface effects and ignore Winkler and Pasternak foundation.

Then using Eq. (14) in which nβ reflects the effects of mode shapes with consideration of surface effects and obtaining

the relation for *n

n

ββ

it can be concluded that

( )

( )3 24

* *2

6 2

2

s s s

s

Eh hRR Eh h

ρ ρ µ λ ν λ τ

ρ ρ

+ + − + = +

(15)

where *h and *R are the thickness and radius of curved nanobeams without surface effects, respectively. It is observed in Fig. 2, the results are good agreement with reference [27].


Fig. 1 geometric comparison of curved beams for the same natural frequency with and without surface effects

The nanorings with total central angle α and pinned at both ends the Navier solution can be given as

sin ni tr

nu eωπθ

α =

(16)

Substituting Eq. (16) into Eq. (10), the dimensionless natural frequencies of the curved beam including surface effects and elastic foundations can be obtained as

( ) ( ) ( )( )( )

( )6 4 2 2 4

422

2

2

2 1

pn n n w n n

n sn

n

KR K

R AR REIAR Rb

n

λλ λ λ λ λ

α ρ

ρ ρ λ

πλ

α

− + +

Ω+

=

+=

+

(17)

The dimensionless natural frequency of curved beam without surface effects and elastic foundation can be written as

( )( )( )

( )46 4 22

0 4 2

EI 2

1n n n

nn

n

R AEIAR

n

λ λ λ α ρ

ρ λ

πλ

α

− +Ω =

+

=

(18)

1.1. EFFECT OF THICKNESS ON FREQUENCY RATIO WITH DIFFERENT RADIUS OF CURVATURE


In this subsection, the effect of the thickness (h) with various curvature radiuses on frequency ratio with and without surface effects is examined. The same material and geometric parameters are selected is used for the results by the present model in Fig.2. In addition the Winkler and Pasternak elastic foundation for this case, are 10

210 Nm

and

610 N− respectively.

Fig 2. Frequency ratio with and without surface effects versus thickness h for different radius of curvature

To highlight the surface effects, on the vibration frequencies of the curved nanobeams, the dispersion curves are presented in Fig.2. It is clearly seen that, at the low values of thickness L, the greater values of curved nanobeams with surface effects. Therefore, it is concluded that by increasing thickness h, the effect of surface properties tend to diminish

slightly. However the fig 2, reveals that, the surface effects play important role in higher curvature radiuses.

1.2. ANALYSIS OF HIGHER MODES ON FREQUENCY RATIO OF CURVED BEAM WITH AND WITHOUT SURFACE EFFECTS AND ELASTIC FOUNDATION

Another numerical investigation is carried out to see the discussed effects on higher vibration mode numbers. The frequency ratio with and without surface effects is illustrated in Fig 3. In this case, the following parameters are selected

30R nm= , 10210w

NK m= , 610pK N−= .


Fig 3. First three Frequency ratio with and without surface effects versus thickness h

The trends of Fig 3. are similar to those in Fig 2. It is noted that with an increase of thickness in curved nanobeam h in Fig 3, the frequency ratios tend to one at three natural frequency mode numbers. It is revealed that in high values of

thickness the influences of surface effects have been diminished in all mode numbers.

1.3. EFFECT OF WINKLER FOUNDATION ON FREQUENCY PARAMETER

In this subsection, the effect of the Winkler elastic foundations of curved nanobeams with surface effects on the vibration frequencies is investigated respect to thickness of curved nanobeam. For this purpose, the variation of fundamental dimensionless natural frequency respect to thickness with various Winkler elastic foundations is

considered as shown in Fig 4. In the case, the Pasternak elastic foundation assume constant and it is equal to 610 N− .

Fig 4. Dimensionless natural frequency respect to thickness h for various Winkler elastic foundations


From Fig 4, it is seen that the Winkler elastic foundation can significantly influence the vibration of curved nanobeam with surface effects. It is also observed that as thickness of curved nano beam increases, the fundamental frequencies decrease, which indicates that the Winkler elastic foundation has an important role in dimensionless frequencies. As it

shown in Fig 4, as Winkler values increase, the dimensionless natural frequencies also increase.

1.4. EFFECT OF PASTERNAK FOUNDATION ON FREQUENCY PARAMETER OF CURVED NANOBEAM

To evaluate the influence of the Pasternak elastic foundation on vibration curved nanobeam with surface effects, Fig 5. presents the natural frequency of the Euler-Bernoulli model with respect to different values of Pasternak elastic foundation. For this purpose, the variation of fundamental dimensionless natural frequency respect to thickness with various Pasternak elastic foundations is considered as shown in Fig 5. In the case, the Winkler elastic foundation

assume constant and it is equal to 10210 N

m.

Fig 5. Dimensionless natural frequency respect to thickness h for various Pasternak elastic foundations

It is seen from Fig. 5 that the dimensionless frequency is more sensitive to low thicknesses. As the thickness of curved nanobeam increases, the dimensionless frequency decreases. However it is observed from that as Pasternak values

increase, the dimensionless natural frequencies also increase.

1.5. EFFECT OF RADIUS OF CURVATURE WITH DIFFERENT CURVATURE ANGLE ON FREQUENCY PARAMETER

To understand the influence of the radius change R on the fundamental dimensionless natural frequency of curved nanobeam with surface effects Fig 6. Present the natural frequencies of curved nano beam with respect to curvature

radius for different angles of curvatures.


Table 1. Radius of curvatures and opening angle effect on first three dimensionless frequency of a S-S curved nanobeam with surface effects (h =10 nm).

n=1 n=2 n=3

R (nm)

Opening angle Opening angle Opening angle

4

π 2π π

4π 2

π

π 4

π 2π

π

10 8.5860 7.9899 10.4824 35.5131 34.3442 31.9595 80.4543 79.1681 75.1266

20 9.9584 14.5663 33.1482 36.7241 39.8337 58.2651 81.6298 84.1989 98.9823

30 12.5597 26.0609 70.3180 38.8763 50.2389 104.243 83.6543 93.4159 143.790

40 16.5435 42.3610 122.245 42.1267 66.1739 169.444 86.6159 107.609 210.267

50 21.9300 63.3998 188.974 46.6265 87.7200 253.599 90.6203 127.320 297.848

60 28.6895 89.1502 270.519 52.4893 114.758 356.601 95.7771 152.774 406.031

70 36.7900 119.601 366.884 59.7808 147.160 478.403 102.187 183.985 534.516

80 46.2081 154.746 478.071 68.5258 184.833 618.985 109.934 220.873 683.140

90 56.9286 194.584 604.080 78.7233 227.714 778.334 119.078 263.334 851.811

100 68.9415 239.111 744.914 90.3584 275.766 956.445 129.660 311.279 1040.47

According to table 1. It is obviously that, the dimensionless natural frequency increase with increase radius of curvatures. It is interesting to say that natural frequencies also increase with increase opening angles. It can be used for

design of curved nanobeams and nanorings in future.

7. CONCLUSION Derived herein are the governing equations for the free vibration of circular nanorings/arches including surface density, surface tension and surface elasticity. The Winkler and Pasternak elastic foundations were considered on vibration behavior of the curve nanobeam. In addition, the simply-simply boundary conditions were assumed for this case. Hence the Navier method was employed to solve the governing equation to satisfied boundary conditions. The effects of the thickness of curved nanobeam, Winkler elastic foundation, Pasternak elastic foundation, opening angle and radius of curvature, on the frequency parameters of the circular curved beams were investigated. It is concluded that by increasing thickness h, the effect of surface properties tend to diminish slightly. Furthermore it is shown that the elastic foundations and surface effects play an important role in dynamics behavior of circular curved nanobeams. The

solutions can be used as benchmark for future researches.

8. REFERENCES 21. Şimşek, Mesut. "Large amplitude free vibration of nanobeams with various boundary conditions based on the

nonlocal elasticity theory." Composites Part B: Engineering 56 (2014): 621-628.


22. T. L. Daulton, K. S. Bondi, K. F Kelton, “Nanobeam diffraction fluctuation electron microscopy technique for structural characterization of disordered materials application metallic glasses”, Ultramicroscopy, Vol. 110, pp. 1279-1289, (2010).

23. B. Hu, Y. Ding, W. Chen, D. Kulkarni, Y. Shen, V. V. Tsukruk, Z.L. Wang, “External-strain induced insulating phase transition in VO2 nanobeam and its application as flexible strain sensor”, Advanced Materials, Vol. 22, pp. 5134-5139, (2010).

24. Thai, Huu-Tai. "A nonlocal beam theory for bending, buckling, and vibration of nanobeams." International Journal of Engineering Science 52 (2012): 56-64.

25. Ke, Liao-Liang, Yue-Sheng Wang, and Zheng-Dao Wang. "Nonlinear vibration of the piezoelectric nanobeams based on the nonlocal theory." Composite Structures 94.6 (2012): 2038-2047.

26. Murmu, T., and S. Adhikari. "Nonlocal transverse vibration of double-nanobeam-systems." Journal of Applied Physics 108.8 (2010): 083514.

27. Eltaher, M. A., Samir A. Emam, and F. F. Mahmoud. "Free vibration analysis of functionally graded size-dependent nanobeams." Applied Mathematics and Computation 218.14 (2012): 7406-7420.

28. Gurtin, Morton E., and A. Ian Murdoch. "Surface stress in solids." International Journal of Solids and Structures 14.6 (1978): 431-440.

29. Gheshlaghi, Behnam, and Seyyed M. Hasheminejad. "Surface effects on nonlinear free vibration of nanobeams." Composites Part B: Engineering 42.4 (2011): 934-937.

30. Sharabiani, Pouya Asgharifard, and Mohammad Reza Haeri Yazdi. "Nonlinear free vibrations of functionally graded nanobeams with surface effects." Composites Part B: Engineering 45.1 (2013): 581-586.

31. Sahmani, S., M. Bahrami, and R. Ansari. "Surface energy effects on the free vibration characteristics of postbuckled third-order shear deformable nanobeams." Composite Structures 116 (2014): 552-561.

32. Sahmani, S., et al. "Surface effects on the nonlinear forced vibration response of third-order shear deformable nanobeams." Composite Structures 118 (2014): 149-158.

33. Nazemnezhad, R., et al. "An analytical study on the nonlinear free vibration of nanoscale beams incorporating surface density effects." Composites Part B: Engineering 43.8 (2012): 2893-2897.

34. Hosseini-Hashemi, Shahrokh, and Reza Nazemnezhad. "An analytical study on the nonlinear free vibration of functionally graded nanobeams incorporating surface effects." Composites Part B: Engineering 52 (2013): 199-206.

35. Ansari, R., et al. "On the forced vibration analysis of Timoshenko nanobeams based on the surface stress elasticity theory." Composites Part B: Engineering 60 (2014): 158-166.

36. Malekzadeh, Parviz, and Mohamad Shojaee. "Surface and nonlocal effects on the nonlinear free vibration of non-uniform nanobeams." Composites Part B: Engineering 52 (2013): 84-92.

37. Zhao, Tiankai, Jun Luo, and Zhongmin Xiao. "Buckling Analysis of a Nanowire Lying on Winkler–Pasternak Elastic Foundation." Mechanics of Advanced Materials and Structures 22.5 (2015): 394-401.

38. Fallah, A., and M. M. Aghdam. "Nonlinear free vibration and post-buckling analysis of functionally graded beams on nonlinear elastic foundation." European Journal of Mechanics-A/Solids 30.4 (2011): 571-583.

39. Jang, T. S., H. S. Baek, and J. K. Paik. "A new method for the non-linear deflection analysis of an infinite beam resting on a non-linear elastic foundation." International Journal of Non-Linear Mechanics 46.1 (2011): 339-346.

40. Niknam, H., and M. M. Aghdam. "A semi analytical approach for large amplitude free vibration and buckling of nonlocal FG beams resting on elastic foundation." Composite Structures 119 (2015): 452-462.

41. Mohammadi, Hossein, et al. "Postbuckling instability of nonlinear nanobeam with geometric imperfection embedded in elastic foundation." Nonlinear Dynamics 76.4 (2014): 2005-2016.

42. Pradhan, S. C., and G. K. Reddy. "Buckling analysis of single walled carbon nanotube on Winkler foundation using nonlocal elasticity theory and DTM." Computational Materials Science 50.3 (2011): 1052-1056.

43. Wang, Chien Ming, and W. H. Duan. "Free vibration of nanorings/arches based on nonlocal elasticity." Journal of Applied Physics 104.1 (2008): 014303.

44. Yan, Zhi, and Liying Jiang. "Electromechanical response of a curved piezoelectric nanobeam with the consideration of surface effects." Journal of Physics D: Applied Physics 44.36 (2011): 365301.

45. Kananipour, Hassan, Mehdi Ahmadi, and Hossein Chavoshi. "Application of nonlocal elasticity and DQM to dynamic analysis of curved nanobeams." Latin American Journal of Solids and Structures 11.5 (2014): 848-853.


46. Khater, M. E., et al. "Surface and thermal load effects on the buckling of curved nanowires." Engineering Science and Technology, an International Journal 17.4 (2014): 279-283.

47. Assadi, Abbas, and Behrooz Farshi. "Size dependent vibration of curved nanobeams and rings including surface energies." Physica E: Low-dimensional Systems and Nanostructures 43.4 (2011): 975-978.

48. Rao, Singiresu S. Vibration of continuous systems. John Wiley & Sons, 2007.


Extracting Drug-Drug Interaction from Literature through Detecting Linguistic-Based Negation and Neutral Candidates

Behrouz Bokharaeian, Alberto Diaz

NIL (Natural Interaction based on Language) Group, Universidad Complutense de Madrid, 28040 Madrid, Spain

[email protected]

Abstract Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Most of the current relation extraction tasks and the produced corpora are based on binary relations, which decide whether a relation between the two entities exists in the sentence or not. The paper aims to identify neutral interaction candidates who have not been stated by the authors in the sentences which have not been studied yet. Alongside, a linguistic-based negation related feature set based on negation scope and cues were employed. The proposed neutral candidate features show significantly better performance in comparison with negation related features. The results show by employing the proposed features alongside to a bag of words kernel, the performance of the three used kernel methods improves. Moreover, the enhanced local context kernel outperformed the other used methods. Additionally the experiments show the automatically produced negation scope and cue tags can be employed effectively.

Keywords: Drug-Drug interaction, kernel methods, negation detection, neutral candidate

1. Introduction Being relatively new, biomedical relations extraction from text is a research field rapidly growing within the disciplines focused on Natural Language Processing (NLP). Considering ever increasing number of published biomedical researches and the large deal of unstructured text resources released on biomedical studies, it seems highly demanding to extract biomedical relations from either scientific articles or text reports. Due to the significance expressed by Drug-Drug interactions in medicines, it will be very desirable for bioNLP studies to propose algorithms to automatically extract such interactions from text. A drug-drug interaction (DDI) refers to a situation where the level of activity exhibited by one drug is changed by another drug. According to FDA reports and some other studies [1], United States sees over 2 million life-threatening DDIs occur per year [2]. Developed by many academic researchers and pharmaceutical companies are relational and structural databases to recorded DDIs. However, one may found major deal of researches and valuable information to be still in the form of such unstructured text documents as scientific publications and technical reports.

In addition, with the key role played by biomedical relations such as protein-protein interactions and drug-drug interactions, in the course of biological and medical processes identification, biomedical relation extraction is rendered to be a highly crucial research topic in the field of biomedical text mining. Most of existing literatures on biomedical



relation extraction, including those on DDI detection, had their bases on supervised binary relation extraction. As such, such other types of algorithms as complex relation extraction algorithms and semi-supervised ones are also expected to be translated into this type of relation extraction algorithm [3].

On the other hand, with their general aim being to derive factual knowledge from textual data, most BioMedical text mining tasks should necessary be able to detect negative assertions. According to linguistics [4], negation is a morphosyntactic operation where the meaning of lexical or construction item is either denied or inverted by another lexical item. Accordingly, the lexical item that expresses the negation is referred as a negator. Being a significant source of inaccuracies within automated information retrieval systems, negation has found common application in clinical and biomedical text documents [5].

Generally, two general negation detection methods are developed and implemented to annotate the used corpora, namely a linguistic-based approach and an event-oriented approach. Among other negation annotated corpora one may refer to the two well-known ones being the linguistically-focused, scope-based BioScope and the event-oriented Genia. In BioScope, with each argument of the key events being under the scope, scopes aim to recognize the negation position of this key event within the sentence [6]. In contrast, Genia deals with the event modality independently found within the events. In a Genia event, biological concepts (relations and events) are annotated for negation, but with no linguistic cues being annotated for them. Indeed, BioScope corpus has its main objective to investigate this language phenomenon in a general, task-independent, linguistically-oriented way. Furthermore, automatic recognition of sentence negation scope and cues is another feature of BioScope. To see a detailed comparison between these two corpora, one may refer to [7]. Furthermore, feature extraction investigations indicated the “trigger of the event” to be a misleading feature within this task because it suffers from uncertainties found within positive and negative polarities. Consequently, the following actions were undertaken: first, identification of the relation state within its corresponding positive sentence; and second, detection of all factors considering whether the meaning is inversed or not. As a final conclusion, the linguistically-oriented approach was proved to be the approach of choice for the work herein undertaken.

On the other hand, most of the current relation extraction problems and the produced corpora are based on binary relations, which decide whether a binary relation, between the two entities, exists in the sentence or not. Similarly, in DrugDDI corpus [8], implemented systems must predict whether an interaction between the two drugs has occurred or not. Although detecting positive interaction is the main target of the DrugDDI corpus, there is a difference between the interaction candidate which has been stated by the authors (Biased candidate), and the interaction candidate which has not (Neutral candidate). Whilst, both are considered as false in DrugDDI. In other words, the neutral interaction candidate is one with no remark by the author in the sentence, while the biased interaction candidate is exactly the opposite (with remarks by the author). In fact, neutral candidates are a particular subclass of non-positive candidates being detectable by some defined features.

The rest of the process includes employing the extracted features in combination with other kernel methods. In the following sections, the method is first explained, then, in the third chapter, the results are exhibited, and in the last section, the results are discussed and concluded, and some suggestions are given for future works.

2. Method

Discussed in this section is the proposed method along with different components of the system. General framework of the implemented system can be seen in Figure 1. As the flowchart demonstrates, the sentence, drug names and extracted negation scopes and cues from the produced NegDDI-DrugBank [9] corpus are then used as inputs for both feature extraction and the used kernel methods, namely Global context kernel, subtree kernel and local context kernel. Combining the two previously produced sentences, the new presentation of the sentence was produced using these inputs. The new combined presentation consists of several substructures produced by the kernel method features along


with newly proposed NN feature values (in this paper, “NN” refers to Negation scope and cue, and Neutral candidate feature sets).

Used to classify the newly created presentation is a bag-of-words kernel method which looks for a polynomial combination of the features commonly referred to as the kernel function. A support vector machine with SMO [10] implemented was used which, according to the performed experiments in the study, outperformed other implementations of SVM, e.g. libSVM. Weka API was used as the implementation platform. Executed without a

stemming step, bag-of-words based kernel method had its minimum term frequency set to one.

For all mentioned methods, features were all considered as blind with all the drug names in the generated features been replaced with two general terms, namely DrugName (for the two drugs with their interaction under investigation) and OtherDrugNames (for other drugs). Undertaken to accomplish tokenization task was Stanford BioNLPTokenizer [11] tried to be adapted with pharmaceutical text; Stanford parser was, however, used to parse constituents. Furthermore, just like the winning team in DDI extraction challenge 2011, we used TreeTagger to undertake Lematization and

POStagging tasks.

Figure 1. Basic components of the implemented and proposed methods

2.1. Negation scope and cue features To be directly extracted from the extended corpus, relative position of the entities with respect to negation scope and cue is an important factor to consider. For instance, in the sentence below:


• Population pharmacokinetic analyses revealed that MTX, NSAIDs, corticosteroids, and TNF blocking agents [did not influence abatacept clearance].

MTX and NSAIDs are the two drug names outside the negation scope with their interaction relation not inverted by the negation; this is while the interaction between abatacept and MTX is seen to be inverted by the negation, because abatacept is seen to be in the negation scope. So depending on whether drug names are inside or outside the negation

scope, there are 6 different possibilities to be used as 6 features:

• BothinsideNegSc: A Boolean feature set to true when both drugs are inside the negation scope with other situations being false.

• BothLeftSNegSc: A Boolean feature set to true when both drugs are on the left side of the negation scope with other situations being false.

• BothRightNegSc: A Boolean feature set to true when both drugs are on the right side of the negation scope with other situations being false.

• OneLeftOneInsideNegSc: A Boolean feature set to true when one drug is on the left side of the negation scope, the other is inside the negation scope, and other situations are false.

• OneRightOneInsideNegSc: A Boolean feature set to true when one drug is on the right side of the negation scope, the other is inside the negation scope, and other situations are false.

• OneLeftOneRightSc: A Boolean feature set to true when one drug is on the right side of the negation scope, the other is on the left, and other situations are false.

Not only the mentioned features, but the negation cue is also used as text features.

One may refer to [9] to see detailed information on different possibilities of relative position of drugs within the sentences along with the negation scope and the different types of the interactions.

2.2 Neutral candidate feature extraction As it was explained in the introduction, there is a critical distinction between biased and neutral interaction candidates. Neutral candidate refers to a candidate with no remarks by the author in the sentence, while a biased candidate is exactly the opposite (remarked by the author). Actually, neutral candidates are a particular subclass of non-positive candidates which can detect by some defined features. The two concepts are demonstrated on the sample sentence analyzed in Figure 2. The status of the relation between each pair of the discussed drugs at the end of the sentence, i.e. Propranolol, Ranitidine, etc. is neutral as the author has made no remark on their interaction with each other. However, a biased relation is evident between Acarbose and Ranitidine, Propranolol and the other mentioned drugs, because the author clarifies“...Acarbose has no effect on either the...”


Figure 2. A sample sentence with negation from NegDDI-DrugBank with neutral and biased false DDI

In negation action, a negation cue does invert the biased candidate but not the status of an innately neutral interaction candidate. For instance, in the sentence shown in Figure 2, the positive status of the biased candidate between Acarbose and Ranitidine is changed into negative by the act of negation cue. However, no change is seen in the status of the neutral candidate between Propranolol and Ranitidine keeping the interaction false. Another example is presented in the following sentence:

• There appears to be<scope><cue>no</cue>pharmacokinetic interaction between acitretin and cimetidine, digoxin, or glyburide </scope>.

As there is no negation cue in the sentence, the true relation between acitretin and cimetidine is inverted into false. However, regardless of whether a negation cues exist or not, no change is evident in the relation between digoxin and glyburide, which is considered to be false. Accordingly, based on different types and writing patterns, 10 Boolean features were defined to extract neutral candidates. Then implemented was a rule-based system was which utilized regular expression rules to obtain the following features:

• When either the second drug name is another name for the first drug, a brand name, or the category to which the first drugs belongs, two of Boolean features are set to true. For instance, in the sentence below, Purinethol and Imuran are the brand names for mercaptopurine and for azathioprine, respectively:

– In patients receiving mercaptopurine (Purinethol) or azathioprine (Imuran), the concomitant administration of 300-600 mg of allopurinol per day will require a reduction in dose to approximately one-third to one-fourth of the usual dose of mercaptopurine or azathioprine.

• Two other Boolean features are set to true provided the two drugs of interest have their interactions with a third drug (or drugs) investigated with the interaction between the two drugs of interest not been investigated. For instance, no investigation is evident on interaction between doxorubicin and bleomycin in the sentence below:


– However, in a well-controlled study of patients with lymphoma on combination therapy, Allopurinol did not increase the marrow toxicity of patients treated with cyclophosphamide, doxorubicin, bleomycin, procarbazine and/or mechlorethamine.

• A Boolean feature is set to true given two drug names located within two different sentences separated by a dot (.).

• In order to detect drug-pairs located within sentences with no additional information to DDI (hereafter referred to as non- informative sentences), we proceeded to define five Boolean features. The sentence below, for example, is not informative with no investigation reported:

– There are no study data to evaluate the possibility, nitric oxide donor compounds, including sodium nitroprusside and nitroglycerin

It is worth mentioning how critical is to consider the neutral candidates as the failure to do so in a proper manner, may result in some conflicts within the corpus later on. In Figure 2, there is no actual investigation conducted on possible interactions between Propranolol and Ranitidine; in the corpus, however, such an interaction has been considered to be negative, in which case no remark on the interaction between these two drugs is made by the author leading the corpus to face some conflicts.

3. Results

This section begins with presenting the comparative results of the discussed methods along with those obtained by the original methods. Further, F-measure is utilized to demonstrate contributions from all the possible different feature sets.

Furthermore, as mentioned in the section on method, some experimentation are undertaken to determine the adequacy of the automatic algorithm which involves no manual checking phase. As shown in Table 1, tested checking routines were 10-fold cross validated to confirm errors incurred via automatic negation detection algorithm to not result in significantly worse results. As shown by Table 1 comparative experiments showed manually checked and unchecked results to be identical when using two annotations. However, less than 0.3% (average=0.1%) improvement was observed with manually checked results in other three comparative experiments. Additionally, as can be seen in Table 1, the manual checking process induced more improvements to the results of subtree kernel compared to those of the two used sequence kernels.

As such, one may employ, based on experiments, automatic negation detection, with no manual checking phase, to effectively achieve acceptable results in the course of DDI extraction task. Likewise, these results are in agreement with the results of the analyses performed on the corpus. According to [9], the automated process succeeded to correctly annotate 77% of the negation scope and cues via an automatic process. Furthermore, the small deal of errors incurred within the annotations may impose negative impacts on the DDI algorithm performance.

Displayed in the following tables are the results of two different types of validation tests described in this chapter. At first, similar to what was undertaken in SemEval DDI challenges; the system was trained and tested by the training and testing parts of the corpus Drug Bank, respectively. Then, NegDDI-DrugBank corpus 2013 was 10-fold cross validated with the results been displayed. Subsequently, a statistical sign test was followed to show that the improvements caused


by the proposed method, over the three used methods, were significant. Finally, a presentation of system’s error analysis is brought.

Table 2 demonstrates the results obtained by herein-modified global context kernel (with NN postfix) along with those of normal global context kernel; the first row contains the results for those pairs, in the testing part, with negation cue(s) but no clause connector, while the second row reports the results for those pairs with negation cue(s) and clause connectors. Reported along the third and fourth rows are the results for the pairs with no negation cue and either with or without clause connector(s), respectively. The last, but not the least row reports the results for entire dataset.

Table 1- A comparison across the results of the 10-fold cross validation of the three methods used with manually checked (MC) and without manually checked (WC) negation annotation in terms of F-measure.

Method F-1 WC F-1 MC Dif. GCK+NEG 77.3% 77.4% +0.1% LCK+NEG 81.1% 81.1% 0%

SubtreeK+NEG 71.9% 72.1% +0.2%

Tables 2, 3 and 4 encompass the results obtained by the improved experimented methods (with NN postfix) among with those of the corresponding original methods. For all the three tables, the first row gives the results for sentences within the test dataset with negation cue(s) but no clause connector. Reported along the second row are the results for pairs with negation cue(s) as well as clause connector(s). Presented via the third and the fourth rows are the results for sentences with no negation cue but either with or without clause connector(s), respectively. The last row displays the results for entire test dataset.

Table 5 encompasses the results of the 10-fold cross validation of NegDDI-DrugBank 2013 (both testing and training parts). As the table shows, the proposed features and method are associated with an increase in all parameters studied including F-measure, recall. In the course of 10-fold cross validation experiments, the best F-measure, across the proposed methods, was achieved by the proposed local context kernel-NN method (83.9%). Furthermore, with a F-score of 68.5%, the improved local context kernel–NN produced the best F-score out of the testing part of the NegDDI-DrugBank. In addition, the table indicates the best improvements in terms of F-measure to be realized via local context kernel; i.e., compared to the original LCK, the improved LCK method has achieved more success in satisfactorily detecting false and true interactions.

According to Tables 2, 3 and 4, the proposed method succeeded to improve F-measure for all test categories. With an average increase of +7.1% in F-measure, sentences with no negation cues and clause connectors exhibited the best improvements in the global context and the local context kernel methods. Moreover, it was observed in subtree kernel that, with an average of +15.4% increase in the F-measure, sentences with no negation cue but one or more clause connectors exhibited the highest rate of improvements. The only method where the obtained F-measure were not correlated to one another was subtree kernel indicating non-equal success rates of the proposed subtree kernel for detecting true and false interactions.

These results proved the proposed features to be efficient not only for sentences with negation cues and clause connectors, but also for other types of sentences such as those with neither negation cues nor clause connectors. As is shown in Tables 2, 3 and 4 this can be attributed to significant improvements induced by the extracted neutral candidate features in the course of all three methods applied to all four types of test datasets. In addition, due to the experimented sequence kernels (the global and local context kernels), neutral candidates exhibited their worst performance in sentences with no negation cue and scope.

In the subtree kernel, however, the best performance was evident on this category of datasets (Tables 3). This is because of considerably higher effectiveness of the original subtree kernel, rather than other two sequence kernels, in sentences


with negation cues. Note that in spite of the fact that no negation cue or scope is directly used to extract this category of features, a major contribution of non-informative sentences detected by these rules possess one or more negation cues.

For instance, being a non-informative sentence containing a negation cue “no” along with possible interaction between Panretin gel and antiretroviral agents. The following sentence is detected, using the proposed neutral candidate detection rules, as one containing a neutral candidate.

• No formal pharmacokinetic drug interaction studies between Panretin gel and antiretroviral agents have been conducted.

Therefore, these rules indirectly help resolving this type of relations usually happening with the sentences with negation cues.

It was observed in the experiments with the proposed global context kernel that the dataset containing the sentences with neither negation cues nor clause connectors produced the best performance (71.9%). Furthermore, for the most cases, the proposed methods took the invented negation scope and cue and also clause dependency features as tokens. Consequently, just like the original global context kernel, the two invented categories did not significantly exhibit the expected improvement in GCK.

On the other hand, the same experiments undertaken on the modified local context kernel were again conducted for the global context kernel with the results indicating that, similar to the GCK, the dataset with sentences lacking negation cues and clause connectors exhibited the best performance. However, even for the other three datasets, the system is seen to improve DDI detection performance by a satisfyingly level (Table 4).

Finally, according to the results of experiment conducted with the improved subtree kernel (Table 3), the best performance in terms of improvement rate (15.3%) was observed with the dataset containing sentences of no negation cues but clause connector(s). Even though the performance of the original subtree kernel was improved, although to different extents, by all feature sets, the best combination of feature sets used was seen to be the one with neutral candidate and negation cue and scope features combined (Table 3), whose performance was even comparable to the entire list of features (15.3%).

The results indicated higher effectiveness of the subtree kernel based on constituent parsing (rather than other methods investigated) in DDI detection of complex sentences; this is most likely attributed to more structural and componential information extracted by this method, rather than the other two sequence kernels used, from the sentence.

However, no significant improvement was observed for those sentences with negation cues, scopes and connectors; possibly because, as it was explained, the original subtree kernel as well processes these types of sentences as the invented features may render not to be that effective.

In order to reduce complexities within the sentences, additional experiments, grounded on basic simplification methods, were undertaken. Unfortunately, no improvement was obtained in the results.

With a F-measure of 68.5% which was 2.7% greater than that of the winning system in DDIextraction 2011 challenge (implemented by “Humboldt University of Berlin”), the best result for the testing part was obtained by the improved Local context kernel method (LCK-NN).


Table 2- F-measure results obtained with global context kernel for a combination of different feature sets used.

Test Category F-1 GCK F-1 GCK+NEG Dif. F-1 GCK+NEUT Dif. F-1 GCK-NN Dif. With negation No connector

56.6%

54.9%

-1.7%

66.2%

+9.6%

59.8%

3.2%

With negation With connector

51.7%

52.2%

+0.5%

59.7%

+8.0%

58.2%

6.5%

No negation With connector

62.3%

62.3%

0.1%

65.3%

+3.0%

65.7%

3.4%

No negation No connector

64.7%

64.8%

0.1%

71.8%

+7.1%

71.9%

4.2%

Total 61.7% 61.3% -0.4% 68.6% +6.9% 67.5% 4.3%

Table 3- F-measure results obtained with subtree kernel for a combination of different feature sets used.

Test Category F-1 Subtree F-1 Subtree+NEG Dif. F-1 subtree+NEUT Dif. F-1 subtree-NN Dif. With negation No connector

60.9%

59.2%

-1.7%

66.9%

+6.0%

70.2%

9.9%


63.2%

63.1%

+0.9%

63.2%

0%

63.1%

-0.1%

No negation With connector

58.6%

62.9%

+4.3%

68.5%

+9.7%

73.9%

5.3%


36.3%

36.3%

0%

38.7%

+2.4%

36.3%

0%

Total 47.1% 47.6% +0.5% 51.4% +4.3% 51.6% 3.8%

Table 4- F-measure results obtained with local context kernel for a combination of different feature sets used.

Test Category F-1 LCK F-1 LCK+NEG Dif. F-1 LCK+NEUT Dif. F-1 LCK+NN Dif. With negation No connector

62.6%

63.4%

+0.8

66.0%

+3.4%

65.6%

3%


58.0%

52.2%

-5.8%

67.2%

+9.2% 64.8%

6.8% No negation

With connector

64.8%

65.9%

+1.1%

66.2%

+1.4% 68.9% 4.1%


63.9%

65.3%

+1.4%

69.6%

+5.7% 69.9%

6% Total 63.4% 64.1% 0.7% 68.1% +4.7% 68.5% 5.0%

Table 5- F-measure results obtained with 10-fold cross validation over NegDDI-DrugBank 2013 (Drugbank training and testing parts) for the three investigated kernels and their improved versions.

Method Rec Rec (Enhanced NN) Dif. F-1 F-1 (Enhanced

NN) Dif.

Global Context

76.2%

79.3%

+3.1%

77.4%

80.2%

+2.8

Subtree kernel

68.7%

72%

+3.3%

71.9%

74.7%

+2.8

Local Context Kernel

80.4%

82.7%

+2.3%

80.7%

83.9%

+3.2

%

3.2. Error Analysis Provided in this section are the results of analyses conducted on error identification phase.


Training the system with the entire training dataset within NegDDI-DrugBank, which includes sentences without negation cues and clause connectors, can be seen as one of the main sources of errors arisen in sentences with negation cues and clause connectors which are higher than the expected rate of error in the testing dataset for either sequence kernels used.

Better results are predicted to be obtained provided the system is trained and tested with sentences possessing negation cues and scopes. However, as the number of such sentences within the training dataset is small (around 20%), the system has given inferior results with respect to the current ones. The same results were obtained from the same experimentations undertaken with sentences containing clause connectors. Therefore, one may suggest using corpora with higher percentages of sentences with either negation cues or clause connectors to resolve this issue and enhance the quality of relation forecast in such sentences.

With further experiments carried out, larger improvements were found over experiments where features were manually selected. Therefore, an effective automatic feature selection method is predicted to improve the system performance for the proposed methods.

Some other known sources of error can be summarized as follows:

• Issues with pharmaceutical unification or tokenizer: tokenization task, in this research, was performed via Stanford BioNLPTokenizer. However, in order to unify different tokens of the same pharmaceutical meaning and also to adapt the tokenizer with the pharmaceutical text resources, there is a need for a more accurate unification process. For instance, one should see Co-administration token as being equivalent (equal) to Coadministration token. Moreover, as sources of inaccuracy, some compound drug names composed of more than one words (including DrugName-containing and DrugName-induced) must be pharmaceutically discussed when tokenization task is concerned.

• Similar to all extraction systems, the herein proposed system suffers from the issue of parentheses as another source of inaccuracy. That is, some parentheses encompass a clause or explanation within which contains one or more drug name(s) are placed, see the following example.

– Although specific drug or food interactions with mifepristone have not been studied, on the basis of this drugs metabolism by CYP 3A4, it is possible that ketoconazole, itraconazole, erythromycin, and grapefruit juice may inhibit its metabolism (increasing serum levels of mifepristone). Ketoconazole and mifepristone are two drug names which have interaction in the corpus. However, because of the existence of parentheses, their interaction is not detected in the system. A simplification algorithm could be useful to resolve the parentheses issue.

• With more than two independent clauses, complex sentences contribute to higher rates of error. The higher the count of independent and dependent clauses, the higher the rate of error; see the following example:

– ProAmatine. Alpha-adrenergic blocking agents, such as prazosin, terazosin, and doxazosin, can antagonize the effects of ProAmatine. Potential for Drug Interactions: It appears possible, although there is no supporting experimental evidence that the high renal clearance of desglymidodrine (a base) is due to active tubular secretion by the base-secreting system also responsible for the secretion of such drugs as metformin, cimetidine, ranitidine, procainamide, triamterene, flecainide, and quinidine.

4. Discussion and Conclusion


As an important task in Biomedical NPL, supervised biomedical relation extraction looks forward to extract associations between biomedical entities. Investigated in this paper were Drug- Drug interactions (DDIs); these are considered as critical components of biomedical associations for extracting which many methods have been developed. Substantial studies on the contributions of negation scope and cue and neutral candidates, into the relation extraction task via the proposed methods are yet to be reported.

Proposed in this paper were a total of 18 features (including negation related features and those used for neutral candidate identification) extracted to be used in the used kernels.

In addition, the results proved the proposed features to be efficient not only for sentences with negation cues and clause connectors, but also for other types of sentences such as those with neither negation cues nor clause connectors. Meanwhile, one may suggest inadequate information of the linguistically-oriented scope-based negation annotating process (which identifies the negation cue and scope) to cope with the act of negation.

In spite of the confidence we have in our current results, whether each and every kernel method benefit from these features can make the basis for a challenging discussion.

Authors believe that, attempting to capture more features from parse tree, parse graph and other sentence presentations, more advanced kernels, rather than simpler ones, are likely to take less out of the proposed features and method.

Furthermore, there still is a need for the proposed neutral candidate detection algorithm for which one may have used any annotation method. Another source of reduction in the significance of the annotating method is seen to be those types of neutral candidates occurring in non-informative sentences, usually referred to as negators. Detected by the proposed system, these cases were correctly recognized as false interactions. Another advantage of the linguistically-oriented approach can be seen in less manual efforts required via such an approach compared to that of an event-oriented annotating method.

To be used as either a pre-processing step or alongside other methods, this sort of algorithms can bring about better results.

The authors have recently utilized the proposed algorithm to annotate Medline part of the DrugDDI corpus with negation cues and scope. However, Medline part of the corpus contains considerably less testing and training data compared to those found in the DrugBank part. Future works may extend NegDDI-DrugBank by manually checked product(s) of the Medline part as well as addressing further experiments on this subject.

Also for future works, one can employ similar features to resolve in-sentence speculations as another major source of inaccuracy within texts, beside negation; similar approaches can be followed to determine cues and scopes within speculations.

In spite of the confidence we have in our current results, whether each and every kernel method benefit from these features can make the basis for a challenging discussion.

Based on the results of the subtree kernel for sentences with negation cues and clause connectors, the authors believe that, attempting to capture more features from parse tree, parse graph and other sentence presentations, more advanced kernels, rather than simpler ones, are likely to take less out of the proposed features.

5. REFERENCES

[1] J.H. Gurwitz et al., "Incidence and preventability of adverse drug events in nursing homes," The American Journal of Medicine, vol. 109, pp. 87-94, 2000.

[2] J. Lazarou, B.H. Pomeranz, and P.N. Corey, "Incidence of adverse drug reactions in hospitalized patients: A meta-analysis of prospective studies," JAMA, vol. 279, no. 15, pp. 1200-1205, 1998. [Online]. +


http://dx.doi.org/10.1001/jama.279.15.1200 [3] Ryan McDonald et al., "Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE,"

in Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA, 2005, pp. 491-498. [Online]. http://dx.doi.org/10.3115/1219840.1219901

[4] Eugene E. Loos, Susan Anderson, Jr. Dwight H. Day, Paul C. Jordan, and J. Douglas Wingate, Glossary of linguistic terms. Camp Wisdom Road Dallas,: SIL International , 2004.

[5] W. Chapman, W. Bridewell, P. Hanbury, G.F. Cooper, and B.G. Buchanan, Evaluation of Negation Phrases in Narrative Clinical Reports, 2002.

[6] G. Szarvas, V. Vincze, R. Farkas, and J. Csirik, " [7] Veronika Vincze, Gyorgy Szarvas, Gyorgy Mora, Tomoko Ohta, and Richard Farkas, "Linguistic scope-based and

biological event-based speculation and negation annotations in the BioScope and Genia Event corpora," Journal of Biomedical Semantics, vol. 2, no. Suppl 5, p. S8, 2011. [Online]. http://www.jbiomedsem.com/content/2/S5/S8

[8] I. Segura-Bedmar, P. Mart, and D. S, "The 1st DDIExtraction-2011 Challenge Task: Extraction of Drug-Drug Interactions from Biomedical Texts," CEUR-WS, vol. 761, pp. 1-9, 2011.

[9] Behrouz Bokharaeian, Alberto Diaz, Mariana Neves, and Virginia Francisco, "Exploring Negation Annotations in the DrugDDI Corpus," in Proceedings of the Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing, 2014, pp. 84-91.

[10] Thorsten Joachims, "Making large scale SVM learning practical," Universit [11] David McClosky, Mihai Surdeanu, and Christopher D. Manning, "Event Extraction As Dependency Parsing for

BioNLP 2011," in Proceedings of the BioNLP Shared Task 2011 Workshop, Stroudsburg, PA, USA, 2011, pp. 41-45. [Online]. http://dl.acm.org/citation.cfm?id=2107691.2107697

[12] B. Bokharaeian, Alberto D, and M. Ballesteros, "Extracting Drug-Drug interaction from text using negation features," Procesamiento del Lenguaje Natural, vol. 51, 2013.

[13] Behrouz Bokharaeian and Alberto D, "NIL UCM: Extracting Drug-Drug interactions from text through combination of sequence and tree kernels," in Second Joint Conference on Lexical and Computational Semantics (*SEM), Atlanta, Georgia, USA, 2013, p. 644.

[14] Md. Chowdhury, M. Faisal, and A. Lavelli, "Exploiting the Scope of Negations and Heterogeneous Features for Relation Extraction: A Case Study for Drug-Drug Interaction Extraction," in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, Georgia, June 2013, pp. 765-771. [Online]. http://www.aclweb.org/anthology/N13-1093

http://dx.doi.org/10.1001/jama.279.15.1200

http://dx.doi.org/10.3115/1219840.1219901

http://www.jbiomedsem.com/content/2/S5/S8

http://dl.acm.org/citation.cfm?id=2107691.2107697

http://www.aclweb.org/anthology/N13-1093


Solving Fuzzy Linear system in the presence of Hukuhara Difference

M.Keyanpour1 , M.Mohaghegh tabar1∗

1Department of Mathematics, Faculty of Sciences,

University of Guilan, Rasht, Iran, P.O.Box 41335-1914.

Abstract

In this paper, we deal with a linear systems in which the operators are addition and Hukuhara difference. We pay attention to the existence of solution condition for the mentioned system and if there is a solution we obtain it by using α-cuts. Numerical results confirm applicability of our technique.

Keywords: Fuzzy Linear programming, inverse operator, Hukuhara difference, fuzzy solution.

1 Introduction The arithmetic operations on real numbers was extended to the ones defined on fuzzy numbers using extension principle which firstly proposed by Zade[10]. This principle in spite of all its usefulness has been regarded as a time-consuming and expensive computational tools due to the necessity of solving a nonlinear programming problem. To overcome this deficiency, many researchers have investigated this problem by observing the fuzzy numbers as a collection of α-levels. One of the main applications of fuzzy number arithmetic is to form linear systems whose parameters are all or partially represented by fuzzy numbers. A general model for solving a FLS whose coefficient matrix is crisp and the right-hand side column is an arbitrary fuzzy number was first proposed by Friedman et al [8].They used the embedding method and replaced the original fuzzy linear system by a crisp linear system with a nonnegative coefficient matrix S, which may be singular even if A be nonsingular. Another class of methods for solving fuzzy linear systems is iterative methods. Allahviranloo [1]introduced the Jacobi method for solving FLS for the first time. Also he proposed the

∗ Maryam Mohaghegh tabar([email protected]) Mohammad Keyanpour([email protected])


mailto:Keyanpour([email protected])


GausSeidel and the SOR methods in [1, 2]. Dehghan [4] extended Adomian decomposition method for solving fully fuzzy linear systems. Dehghan [5] et al also extended more iterative methods such as EGS, AOR, ESOR, SSOR, USSOR, EMA and MSOR to solve FLS. Dehghan [6] employed Dubois and Prade [7] approximation arithmetic operators on LR representation of fuzzy numbers to find a positive vector solution for a fully fuzzy linear system. Along with them some authors who dealt with α-cut representation proposed valuable methodologies to obtain a fuzzy solution for the FLS but the notable fact is ignoring to be non-decreasing and non-increasing on the left and right cuts respectively. Not to obtain a real fuzzy number is the important shortcoming to their results [3]. The fact is that by embedding interval space into a space which has additive inverse and multiplicative inverse, Banach space, the characteristics of fuzzy numbers in α-cut representation, as intervals, will changed. Our focus is to use Hukuhara difference to overcome this deficiency. Although Hukuhara difference cannot be applied for any two interval but if this difference exists we will obtain a valid fuzzy number. To this end we propose a generalized fuzzy linear system based on Hukuhara difference and then adopt the proposed method by Dehghan et al[5] to find a fuzzy solution for the mentioned system. In section 2,the basics of the fuzzy set theory is discussed. Then we define generalized fuzzy linear system and propose a numerical method to solve it in Section 3. Section 4 contains a numerical example.

2 Fuzzy Arithmetic

Fuzzy numbers are one way to describe the vagueness and lack of precision of data. The theory of fuzzy numbers is based on the theory of fuzzy sets which was introduced in 1965 by Zadeh [10]. The concept of a fuzzy number was first used by Nahmias[9] and by Dubois and Prade[7] in the late 1970s. Definition of a fuzzy number [8]

Definition 2.1. we represent an arbitrary fuzzy number by a pair of functions [ ( ), ( )];0 1x xα α α≤ ≤ which satisfy the following requirements:

• ( )x α Is a bounded left continuous nondecreasing function over • ( )x α Is a bounded left continuous nonincreasing function over • ( ) ( ); 0 1x xα α α≤ ≤ ≤

A crisp number α is simply represented by ( ) ( )x xα α α= = ; 0 ≤ α ≤ 1. The set of all fuzzy numbers [ ( ), ( )]x xα α becomes a convex cone that is denoted by 1E which is then embedded isomorphically and isometrically into a Banach space.

A popular fuzzy number is the trapezoidal fuzzy number 0 0(x ,y , , )u α β= with two defuzzifiers 0 0,x y and left fuzziness σ and right fuzziness β.

Its parametric form is


u(α) = y0 − σ + σα, u (α) = y0 − β + βα. (1)

For arbitrary fuzzy numbers x = [ ( ), ( )]x xα α , y= [ ( ), ( )]y yα α and real number k, we may define the addition and the scalar multiplication of fuzzy numbers by using Zadeh extension principle [21] that can be used to generalize [4] the crisp mathematical concepts to the fuzzy sets.

Definition 2.2 [11] Let [ , ]A a a= and [ , ]B b b= be two crisp intervals, the H- difference is

[ , ] [ , ] [ , ]H H

a b cA B a a b b c c

a b c

= +− = − = ⇔ = +

Note: Although the classic difference operator for intervals is not associative, it can be simply proved that H- difference has this valuable property.

3 Generalized Fuzzy Linear System (GFLS)

Consider a fuzzy linear system of equations:

11 1 12 12 2 13 1 1

21 1 22 22 2 23 2 2

1 1 12 2 2 3

...

.......

...

n n

n n

n n n nn n n

a x o a x o a x ba x o a x o a x b

a x o a x o a x b

= = =

Where , ,1 ,i ja i j n< < are crisp numbers, ,i jo is a notation for operators, it can be

addition or Hukuhara difference and b is a vector of fuzzy numbers and ,i jx are fuzzy variables. System 5 is called Generalize fuzzy linear system.

Definition 3.1. Consider system 5, corresponds to the GFLS and or all 1 ,i j n≤ ≤ we define a

coefficients matrix ,[a ]i j n nA ×= , an operator matrix ,[o ]i j n nO ×= , where , , i j Ho ∈ + − and a

representation matrix 1 2 2 3[a ,o ,a ,o ,...,o ,a ]o n nA = in which ja and jo are the columns of matrices

A and O, respectively.

And corresponding to oA , a crisp matric [ ]ijO o= is defined as follows

,; 1 , .

,ij

ijij H

if oo i j n

if o+ = +

= ≤ ≤− = −


Consider a 2 2n n× GFLS

12 13 11 2 111 12 1

22 23 21 2 221 22 2

2 31 21 2

1 212 13 1 111 12 1

1 222 23 221 22

( ) ( ) ... ( ) ( )

( ) ( ) ... ( ) ( )...

( ) ( ) ... ( ) ( )

( ) ( ) ... ( ) ( )( ) ( ) ...

n nn

n nn

n n nn n nn n nn

nn n

n

a x o a x o o a x b

a x o a x o o a x b

a x o a x o o a x b

a x o a x o o a x ba x o a x o o a

α α α α

α α α α

α α α α

α α α α

α α

=

=

=

=

22

1 22 31 2

( ) ( )...

( ) ( ) ... ( ) ( )

nn

nn n nn nn n nn

x b

a x o a x o o a x b

α α

α α α α

=

=

Equation can be simplified as follows

( )1 1

( ) ( )( )1 1

( ) ( ) ( );0

( ) ( ) ( )

n njj iij i j nj j

n nj iji n j i n j nj j

d x d x bi n

d x d x b

α α α

α α α

+= =

+ + += =

+ = ≤ ≤+ =

∑ ∑∑ ∑

In which

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

; 00

0; 0

; 00

0

ij ijij ij

i n j

ijij ij

i n j ij

ij ijij ij

i n j

ij

i n j

d d aif o and a

d d

d dif o and a

d d a

d d aif o and a

d d

d dd d

+ +

+ +

+ +

+ +

+ +

+ +

+ +

+ +

= == + > = =

= == + < = =

= = −= − > = =

= == = −

; 0ij ijij

if o and aa

= − >

The path we passed to reach D is to adopt and generalize what Freidman proposed for solving a fuzzy linear system. Based on what Freidman and many other authors who followed him achieved, we employs α cuts and L-R function for solving GFLS. Meanwhile we are interested to obtain more

simple representation of 2 2[d ]ij n nD ×= . To this end we prefer to define a binary matrix [p ]ij n nP ×=where


1, 1 , 11, ,1 ,21, ,1 , 2

ij ij

ij H

i n jp o i n j n

o i n j n

≤ ≤ == = + ≤ ≤ ≤ ≤− = − ≤ ≤ ≤ ≤

Proposition 3.1. Let A be a coefficients matrix in a GFLS and P be its binary matrix. Then D can be computed as follows

. . (A) . . ( A). . ( A) . . (A)

A P S A P SD

A P S A P S−

= −

where . is Hadamard product and S is defined as

:(A)1, a 0

(S(A))0, a 0

n n n n

ijij

ij

S R RA S

× ×→→

≥= <

4. Numerical Examples

Here some examples illustrate all the theoretical foundation in the previous section

Let

1 2

2

2 4 [2 , 4 2 ]3 [ 9 3 ,3 9 ]

x xx

α αα α

− = − = − + −

12 4 [2 ,4 2 ], ,

0 3 [ 9 3 ,3 9 ]A A b

α αα α

− − = = = − + −

1 2

2

2 4 [2 , 4 2 ]3 [ 9 3 ,3 9 ]

Hx xx

α αα α

− = − = − + −

Thus we have

2 4 [2 , 4 2 ], ,

0 3 [ 9 3 ,3 9 ]HA O b

α αα α

+ − − = = = + + − + −

1 1 1 1 0 0, (A) , ( A)

1 1 1 1 0 0P S S

− = = − =

Then


2 4 0 00 3 0 00 0 2 40 0 0 3

D

− = −

And

1

0.5 0.6667 0 0 2 6 30 0.3333 0 0 4 2 30 0 0.5 0.6667 9 3 4 70 0 0 0.3333 3 9 1 3

D b

α αα αα α

α α

−

− + − − + = = − + − − −

Therefore

1 [ 6 3 ,4 7 ]x α α= − + − and 2 [ 3 ,1 3 ]x α α= − + − are the solutions and obviously exact solutions.

5 Conclusion In this paper Hukuhara difference is applied to construct a generalized fuzzy linear system. We proposed a numerical method to find fuzzy solution for mentioned system by modifying an existing approach. The advantage of our approach is to obtain a fuzzy solution in α -cuts representation such that completely satisfies the system. References [1] T. Allahviranloo, Numerical methods for fuzzy system of linear equations. Appl. Math. Comput. 155 (2004), 493-502. [2] T. Allahviranloo, Successive overrelaxation iterative method for fuzzy system of linear equations, Appl. Math.Comput. 162 (2005), 189-196. [3] T. Allahviranloo, M. Ghanbari, A.A. Hosseinzadeh, E.Haghi, and R. Nuraei, A note on Fuzzy linear systems, Fuzzy Sets and Systems. 177 (2011), 87-92. [4] S.L. Chang, L.A. Zadeh, On fuzzy mapping and control. IEEE Trans.Syst. Man Cyb 2(1972), 30-34. [5] M. Dehghan, B. Hashemi, Solution of the fully fuzzy linear systems using the decomposition procedure, Applied Mathematics and Computation. 182 (2006) 1568-1580. [6] M. Dehghan, B. Hashemi, Iterative solution of fuzzy linear systems. Applied Mathematics and Computation. 175 (2006) 645-674. [7] M. Dehghan, B. Hashemi,M. Ghatee, Solution of the fully fuzzy linear systems using iterative techniques., Chaos, Solitons and Fractals. 34(2007) 316-336. [8] D. Dubois and H. Prade, Operations on fuzzy numbers, J. Systems Sci. 9 (1978), 613-626. [9] L.A. Zadeh, Fuzzy sets, Information and Control 8(1965)338-353. [10] L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, Inform. Sci. 8 (1975), 199-249.


On the Grey Dynamics of Type 2 Fuzzy Neural Hybrid Force Control

Farnaz Sabahi

Engineering Faculty, Urmia University, Urmia, Iran

[email protected]

Abstract The problem of uncertainty for robot manipulator dynamic i n contact with an environment using hybrid force control and neuro-fuzzy is considered. Control of an industrial robot is mainly a problem of dynamics. It includes nonlinearity, uncertainties and external perturbations that should be considered in the design of control laws. For the first time, we formulate Type 2 neuro-fuzzy system based on extended back propagation (EBP) learning algorithm to adjust the parameters online, with no initial offline training while using the force error as the objective function. In the proposed method, a neural system is used as an approximate model of uncertain parts of robot dynamic while we assume by first principle knowledge there is known parts in robot dynamic. In addition, a self-tuning Type 2 fuzzy system is adopted to implement the on-line compensation for the static error caused by the PD controller based on the fuzzy rule set to improve the control performance. The proposed controller guarantees the closed loop stability for any arbitrary initial values of states and any unknown-but-bounded disturbances. Simulation results show the applicability and adaptability of the method to the hybrid force control and is more accurate when compared with alternative approaches.

Keywords: Fuzzy system, Neural system, Robot, Uncertainty.

1. INTRODUCTION Many uses of manipulators having end effectors such as robotic "hands" show increasing attention. Accordingly there has been a need for an efficient means to control work space of manipulators. The evolution in the control area has been fuelled by two major concerns: dealing with complex systems and accomplishing less precise design requirements. This demands new techniques in robotics and control paradigms. The use of fuzzy and neural networks in control systems can be seen as a natural step in dealing with these difficulties.

Force control has been known to be one of the complicated control algorithms for manipulator when interacts with the environment. The most common method of force control is hybrid position/force control [1]. In this paradigm, the desired force should be maintained as well as the desired trajectories should be followed. The improvement of the controller robustness using the neural network (NN) [2] and fuzzy logic [3] have been shown before. However, most of methods in this area focused on offline training and/or they have assumed all parts of the robot dynamic are known. Besides that, most of researches use standard Error Back-propagation learning algorithm. However, there are problems associated with that [4] such as local minima and low learning speed ,so here we use Extended Error Back-propagation learning algorithm to



reach better results. In fact, combination of neural and fuzzy systems with applying extended back-propagation is a new paradigm that this paper is considered in the hybrid force/position framework.

2. METHODOLOGY

We are analysing uncertainty in robot dynamic by friction, centrifugal force, Coriolis force, and Gravity. By considering the robot dynamic as

M ( x ) x C ( x , x ) G ( x ) F F e (1)

We assume manipulator moves in a singularity free region of the workspace, so the vector of operational variables x constitutes a set of lagrangian generalized coordinates and M is the meaning of a true inertia matrix. The ambiguities concerning joint friction, joint and link flexibility, coriolis, centrifugal and gravity terms (C, G) in the plant i s modelling by neural network, while controller is a fuzzy controller.

Using Back-Propagation algori thm is general for learning. But some problems are associated with the standard Back-Propagation learning algorithm: The learning speed is slow and there exist local minima. To alleviate these two problems at the same time, Chen and Nutter [4] extended back propagation learning algorithm that we use here. In this algorithm in addition to updating weights over all training patterns, other free parameters including the upper and the lower bounds of the output function as well as the slope of the function are also updated.

Furthermore, since the stability of the system is more important than the satisfying of its performance, we also prove the system’s stability by the following theorem.

Theorem 1: We suppose the following assumptions:

1) There are matrices P, X, Q are positive definite that for the linear model we have:

+ = −

2) There is a diagonal matrix Γ where its elements are non-negative.

3) There is a positive definite matrix z respect to x,x& that

)(2/

2

1

0

0

rrr

nn

nn

nn

zz

zz

××

×

×

=

This matrix is used for matrix multiplication.

4) There are finite state , i.e., < where ∈ ,

Then according to the control law, all signals in the closed loop are bounded and the system is stable.


Proof: 1

The idea for hybrid position/force control is shown in Figure 1.

Figure 1: The proposed Hybrid position/force controller.

3. RESULTS OF SIMULATION

In the simulation, the robot is supposed to draw lines on the elastic surface. During its implementation an unexpected disturbance-unit step is occurred. The performance of the proposed approach is tested by tracking the trajectory on 45° tilted surface environment. Reference force is a constant force (50 Newton). Figure 2 shows the position error the system, and Figure 3 shows the force error. As can be seen, the control scheme has yielded small tracking error.

١ Due to the restriction of the papers' pages, the proof is omitted.


Figure 2. Position error.

Figure 3. Force error.

4. CONCLUSION


In this paper, hybrid position/force control of robot manipulator due to the uncertain robot dynamic has been analyzed. The proposed type 2 fuzzy-neural controller shows robustness of the system under unknown parts of robot dynamic by the ability of compensating uncertainties in the robot’s dynamics by neural network and applying fuzzy controllers. Simulation studies have been carried out to confirm analytical discussion in the previous sections.

5. REFERENCES

[1] Li, Z., Cao, X., Tang, Y., Li, R., and Ye W., “Bilateral teleoperation of holonomic constrained robotic systems with time-varying delays”, IEEE Trans on Instrumentation and Measurement, 2013, pp. 752 - 765.

[2] S. Jung and T.C. Hsia, “Neural network impedance force control of robot manipulator,” IEEE Trans. on Industrial Electronics , Vol. 45, No. 3, 1998, pp. 451-461

[3] J. Lin, C.C. Lin, and H.-S. Lo ,“Hybrid position/force control of robot Manipulators mounted on oscillatory bases using adaptive fuzzy control “,International Symposium on Intelligent Control, pp.467-492,2010.

[4] L. Chen and S. Roy,”An Extended Back-propagation Learning by using heteragenovs processing units” IEEE Trans, Neural Network. pp. 988-999, 1992.


TWO METHODS FOR DEFUZZIFICATION OF NUISANCE PARAMETER

Ahmad Hozhabr and Adel Mohammadpour

Department of Statistics, Amirkabir University of Technology (Tehran Polytechnic)

Abstract:

The present study seeks the notion of estimation with a nuisance parameter. Tow methods are suggested for using the fuzzy knowledge of the nuisance parameter. In the first method, we calculate a defuzzified value of the parameter and replace it instead of the parameter. In the second method, the authors deffuzify the cumulative distribution function (cdf) of the random variable, X, as a function of the fuzzy parameter. At the end, we compare these two methods

1. Introduction There are two main approaches to estimate a parameter θ; Classical, and Bayesian approach. The classical approach

is applied when the observations and parameters are crisp, where as the Bayesian approach is applied when the observations are crisp and the parameters are crisp random variables.

There is also a third approach to some practical problems. In this approach, similar to the above two approaches, observations are crisp, but there may be some fuzzy knowledge about an unknown, but fixed parameter that can be expressed via a known membership function, denoted by m(.). This fuzzy knowledge comes from restrictions on the parameter or from our experience. For simplicity, we call such a parameter, the fuzzy parameter. Ralescu [6] uses a similar terminology for parameters in a binomial distribution and Haekwan & Tanaka [3] introduced regression with fuzzy parameters.

In order to establish the problem, we assume that the observation 1 n( ,..., )′=x x x has the cdf X | ,F ( | , )ν θ ν θx , where θ

is a crisp parameter and ν is a nuisance fuzzy parameter and we are going to estimate the crisp parameter θ.

Now the main question is how to get rid of ν . In the Bayesian approach where we have a prior distribution for ν , we simply get the marginal distribution X |F ( | )θ θx ; Then we estimate θ. In the fuzzy approach, however, we hav.

membership function m( )ν that S

m( )d 1(s : m( ) 0)ν ν = = ν ν >∫ . As mentioned above, there are two methods of using

knowledge m( )ν to defuzzify X | ,F ( | , )ν θ ν θx . In the first method we calculate some values of ν like mean ( )ν and median ( )ν% of ν and then replace it instead of ν in X | ,F ( | , )ν θ ν θx . In the second method we consider

X | ,g ( ) F ( | , )ν θν = ν θx as a function of fuzzy parameter ν and calculate the mean or median of g( )ν to defuzzify g( )ν .

Section 2 and 3 are going to explain these two methods, and section 4 compares conclude them. Hence forth, we assume that X is a continuous crisp random variable and the fuzzy parameter ν is a real and continuous one with membership function m( )ν .

2. First Method We consider the cdf of X, a function of the fuzzy parameter ν while deffuzzify it by replacing ν or ν% instead of ν

in X | ,F ( | , )ν θ ν θx .

Definition 1. Let ν is a fuzzy parameter with membership function m( )ν . Then the mean and the median of ν are defined respectively, as follows:

Sm( ) d ,ν = ν ν ν∫

1m ( ) d m ( ) d .2ν ∞−∞ ν

ν ν = ν ν =∫ ∫%

%


Definition 2. Let X be a random variable with cdf X | ,F ( | , )ν θ ν θx where θ is a crisp and ν is a fuzzy parameter. Then, defuzzified

cdfs based on the defuzzified parameter mean ( )ν and median ( )ν% are respectively as follows:

X | ,F ( | , ) ,ν θ ν θx

X | ,F ( | , ) .ν θ ν θ% %x

Example 1. Let X have a cdf X|F ( | ) 1 e−νν ν = − xx , 0ν> , 0>x where ν is a fuzzy parameter with membership

function m( ) e−νν = , 0ν> . Then the defuzzified distribution corresponding ν and ν% are calculated as follows:

S 0m( ) d .e d 1∞ −νν = ν ν ν = ν ν =∫ ∫

X|1F ( |1) 1 e−= − xx

0 01m( ) d e d ln 22

ν ν −νν ν = ν = ⇒ ν =∫ ∫% % %

( )Xln 2

|ln 21F ( | ln 2) 1 e 1 .2

−= − = −xxx

Example 2. Let X have a cdf similar to that of example 1 and,

12 0 22 4 1m( ) 23 3 2

0 o.w

ν ≤ ν <ν = − ν + ≤ ν <

Then we can obtain,

( )12

12

2 220 0

2 4m( )d 2 d d 13 3ν = ν ν ν = ν ν + ν − ν + ν ν =∫ ∫ ∫

X|1F ( |1) 1 e−= − xx

01m( ) d 0.77532

νν ν = ⇒ ν =∫

% %

X0.7753

|0.7753F ( | 0.7753) 1 e .−= −x

3. Second Method In this method we calculate defuzzified distribution function by calculating the mean or median of

X | ,g ( ) F ( | , )ν θν = ν θx as a fuzzy random variable. In order to do the task, we must be able to calculate g( ) 'sν the membership function. This is done with the use of the Extention Principle [6]. But there is a problem with that.

Mean or median of g( )ν which is achieved by g( ) 'sν membership function of, and expressed a defuzzified distribution function, does not preserve properties of a distribution. This problem can be solved by using an increasing membership function such as cumulative membership function, say M( )ν , and then use Extention Principle for calculating g( ) 'sν cumulative membership function.

It will be shown that the obtained defuzzified distribution function of this way, has all properties of a distribution function.

Definition 3 (cumulative membership function) Let ν be a fuzzy parameter with membership function m( )ν ; The cumulative membership function (c.m.f), denoted by M( )ν , is defined as follow:

M ( ) m( ) dν−∞

ν = ν ν∫

Definition 4. Let ν is a fuzzy parameter with membership function m( )ν and continuous cumulative membership function M( )ν and X |g ( ) F ( | )νν = νx is a fuzzy distribution; Then the deffuzzified distribution function of

X |g ( ) F ( | )νν = νx by calculating the mean and median of that are denoted by F and F%, and are defiened as follows:


F g( ) m(g( )) dg( )= ν ν ν∫

F0

1m(g( )) dg( ) 2ν ν =∫%

Where m(g( ))ν is the membership function of g( )ν which is achieved by Extension Principle.

Example 4 (continued) In the example 1 we have,

u0

m( ) e M ( ) e du 1 e , 0ν−ν − −νν = ⇒ ν = = − ν >∫

X|F ( | ) 1 e ,−νν ν = − xx

and by the use of Extension Principle for M( )ν ,

( ) ( )1 1ln(1 y)g( ) 1 e

1M (y) M (y) M ln(1 y) 1 e 1 (1 y) , 0 y 1.−ν− − −

ν ν−= = − − = − = − − < <x x

x x

Therefore, the membership function of y g( )= ν is

1 11m(y) M(y) (1 y) , 0 y 1.y−∂= = − < <∂

xx

With the use of this membership function, we can obtain,

1 110

1 1F ym(y) dy y (1 y) dy 1 ,1−

= = − = − +∫ ∫ xx x

( )F0

1 1m(y)dy F 1 .2 2= ⇒ = −∫% % x

We can see in example 1 that the defuzzified value of, and F% derived from the two methods are the same. This is true always when the conditions of the following theorem holds [5] :

Theorem 1. If X |L( ) F ( | )νν = νx is a monotone function wrt ν and V has a unique median ( )1 1F 2−V , then

( )( )X1 1F ( ) L F 2

−=% x V .

Proof: Let

1 inf ; L( ) u if is a nondecreasing functionu L( ) L (u)

inf ; L( ) u if is a nonincreasing function− ν ν ≥

= ν ⇒ ν = =ν ν ≤

LL

be the generalized inverse of L, e.g. [8] page 39. Noting that

(u, ) : u L( ) if is a nondecreasing function(u, ) : L (u)

(u, ) : u L( ) if is a nonincreasing function− ν ≤ ν

ν ≤ ν =ν ≥ ν

LL

And by (the difinition) we have,

( )X1P L( ) F ( ) 2≤ =% xV

( )

( )

X

X

1P L (F ( )) if is a nondecreasing function21P L (F ( )) if is a nonincreasing function2

−

−

≤ =⇔ ≥ =

%

%

x

x

V

V

L

L

( )

( )

X

X

1F L (F ( )) if is a nondecreasing function211 F L (F ( )) if is a nonincreasing function2

−

−

=⇔ − =

%

%

x

x

V

V

L

L

( )X

1 1F L (F ( )) 2−⇔ =% xV

( )X1 1L F ( ) F 2

− −⇔ = =% x V by uniqueness of the median of V


( )X1 1F ( ) L F ( )2

−⇔ =% x V

where the last expression follows from

(u, ):L (u) (u, ):L( ) u.−ν =ν ⊆ ν ν = ¢

As mentioned above, XF ( )% x has all the properties of a cdf. This is shown in the following theorem [5]:

Theorem 1. Let X have a cdf X |F ( | )ν νx depending on a fuzzy parameter V with membership function f ( )νV and the real fuzzy variable X |T F ( | )ν= x V have a unique median for each fixed x . Then,

1. XF ( )% x is an increasing function in each of its arguments.

2. If X |T F ( | )= νxV and f ( )νV are continuous cdfs then XF ( )% x is a continuous function in each of its arguments. 3. X0 F ( ) 1< ≤% x .

Proof: 1. Let 1 ny (y ,..., y )′= , 1 nz (z ,..., z )′= , j jy z< for fixed j and i iy z= for i j≠ , 1 i≤ , j n≤ and take

X X X Xy z | |k F (y), k F (z) and Y F (y | ), Z F (z | ) .= = = =% %V VV V

Then, using (2) we have

y z1P(Y k ) P(Z k ) .2≤ = ≤ =

We also have Y Z≤ , because X |F V is a nondecreasing function in each of its arguments. Therefore,

y z zP(Y k ) P(Z k ) P(Y k ) ,≤ = ≤ ≤ ≤

yk is the unique median of Y and so y zk k≤ or equivalently, XF ( )x is nondecreasing in its j-th argument.

2. Let 1 j 1 j j 1 n( , ..., , , , ..., )− + ′=x x x x x x and 1 j 1 j 1 nt ( , ..., , t, , ..., )− + ′= x x x x . By part 1, XF ( )% x is a nondecreasing function in each of its arguments. Therefore,

X X X Xj jt t

F ( ) limF (t) and F ( ) limF (t) ,− +↑ ↓

= =% % % %x x

x x

exist and are finite, e.g. [9].

Further, X |F ( | )νxV is continuous wrt jx , and so

X XX X| |P(F ( | ) F ( )) P(F ( | ) F ( )) ,− − −≤ = ≤% %x x x xV VV V

X XX X| |P(F ( | ) F ( )) P(F ( | ) F ( )) ,+ + +≤ = ≤% %x x x xV VV V and by (2) we have

X XX X| |P(F ( | ) F ( )) P(F ( | ) F ( ))−≤ = ≤% %x x x xV VV V

XX|P(F ( | ) F ( )) .+= ≤ %x xV V (3) But XF ( )% x is the unique median of X |F ( | )xV V ; Therefore, by (3),

X X XF ( ) F ( ) F ( ) ,− += =% % %x x x and thus XF ( )% x is continuous.

3. XF ( )% x is the median of random variable T, where X |T F ( | )= xV V and 0 T 1≤ ≤ , and so X0 F ( ) 1≤ ≤% x . ¢ 4. Applications and Conclusions


As mentione in section 3 the defuzzified value of X |F ( | )ν ν%x and F% derived from the two methods are the same when

X|F ( | )ν ν% x is a monotone function of ν and V has a unique median 1 1F ( )2−ν and so we can easily obtain

( )X X1 1

| 2F ( ) F | M ( )−ν=% x x . So for the location families, scale families and the families with MLR (monoton likelihood

ratio) that the distribution is monoton wrt the parameter, the method for median can easily be used. We have shown in this work that we can use fuzzy knowledge about a nuisance parameter in classical statistics for estimation. We introduced two methods of estimating based on a definition for the distribution function of a random variable with a fuzzy parameter and estimation of a fuzzy parameter. These methods enable us to study some of the problems which cannot be studied in classical statistics by a parametric method. These methods can solve som open problems in testing hypthesis such as the problem of hypothesis testing for the mean of an exchangeable normal population [7], the parametric solution for testing the independence assumption versus the exchangeability assumption for normal distribution [1], time series [2] (for checking error terms), and quality control [2]. Note that, the non-parametric tests for these problems are not robust and cannot be used in many real problems [2,4].

REFERENCES

[1] Amold, S. F. (1979). Linear models with exchangeably distributed errors. Journal of the American Statistical Association, 74, 194-199.

[2] Dufour, J. M, & Roy, R. (1986). L'echangeabilite en séries chronologiques: quelques résultats exacts sur les autocorrelations et les statistiques Portemanteau. Cahiers du C.E.R.O., 28, 19-39.

[3] Haekwan, L. & Tanaka, H. (1999). Fuzzy approximations with non-symmetric fuzzy parameters in fuzzy regression analysis. J. Oper. Res. Soc. Japan, 42, 98-112.

[4] Mohammadpour, A. & Behboodian, J. (2000). Hypothesis testing for an exchangeable normal distribution. J. Sei. I. R. Iran, 11, 131-141.

[5] Mohammadpour, A. & Mohammad-Djafari, A. (2006). Inference with the Median of a Prior. Entropy 8[2], 67-87. [6] Ralescu, D. (1995). Statistical decision analysis with fuzzy information. manuscript. [7] Rao, C. R. (1973). Linear statistical inference and its applications, 2nd ed, New York, Wiley. [8] Robert, C. P., Casella, G., (2004). Monte Carlo statistical methods. 2nd ed, Springer, New York. [9] Rohatgi, V. K., (1976). An Introduction to Probability Theory and Mathematical Statistics. Wiley, New York.


Fuzzy Linear system of the form A BAx b Bx b+ = + and Hukuhara Difference

Mohamad Keyanpour1 Maryam Mohagheghtabar2

1Department of Mathematics, Faculty of Sciences, University of Guilan, Rasht, Iran, P.O.Box 41335-1914.

2Faculty of Sciences, Islamic Azad University of Fouman and Shaft.

[email protected] Abstract

In this paper, the α-cut representation and the generalized Hukuhara difference is employed to create A BAx b Bx b+ = + fuzzy linear sys-tem. The foundation of the paper is based on the absence of inverse operator in

fuzzy arithmetic. This paper aims at finding an unique fuzzy solution for the mentioned system. The effectiveness of the proposed methodology is illustrated by examples.

Keyword: Fuzzy Linear programming, Hukuhara difference, invers operator, fuzzy solution

1. Introduction

Fuzzy numbers and fuzzy arithmetic operations were firstly introduced by Zadeh [9], Dubois and Prade [4]. Fuzzy systems are used to study a variety of problems like fuzzy linear systems and fuzzy optimization. Fuzzy linear system is one of the main applications of fuzzy number arithmetic. Economics, engineering and physics are instances of the sciences dealing with fuzzy linear systems. In many applications, at least one of the elements of a system needs to represent by fuzzy rather than crisp numbers. To this end numerical procedures were developed in the area of fuzzy linear systems investigated [3] the existence of a minimal solution of general dual fuzzy linear equation system of the form

A BAx b Bx b+ = + , where A, B are real n × n matrices, theunknown vector x is a vector consisting of n fuzzy numbers and theright hand sidesbA and bB are vectors consisting of m fuzzy numbers. Recently, Muzziloiet al. Muzziloi et al.[6] considered fully fuzzy linear systems of the form A BAx b Bx b+ = + B square matrices of fuzzy coefficients and bA and bB are fuzzy number vectors. In this paper, we are finding the solution of a fuzzy linear system of the form

A BAx b Bx b+ = + , in which and bB are fuzzy number vectors and the unknown vector x is a vector consisting of n fuzzy numbers. In section 2,the basics of the fuzzy set theory is discussed. Then we use Hukuhara difference to construct a fuzzy linear system in the form of

A BAx b Bx b+ = + and propose a numerical method to solve it in Section 3 and 4. Section 5

contains some a numerical.

2. Fuzzy Arithmetic

Fuzzy numbers are one way to describe the vagueness and lack of precision of data. The theory of



fuzzy numbers is based on the theory of fuzzy sets which was introduced in 1965 by Zadeh [10]. The concept of a fuzzy number was first used by Nahmias[9] and by Dubois and Prade[7] in the late 1970s. Definition of a fuzzy number [8].

Definition 2.1. we represent an arbitrary fuzzy number by a pair of functions [ ( ), ( )]; 0 1x xα α α≤ ≤ which satisfy the following requirements:

• ( )x α Is a bounded left continuous nondecreasing function over

• ( )x α Is a bounded left continuous nonincreasing function over

• ( ) ( ); 0 1x xα α α≤ ≤ ≤

A crisp number α is simply represented by ( ) ( )x xα α α= = ; 0 ≤ α ≤ 1. The set of all fuzzy

numbers [ ( ), ( )]x xα α becomes a convex cone that is denoted by 1E which is then embedded isomorphically and isometrically into a Banach space.

3. Preliminary

Usually there is no inverse element for an arbitrary non-crisp fuzzy number 1u E∈ i.e. there exists no element such that

0x y+ =

Actually, for all non-crisp fuzzy number 1u E∈ we have

( x) 0x + − ≠

Therefore, the fuzzy linear equation system

A BAx b Bx b+ = +

Cannot be equivalently replaced by the fuzzy linear equation system

( ) B AA B x b b− = −

Which had been investigated. In the sequel, we will call the fuzzy number system

A BAx b Bx b+ = +

where [a ]ij n nA ×= , [b ]ij n nB ×= are crisp coefficient matrices and fuzzy number vectors, respectively.

We give consider fuzzy number vectors ,A Bb b in parametric form by ordered pair of functions as follows

( , ), ( , )A BA BA Bb b b b b b= =

Definition 3.1 Let [ , ]A a a= and [ , ]B b b= be two crisp intervals, the H- difference is

[ , ] [ , ] [ , ]H H

a b cA B a a b b c c

a b c

= +− = − = ⇔ = +


Note: Although the classic difference operator for intervals is not associative, it can be simply proved that H- difference has this valuable property.

4. Generalized Fuzzy Linear System (GFLS)

In the presence of Hukuhara difference and based on theorem 3.1, equality

1 1 2 2H HA x b A x b− = −

Can be equivalently replaced as follows

1 2 2 1H HA x A x b b− = −

Definition 4.1. Consider system 5, corresponds to the GFLS and for all 1 ,i j n≤ ≤ we define two coefficients

matrices 11 ,[a ]i j n nA ×= , 2

2 ,[a ]i j n nA ×= , operator matrices 11 ,[o ]i j n nO ×= and 2

2 ,[o ]i j n nO ×= where

, , ;k 1, 2ki j Ho ∈ + − = and two representation matrices 1 1,1 1,2 1,2 1,3 1, 1,[a ,o ,a ,o ,...,o ,a ]o n nA = and 2 2,1 2,2 2,2 2,3 2, 2,[a ,o ,a ,o ,...,o ,a ]o n nA = in which ,k ja and ,k jo ;k=1,2, are the columns of matrices

1 2,A A and 1 2,O O , respectively.

And corresponding to ,o kA , two crisp matric [ ]k k

ijO o= is defined as follows

,;1 , , 1,2

,

kk ij

ij kij H

if oo i j n k

if o + = += ≤ ≤ =− = −

Consider a 2 2n n× GFLS

2 11 2 1 2 1 212 13 11 2 1 111 11 12 12 1 1

2 11 2 1 2 1 222 23 21 2 2 221 21 22 22 2 2

1 2 1 2 12 31 21 1 2 2

( ) ( ) ( ) ( ) ... ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ... ( ) ( ) ( ) ( )...

( ) ( ) ( ) ( ) ... (

n nn n

n nn n

n n nnn n n n nn

a a x o a a x o o a a x b b


a a x o a a x o o a

α α α α α

α α α α α

α α

− − − = −

− − − = −

− − 2 12

1 2 1 2 1 21 212 13 1 1 211 11 12 12 1 1

2 11 2 1 2 1 21 222 23 2 2 221 21 22 22 2 2

1 2 11 21 1 2

) ( ) ( ) ( )

( ) ( ) ( ) ( ) ... ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ... ( ) ( ) ( ) ( )...

( ) ( ) (

n n nnn

nn n n

nn n n

nn n n

a x b b



a a x o a a

α α α

α α α α α

α α α α α

α

− = −

− − − = −

− − − = −

− − 2 12 1 22 32 ) ( ) ... ( ) ( ) ( ) ( )nn nn n nn nn nnx o o a a x b bα α α α

− = −

For the new system of equations with new coefficient matrix 1 2A A A= − and new right hand side

2 1b b b= − we have:

It will be simplified as follows

( )1 1

( ) ( )( )1 1

( ) ( ) ( );0

( ) ( ) ( )

n njj iij i j nj j

n nj iji n j i n j nj j

d x d x bi n

d x d x b

α α α

α α α

+= =

+ + += =

+ = ≤ ≤+ =

∑ ∑∑ ∑


In which

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

(i n)(j n)

( ) i(j n)

; 00

0; 0

; 00

0

ij ijij ij

i n j

ijij ij

i n j ij

ij ijij ij

i n j

ij

i n j

d d aif o and a

d d

d dif o and a

d d a

d d aif o and a

d d

d dd d

+ +

+ +

+ +

+ +

+ +

+ +

+ +

+ +

= == + > = =

= == + < = =

= = −= − > = =

= == = −

; 0ij ijij

if o and aa

= − >

We are interested to obtain more simple representation of 2 2[d ]ij n nD ×= . To this end we prefer to define a

binary matrix 2 2[p ]ij n nP ×= where

1, 1 , 11, ,1 , 21, ,1 , 2

ij ij

ij H

i n jp o i n j n

o i n j n

≤ ≤ == = + ≤ ≤ ≤ ≤− = − ≤ ≤ ≤ ≤

Proposotion4.1. Let A be a coefficients matrix in a GFLS and P is its binary matrix. Then D can be computed as follows

. . (A) . . ( A). . ( A) . . (A)

A P S A P SD

A P S A P S−

= −

Where .is Hadamard product and S is defined as

:(A)1, a 0

(S(A))0, a 0

n n n n

ijij

ij

S R RA S

× ×→→

≥= <

Lemma 4.2. Let

1, 0[z ]

0,ij

ij n n

aZ

otherwise×

== =

Then

(A) ( A) 1 ZS S+ − = +

Theorem 4.3. The matrix D is nonsingular if and only if the matrices .A P and . .(2S(A) 1)A P − both are nonsingular.

Theorem 4.4. if 1D − exists it must have the same structure as D


1 R SD

S R−

=

Where

1 1

1 1

1 [(A.P) (A.P.(2S(A) 1)) ]2

1S [(A.P) (A.P.(2S(A) 1)) ]2

R − −

− −

= + +

= − +

5. Numerical examples

Consider the following system

1 2

1 2

3 2 1 2, ,

4 1 1 3( 16 , 14 ) (6 5 ,20 9 )

; ;( 14 , 12 ) ( 22 8 , 8 6 )

A A

b bα α α αα α α α

− = =

− + − − + −

= = − + − − − + − −

Therefore

1 2

1 1

2 2

1 1 1 1,

1 1 1 11 1 0 0

(A ) , ( A )1 1 0 01 0 0 1

(A ) , ( A )1 1 0 0

P P

S S

S S

− − = = − −

= − =

= − =

Then

3 2 0 23 2 0 00 2 2 20 0 3 2

D

− = −

The solution of proposed method and the exact solution are the same

1 [3 ,6 3 ]x α α= − and 2 [4 ,7 2 ]x α α= + −

6. Conclusion

In this paper Hukuhara difference is applied to construct a fuzzy linear system when A BAx b Bx b+ = + . We modified a numerical method which authors already take into account for solving a fuzzy linear system. Our


modified method guaranty to obtain an exact fuzzy solution in the α - cuts representation for the mentioned equation system.

References

1. S. Abbasbandy, A. Jafarian and R. Ezzati, Conjugate gradientmethod for fuzzy symmetric positive definite system of linear equations,Appl. Math. Comput 171 (2005), 1184-1191. 2. S. Abbasbandy, R. Ezzati and A. Jafarian , LU decompositionmethod for solving fuzzy system of linear equations, Appl. Math. Comput172 (2006), 633-643. 3. S. Abbasbandy, M. Otadi and M. Mosleh, Minimal solution ofgeneral dual fuzzy linear systems, Chaos Solitons&Fractals, in press. 4. D. Dubois and H. Prade, Operations on fuzzy numbers, J. SystemsSci. 9 (1978), 613626. 5. M. Friedman, M. Ming, A. Kandel, Fuzzy linear systems, FFS 96(1998), 201209. 6. S. Muzzioli and H. Reynaerts, Fuzzy linear systems of the formA1x + b1 = A2x + b2, Fuzzy Sets and Systems. 1 (1978), 97-111. 7. S. Nahmias, Fuzzy variables, Fuzzy Sets and Systems. 157 (2006), 939-951. 8. X. Wang, Z. Zhong and M. Ha, Iteration algorithms for solving asystem of fuzzy linear equations, Fuzzy Sets and Systems. 119 (2001),121-128 . 9. L.A. Zadeh, Fuzzy sets, Information and Control 8,(1965),338-353. 10. L.A. Zadeh, The concept of a linguistic variable and its applicationto approximate reasoning, Inform. Sci. 8 (1975), 199-249.


Fuzzy regression: M-estimation approach

J. Chachi1, S.M. Taheri2

1- Department of Mathematics, Statistics and Computer Sciences, Semnan University, Semnan 35195-363, Iran

2- Faculty of Engineering Science, College of Engineering, University of Tehran, Tehran, P.O. Box 11365-4563, Iran

[email protected]

Abstract In this research, a new robust regression method is proposed which is based on employing a fitting criterion that is not as vulnerable as least-squares to unusual data. The proposed fuzzy regression in this paper is a modified M-estimation approach to fuzzy regression analysis, when independent variables are crisp, the dependent variable is fuzzy and outliers are present in the data set. The parameters estimation problem finally reduces to a weighted least-squares approach to estimating the parameters of interest which is a simple approach both theoretically and computationally. To illustrate how the proposed method is applied numerical example is discussed and compared in methods from the literature. Keywords: Fuzzy regression, Weighted least-squares, Imprecise data.

1. INTRODUCTION An assessment of potential outliers is important in any analysis especially for formalizing linear regression model in a fuzzy domain, in which model parameters and/or data are fuzzy (imprecise or vague). A single outlier affects both on parameters estimates and on the fit of the model to the data [7]. The main problem which will be investigated in this paper is to provide a suitable method to deal with fuzzy data contaminated by outliers. Fitting robust methods decrease the affect of outliers on the model fit. In this regard, the robust M-estimation method is concerned in fuzzy environment, which provides a robust estimation method for the fuzzy regression model that is insensitive to outliers and possibly high leverage points. In the proposed approach the outliers are carefully investigated and the importance of each observation is recognized. This is important because sometimes even a single observation can change the value of the parameter estimates, and omitting this observation from the data may lead to totally different estimates. Therefore, if there exist outliers in the data set, the proposed method is preferred to estimate parameter values because the detected outliers have low influences on the estimated values of the parameters. However, in the proposed fuzzy regression analysis in this paper, the outlier problem is dealt with two different points of view: Outlier detection, and Robust estimation. In the case of outliers in fuzzy data set, the robust estimation of fuzzy regression parameters has been studied by many authors and robust methods have been defined. Recently, many researches have studied robust fuzzy regression estimation (see valuable references in D'Urso et al. [7]). In this regard, the outliers problem has been addressed with regard to both outliers detection criteria and robust estimation procedures. The methods, which have been proposed to detect the presence of outliers in fuzzy framework, relied on graphical representation [6] and/or on analytical procedures ([5,10,12,13]). More recently, several issues related to robust fuzzy regression has been addressed in the literature and a great deal of progress has been made. See also [2,3,4,7] for a list of possible published references on the topic of robust fuzzy regression analysis. The rest of the paper is organized as follows. In the next section we briefly review some basic concepts of fuzzy set theory. In Section 3, employing the M-estimation method, we construct a robust fuzzy multiple linear regression model for crisp input-fuzzy output data. Section 4 reports applicative real valued example to illustrate the effectiveness of the proposed model in the presence of outliers. In Section 5, some final remarks conclude the paper. 2. FUZZY SETS AND FUZZY ARITHMETIC A fuzzy set A% on the universal set X is described by its membership function ( ) : [0,1]A x →% X . In this paper, we assume that = ¡X , the set of real numbers. A specific class of fuzzy numbers, which is rich and flexible enough to



cover most of the applications, is the so-called LR-fuzzy numbers ( , , )LRN n l r=% with central value n ∈¡ , left and

right spreads l +∈¡ , r +∈¡ , decreasing left and right shape functions : [0,1]L + →¡ , : [0,1]R + →¡ , with (0) (0) 1L R= = . Typically, the LR-fuzzy number N has the following membership function [15]

( ) if ,( )

( ) if .

n xL x nlN x

x nR x nr

− ≤= − ≥

%

We can easily obtain the α -cut of N% as follows

1 1[ ( ) , ( ) ], [0,1],

[ , ].l u

N n L l n R r

N Nα

α α

α α α− −− ∈

=

= +

An LR-fuzzy number ( , , )LRN n l r=% with L R= and l r λ= = is called symmetric L-fuzzy number and is

abbreviated by ( , )LN n λ=% . For the algebraic operations of LR-fuzzy numbers, we have the following result on the basis of Zadeh's extension principle [15]. Let ( , , )m m LRM m l r= and ( , , )n n LRN n l r= be two LR-fuzzy numbers and λ be a real number. Then

0

( , , ) if 0,if 0,

( ,| | ,| | ) if 0,

( , , ) ,

( , , ) ,

m m LR

m m RL

m m LR

m n m n LR

m l rM

m r l

M m l r

M N m n l l r r

λ λ λ λλ λ

λ λ λ λ

λ λ

>

⊗ = = <

⊕ = +

⊕ = + + +

%

%

% %

I

where 0I stands for the indicator function of the crisp zero. 3. THE PROPOSED MODEL

Assume that, in a practical study, the observed data on n statistical units are recorded as 1 1( , ), , ( , ),n ny y…x x% % where

1 1[ , , ]n ny y ′× = …y% % % is the vector of symmetric L-fuzzy numbers, i.e. ( , )i i i Ly y l=% ( 1, , )i n= … , which determines the

fuzzy observed of the dependent variable, and 0 1[ , , , ]i i i kix x x= …x 0( 1, , ; ; 1)ii n k n x= … < = , forms the vector of

crisp observed independent variables. Without loss of generality, we can assume that 0jix > , by a simple translation

of all data if necessary. The following functional dependence between 1n×y% and ( 1)n k× +X (design matrix) is considered

0 1 1

0 0 1 1 1

0 1 1 0 1 1

( , ( )) ( ) ( )( , ) (( , ) ) (( , ) )

( , ) .

L k k

L L k k L k

k k k k L

y g l x xx x

x x x x

β β ββ σ β σ β σ

β β β σ σ σ

⊕ ⊗ ⊕…⊕ ⊗= ⊕ ⊗ ⊕…⊕ ⊗

= + +…+ …+

=

+ +

% % %

where :g + →¡ ¡ is invertible.

In the proposed method, we do not impose a non-negativity condition to the estimation problem to avoid negative estimated spreads. In this method, we propose modeling a transformation of the spread of the response through model (on the explanatory variables) [8]. Indeed, using this method, the center and spread of the response variable is obtained as

10 1 1 0 1 1( , ) ( , ) .L k k k k Ly l x x g x xβ β β σ σ σ−= + +…+ + +…+


In the above model the spreads of the response variable are always non-negative. A common approach consists in transforming the spread by means of the natural logarithmic transformation, i.e. ( ) ( )G t Log t= [8]. We will use this approach in the numerical example to transform the spreads into real variables without the restriction of non-negativity.

The procedure for estimating the fuzzy parameter %0 0 1 1( 1) 1 [( , ) , ( , ) , , ( , ) ] ,L L k k Lk β σ β σ β σ ′

+ × = …β is based on choosing

the best candidate %$( 1) 1k + ×β instead of %

( 1) 1k + ×β , consisting of minimizing the total difference (sum of the errors) between

the observed values of the response variable, %1n×y , and its theoretical counterpart, %$ 1n×y , with respect to a distance. In the

following for simplicity the orders of vectors and matrices will not be shown.

The errors, defined as the difference between observed values of dependent variable and its estimated values, play an important role in the estimation of a regression model. In the fuzzy least-squares method, using a distance between fuzzy numbers, the parameters of the model are estimated so that Sum of the Squared Errors (SSE) would be

minimized. A well-known distance between two LR-fuzzy numbers M% and N% is defined as

( )12 2 20

( , ) ( ) [ ] [ ] ,l l u uM N f M N M N dα α α αα α= − + −∫% %D

Where ( )f α is an increased function in the interval (0,1]α ∈ satisfying (0) 0f = and 1

0( ) 0.5f dα α =∫ (see also

[11,14]). In especial case, the distance between two symmetric L-fuzzy numbers ( , )m LM m λ=% and ( , )n LN n λ=% is obtained as follows

2 2 2( , ) ( ) ( ) ,m nM N m n c λ λ= − + −% %D

where constant c is defined as ( )1 210

( ) ( )c f L dα α α−= ∫ .

For estimating the fuzzy parameter %β , we will consider the M-estimation approach which is a fitting criterion for

estimating the fuzzy parameter %β that is not as vulnerable as least-squares to unusual data [9]. The general M-estimator minimizes the objective function that is a function ρ of the errors, which, in the proposed fuzzy regression model, is equivalent to

2

1 0 0

( , ( )) , ( , ) .n k k

i i L ji j ji j Li j j

SSE y g l x xρ β σ= = =

=

∑ ∑ ∑D

The function ρ gives the contribution of each residual to the objective function. A reasonable function ρ should have the following properties:

• ( ) 0eρ ≥ , • (0) 0ρ = , • ( ) ( )e eρ ρ= − , • ( ) ( )e eρ ρ ′≤ , for | | | |e e ′≤ .


For example, for least-squares estimation,2

( )2

eeρ = . In this paper the following function known as Huber function is

used in the computations [9].

2| | 1.345,

2( )1.3451.345(| | ) | | 1.345,

2

e ee

e eρ

<=

− ≥

Now, for estimating the fuzzy parameter %β , the sum of squares errors, i.e. SSE, should be minimized through following optimization problem

%

2 2

1 0 0

min ( ) .n k k

i ji j i ji ji j j

y x c g l xρ β σ= = =

− + − ∑ ∑ ∑β

The minimum value of SSE can be founded by differentiating with respect to the parameters jβ and jσ , 0,1, ,j k= … ,

and setting the resulting partial derivatives to 0. Differentiating the objective function SSE with respect to the parameters, and setting the partial derivatives to 0, produces a system of 2( 1)k + estimating equations for the parameters as follows

1 0

1 0

0 ( ) 0, 0,1, , ,

0 ( ( ) ) 0, 0,1, , ,

n k

ji i ji jj i j

n k

ji i ji jj i j

SSE x y x j k

SSE x g l x j k

ψ ββ

ψ σσ

= =

= =

∂= ⇔ − = = …

∂

∂= ⇔ − = = …

∂

∑ ∑

∑ ∑

where ψ ρ′= is the derivative of ρ . In general, the ψ function is nonlinear and the above equations should best be solved by iterative methods. In the following, the iteratively reweighed least-squares algorithm is illustrated for estimating the fuzzy parameterβ% . By defining the following weight functions for 1, ,i n= … ,

0 0

0 0

( ) ( ( ) )

, ,( )

k k

i ji j i ji jj j

i ik k

i ji j i ji jj j

y x g l x

y x g l x

ψ β ψ σ

ω γ

β σ

= =

= =

=

− −

=

− −

∑ ∑

∑ ∑

the estimating equations may be written as

1 0 1 0

( ) 0, 0,1, , , ( ( ) ) 0, 0,1, , .n k n k

ji i ji j i ji i ji j ii j i j

x y x j k x g l x j kβ ω σ γ= = = =

− = = … − = = …∑ ∑ ∑ ∑

In the estimating system equations the weights, however, depend upon the errors, the errors depend upon the estimated coefficients, and the estimated coefficients depend upon the weights. An iterative solution (called iteratively reweighted least-squares [1]) is therefore required as follows for obtaining the optimal parameters:

Step 1. Select initial estimate


% (0) (0) (0) (0) (0) (0) (0)0 0 1 1[( , ) , ( , ) , , ( , ) ] ,L L Lk kβ σ β σ β σ ′= …β

such as the least-squares estimates.

Step 2. At each iteration 1, 2,t = … , calculate errors ( 1)tie − and ( 1)t

iε − as

( 1) ( 1) ( 1) ( 1)

0 0

( ), ( ( ) ).k k

t t t ti ji i jii j i j

j j

e y x g l xβ ε σ− − − −

= =

= − = −∑ ∑

and associate the following weights for the errors ( 1)tie − and ( 1)t

iε − for 1, ,i n= …

( 1) ( 1)( 1) ( 1)

( 1) ( 1)( ) ( )

, .t t

t ti ii it t

i i

ee

ψ ψ εω γ

ε

− −− −

− −= =

Step 3. Calculate new weighted-least-squares estimate % ( )tβ as ( ) ( 1) 1 ( 1)[ ]t t t− − −′ ′=β X Ω X X Ω y , ( ) ( 1) 1 ( 1)[ ] ( )t t t− − −′ ′=σ XΓ X X Γ g l ,where 1[ , , ]ny y ′= …y , 1( ) [ ( ), , ( )]ng l g l ′= …g l , ( 1)( 1) ( 1)

1diag[ , , ]tt tnω ω−− −= …Ω , and

( 1)( 1) ( 1)1diag[ , , ]tt t

nγ γ−− −= …Γ are the current weight matrices.

Step 4. Steps 2 and 3 are repeated until the estimated coefficients converge.

Finally, after obtaining the best solution of the above algorithm, the optimal model can be written as

10 1 1 0 1 1

ˆ ˆ ˆ ˆˆ ˆ ˆ ˆ( , ) ( , ) .L k k k k Ly l x x g x xβ β β σ σ σ−= + +…+ + +…+

4. NUMERICAL EXAMPLE

One of the classical problems in hydrology engineering is the measurement of suspended load (ton/day) and discharge (3 /m s ) in watersheds to obtain a model for prediction of suspended load based on discharge. Based on a study in a part

of Darband (situated in the north east of Iran) some water characteristics were measured using standard procedures [2,3,4]. The daily discharge and suspended load of the watershed are measured for 51 times. But, due to some limitations in experimental environments, the observed values of the suspended load were reported as triangular fuzzy numbers. Since, in practical studies, it is important to estimate the amount of suspended load in term of discharge, we wish to model the relationship between suspended load as the fuzzy dependent variable ( ( , )Ty y l=% ) and discharge as the crisp independent variable ( x ).


Figure 1. The scatter plot of the centers of suspended load in term of discharge

Figure 2. The scatter plot of the spreads of suspended load in term of discharge

In order to gain some insight on the data, Figures 1 and 2 show the representation of discharge data on the centers and the spreads of suspended load, respectively. It turns out that these observations are contaminated by outliers with respect to the values of discharge and the centers of dependent variable and therefore our robust technique can help in detecting such outliers. By applying the proposed procedure introduced in Section 3 the optimal model which estimates the fuzzy dependent variable is obtained as follows

ˆ (5.43 0.24 ,exp0.47 0.04 ) .Ty x x= + +%

It is well-known that the least squares fit can be grossly influenced by outliers. In order to provide a competitive study, in the following, this example will show that least-squares do not work well in the presence of outliers. In this regard,


we employ two fuzzy least-squares regression models proposed by Xu and Li [14] and Ferraro et al. [8] to model the data set, as well. We will denote the methods proposed by Xu and Li [14] and Ferraro et al. [8], as Xu and F , respectively. By applying these approaches to the data set, the derived fuzzy linear regression models are as follows:

ˆ (5.60,1.58) (0.20,0.08) ,ˆ (5.60 0.20 ,exp0.47 0.04 ) .

Xu L T

F T

y x

y x x

= ⊕ ⊗

= + +

%

%

Several goodness-of-fit criteria have been introduced by authors to evaluate the performance of a fuzzy regression model [2,3,4,7]. In this work, in order to provide a competitive study we compare the model proposed in this study with some well-known models. To do this, we employ three criteria known as Mean of Similarity Measures (MSM) and Mean of Absolutes Errors ( 1MAE ) and ( 2MAE ) which have been widely used for evaluation of fuzzy regression models. These criteria are defined as follows

1

11

21

ˆmin ( ), ( )1 ,ˆmax ( ), ( )

1 ˆ| ( ) ( ) | ,

ˆ| ( ) ( ) |1 .( )

n i i

i i i

n

i ii

n i i

i i

y t y t dtMSM

n y t y t dt

MAE y t y t dtn

y t y t dtMAE

n y t dt

=

=

=

=

−=

=−

∫∑∫

∑∫

∫∑∫

% %

% %

% %

% %

%

To compare the performances of these three fuzzy regression models, the criteria MSM, 1MAE and 2MAE

are adopted to calculate the accuracy of the models in estimating the observed responses. The corresponding values of these criteria are listed in the Table 2.

Table 2- Comparison between fuzzy regression models methods proposed by MSM 1MAE 2MAE

Xu and Li [14] 0.50 1.48 0.73

Ferraro et al. [8] 0.50 1.49 0.74

Proposed model 0.52

1.44

0.70

The MSM value for the proposed robust fuzzy method is 0.52, which is greater than those of 0.50, and 0.50 calculated from the Xu, and F methods, respectively. On the other hand, the 1MAE and 2MAE values for the proposed method are 1.44 and 0.70, respectively, which are clearly smaller than those of 1.48 and 0.73 calculated from the Xu method and 1.49 and 0.74 calculated from the F method. It should be mentioned, it is shown that the models proposed by Xu and Li [14] and Ferraro et al. [8] dominate some other fuzzy regression models with different criteria as their objective functions including the MSM criterion used in this paper. Therefore, by the above results, it is concluded that the proposed model, at least in this example, not only is better than these two models but also is better than a lot of models dominated by the above two models.


5. CONCLUSIONS

In this regard, in this paper, the robust M-estimation method is employed to construct a fuzzy regression model for crisp input-fuzzy output data. Similarly, one can use other robust estimation methods to construct a fuzzy regression model and compare it with some well-known fuzzy regression techniques in the literature. The proposed model was applied to estimate the suspended load using the amounts of discharge, based on a real data set, in which the observations of the suspended load were fuzzy rather than crisp. To provide a competitive study, the proposed approach is compared with two well-known fuzzy regression techniques based on modeling a real valued data set. The results indicated that the proposed fuzzy regression model in this paper has a better explanatory power than a lot of well-known models. This conclusion was shown based on three goodness-of-fit criteria: a similarity measures-based criterion, and two absolute error-based criteria.

6. REFERENCES 1. Andersen, R. (2007), “Modern Methods for Robust Regression’’. Sage, Thousand Oaks, CA,

2. Chachi, J. and M. Roozbeh, M. (2016). ``A fuzzy robust regression approach applied to bedload transport data'', Communications in Statistics-Simulation and Computation, In Press.

3. Chachi, J., Taheri, S. M. and Arghami, N. R. (2014), “A hybrid fuzzy regression model and its application in hydrology engineering’’, Applied Soft Comput. 25, pp. 149-158.

4. Chachi, J., Taheri, S. M. and Rezaei Pazhand, H. (2016), “Suspended load estimation using 1L -Fuzzy regression, 2L-Fuzzy regression and MARS-Fuzzy regression models’’, Hydrological Sciences J. doi:10.1080/02626667.2015.1016946.

5. Chen, Y. S. (2001), “Outliers detection and confidence interval modification in fuzzy regression’’, Fuzzy Sets Syst. 119, pp. 259-272.

6. Coppi, R., D'Urso, P., Giordani, P. and Santoro, A. (2006), “Least squares estimation of a linear regression model with LR fuzzy response”, Comp. Stat. Data Anal. 51, pp. 267-286.

7. D'Urso, P., Massari, R. and Santoro, A. (2011), “Robust fuzzy regression analysis”, Inform. Sci. 181, pp. 4154-4174.

8. Ferraro, M. B., Coppi, R. G., Gonzalez-Rodriguez, R. and Colubi, A. (2010), “A linear regression model for imprecise response”, Int. J. Approx. Reason. 51, pp. 759-770.

9. Huber, P. and Ronchetti, E. M. (2009), “Robust Statistics”, 2ed. Wiley, Hoboken, NJ.

10. Hung, W. L. and Yang, M. S. (2006), “An omission approach for detecting outliers in fuzzy regressions models”, Fuzzy Sets Syst. 157, pp. 3109-3122.

11. Mohammadi, J. and Taheri, S.M. (2004), “Pedomodels fitting with fuzzy least squares regression”, Iranian J. Fuzzy Syst. 1, pp. 45-62.

12. Nasrabadi, E., Hashemi, S. M. and Ghatee, M. (2007), “An LP-based approach to outliers detection in fuzzy regression analysis”, Int. J. Uncertainty, Fuzziness and Knowledge-Based Syst. 15, pp. 441-456.

13. Peters, G. (1994), “Fuzzy linear regression with fuzzy intervals”, Fuzzy Sets Syst. 63, pp. 45-55.

14. Xu, R. and Li, C. (2001), “Multidimensional least-squares fitting with a fuzzy model”, Fuzzy Sets Syst. 119, pp. 215-223.

15. Zimmermann, H. J. (2001), “Fuzzy Set Theory and Its Applications”, 4th ed., Kluwer Nihoff, Boston,


A Novel Metaheuristics for Optimization Inspired by Mother-Infant Communication in Animal Colonies

AlirezaGhaffari-Hadigheh

AzarbaijanShahidMadani University, Tabriz,Iran

[email protected]

Abstract Mother-infant vocalization from distance in some animal species such as bats, gulls and penguins is a basic tool to find the other's location. It is referred to echolocation or bio-sonar characteristics which is important for the mother to find exactly its own baby in a large colony. Moreover, the baby uses it to better conduction the mother in getting back to the nest without knowing its exact location. This natural fact is the motivation of the current study to devise a novel metaheuristic algorithm in solving optimization problems. The proposed methodology was tested on some continuous functions and led to promising results on convergence.

Keywords: Metaheuristics, Continuous optimization problem, Evolutionary Methods.

1. INTRODUCTION Nowadays, most of engineering challenges can be formulated as optimization problems, in discrete or continuous optimization form, for which there is no knowledge of an exact polynomial algorithm. A great number of heuristics were developed leading to a solution close to the optimum, the majority of them were convenient specifically for a given problem. Moreover, some certain continuous optimization problems exist with the same property enabling to definitely locate a global optimum in a completed number of computations. For these kind of problems, there are significant tools of traditional methods. However, they are often ineffective if the objective function does not satisfy particular structural properties such as convexity and smoothness.

Metaheuristics which are usually nature-inspired and based on swarm intelligence, created developments on both domains. They are applied to almost all kinds of discrete problems as well as are adapted to the continuous problems. The ways for inspiration are such diverse, while all of them apply some specific characteristics in formulating the key updating formulae. To name some examples, genetic algorithm was inspired by Darwinian evolution characteristics of biological systems [6], the ant colony algorithm which employs their behavior when they are searching for a shortest path in their colony [1], particle swarm optimization is based on the swarming behavior of birds and fish [6], and firefly algorithm which was based on the flashing characteristics of tropical fireflies [7].

Recently, bat algorithm based on the echolocation features of micro bats has been introduced [8], which employs a frequency-tuning technique to increase the diversity of the solutions in the population. At the same time, it uses the automatic zooming to balance exploration and exploitation during the search process by mimicking the variations of pulse emission rates and loudness of bats when searching for prey. We refer the interested reader for a thorough review of the literature to [9] and references within.

Natural characteristics of mother-infant vocalization in some animal species is an unknown feature in heuristic and metaheuristic algorithms literature. Let us explain this characteristics on some species. According to the observation on mother-infant reunion, the female adult bats only feed their own babies, but not the other's in the colony. The mother can recognize its own infant through both odor and vocal cues indicating that the isolation calls emitted by the infant



bats play an important role in mother-infant communication [10]. Another example is lesser spear nosed bat when the mother backs to her baby after separation. The calls of mother consists of a specific frequency-time structure, which marks its individually voice distinguishable by her infant [2]. An experimental study revealed that the big brown bat responds exclusively to sinusoidally frequency modulated signals [11]. In another conducted study, the spontaneous vocalization development of greater horseshoe bat was investigated.

Vocalizations of the greater horseshoe bat infants could be categorized to those fulfilling as pathfinders of echolocation sounds and those serving as isolation calls used to inform their mothers.

Similar treatment is observed in black-tailed gulls in their reproductive colonies. Their different vocal signals can be categorized in three main groups which the contact call in the most used one as a tool in distinction between individuals, especially in between parents and chicks. The empirical research shows the importance of this call in social relationship in a high populated breeding colonies [5].

Analogously, in feeding of a chick with the king penguin the baby must recognize the call of its parents, mostly using the frequency, in the continuously noisy background of the colony .Experimental observations revealed that the chick's frequency analysis of the call is not tuned towards precise peak energy values, the signal being recognized even when the carrier frequency was shifted out of the pre-specified range [3]. This structure is repeated through several repetition of the call giving a distinct vocal signature. The experiments also denoted that the small amount of information necessary to understand the message. The high redundancy

in the time and frequency domains and the almost infinite possibilities of coding provided by the frequency modulation signature permit the chick to recognize the adult, without the help of a nest position. This mysterious capability inspired us to develop a new metaheuristics which efficiently solves convex and non convex continuous problems to an (local) optimizer based on our experiments. There is a main difference between our methodology with the bat algorithm proposed in [8], though both can be considered as a frequency-tuning algorithm. There, the author considered a population of bats, each of them representing

a potential optimal point of a given optimization problem. The optimal point is the position of the prey. Within iteration, the algorithm selects the best place where one of the bats is according to the objective function value. Each bat emits a sound towards the prey with a randomly selected frequency in an interval and approximates a better direction towards the prey using its sound reflection. The loudness and the rate of pulse emission are updated. The updating factor treats as the cooling factor in simulated annealing. We estate that, there is no additional assumption on communicating information between the bats and the prey, but only the sound emitted by the bats and its reflection, the loudness of the voices and the pulse emission rates.

In our algorithm, instead of a population of bats, just a mother is considered which could be one of such species with a population of infants which are updated in each iteration. It is assumed that the mother was with its baby before leaving for prey. The focus in our algorithm is their policy to reuniting among a colony of mothers and babies may be of millions. The only device to communicate between the mother and its infant is echolocation sounds transmitting in a mutually pre-compromised range. They tune their voice frequencies in the range that is understandable for both. It is assumed that the infant stands in an optimal solution of the underlying objective function and the mother moves toward this point in each iteration. In this algorithm, voice loudness and pulse emission rates are not employed. To distinguish our algorithms with the one in [8], it is referred to as Parent-Infant communication algorithm.

The paper is organized as follows. Section 2 describes main assumptions and follows with deeply explanation of the algorithm. The convergence of the algorithms is analyzed superficially in this section. Section 3has experimental results on some test problems. The performance of the algorithm is investigated for different values of parameters in Section 4.Final section includes some concluding remarks, future work direction as well as possible generalizations and developments.


2. Standard Parent-Infant Communication Algorithm

Based on the above description and characteristics of Mother-Infant echolocation, we developed the algorithm with the three idealized rules:

1. All infants as well as the mother use echolocation to determine distance, and they also know the difference between each other and other barriers in some way.

2. The mother as well as its infant emit voices with a frequency varying in [ ; ]. The mother as well as its infant can automatically adjust the wavelength (or frequency) of their emitted pulses in this range.

3. The mother has the ability to recognize its own infant's voice among the other's in an unknown way. A pseudo-code of the proposed algorithm is presented in Algorithm 1. An iteration is described in the sequel.

2.1 Virtual Mother-Infants Treatment

A small number of infants are selected from the colony for which their positions are fixed. Only the mother's infant is in the optimal position of the given objective function which might be a local optimizer. For the infants population of the size n, the initial set (0) = (0); (0) = 1, … , is considered, where (0) ∈ [ ; ] is the emitted voice frequency of the -th infant situated at (0). Before separation of the mother and its infant, the mother memorize the initial frequency of their voices, and compute their ratio which is in an agreed interval. Let us denote this value by (0) = (0)/ (0), where (0) and (0) are the voice frequency of the mother and the infant, respectively. They are randomly selected in [ ; ] at the start of algorithm. This value will be adjusted by them as the mother getting closer to its infant. It is observed that this ratio approaches to a limit point as the algorithm proceeds. In each iteration of the algorithm, the population of infants substituted by a new one until the prespecified number of iterations are performed.

2.2 Virtual Mother Motion

Let us describe the initial motion. The others are performed similarly. The mother first calculates the ratio (0) = (0)/ (0)for = 1, … , , selects the closest one to (0), and considers the corresponding infant say the th one, as its own baby (i.e., is a candidate for the optimal solution). Her current position is (0); she moves toward (0) along direction (0) = (0) − (0).and updates her position according to a line search for a smallest value of the objective function along way. This point will be considered as the mother's position at the next iteration and be denoted by (1) corresponding to the optimal step length ∗. 2.3 Updates

The updates in the first step are as follows. Analogous updates are considered at other iterations. First, the voice frequencies of the mother and of the infant are adjusted in terms of the objective function values at the initial and the updated positions of the mother. This could be carried out by a random process such as (1) = (0) − [ ( 1∗) − ( 0 )] (1) (1) = (0) − [ ( 1∗) − ( 0 )] (2) where and are two random numbers in (0; 1). Furthermore, the mother disregards the current population of infants and considers a new set (1) = (1); (1) = 1, … , . The position of each infant in (1) is selected at the decreasing side of the objective function. One possible choice could be the half space identified by the hyperplane passing through ∗and perpendicular to the vector − (0).


(1) ∈ 0(1);− (0) (3)

The voice frequency of infants are chosen randomly, too. A potential update of the frequencies is as (1) = + ( − ); = 1, … , (4)

Where is a random value in (0,1).

3. Some Empirical Results

In this section, some experimental results are presented and discussed. Three examples of those in [8] are considered to validate the proposed method. Similarly, the frequency variation range is fixed in [0,100], and = and are randomly selected from a uniform distribution in(0,1). Moreover, a simple line search is implemented; the interval [0,1]containing _ is divided into 10 subintervals and the one with the smallestvalue of the objective function among the associated 11 points is selected.The infants’ population of the size = 25 and the iteration number 100 area summed. In all figures, the black dots are the random starting points and t he red ones are the final points resulted by the algorithm. The algorithm is implemented in Matlab to visualize its effectiveness in practice. More experimental results are presented in the next section.

The first example considers a function with a unique global optimal solution.

Example 1 This function is a version of De Jong's standard sphere function with a single global minimum at = ( , … , ) as follows

ℎ( ) = ( − )2 =1 (5)

In 2 , = −10 and = 30 are considered and consequently the optimalsolution is (−10; 30). In our experiment, it was observed that the starting point is unimportant in the efficiently convergence of the algorithm. The right one in Fig. 1 denotes the trace of produced points converging to the optimal solution (denoted by blue +'s). The left one depicts the starting points and resulting ones for 50 runs of the algorithm.


Figure 1. Results for the De Jong's standard sphere function in = . The right figure denoteone sample starting point, trace of intermediates and the final. The left one denotes the resultsfor 50 runs of the algorithm, stating points and the accumulating one.


At the next example, a function with multiple minimizers with no localones is considered.

Example 2 Consider the Rosenbrock's function as follows

( ) = (1 − ) + 100( − ) (6)

Analogues to [8], the box containing the optimal solutions is consideredas = | − 2.048 ≤ ≤ 2.048; = 1, … , . This function has twoglobal minimizers(−1,1) and (1,1) in 2D and the results of 50 runs aredepicted in Fig 2. It is seen that each of the two global optimizers could bethe accumulation point of the algorithm. The right one has a concentratedinsight to the global optimizers (−1,1)and(1,1).

Figure 2. The left plot denotes the results of running of the algorithm on the Rosenbrock'sfunction in 2D. The right one depicts the resulting points of the algorithm.

From Fig. 3, it is obvious that the position of starting points is ineffective to the resulting point of the algorithm. In these both runs, thestarting point is closer to (1; 1) than to (-1; 1). However, the final pointsare not the same. To avoid this treatment of the algorithm, when one ofthe optimizers is of special interest and its approximate position is known,the starting point box B can beaccordingly shrunk. Experiments certificate this approach for the mentioned drawback. However, it was observedin some runs of the algorithm, when the starting point has been selectedenough close to one of the optimums, it stopped at the other one.


Figure 3. Two samples of runs, each of them is approaching to one of the two global optimalsolutions of the Rosenbrock's function for = .

Last test function has a global minimizer at the origin with some local minimizers in its proximity.

Example 3 Let us consider the Eggcrate function in 2D as ( , ) = + + 25( 2 + 2 ); − 2 ≤ ; ≤ 2 (7)

Fig. 4 depicts the algorithm behavior on this problem. The right one shows the situation when the algorithm converges to the global minimum regardless of the initial point's position. In addition, it was observed that when the infants population size is less than 25 and the iterations number is not more than 50, the resulting point is trapped at a local minimizer.

Figure 3. The left figure denote the Egg crate function ( , ) plotted in the box [− , ] × [− , ]. The right one depicts the results on 50 times running the algorithm. Black circles are the random starting points and the red dots are the resulting points.

.4. Concluding Remarks A novel metaheuristic method was presented in this report. The test problems examined were unconstrained. Experimental results are promising in general. However, further computations have to be carried out to examine the


effect of infants’ population size on the convergence. For the constrained cases, one may choose the position of infants which satisfy these constraints. Selecting a feasible solution in this case might be a challenge. Our presented algorithm is a continuous version and binary case might be of more interest which is our future study direction. In the implementation, a simple line search is considered between the current position of the mother and the best one's in the selected population. One may consider exact line search methods instead if the problem is well-behaved. For a general case, Armijo rule and Golden section searcher the two [4] to name. To update the infant population in each iteration, a simple method was employed to detect a possible decreasing direction for the objective function in the search space. Other possibilities such as steepest descent direction can be considered when the objective function is smooth.

11. REFERENCES 1. Marco Dorigo and Thomas Stützle, The ant colony optimization metaheuristic: Algorithms, applications, and

advances, Handbook of metaheuristics, Springer, 2003, pp. 250-285. 2. Karl-Heinz Esser and Uwe Schmidt, Mother-infant communication in the lesser spear nosed bat phyllostomus

discolor (chiroptera, phyllostomidae) evidence for acoustic learning, Ethology 82 (1989), no. 2, 156-168. 3. Pierre Jouventin, Thierry Aubin, and Thierry Lengagne, Finding a parent in a king penguin colony: the acoustic

system of individual recognition, Animal Behaviour 57 (1999),no. 6, 1175-1183. 4. Jorge Nocedal and SJ Wright, Numerical optimization: Springer series in operations research and financial

engineering, Springer-Verlag (2006). 5. Shi-Ryong Park and DaeSik Park, Acoustic communication of the black-tailed gull (lamscrassirostris): the structure

and behavioral context of vocalizations, Korean J Bid Sci 1(1997), 565-569. 6. J DreoAPetrowski and P Siarry E Taillard, Metaheuristics for hard optimization, 2006. 7. Xin-She Yang, Firey algorithm, Engineering Optimization (2010), 221-230. 8. Xin-She Yang, A new metaheuristic bat-inspired algorithm, Nature inspired cooperativestrategies for optimization

(NICSO 2010), Springer, 2010, pp. 65-74. 9. Xin-She Yang and Xingshi He, Bat algorithm: literature review and applications, InternationalJournal of Bio-

Inspired Computation 5 (2013), no. 3, 141-149. 10. LIU Ying, Jiang Feng, Yun Lei Jiang, and Lei WU1and Ke Ping SUN, Vocalization development of greater

horseshoe bat, rhinolophusferrumequinum (rhinolophidae, chiroptera),Folia Zool 56 (2007), no. 2, 126-136. 11. Qi Yue, John H. Casseday, and Ellen Covey, Response properties and location of neuronsselective for sinusoidal

frequency modulations in the inferior colliculus of the big brownbat, 98 (2007), no. 3, 1364-1373.


A decision tree based neural network method for prediction of poor prognosis in traumatic brain injury patients

Saeedeh Pourahmad1, S.Mahmoud Taheri2*, Iman Hafizi-Rastani1,

Hosseinali Khalili3, Shahram Paydar4

1-Biostatistics Department, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran, 2-Faculty of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran,

3-Shiraz Neuro Science Research Center, Department of Neuro Surgery, Shiraz University of Medical Sciences, Shiraz, Iran

4-Trauma Research Center, Department of Surgery, Shiraz University of Medical Sciences, Shiraz, Iran


Abstract This paper aims to predict poor prognosis in traumatic brain injury(TBI) patients based on admission findings. A neural network is mapped from an initial decision tree constructed from a part of data. Then the designed network is trained and validated by the remained data. 10-fold cross validation method is applied and the area under the ROC curve and accuracy rate are reported. The most important attributes are determined from the trained network by two methods; Change of Mean of Squared Error (COM) and Sensitivity Analysis (SA). The results reveal a high accuracy rate (91.1%) and significant area under the ROC curve (0.655) for the model used in this study. Furthermore, the order of the important attributes in prediction which has determined by this method is accepted in clinical aspect. Accordingly, the combination of different modeling method may bring out some complexities in computations and interpretations but improves the accuracy in prediction. Keywords: Poor prognosis; Traumatic brain injury patients; Neural network; Decision tree.

1. INTRODUCTION Data mining techniques which were firstly applied in medicine by Cremilleux and Robert in 1997 [1] has been developed to extract hidden information in large databases. These techniques include several data exploration methods such as association, classification, clustering, forecasting, sequence or path analysis, etc. [2]. Decision tree algorithms and neural network modeling methods are two famous soft computing approaches in this content. Decision tree is a hierarchical model of decision making with inductive inference which can be expressed in the form of nested phrases (if-then rules) [1]. Neural network model is a classification method with high prediction ability which models the complex nonlinear relations among the data. However, it is sensitive to the network's parameters, requires more calculations and does not have easy interpretation for practical users [3]. Some applications of these methods in clinical dataset include disease diagnosis [4-6], cancer diagnosis [7-9], image analysis [10], death prediction [11-12], survival analysis [13], patients' classification [14], prognosis prediction [15-16] and the combination of these two methods in cancer relapse [17].

Traumatic brain injury (TBI) is the most important cause of death in people under45yearsold globally. The patients are highly unstable at the first week after TBI. Hence, a modeling process is desired to determine important attributes in poor prognosis. Therefore, a combination method is utilized in the present study to model this relation by decision tree and neural network approaches.



2. METODS 2.1 Data set

All 410 TBI patients referred to ICU of ShahidRajaee Hospital (Shiraz, southern part of Iran) during 2011–12participated in the present study. They had Glasgow Coma Scale (GCS) equal or less than 10 due tobraininjuryconsist ofmotor-vehicle accident, pedestrian, fall, and assault impact. 24 attributes affecting Glasgow Outcome Scale (GOS) of patients are considered as the input attributes includingage, gender, CT scan findings, pulse rate, respiratory rate, pupil's size, reactivity, cause of injury, etc. Patients' status 6-months after the injury (unfavorable, favorable) were considered as the output variable. Unfavorable status (or poor prognosis) is defined as inappropriate situation of TBIpatients afterICUdischarge fromhospital. Inappropriate situation (Unfavorable status) is considered as extended GOS (GOSE) equal or less than4 and favorable status is defined as GOSE equal or larger than 5 [18].

2.2 Method

Decision tree (DT) is a directed, acyclic graph in the form of a tree. Each node in a tree has either zero, or more outgoing edges. If a node has no outgoing edges, then it is called a decision node, or a leaf node; otherwise, a node is called a test node, or an attribute node. Each splitting attribute has a splitting function associated with it. The splitting function determines the outgoing edge from the test node, based on the attribute valueof an object in question. The problem of DT construction is to find a DT classifier, such that the misclassification rate is minimal [19].

A tree can be restructured as a neural network [20]. Hence, the number of neurons in hidden layer, and also the conjunctions among the neurons and layers, can be mapped from the internal nodes and their outgoing edges of the initial tree. In this study, the idea of Sethi [20] is followed. Accordingly, two hidden layers are sufficient to model a complex relation the first hidden (Partitioning)layer and the second hidden (ANDING) layerwhich reflects the AND operator of each path in DT. Since for two, or more sheets in a tree, which lead to the same results, OR operator is considered for the corresponding paths, the output layer is named ORING layer. To design a mapped network the following steps were followed:

1. The dataset is divided into two parts: tree design set, and network training set. 2. DT was developed using C4.5 algorithm. 3. The tree is mapped into a four-layer neural network structure following these rules:

a) The neurons number For the input layer is equal to the number of input attributes. For the first hidden layer, it equals the internal nodes of the initial tree. Each of these neurons implements one of the decision functions of internal nodes. For the second layer, it equals the decision nodes.And for the output layer, it equals the output attribute’s categories

b) The network conjunctions, which detect the hierarchy of the tree Each input from the input layer is connected to a neuron inthe first hidden layer, if it follows the decision function of the corresponding internal node of initial tree.From the first to the second hidden layer: the root neuron is connected to all second's neurons, and the remaining to their final decision neuron. From the second to the output layer: neurons with the same decision are connected to their corresponding output neuron.

It is essential to mention that the above rules do not necessarily lead to an optimal number of neurons in the first hidden layer. Hence, the construction algorithm of the initial tree is important.

4. The mapped network is trained using back propagation feed forward algorithm.


To evaluate the performance of this method the area under the ROC curve (AUC) and the accuracy rate were used. Furthermore, to determine the important two methods are usedfor back propagation networks: sensitivity analysis and change of MSE (mean squared error)[21] (Eq. 1,2 &3).

, = = ∑ ( ∑ ( ) ) (1)

where, , , are the corresponding derivatives of the activation function; , , are the weights of first, second, and output layers, respectively.

, = ∑ , (2)

where, P is a total number of training observations. =∑ ∑ ( − ) / (3)

Where tkp and okp are the desired (target) output and the calculated output, respectively, of the kth output neuron for thepth observation, K is the total number of output nodes, and P is the total number of training observations. The MSE values are calculated after the one-by-one deletion of the attributes. The absolute values of its change relative to the total MSE is considered as the index for the attributes ranking.

3. RESULTS

410 patients including 85.6% malesvs.14.4% females, with age range of 36.1± 18 participated in the study. Qualitative attributes (with more than two categories) had been transformed into indicator variables for neural network training (1 code for desired category; 0 code otherwise). Therefore, the number of input attributes increased to 34 ones. For the output attribute, unfavorable and favorable were coded 1 and 0, respectively.

To apply the combined method, data were divided into two parts randomly, and 30% (arbitrary) was used to construct the initial tree. ANN with two hidden layers was then mapped from the initial tree. The remained data (70%) was used to train the designed network by back propagation feed forward algorithm. Table1 describes the network and its performance measures.

Table 1. Characteristics of the mapped trained network

Hidden layers no. 2 Neurons in first hidden layer 9 Neurons in second hidden layer 13 Neurons in output layer (decision classes) 2 Inputs no. (quantitative and indicator variables) 34 Learning algorithm Back propagation feed forwards Activation function Tansig Learning Rate 0.2 Stop Condition(difference between two adjacent error components) 0.000001 Validation method 10-fold MSE (Mean squared error) 0.89 Accuracy Rate (%) 91.1 Area under the ROC curve 0.665 (p-value<0.01)

To determine the important attributes from the trained network, the SA, and change of MSE method were used (Table2, and Fig. 1). The first ten (arbitrary) important attributes (the ones with large absolute change in MSE) were Pupil3, Sat,


AgeCat2, PupilS1, LSFX1, Pneumocephalus1, ICH1, PCO2, BE, PO2L60 (Fig. 1). Based on the sensitivity analysis, the first ten important attributes in unfavorable decision class were:RR, BE, Pneumocephalus1, Shift1, SBP, BENum, BSFX1, AgeCat3, PH, PCO 2, while for the favorable decision class included:Pupil1, AgeCat3, DSFX1, EDH1, Ambient1, Sat, Sex1, GCS1h, Pupil2, PupilS3. The order of attributes and prediction accuracy rate in the combined method were preferred by the clinicians.

Figure 1. The absolute values of change in MSE by elimination of an attribute

Table 2.Sensitivity values of the input attributes in two decision classes

Attributes' name Sensitivity values in unfavorable class (coded 1)

Sensitivity values in favorable class (coded 0)

Sex Age (<40) Age (40-59) Age (>=60) State Systolic Blood Pressure Pulse rate GCS1hour PupilReaction1 PupilReaction2 PupilReaction3 PupilSize1 PupilSize2 PupilSize3 Subarachnoid Hemorrhage Intraventricular Hemorrhage Linear skull fractures Basilar skull fracture Pneumocephalus Epidural Hemorrhage Subdural Hemorrhage Intracerebral Hemorrhage Depressing Skull Fracture Shift Ambient PO2 PCO2 PH Base Excess Arterial oxygen saturation Number of times with PO2 less than 60 during 72 hours after injury Number of times with PCO2 more than 40 during 72 hours after injury Number of times with BE less than -4 or more than -8 during 72 hours after injury

0.0072 0.0074 0.0095 0.0149 0.0019 0.0196 0.0110 0.0316 0.0027 0.0049 0.0026 0.0019 0.0042 0.0036 0.0052 0.0050 0.0029 0.0056 0.0160 0.0295 0.0059 0.0015 0.0029 0.0105 0.0233 0.0031 0.0078 0.0125 0.0134 0.0306 0.0042 0.0065 0.0058 0.0167

0.0323 0.0110 0.0063 0.0430 0.0088 0.0051 0.0136 0.0107 0.0290 0.0817 0.0289 0.0200 0.0118 0.0080 0.0283 0.0256 0.0203 0.0245 0.0021 0.0214 0.0394 0.0133 0.0125 0.0422 0.0222 0.0380 0.0227 0.0112 0.0073 0.0199 0.0362 0.0070 0.0101 0.0075

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 340

5

10

15Change of MSE

Feature No

Abs

olute Differen

ce


4. DISCUSSION Although, there are several studies which have been worked on prognostic model for TBI, some differences in methodology, objectives, and study design exist, compared tothe present study. The feature, which had been modeled in these studies, include trauma and injury severity score [22], surgical complications [23] and mortality [18]. In addition, input attributes had been collected from the patients' findings at scene, at admission, or during the hospitalization in ICU. Further, most of the citedresearches had applied classical methods, such as, univariate and multivariate logistic regression models in their analysis [23]. However, soft computing approaches, including the combination of neural network and DT with different methodology [17], and DT classifications [18] had been used in some studies.

The flexibility of soft computing approaches to the real data circumstances is well known. They do not depend on the theoretical assumptions. Also, they can learn and model any complex relations among the data. Nevertheless, soft computing methods need to large dataset, are complex in design, difficult in interpretation (e.g., neural network) and non-sensitive to confounders, or noise signals (e.g.,DT). Combining these methods may lead to modify their defects and enhance their performance.

The results of the combined method in the present study were clinically acceptable.

This study had some limitations in data gathering phase including missing data in hospital records and patients’ follow-up 6-months after the injury. In data analysis phase, complex and professional computing was needed, especially for mapping and designing the network from the initial tree. Since in each run of the computer program, different training and testing dataset may be randomly chosen, several runs had been made in each modeling stage to achieve a reasonable error rate. As a result, due to difficult interpretation of the results of neural network for clinical experts, an additional step can be considered to modeling process for the future studies, where the validated network is re-transformed to a DT.

5. ACKNOWLEDGMENT

This work was supported by the grant number 91-6166 from Shiraz University of Medical Sciences Research Council. This article was extracted from Iman Hafizi's Master of Science Thesis. The authors are thankful to the Trauma Research Center personnel for their help sin data gathering.

6. REFERENCES

1. Witten, I.H. and Frank, E. (2005)."Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann.

2. Seifert, J.W. (2007) "Data Mining and Homeland Security: An Overview", DTIC Document. 3. Bellazzi, R. and Zupan, B. (2008)."Predictive data mining in clinical medicine: current issues and guidelines,"

International Journal of Medical Informatics, 77, pp. 81-97. 4. Sakai, S., Kobayashi, K., Toyabe, S.I., Mandai, N., Kanda, T. and Akazawa, K . (2007). "Comparison of the

levels of accuracy of an artificial neural network model and a logistic regression model for the diagnosis of acute appendicitis," Journal of Medical Systems, 3,pp.1357-1364.

5. Heydari, S.T., Ayatollahi. S.M.T. and Zare, N. (2012) "Comparison of artificial neural networks with logistic regression for detection of obesity," Journal of Medical Systems, 36, pp.2449-2454.

6. Pearl, A. and Bar-Or, D. (Editors) (2009). "Using artificial neural networks to predict potential complications during trauma patients' hospitalization period. Medical Informatics in a United and Healthy Europe", IOS Press, 610-614.


7. Veltri, R.W., Chaudhari, M., Miller, M.C., Poolem, E.C., O’Dowd, G.J. and Partin, A.W. (2002)."Comparison of logistic regression and neural net modeling for prediction of prostate cancer pathologic stage", Clinical Chemistry, 48, pp.1828-1834.

8. McLaren, C.E., Chen, W.P., Nie, K. and Su, M.Y. (2009)."Prediction of malignant breast lesions from MRI features: a comparison of artificial neural network and logistic regression techniques", Academic Radiology, 16, pp.842-851.

9. Pourahmad, S., Azad, M. and Paydar, S. (2015)."Diagnosis of malignancy in thyroid tumors by multi-layer perceptron neural networks with different batch learning algorithms", Global Journal of Health Science, 7, pp.46-54.

10. Jiang, J., Trundle, P. and Ren, J. (2010)."Medical image analysis with artificial neural networks", Computerized Medical Imaging and Graphics, 34, pp.617-631.

11. Eftekhar, B., Mohammad, K., Ardebili, H.E., Ghodsi, M. and Ketabchi, E. (2005)."Comparison of artificial neural network and logistic regression models for prediction of mortality in head trauma based on initial clinical data", BMC Medical Informatics and Decision Making, 5, pp. 1-8.

12. Shi, H.Y., Lee, K.T., Lee, H.H., Ho, W.H., Sun, D.P. and Wang, J.J. (2012)."Comparison of artificial neural network and logistic regression models for predicting in-hospital mortality after primary liver cancer surgery", PloS one, 7, pp. 1-6.

13. DiRusso, S.M., Chahine, A.A., Sullivan, T., Risucci, D., Nealon, P. and Cuff, S. (2002)."Development of a model for prediction of survival in pediatric trauma patients: comparison of artificial neural networks and logistic regression", Journal of Pediatric Surgery, 37, pp. 1098-1104.

14. Kuo, W.J., Chang, R.F., Chen, D.R. and Lee, C.C. (2001)."Data mining with decision trees for diagnosis of breast tumor in medical ultrasonic images", Breast Cancer Research and Treatment, 66, pp. 51-57.

15. Soni, J., Ansari, U., Sharma, D. and Soni, S. (2011)."Predictive data mining for medical diagnosis: An overview of heart disease prediction", International Journal of Computer Applications, 17, pp. 43-48.

16. Bejarano, B., Bianco, M., Gonzalez-Moron, D., Sepulcre, J., Goñi, J., Arcocha, J. (2011)."Computational classifiers for predicting the short-term course of multiple sclerosis", BMC Neurology, 11(67), pp. 1-9.

17. Jerez-Aragonés J.M., Gómez-Ruiz J.A., Ramos-Jiménez, G., Muñoz-Pérez, J. and Alba-Conejo E. (2003)."A combined neural network and decision trees model for prognosis of breast cancer relapse", Artificial Intelligence in Medicine, 27, pp. 45-63.

18. Oh, H. and Seo, W. (2009)."Functional and cognitive recovery of patients with traumatic brain injury prediction tree model versus general model", Critical Care Nurse, 29, pp. 12-22.

19. Kuo, W.J., Chang, R.F., Chen, D.R. and Lee, C.C. (2001)."Data mining with decision trees for diagnosis of breast tumor in medical ultrasonic images", Breast Cancer Research and Treatment, 66, pp. 51-57.

20. Sethi, I.K. (1990)."Entropy nets: From decision trees to neural networks", Proceedings of the IEEE, 78, pp. 1605-1613.

21. Sung, A. (1998)."Ranking importance of input parameters of neural networks", Expert Systems with Applications, 15, pp. 405-411.

22. Lesko, M.M., Jenks, T., O'Brien, S.J., Childs, C., Bouamra, O. and Woodford, M. (2013)."Comparing model performance for survival prediction using total Glasgow Coma Scale and its components in traumatic brain injury", Journal of Neurotrauma, 30, pp. 17-22.

23. Shibuya, T.Y., Karam, A.M., Doerr, T., Stachler, R.J., Zormeier, M. and Mathog, R.H. (2007)."Facial fracture repair in the traumatic brain injury patient", Journal of Oral and Maxillofacial Surgery, 65, pp. 1693-1699.


An Algorithm for Image Watermarking Embedding and Blind Detection in the Domain of Wavelet Transform

Esmaeil Najafi

Department of Mathematics, Faculty of Science, Urmia University, Urmia, Iran [email protected]

Abstract

In this paper a new algorithm is proposed for the embedding and detection of a watermark in the domain of the wavelet transform. The embedding process is informed and the algorithm uses blind detection for both host image and watermark. The experimental results have shown that the proposed approach has the desired properties such as invisibility, capacity, reliable detection, and robustness against compression, noise, scaling and rotation attacks.

Keywords: Wavelet transform, Image processing, Image watermarking, Signal detection and filtering.

1. INTRODUCTION

With the fast growth of the Internet, people have paid more and more attention to the security of the network information. The protection of the digital data is an important topic to the owners of the multimedia products. Digital watermarking is embedding hidden data into the multimedia in such a manner that it cannot be removed and its detection to verify the ownership of digital products.

In order to be used as a means of copyright protection, the two main requirements of high robustness and capacity and high imperceptibility (low visibility) should be ensured for watermarking. The peak singnal to noise ratio is a criteria that is used to evaluate imperceptibility. There exists a reverse relation between capacity, rebustness and imperceptibility and making a developed compromis between these conflict parameters is the core motivation of the most watermarking schemes.

The watermarking procedure in which the techniques first transform an image into a set of frequency domain coefficients are frequency-domain techniques [1]. These techniques include discrete cosine transform (DCT), discrete fourier transform (DFT), discrete wavelet transform (DWT) etc. In these techniques the watermark is embedded in the transform coefficients of the image. Then the coefficients are inverse-transformed to form the watermarked image. In this way the watermark is less invisible and more robust to some image processing operations and attacks. Exctracting process of watermark may be dependent or independent of the original image or embedded watermark, which based on level of required information are classified in non-blind, semi-blind or blind detection process. These classes are application-dependent and affect on the capacity of the watermarking scheme.

A new robust image watermarking scheme based on wavelet transform was investigated in this study. In the procedure a binary watermark in embedded in the all detail subbands of the three levels decomposition of the image. In the



extracting of watermark there is no need for any information of host image or embedded watermark and hence the detection process is completely blind.

2. DISCRETE WAVELET TRANSFORM OF IMAGES

A function (signal) ( ) using discrete wavelet transform (DWT) can be decomposed into a weighted sum of basis functions , ( ) and , ( ) (wavelets and scaling functions):

( ) = √ ∑ ( , ) , ( ) + √ ∑ ∑ ( , ) , ( ), , ∈ ℤ,

where is starting scale, is the lenght of the signal, ( , ) and ( , ) are approximation and details coefficients, respectively.

In two dimensional signals (like an image), wavelets and scaling functions are tensor products of the one dimensionals: ( , ) = ( ) ( ), ( , ) = ( ) ( ), ( , ) = ( ) ( ), ( , ) = ( ) ( ), and the decomposition of signal ( , ) of size × will be as

( ) = √ ∑ ∑ ( , , ) , , ( , )

+ √ ∑ , , ∑ ∑ ∑ ( , , ) , , ( , ), , , ∈ ℤ, With this decomposition the 2D signal is filtered to produce the coefficients of the four subbands ( , . , . ), ( , . , . ), ( , . , . ) and ( , . , . ), when ( + 1, . , . ) is the coefficients of the input image. This procedure is one level analysis filter bank and an image is divided into four sub-bands LL, LH, HL and HH. LH (Low-High) is generated by the approximation coefficients in X direction and details coefficients in Y direction, (see Figure (1)).

Figure 1: Image decomposition with DWT: Subbands of 2 level decomposition


Figure 2: Image decomposition with DWT : 1 level analysis and synthesis filter bank

After decomposition of the image , the details and approximation coefficients is used by inverse DWT to recompose the original image . This process is synthesis filter bank and the filter coefficients (lowpass and highpass filters) are chosen such that the reconstruction is perfect [2, 3, 4] . Figure 2 illustrates the wavelet and inverse wavelet transform steps . The basic approach of wavelet in many applications and their usefulness in image processing is demonstrated in three steps: computing a 2-D DWT of an image, altering the coefficients of the transform, and finally computing the inverse transform to reconstruct the modified image. DWT properties can be utilised to enhance the robustness and preserve the imperceptibility in the watermarking . Accordingly , this technique is adopted by the scheme proposed in this paper. 3. THE PROPOSED SCHEME

The proposed scheme has presented in this section. This scheme includes watermark embedding and extracting processes which are presented in Algorithms 1 and 2. The watermark = , , . . . , which will be embedded is a binary watermark with values = 0,1 for = 1, . . . , . Also we name ( , , ) = ( , ) the components of the subband ∈ , , .

3.1 The watermark embedding procedure

The human eye is able to detect modifications to the lower frequencies. So it is better to embed a watermark into an image by modifying large detail coefficients of its multiresolution representation [5, 6]. Detail coefficients belong to the edges and borders of the images, where the ferequency is high and embedding of the watermark in these locations is rebust againt human visual system. The original image is decomposed into three level. The sub-bands LH3, HH3 and HL3 in level three are selected for the embedding of the watermark since these sub-bands involve wide range of the ferequency spectrum of the image and then the rebustness of the watermarking will be increased. The steps of the embedding process is as following Algorithm 1:

Step 1: Decompose the original image A using 3 levels DWT to subbands LL3, LH3, HL3 and HH3.

Step 2: Embed the watermark by the following scheme and obtain the new modified coefficients:

For ∈ 3, 3, 3 and = 1, . . . , do:


If == 1& ( , ) < ( , + 1), then swap( ( , ), ( , + 1)).

else If == 0 & ( , ) > ( , + 1), then swap( ( , ), ( , + 1)).

Step 3: The watermarked image is then obtained after applying the inverse DWT on the modified coefficients LH3,HH3 and HL3: = .

3.2 The watermark extracting procedure

For the detection of the watermark the inverse of the embedding is implemented. Along with the detection procedure no information of host image or embedded watermark is required, then the extracting procedure is blind. The watermarked image ∗ may be distorted by geometrical attacks and non-geometrical attacks ( image processing attacks). After decomposition of the watermarked image ∗ in same levels as embedding, The following Algorithm 2 on the subbands LH3, HL3 and HH3 will be done:

Step 1: Decompose the watermarked image ∗ using 3 levels DWT to subbands LL3, LH3, HL3

and HH3.

Step 2: Extract the watermark = , . . . , by the following process:

For ∈ 3, 3, 3 and = 1, . . . , do

If ( , ) < ( , ), then = 0.

else = 0.

Step 3: The extracted watermark is = , , . . . , .

4. EXPERIMENTAL RESULTS The proposed procedure simulation is implemented in MATLAB. The popular test images Peppers, Cameraman and Baboon with size 512x512 were used as host images and the watermark is a binary string. The examination of the efficiency of the proposed scheme under different conditions are considered in terms of imperceptibility and robustness agian various attacks. The most widely used criteria are peak signal-to-noise ratio (PSNR) and normalized correlation coefficient (NCC) which are used repeatedly in the literature. The PSNR which are used to estimate the imperceptibility, is a criteria to evaluate the similarity between the host image and the watermarked image by the following relation:

= 10log ( ( , )) , = 1 ×

( ( , ) − ( , )) , where , are the size of the image and is the mean square error between two images. The NCC is a criterion which measures rebustness by evaluating the similarity between the original watermark and extracted watermark and has a value between 0 and 1:


= ∑ ∑ ∑ .

4.1 THE IMPERCEPTIBILITY TEST OF THE SCHEME

The imperceptibility of the test images with various lenght of the watermark is examined. Obviously there is a reverse relation between the lenght of the watermark and PSNR, hence an optimal length of watermark must be selected such that an acceptable imperceptibility of the watermarked image to be obtained. These results are observed in Table 1. According to the results choosing = 6000 as the length of the watermark is ideal, since the minimum acceptable value of PSNR is about 38 dB [7]. Figure 2 displays the host and watermarked images which the length of the embedded watermark is = 6000.

Table 1: Imperceptibility values using PSNR (dB) with various watermark length for test images.

Length

Image = 2000 = 4000 = 6000 = 8000 = 10000

Peppers 42.9916 38.7867 37.5629 36.9227 35.2053

Cameraman 45.3508 39.2524 38.2870 36.6441 36.0325

Baboon 40.9367 38.8082 37.7057 37.4336 35.9913

Figure 2: Original (up) and watermarked (down) images with = .

4.2 THE REBUSTNESS TEST OF THE PROCEDURE


The robustness is the resistance of the embedded watermark against distortions of the watermarked image. The distortions may be geometrical or non-geometrical attacks. Tables 2 - 4 represent the NCC values of the extracted watermark of the scheme when is impressed by different attack types. Geometrical attacks such as scaling and rotation were applied. Noise addition (salt and pepper, Gaussian) and jpeg compression attacks were selected as non-geometrical attacks (image processing attacks). The proposed scheme showed agreeable resistance under all attacks.

Table 2: Scaling and rotation attacks.

Image

Attack Peppers

Camera-

man Baboon

Image

Attack Peppers

Camera-

man Baboon

Scaling (0.5,2) 0.9741 0.9723 0.9755 Rotation 2∘ 0.8719 0.8556 0.8901



Scaling (2,0.5) 0.9793 0.9785 0.9849 Rotation 110∘ 0.8700 0.8562 0.8926

Scaling (4,0.25) 0.9179 0.9185 0.9292 Rotation −50∘ 0.8678 0.8521 0.8908

Scaling (8,0.125) 0.6490 0.6707 0.7134 Rotation −80∘ 0.8707 0.8522 0.8950

Table 3: Salt & pepper noise and Gaussian noise attacks.

Image

Attack Peppers

Camera-

man Baboon − Peppers

Camera-

man Baboon

Salt & pepper noise 0.001 0.9435 0.9158 0.9770 Gaussian noise 0.001 0.8762 0.8307 0.9509






Table 4: Jpeg compression attacks.

Image

Attack Peppers Cameraman Baboon

Jpeg compression 90 0.9755 0.9553 0.9946







5. CONCLUSION

In this study, a new rebust image watermarking based on discrete wavelet transform (DWT) is proposed. This scheme uses DWT properties to achieve the watermarking requiremments. These properties are the edge detection and perfect reconstruction of the DWT. In addition to the blind detection the proposed scheme is more appropriate for watermarking applications such as copyright pprotection. The scheme is resistant against geometrical and nongeometrical attacks with a good capacity of the embedding of the watermark.

6. REFERENCES

1. Shih, F. Y. (2010), “Image Processing and Patern Recognition” . John Wiley & Sons, Inc., Hoboken, New Jersey.

2. Gonzalez , R . C . Woods , R . E . ( 2008), “ Digital image processing” , Third edition , Pearson International Edition .

3. Daubechies , I . (1992) “ Ten lectures on wavelets” , Philadelphia , Pennsylvania .

4. Goswami , J. C. Chan, A. K. (1999), “Fundamentals of wavelets, theory, algorithms and applications”, Wiley-Interscience Publication.

5. Bounkong, S. Toch, B Saad. D. and Low, D. (2003), “ICA for Watermarking Digital Images”, Journal of Machine Learning Research, 4, pp. 1471–1498.

6. Schyndel, R. G. V. Tirkel A. Z. and Osborne C. F. (1994), “A digital watermark” , IEEE Proceedings of International Conference on Image Processing, 2, pp. 86–90.

7. Lee, Y. P. Lee, J. C. Chen, W. K. Chang, K. C. Su, I. J. Chang, C. P. (2012), “High-payload image hiding with quality recovery using tri-way pixel-value differencing” , Inf. Sci. 191 pp. 214-225.


Quasilinearization Numerical Scheme for Solving a Class of Weakly Singular Volterra Integral Equations

Esmaeil Najafi

Department of Mathematics, Faculty of Science, Urmia University, Urmia, Iran [email protected]

Abstract

In this paper we propose a fast quasilinearization numerical scheme, coupled with collocation method, for solving nonlinear weakly singular Volterra integral equations. The kernel of the eqaution satisfies some monotonicity conditions. The mothod provides quadratic, uniform and monotone convergence of the quasilinearization method (QLM) for solving nonlinear problems. The convergence of QLM and its rate are examined numerically, on some numerical test examples.

Keywords: Volterra integral equations, Collocation method, Gauss-Kronrod quadrature.

1. INTRODUCTION AND PRELIMINARIES

The method of quasilinearization is an iterative procedure for solving a large variety of nonlinear problems including ordinary and partial differential equations [1, 2, 3, 4], integral and integro-differential equations [5] is introduced by Bellman and Kalaba [6] and later generalized by Lakshmikantham [7, 8]. Consider the nonlinear weakly singular Volterra integral equation

.))(,,()(=)(0

dssustktytut

∫+ (1)

with k a weakly singular function. Applying iterative processes to solve this equation, when ),,( ustk is nondecreasing

with respect to uand satisfies a Lipschitz condition, the successive approximations

,)))()())((,,())(,,(()(=)( 1110dssususustksustktytu pppup

t

p −−− −++ ∫ (2)

for L1,2,=p , yields a monotonic sequence which is uniformly convergent to the solution of Eq. (1). This scheme is linear and under nondecreasing monotonicity and convexity conditions on ),,( ustk is quadratically convergent to the solution of Eq. (1). The purpose of this paper is to employ numerical methods to approximate the solution of the linear integral equations (2) in a piecewise continuous polynomials space and then generating a sequence of approximation solutions where converge to the solution of the nonlinear integral equation (1), under some conditions on ),,( ustk as mentioned above. For numerical integration and quadrature formulaes we employ Gauss-Kronrod quadrature rule, an extension of the Gauss quadrature rules using Stieltjes polynomials based by Kronrod [9, 10, 11].



2. METHOD OF QUASILINEARIZATION (QLM)

For R∈T and 0>T let ][0,= TJ and :),(= tsJJstD ≤×∈ , consider

,))(,,()(=)(0

dssustktytut

∫+ (3)

where ],[ RJCy∈ .

To solve Eq. (1) we use the following iterative scheme

,)))()())((,,())(,,(()(=)( 1110dssususustksustktytu pppup

t

p −−− −++ ∫ (4)

for 1,2,...=p where )(0 tu is the lower solution of Eq. (1). A function ],[ RJCv∈ is called a lower solution of Eq. (1)

on J if

, ,))(,,()()(0

Jtdssvstktytvt

∈+≤ ∫

and ],[ RJCw∈ an upper solution if the reversed inequality satisfied. Let ),()( ; ),,(= JttwutvDust ∈≤≤×∈Ω R .

Theorem 1 Assume that

)( 1H ],[0 RJCv ∈ is lower solution of Eq. (1) on J .

)( 2H 0),,( 0,),,( ≥≥ ustkustk uuu for Ω∈),,( ust .

Then the iterative scheme (4) defines a nondecreasing sequence )( tvp in ],[ RJC such that uvp → uniformly on

J , and the following quadratic convergent estimate holds:

. , 102

1 uvvvvuAvu ppp ≤≤≤≤−≤− − L PP PP

3. STEP-BY-STEP COLLOCATION METHOD We set the partition =<<<=0 10 TNτττ L on J and put 1= −− nnnh ττ with max= nn hh and indicate the

above partition by hJ . The first-kind Chebyshev polynomials are defined by the relation

,0,1,2,= ),(cos= ),(cos=)( KmzmzTm θθ

and the zeros of these polynomials in an increasing arrangement are as

.,1,= ),2

12(cos=1 mkm

kz km Kπ−

+−


We define the mapping ],[1,1][: 1 nnn ττδ −− a with the relation

.22

=)( 11 −− ++

− nnnnn zz ττττ

δ

Now, the linear Volterra integral equation (4) may be shown in the form of

,1,2,= ,)(),(),()(=)(00

KpdssvstkdsstHtytv pp

t

p

t

p ∫∫ ++ (5)

where

.),( )),(,,(=),( ),())(,,())(,,(=),( 1111 DstsvstkstksvsvstksvstkstH pupppupp ∈− −−−− (6)

Suppose that hJ is a given partition on J . The piecewise polynomials space )(1)(1 hm JS −

− is defined by

,;1 |:],[)(=)( 11)(1 NnqJCtqJS mn

dhm ≤≤∈∈ −

−− πσR

where ],(= 1 nnn ττσ − and

µπ denotes the space of the polynomials of degree at most µ . With this definition we select

the Lagrange polynomials as a basis for 1−mπ on the subinterval nσ and approximate the solution of the integral

equation (5) in the piecewise polynomials space )(1)(1 hm JS −

− using collocation method. Then by letting )(=, knkn zt δ and

,1,= ,,1,=:= , mkNntY knh KK , the collocation points, this collocation solution )()(ˆ )(1 h

dmp JStv −∈ in the

subinterval nσ may be written as

.,1,= ,=)(

,,1,= 1,1],( ),(ˆ=)(ˆ=))((ˆ=)(ˆ

1=

,1=

,

mkzzzz

zL

NnzzLVzvzvtv

jk

jm

kjj

k

kpkn

m

knpnpp

K

K

−

−

−∈

∏

∑

≠

δ

(7)

for ,1,2,= Kp where )(ˆ=ˆ,, knp

Pkn tvV . For it σ∈ , Eq. (5) may be written as

dssvstkdsstHdssvstkstHtytv pp

t

ip

t

ippp

j

j

i

jp )(),(),())(),(),(()(=)(

111

1

=1∫∫∫∑

−−−

−

++++ττ

τ

τ (8)

where ))((=)(, zvzv ipip δ . The collocation equation is defined by replacing the exact solution )(tvp by the collocation

solution )(ˆ tvp in Eq. (5) on the collocation points hY . Then using Eq. (??) for

hki Yt ∈,, the collocation equation has the

form

))())(,(ˆ))(,((2

)(=)(ˆ=ˆ,

1

1,1=

,

1

1

1

1=,,, dzzLztkVdsztH

htytvV qjkip

pqj

m

qjkip

ji

jkikip

pki δδ ∫∑∫∑ −−

−

++


,)())(,(ˆ2

))(,(2 ,1,

1=,1

dzzLztkVhdzztHhqikip

kzpqi

m

q

iikip

kzi δδ ∫∑∫ −−++ (9)

for ,,1,= mk K where we have used Eq. (7) and by defining suitable vectors and matrices Eq. (9) is transformed to the linear system

.,1,= ),ˆ(22

=ˆ)2

( ,,1

1=NiVBH

hHhyVBhI p

jjp

ijp

ij

i

j

pi

ii

pi

pi

im K+++− ∑

− (10)

4. FULLY DISCRETIZING USING GAUSS-KRONROD QUADRATURE RULE The basic reasoning behind the extension of quadrature formulae is as follows. Let an n-point quadrature rule be augmented by the addition of p abscissae and let )(xG pn+

be the polynomial whose roots are the pn+ abscissae of the

new quadrature formula. A general polynomial of degree 12 −+ pn can be expressed as

,)()(=)(1

0=112

kk

p

kpnpnpn xcxGxQxF ∑

−

+−+−+ + (11)

where )(1 xQ pn −+ is a general polynomial of degree 1−+ pn . )(1 xQ pn −+

can always be exactly integrated by a )( pn+ -

point formula and if )(xG Pn+ is such that

1,0,...,= 0,=)(1

1−+−∫ pkdxxxG k

pn

then all of Eq. (11) can be exactly integrated by an )( pn+ -point quadrature formula. Thus it should be possible to derive formulae having pn + abscissae and of degree 12 −+ pn . Let )( xpn be the Legendre polynomials.

Stieltjes[12] considers the Legendre polynomials of the second kind

...,)(=)(

1, ,)(=)( 221

1

1

1+++

− +−∫ za

zazE

zqdx

xzxpzq n

n

nn

and obtains the remarkable property

1,0,...,= 0,=)()( 1

1

1−+−∫ pkdxxxExp k

nn

where means )()(=)( 1 xExpxG nnpn ++ with 1= +np . In addition to the orthogonality of the Stieltjes polynomials )(1 xEn+

with respect to the oscillatory weight factor )( xpn , Szego [13] showed that the zeros of )(1 xEn+ are real, simple, and

located in the interior of 1,1][− . They are separated by the zeros of )(xpn . Designate the zeros of )( xpn by nyy ,...,1

and those of )(1 xE n + by 11 ,..., +nxx . Then the quadrature rule

),()()(1

1=1=

1

1 kk

n

kkk

n

kxfbyfadttf ∑∑∫

+

−+≈

will be exact for 13 +∈ nf π . Tables of abscissas and weights and details of computation can be found in [9]. According

to this numerical quadrature we define fully discretized matrices and refresh the linear system (10) for them


,,1,= ),~~~(2

~2

=~)~2

( ,,1

1=NiVBH

hHhyVBhI p

jjp

ijp

ij

i

j

pi

ii

pi

pi

im K+++− ∑

− (12)

where Tpmi

pi

pi VVV ]~,,~[=~

,,1 K . Then we define the fully discretized solution )()(~ 1)(1 hm JStv −

−∈ on iσ by the relation

1,1].( ),(~=)(~=))((~=)(~,

1=, −∈∑ zzLVzvzvtv k

pki

m

kipipp δ

and the linear system (12) has the unique solution

.,1,= ))~~~(2

~2

()~2

(=~ ,,1

1=

1 NiVBHh

HhyBhIV pj

jpi

jpi

ji

j

pi

ii

pi

im

pi K+++− ∑

−− (13)

5.NUMERICAL RESULTS

We have considered the numerical solution of Eq. (1) with some choices of related functions in following:

Example 1 The first example is the following equation [14] with exact solution 3=)( ttu and lower solution 0=)(0 tv .

The numerical results are represented in Table 1.

1.0 ,)()(64354096=)( 21/2

0

8.53 ≤≤−+− −∫ tdssusttstttut

Example 2 ([14]), Exact solution ttu =)( and lower solution 10

=)(0ttv :

.4150 ,

)()()16(15

15=)( 1/2

4

0

2 ≤≤−

+− ∫ tdsstsutttu

t

Example 3 ([15]), Exact solution ttu =)( and lower solution 0=)(0 tv :

1.0 ,)()()

34(1=)( 1/2

2

0≤≤

−−+ ∫ tds

stsutttu

t

Table 1: RMS errors of Examples 1, 2 and 3.

p Example 1

( 7=16,= mN )

Example 2

( 7=16,= mN )

Example 3

( 12=16,= mN )

2 3.9795 e- 02 1.6103 e- 01 1.3498 e- 01

4 2.1451 e- 05 5.5875 e- 02 4.6923 e- 05

6 7.3244 e- 13 4.9410 e- 03 5.1791 e- 07


8 7.3238 e- 13 7.5842 e- 07 5.1791 e- 07

5. CONCLUSIONS In this study the well-known quasilonearization method (QLM) is applied to solve the nonlinear weakly singular Volterra integral equations. Due to the fast convergence of the QLM the solution of linear sequence of the weakly singular equations is approximated using step-by-step collocation method and the presented numerical examples verifies rapidly convergence of the method. For numerical integration and fully discretization, we employ the Gauss-Kronrod quadrature rule which has high accuracy in comparison with other Gauss quadrature rules. 6. REFERENCES 1. Dricia, Z. McRae, F. A. Vasundhara Devi, J. (2009), “Quasilinearization for functional differential equations with

retardation and anticipation”, Nonlinear Analysis 70 1763–1775.

2. Cabada, A Nieto, J. J. Pita-da-Veiga, R. (1998), “A note on rapid convergence of approximate solutions for an ordinary Dirichlet problem”, Dynamics of Continuous, Discrete and Impulsive Systems, 4, 23–30.

3. Koleva, M. N. Vulkov, L. G. (2013), “Quasilinearization numerical scheme for fully nonlinear parabolic problems with applications in models of mathematical finance”, Mathematical and Computer Modelling, Volume 57, Issues 10, pp. 2564-2575.

4. Amster, P. De Npoli, P. (2007), “A quasilinearization method for elliptic problems with a nonlinear boundary condition”, Nonlinear Analysis: Theory, Methods and Applications, Volume 66, Issue 10, pp. 2255-2263.

5. Ahmad, B. (2006), “A quasilinearization method for a class of integro-differential equations with mixed nonlinearities” Nonlinear Analysis: Real World Applications, Volume 7, Issue 5, pp. 997-1004.

6. Bellman, R. and Kalaba, R. E. (1965), “Quasilinearization and nonlinear boundary value problems” , American Elsevier Publishing Co, New York.

7. Lakshmikantham, V. Leela S. and Sivasundaram, S. (1994), “Extensions of the method of quasilinearization”, J. Opt. Th. Appl, pp. 315–321.

8. Lakshmikantham, V. (1996), “Further improvement of generalized quasilinearization, Nonlinear Analysis, 27 pp. 315–321.

9. Kronrod, A. S. (1965), ”Nodes and Weights for Quadrature Formulae. Sixteen-place Tables” , Nauka, Moscow, 1964; English transl., Consultants Bureau, New York.

10. Gautschi, W. (1987), “Gauss-Kronrod quadrature – a survey, Numerical Methods and Approximation Theory III” , (G. V. Milovanovic, ed.), University of Nis, pp. 39-66.

11. Monegato, G. (1982), “Stieltjes polynomials and related quadrature rules” , SIAM Rev., 24 pp. 137-158.

12. Stieltjes, T. J. (1905), “Correspondance dHermite et de Stieltjes” , vol. II, Gauthier-Villars, Paris, pp. 439-441.

13. Szego, G. (1959), “Orthogonal Polynomials” , Amer. Math. Soc., New York.

14. Zhua, L. Wang, Y. (2015), “Numerical solutions of Volterra integral equation with weakly singular kernel using SCW method”, Applied Mathematics and Computation, 260, pp. 63-70.

15. Baratella, P. (2013), “A Nystrom interpolant for some weakly singular nonlinear Volterra integral equations”, Journal of Computational and Applied Mathematics, 237, pp. 542-555.


Solution of a time fractional inverse parabolic problem by a numerical algorithm based on finite differences method

Afshin Babaei1, Alireza Mohammadpour2

1- Department of Mathematics, University of Mazandaran, Babolsar, Iran 2- Department of Mathematics, Babol Branch, Islamic Azad University, Babol, Iran


Abstract

In this paper , a numerical algorithm will be introduced for solving a time fractional inverse parabolic problem with the Caputo fractional partial derivative . In the mathematical model of this problem , one of the boundary conditions is unknown . For solving this inverse problem , we introduce a numerical algorithm based on finite differences method according to an overspecified condition. Finally, an example will be presentedto illustrate the ability and efficiency of numerical method. Keywords: Caputo fractional derivatives , Time fractional inverse problem , Finite differences.

1. INTRODUCTION The time fractional diffusion equation is obtained from the standard diffusion equation , by replacing the first-order time derivative with a fractional derivative of order (0,1)α ∈ [1] .

Fractional derivatives and partial fractional derivatives have been applied recently to the solution of problems in fluid and continuum mechanics [1, 4]

Inverse problems of the fractional derivative diffusion equation have thus attracted more and more attention [5, 6] .

In boundary value problems , when the boundary conditions are not available , the measured temperature on some local positions is used for determining the unknown boundary condition . In this paper , we shall consider the time fractional inverse parabolic problem .

2

2( , ) ( , ) ( , ),0 1, 0, (1)

( ,0) ( ),0 1, (2)(0, ) ( ), 0, (3)(1, ) ( ), 0, (4)

u x t u x t p x t x tt x

u x f x xu t g t tu t h t t

α

α∂ ∂

= + < < >∂ ∂

= < <= >= >

where f(x), g(x) and source term p(x,t) are known functions , while the temperature u(x,t) and boundary condition h(t)are unknown which remain to be determined . Also the parameter 0 1α< < is the fractional order of time derivative .

The time fractional derivative in equation (1) uses the Caputo fractional partial derivative of order α with respect to time t [7] as



10

1 ( , )( ) 1 ,( )( , )( , )

( , ) ,

mt mm

t m

m

u xt dx m mmu x tD u x t

t u x t mt

αα

αα

ττ α

α τ

α

− − ∂− − < <

Γ − ∂∂= =

∂

∂= ∈

∂

∫

¥

where (.)Γ is Gamma function . In problem (1)-(4) , for determining the unknown function h(t) , we use the additional condition

( , ) ( ), 0,u x t s t t= > (5)

where s(t) is a known function and 0 1x< < is a fixed point . The problem (1)-(4) is considered in two cases , 0 x x< < , a direct problem , which is found in literatures previously [8] and 1x x< < which is an inverse problem . In the next section , we develop an implicit finite differences method for solution of this inverse problem .

2. Numerical Algorithm The method is based on the first-order approximation of Caputo's fractional derivative givenas [3]

( ) 1

1

1 ,(1 )(1 )

( )n

n j n jnt i j i i

jD u w u u

kαα

αα α− + −

=≅ −

Γ − −∑ (6)

where ( ) 1 1( 1) ,jw j jα α α− −= − − (7)

and ( , )ji jiu u x t= at the grid points , 0,1, 2, ,ix ih i N= = L and , 0,1, ,nt nk n M= = L for some positive integers N

and M . Recalling from (7) that ( )1 1w α = and taking 2

12

rh

= , the numerical scheme is obtained for n=1as

1 1 1 0 0 0 11 , 1 , 1 1( 2 ) ( 2 ) ( ) , 0,1,2, , 1,i k i i k i i i iru r u ru r u r u u p i Nα ασ σ− + + −− + + − = − + + + = −L (8)

and for 2n ≥ as 1 1 1

1 , 1 , 1 1( 2 ) ( 2 ) ( )n n n n n ni k i i k i i iru r u ru r u r u uα ασ σ − − −− + + −−+ + = − + + −

(9)

( ) 1

2, 0,1, 2, , 1,( )

nn j n j n

ij i ij

w u u p i Nα − + −

=− + = −∑ L

where ( , ).ni i np p x t=

Note that , 1 ( , )nnu u x h t− = − can be obtained from applying the Crank-Nicolson method to the direct problem in the

domain 0 .x x< < The linear system which must be solved is

1 0 0 0 ( 1),j jAU C BU D p for n+ + = + + = (10) and

1 ( 2),j j j j j jAU C BU D e p for n+ + = + − + ≥ (11) where


0 0 0 0 00 0 0 0

2 0 0 0,

0 0 0 0 00 0 0 2 00 0 0 2

rr r

r r rA

rr r

r r r

σσ

σσ

− + − − + − = −

+ − − + −

LLL

M M M M M M MLLL

0 0 ... 0 0 02 0 ... 0 0 0

2 ... 0 0 0,

0 0 0 ... 0 00 0 0 ... 2 00 0 0 ... 2

rr r

r r rB

rr r

r r r

σσ

σσ

− − =

− −

M M M M M

and

( )( )

1 1 11 0 0

1 0 0

( 2 ) , , 0, 0, , 0, 0 ,

( 2 ) , , 0, 0, , 0, 0 ,

t j j jj

t j j jj

C ru r u ru

D ru r u ru

σ

σ

+ + +−

−

= − + + −

= + −

L

L

( )1 1 1 11 2 2 1, , , , ,t j j j j

j N Np p p p p+ + + +− −= L

( ), 1

2( ),

j

j k l j l j ll

e w U Uαασ ′ ′

− + −=

= −∑

where ( )1 2 2 1, , , ,tj j j j

j N NU u u u u− −= L and ( )0 1 2 1, , , ,tj j j j

j N NU u u u u′− −= L . This linear system gives M unknown pivotal

values along the boundary x=1 . 3. Numerical example. Consider (1)-(4) and (5) withf(x) = 0 , g(t) =0 , 0.6x = and

3( ) 0.36s t t= , in the domain

( , ) | 0 1, 0Q x t x t≡ < < > .

With these assumptions , the problem has the exact solution2 3( , )u x t x t= .

Computed the maximum errors between the exact values and the numerical values of the unknown boundary condition

( ) ( ) 1, u t h t= , using the proposed finite difference method at various time and space grids , with 0.25,0.5,0.7α =

are listed in Table 1 .

Also , a comparison between the exact and the numerical solution at t=0.5 and t=1 , with 0.5α = is shown in Figure 1 .

Table 1- Maximum error of the approximate values of h(t) for different values ofh and k

k t= ∆ h x= ∆ 5 0.2α = 0.5α = 5 0.7α =


K=0.02 h=0.05 h=0.025 h=0.01

2.61101E-3 7.56919E-3 1.03076E-2

8.01694E-4 6.05765E-3 9.72424E-3

1.79317E-2 2.80813E-2 3.58906E-2

K=0.025 h=0.05 h=0.025 h=0.01

5.46654E-3 1.05333E-2 1.32905E-2

2.17941E-3 8.90543E-3 9.25887E-3

8.17443E-2 1.50416E-2

2.110005E-2

K=0.01 h=0.05 h=0.025 h=0.01

3.33047E-3 1.63229E-3 4.33333E-3

6.78719E-3 9.31287E-4 4.72201E-3

5.47413E-1 2.9308E0 5.1036E0

Figure 1.Comparison between the numerical solution and the exact solution of the example 1 , at t=0.5 and t=1 , with 0.5α= , 0.05x∆ = and 0.02t∆ = . Exact (red) and Numerical(black ) : (a) , (c) given by direct part and (b) , (d) given by inverse

part.

4. CONCLUSION In this study, a time fractional inverse parabolic problem with unknown boundary condition is considered. For identifying the solution of the problem and the unknown boundary condition, a convergence numerical algorithm, based on finite differences approximations, is presented. The problem is solved in two parts. In the first part, by considering of the given additional condition as the right-hand side boundary, a direct problem is solved. Then in part two, by using the results obtained from direct problem, the numerical algorithm is applied for solving the inverse


problem and obtaining the unknown boundary condition. Numerical results showed that the proposed method gives good results. 5. REFERENCES

1. Mainardi, F and Pagnini, G. (2003),“ The Wright functions as solutions of the time-fractional diffusion equation”, App . Math . and Computation , 141, pp. 51-62 .

2. Liu, F. and Burrage , K. (2011), “Novel techniques in parameter estimation for fractional dynamical modelsarising from biological systems”, Computers & Math. with App.,62, pp. 822-833.

3. Murio, D.A. (2008), “Implicit finite difference approximation for time fractional diffusion equations”, Computers and Math . with App., 56, pp. 1138-1145.

4. Odibat, Z. and Momani , S.(2006), “Approximate solutions for boundary value problems of time-fractional wave equation”, App. Math. and Computation, 181, pp. 767-774.

5. Depollier, C. and Fellah, Z.E.A. (2004), “Propagation of transient acoustic waves in layered porous media Fractional equations for the scattering operators”, Nonlinear Dyn., 38, pp. 181-190.

6. Prakash, A. O. (2004), “Application of fractional derivatives in thermal analysis of disk brakes”, Nonlinear Dyn., 38, pp. 191-206.

7. Podlubny, I. (1999), “Fractional Differential Equations” , Academic Press , San Diego, Calif. , USA .

8. Sweilam , N.H., Khader, M.M. and Mahdy, A.M.S. (2012), “Crank-Nicolson finite difference method for solving time fractional diffusion equation”, J . of Fractional Calculus and App. 2, PP. 1-9.


A spectral method for solving an inverse time-dependent source problem

Afshin Babaei, Somayeh Nemati

Department of Mathematics, University of Mazandaran, Babolsar, Iran Corresponding Author’s E-mail: [email protected]

Abstract In this paper, a numerical algorithm based on Chebyshev polynomials is presented for recovering the unknown time-dependent source term and obtaining a solution of the inverse problem. For solving the problem, the operational matrices of integration and derivation are utilized to reduce the mentioned problem into the matrix equations which correspond to a system of linear equations with unknown Chebyshev coefficients. Finally, anumerical example will be presented to illustrate the ability and efficiency of theintroduced scheme.

Keywords: Inverse problem, Nonhomogenous parabolic equation, time-dependent source term, Chebyshev polynomials, Operational matrix.

1. INTRODUCTION Consider the following problem of determining of function ),( txu satisfying the nonhomogenous parabolic equation

,<<,0<<0),()(=),(),(2

2TtLxtgxf

xtxu

ttxu

∂∂

−∂

∂ (1)

,<<0),(=,0)( 0 Lxxuxu (2)

,<<0)(=),(),(=)(0, 21 TttgtLutgtu (3)

where )(0 xu , )(xf , )(0 tg and )(1 tg are piecewise continuous functions in their domains. Also these functions satisfy the conditions (0)=(0) 00 gu and (0)=(1) 10 gu . This problem is induced in the process of transportation, diffusion and conduction of natural materials. In this problem, in addition to the function ),( txu , the source term )(tg is also unknown. This problem is called as inverse source problem [1]. While modeling natural systems, these types of problems occupy an important place [2, 3]. In [4-8] researchers haveintroducedsome numerical schemes for special cases of reconstructing the time-dependent source term.

For solving this inverse problem, we assume an overspecified condition as

.<<0 ),(=),( 0 Ttttxu χ (4)

In the rest o f this paper, by using extra conditions (4), a numerical algorithm is presented for solving this inverse problem based on the Chebyshev polynomials.



2. Numerical algorithm

The second kind Chebyshev polynomials (SKCPs) are orthogonal polynomials on the interval 1,1][− and can be determined with the following recurrence formula [8]:

,1,2,3,= ),()(2=)( 11 KixUxxUxU iii −+ −

where 1=)(0 xU and xxU 2=)(1 . By the change of variable we will have the well-known shifted SKCPs on the

interval ][0,L as follows

.0,1,2,= 1),2(=)(, LixL

UxU iiL −

Definition 2.1. Bivariate shifted SKCPs are defined on ][0,][0, TL × as

.0,1,2,=, ),()(=),( ,,, LjitUxUtx jiL

Lij τ

τφ (5)

Bivariate shifted SKCPs are orthogonal with each other as:

,16

=),(),(),(2

,,,

00jnim

Lmn

LijL

L Ldxdttxtxtxw δδτπ

φφ τττ

τ

∫∫

where )()(=),(, twxwtxw LL ττ .

A function ),( txu defined on ][0,][0, τ×L may be approximated in terms of the bivariate SKCPs as follows

,),(=),(=),(),( ,,,

0=0=

CtxtxCtxctxu TLL

TLijij

N

j

N

iττ

τφ ΦΦ∑∑; (6)

where TLNN

LN

LN

LL txtxtxtxtx )],(,),,(,),,(,),,([=),( ,,

0,

0,

00,ττττ

τ φφφφ KKKΦ and

TNNNN ccccC ],,,,,,[= 0000 KKK.

The derivation of the vector ),(, txLτΦ with respect to x can be obtained as:

),,(=),(32

4=),(

,,

4321

, txDtx

ONIMMMM

OOOIOIOOOOIOOOOOOIOOOOOO

Lxtx

LLL

τττ ΦΦ

∂

Φ∂

KMMMMMM

KKKK

(7)


where D is a 22 1)(1)( +×+ NN matrix, 1M , 2M , 3M and 4M are I , O , I3 and O , for odd N and O , I2 ,

O and I4 , for even N , respectively, and I and O are the identity and zero matrix of order 1+N , respectively. Also, the integration of the vector ),(, txL τΦ with respect to t can be approximately obtained as

),(),( ,,0

txPtdtx LQLt

ττ Φ′′Φ∫ ; (8)

where QP is a 22 1)(1)( +×+ NN matrix and Q is a 1)(1)( +×+ NN matrix as

,2

=

QOOO

OQOOOOQOOOOQ

PQ

LMOMMM

LLL

τ

.

01)2(

100001

1)(

00810

810

41

000610

61

31

0000410

43

00000211

=

+−

+−

−−

−

−

NN

Q

NK

MMMMMMM

K

K

K

K

For solving the inverse problem (1)-(3), first we give an approximation of the unknown source function using the shifted SKCPs. To this aim, we transform this problem with the suitable change of variables to a zero initial and

boundary conditions problem as

,<<0 ,<<0 ),,(),(=),(),( TtLxtxFtxstxwtxw xxt +− (9)

By using the separation of variables, the solution of this problem may be expressed as follows

( ) ).(sin)(sin),(),(2=),( )(2)(001=

xkddekFstxw tkLt

k

πηξπξηξηξ ηπ

+ −−

∞

∫∫∑ (10)

Let us approximate the unknown function )(tg in terms of the shifted SKCPs as follows

).()( ,0=

tUbtg jj

N

jτ∑;

Using equations (10) and (11), we obtain

(11)


).(sin)(sin),())()((2=),( )(2)(,

0=001=

xkddekFUbftxw tkjj

N

j

Lt

k

πηξπξηξηξ ηπτ

+ −−

∞

∑∫∫∑

Collocating equation (12) in 1+N nodes Nitx i ,0,1,= ),,( 0 K , where ,,0,1,= ,2

1= NiTNiti K

++

and using

equation (4), we get a system of 1+N linear equations to obtain the 1+N unknown coefficients jb , Nj ,0,1,= K in

(11) which can be solved using direct methods.

Now, we use the obtained source function to solve the main problem. Suppose )()(=),( tgxftxs . Substituting

the obtained source function ),( txs into equation (1), integrating both sides of (9) with respect to t and using initial condition (2), we have

,),(=),(),(),(00

0 tdtxstdtxutxutxut

xxt

′′′′−− ∫∫

where ).(=),( 00 xutxu We approximate the functions ),( txu , ),(0 txu and ),( txs in (13) using the method mentioned in previous as

),,(=),(),( ,,

0=0=

txCtxctxu LTL

ijij

M

j

M

iτ

τφ Φ∑∑;

),,(=),(),( ,0,

0=0=0 txCtxctxu L

TLijij

M

j

M

iτ

τφ Φ′∑∑;

),,(=),(),( ,,

0=0=

txStxstxs LTL

ijij

M

j

M

iτ

τφ Φ∑∑;

where 2≥M and the vector C in equation (14) is unknown. Substituting approximations (14)–(16) into equation

(13) and using equations (7) and (8) yield GUC= where ,= 0 SPCG TQ+ and ,)(= 2 T

QPDIU − here I is the

identity matrix of order 21)( +M . Now, to solve the main problem we need to apply the boundary conditions. To this

aim, the boundary conditions (3) are written using equations 00, )(=)(=)(0, WttWt TTL τττ ψψΦ and

11, )(=)(=),( WttWtL TTL τττ ψψΦ as

,= ,= 1100 GCWGCW

where 0W and 1W are two 21)(1)( +×+ NN known matrices and 0G and 1G are two known vectors. To obtain the solution of problem (1)–(3), we replace 1)2( +M rows of the augmented matrix ];[ GU with the rows of the augmented matrices ];[ 00 GW and ];[ 11 GW . In this way, the unknown vector C is determined by solving the new matrix equation. 3. Numerical example Consider (1) with 1=L , 3=T , )(sin)(1=)( 2 xxf ππ+ , )(sin=)(0 xxu π and 0=)(=)( 10 tgtg with overspecified condition as tetu =)(0.5, .

(13)

(14)

(15)

(16)

(12)


This problem has the unique solution ),(=),( xsinetxu t π and .=)( tetg Table 1 displays the absolute errors of the sourcefunction at some selected points with N = 4, 6, 8, 10. Also, numerical results of prolem are displayed in Figures 1 and 2.

Table 1: Numerical results for unknown source function with different values of N

t 4=N 6=N 8=N 10=N 0.0 2106.04 −× 3101.54 −× 5102.21 −× 7102.07 −× 0.3 3104.24 −× 5108.74 −× 6101.49 −× 9107.61 −× 0.6 3104.20 −× 5101.41 −× 7102.62 −× 10107.24 −× 0.9 5107.93 −× 5101.75 −× 8108.62 −× 11105.73 −× 1.2 3101.61 −× 5101.30 −× 8104.68 −× 11106.24 −× 1.5 4108.72 −× 6108.02 −× 8103.98 −× 10101.23 −× 1.8 3101.44 −× 7101.93 −× 8105.24 −× 10102.31 −× 2.1 3103.41 −× 5102.10 −× 7101.09 −× 10105.44 −× 2.4 3101.93 −× 4101.07 −× 7104.01 −× 9101.68 −× 2.7 2105.03 −× 4106.72 −× 6103.33 −× 9103.21 −× 3.0 1102.53 −× 3108.99 −× 4101.82 −× 6102.36 −×

Figure 1: Plot of the function |)()(|log=)( tgtgte N− with 2,4,6=N for Example 1.


Figure 2: plot of the function ),( txu and its approximations when 3=t with 4,6,8=M .

4. CONCLUSIONS

In this work, we applied a spectral method based on the second kind Chebyshev polynomials to the numerical solution of a parabolic inverse problem with unknown source function. First, we introduced a method to find an approximation of the unknown source function by considering this function in the form of a linear combination of Chebyshev polynomials. A system of linear equations was constructed to obtain the coefficients of this combination. Then, by substituting the result into the main problem and using the operational matrices, which are all sparse matrices, we obtained an approximation of the solution. Finally, to verify the applicability and efficiency of the method, a numerical test example was presented.

5. REFERENCES 1. Isakov,V. (1990) “Inverse source problems, Mathematical surveys and monographs” , vol. 34, American mathematical society, Providence, Rhode Island.

2. Friedman, A. (1964) “Partial Differential Equations of Parabolic Type” , Prentice-Hall Inc.

3. Ebel, A. andDavitashvili, T. (2007), “Air, water and soil quality modelling for risk and impact assessment”, Springer, Dordrecht.

4. Prilepko, A. I. and Soloviev, V. V. (1988), “Solvability theorems and Rothes methodfor inverse problems for a parabolic equation. I”, Differential Equations23(1) , pp. 1230-1237.

5. Farcas, A. and Lesnic, D. (2006), “The boundary-element method for the determination of a heat source dependent on one variable”, Journal of EngineeringMathematics54 , pp. 375-388.

6. Borukhov, V. T. and Vabishchevich, P. N. (2000), “Numerical solution of the inverse problem of reconstructing a distributed right-hand side of a parabolicequation”, Computer Physics Communications 126(1) , pp. 32-36.

7. Hasanov, A. and Pektas, B. (2013), “Identification of an unknown time-dependentheat source term from overspecified Dirichlet boundary data by conjugate gradient method”, Computers and Mathematics with Applications56, pp. 42-57.

8. Mason, J.C. and Handscomb, D.C. (2003), “Chebyshev Polynomials” , CRC Press LLC.


t- Best approximation results in fuzzy normed spaces

H. R. Goudarzi

Department of Mathematics, Faculty of Science, Yasouj, Iran, Yasouj University, [email protected].

Abstract In this paper, first we see the definition of t-best approximation in fuzzy normed spaces from another point of view, and we prove some basic new results. The concept of t-orthogonality in a fuzzy normed space has been given with some properties. At last, we have an important decomposition of a fuzzy normed space, provided that one of the summands be a proximinal set.

Keywords: t-best approximation, t-orthogonality, t-proximity mapping, fuzzy normed spaces, proximinal sets.

1. INTRODUCTION

Fuzzy set theory is a powerful tool for modelling uncertainty and vagueness in various problems arising in the field of science and engineering . It has also very useful applications in various fields , e.g . population dynamics [1] , computer programming [4] , nonlinear dynamical systems [6] , nonlinear operators [10,15] , statistical convergence [9,11] , stability problem [8,12] , etc . The fuzzy topology proves to be a very useful tool to deal with such situations where the use of classical theories breaks down . One of the most important problems in fuzzy topology is to obtain an appropriate concept of fuzzy metric and fuzzy normed spaces . The problem of fuzzy metric spaces has been introduced by Kramosil and Michalek [7] and improved by George and Veeramani [3] . Many mathematicians have considered the notion of fuzzy normed spaces from different points of view see ([2,13]) . S . M . Vaezpour and F . Karimi have introduced the concept of t-best approximation in fuzzy normed spaces in [14] . Also to see the more rresults of best simultaneous approximation in fuzzy normed spaces , we refer the readers to [5] . In this paper , first we consider the definition of t-best approximation in fuzzy normed spaces , from another point of view and we prove some basic results . The concept of t-orthogonality in a fuzzy normed space has been given with its elementary properties . At the end , we investigate an important decomposition of a fuzzy normed space , provided that one of the summands be a proximinal set .

2. PRELIMINARIES

Definition 2.1. [5] A triangular norm(t-norm) T (or * ) on [0,1] is defined as an increasing, .commutative and associative mapping [ ] [ ]: 0,1 0,1T → satisfying ( )1,T x x= , for all [ ]0,1x ∈ .

Definition 2.2. [13] A fuzzy normed space(FN- space) is a 3-tuple ( ), ,X N T where X is linear space, T is a t-norm and

N is a fuzzy set on (0, )X× ∞ , such that for all ,x y X∈ and all , 0s t > the following conditions hold:

(FN-1) ( ), 0N x t > ;

(FN-2) ( ), 1N x t = if and only if x = 0;



(FN-3) ( ) ( ), ,| |tN x t N xλλ

= ; for each 0λ ≠ ;

(FN-4) ( ) ( )( ) ( ), ,, , ;T N x t N y s N x y t s≤ + +

(FN-5) [ ], . : 0, 0,( 1( ) )N x ∞ → is continuous,

(FN-6) ( ), 1.tlim N x t→ ∞ =

Example 2.3. Let ( , . )X P P be a normed space, we define *a b ab= or * min( , )a b a b= and

( , ) , , ,|| ||

n

nktN x t k m n R

kt m x+= ∈

+

then ( , *),X N is a fuzzy normed space. In particular if k = m = n = 1 we have

( , ) .|| ||tN x t

t x=

+

which is called the standard fuzzy norm induced by norm || . || .

Definition 2.4. [14] Let A be a nonempty set of a fuzzy normed space ( , *),X N . For x X∈ and t > 0, let

( , , ) sup ( , ), .d A x t N y x t y A= − ∈

An element 0y A∈ is said to be a t- best approximation of x from A if

0( , ) ( , , )N y x t d A x t− =

Also we denote:

( ) : ( , , ) ( , ).tAP x y A d A x t N y x t= ∈ = −

if each x X∈ has at least (respectively exactly) one t-best approximation in A, then A is called a t-proximinal (respectively t-Chebyshev) set.

3. MAIN RESULTS

In this section we try to see the notion of best approximation from another point of view and then we prove some applied results.

Definition 3.1. Let G be a nonempty set of a fuzzy normed space ( , *),X N . An element 0g G∈ is called a best

approximation to x X∈ from G if for every g G∈ and t > 0, we have

0( , ) ( , ).N x g t N x g t− ≥ −

The set of such elements 0g G∈ that are called best approximations to x X∈ is denoted


by :

0 0( ) : ( , ) ( , ), . tGP x g G N x g t N x g t g G= ∈ − ≥ − ∀ ∈

Hence ( )tGP x defines a multimap from X in to the power set of G called t- nearest point mapping or t-proximity map.

Now we try to prove some basic results, in line with the new scope.

Theorem 3.2. Let G be a subspace of a fuzzy normed space ( , *),X N .

i) If x G∈ then ( ) tGP x x= ,

ii) If ( )x cl G∈ but not in G, then ( )tGP x = ∅ (cl(G) is the clouser of G).

Proof. i) Let x G∈ . Then ,( 1)N x x t− = , for easch t > 0, and so ( ), , 1d G x t = . Hence

( ) : ( , ) ( , , ) : ( , ) 1 .

tGP x y G N x y t d x G t

y G N x y tx

= ∈ − == ∈ − ==

ii) Let ( )x cl G∈ but not in G, then there exists a sequence nx of elements of G such that

( ), 1.n nlim N x x t→ ∞ − =

Hence ( ), , 1 .d G x t = Therefore,

( ) : ( , ) ( , , ) : ( , ) 1 : .

tGP x y G N x y t d x G t

y G N x y ty y x

= ∈ − == ∈ − == = = ∅

because x G∉ .

In the sequel we try to present the concept of t-orthogonality in a fuzzy normed space.

Definition 3.3. Let ( , *),X N be a fuzzy normed space and ,x y Y∈ . we say that x is t-orthogonal to y (denoted by tx y⊥ ) if for each t > 0,

( , ) ( , ), N x y t N x t Rα α+ ≤ ∀ ∈

Also we say that x X∈ is orthogonal to G X⊆ (denoted by tx G⊥ ) if

,tx y y G⊥ ∀ ∈

Theorem 3.4. Let ( , *),X N be a fuzzy normed space and G be a subspace of X. Then for each t > 0, 0 ( )t

Gg P x∈ iff

0 ( )tGP x∈ .


Proof. Suppose that 0 ( )t

Gg P x∈ . put1 0g g gα= − , for any fixed g G∈ and Rα ∈ . Since

0 ( )tGg P x∈ and

1g G∈ ,

0 1( ) ), ,(N x g t N x g t− ≥ − and so

0 0( ) ( )( , ., )N x g t N x g g tα− ≥ − −

Then

0 0( ) (( ), ) .,N x g t N x g g tα− ≥ − +

and therefore 0( ) tx g G− ⊥ . Conversely suppose that

0( ) tx g G− ⊥ .Then for all Rα ∈ and 1g G∈ we have,

0 0 1,( ) ,( ).N x g t N x g g tα− ≥ − +

Let g G∈ be arbitrary and fixed, and take 1 0g g g= − and 1α= , in the last inequality to get

0( ) ( , ), .N x g t N x g t− ≥ −

Therefore 0( ) tx g G− ⊥ .

For a subset G of X put

ˆ : ( , ) ( , , ) : ( , ) ( , ), ,

tG x X N x t d G x tx X N x t N x g t g G

= ∈ == ∈ ≥ − ∀ ∈

then we have:

Lemma 3.5. Let ( , *),X N be a fuzzy normed space and G be a subspace of X. Then for all x X∈ , 0 ( )t

Gg P x∈ iff

0ˆ( ) tx g G− ∈ .

Proof. 0 ( )t

Gg P x∈ if and only if 0( ) tx g G− ⊥ ( by theorem 3.4) if and only if

0ˆ( ) tx g G− ∈ (by definition of ˆ tG and since

G is a subspace). W

Corollary 3.6. Let ( , *),X N be a fuzzy normed space and G be a subspace of X. Then

i) ˆ tx G∈ implies that ˆ tx Gα ∈ _x 2 ^G, Rα∀ ∈

ii) ˆ tx G∈ iff 0 ( )tGP x∈ .

Proof. i) Let ˆ tx G∈ , then tx G⊥ , so that,

( , ) ( , ), ,N x g t N x t Rλ λ+ ≤ ∀ ∈

then for each Rλ∈

( , ) ( , ),N x g t N x tα αλ α α α+ ≤


Then we have

( , ) ( , ), ,N x g t N x t t Rα µ α+ ≤ ∀ ∈

where µ αλ= , and this implies that , .tx g g Gα ⊥ ∀ ∈ Therefore, ˆ tx Gα ∈ (by theorem 3.4 and lemma 3.5).

ii) This part fallows from lemma 3.5

In the following theorem we see an important characteristic of P tG when G is a subspace of X.

Theorem 3.7. Let G be a subspace of a fuzzy normed space ( , *),X N , and ( )tGP x be the set of all best approximations

from x X∈ . Then

ˆ( ) ( )t tGP x G x G= −I

Proof. 0

ˆ( )tg G x G∈ −I if and only if 0g G∈ , and

0ˆ( )tg x G∈ − , if and only if

0g G∈ and 0 ˆg x g= − , where ˆ ˆ tg G∈ ,

if an only if 0g G∈ and

0ˆˆ tg x g G= − ∈ , if and only if

0 ( )tGg P x∈ , by lemma 3.5. Therefore ˆ( ) ( )t t

GP x G x G= −I .

W

Now, we are in a situation to see a decomposition for X, interms of G and ˆ tG .

Theorem 3.8. For a linear subspace G of a fuzzy normed space ( , *),X N , the following conditions hold;

1) G is proximinal, 2) ˆ ˆ : , t tX G G g x g G x G= + = + ∈ ∈

Proof. 1) 2)→ . Suppose that G is proximinal, x X∈ and 0 ( )t

Gg P x∈ . Then by lemma 3.5, 0( ) ˆ tg Gx − ∈ and

0 .g G∈

Now, since 0g G∈ and

0( ) ˆ tg Gx − ∈ , we have 0 0

ˆ( ) tx g x g G G− ∈ += + . Hence ˆ tX G G= + .

2) 1)→ Let ˆ ˆ : , t tX G G g x g G x G= + = + ∈ ∈ , and let x X∈ . Then 0x g y= + , for some

0g G∈ , and some ˆ ty G∈ .

then by corollary 3.6, ˆ ty G∈ , and so, 0 ( )tGP y∈ . But

0y x g= − , so 0( ) ( )t t

G GP P xy g−= , this implies that 00 ( )t

GP x g−∈ ,

then for all g G∈ ,

0 0( 0, ) ( , ),N x g t N x g g t− − ≥ − −

And so,

0 0( 0, ) ( ( ), ), .N x g t N x g g t g G− − ≥ − + ∀ ∈

But 0g g G+ ∈ , then for all

1 0g g g G= + ∈ , we have

0 1( , ) ( , ).N x g t N x g t− ≥ −

this means that 0 ( )t

Gg P x∈ . Therefore G is proximinal. W


At the end of this paper we prove two basic facts.

Theorem 3.9. Let G be a nonempty subset of a fuzzy normed space ( , *),X N . Then

i) ( ) ( )t tG y Gx yP yP x+ + = + , for every ,x y X∈ ,

ii) ( ) ( )t tG GP x xPα α= , for every x X∈ and Rα∈ .

Proof. i) ( )0t

G y x yy P + +∈ if and only if 0y G y∈ + and

0( , ) ( ( ), ),N x y y t N x y g y t g y G y+ − ≥ + − + ∀ + ∈ +

if and only if 0( )y y G− ∈ and

0( ( ), ) ( , ), ,N x y y t N x g t g G− − ≥ − ∀ ∈

if and only if ( )0( ) tGy y xP− ∈ , if and only if ( )0

tGPy yx∈ + . Therefore

( ) ( ) , , ,t tG y Gx y x y xP P y X+ + = + ∀ ∈

ii) If 0α = we have ( ) ( )0 0 0t tGP Pxα = = . Because 0 0∈ and

( ) ( )0 0.t tG GP Px xα ==

Thus assume that 0α ≠ . Now 0 ( )t

Gy P x∈ if and only if 0y Gα∈ and

0( , ) ( , ), ,N x y t N x g t g Gα α α− ≥ − ∀ ∈

if and only if

01( , ) ( , ), ,t tN x y N x g g Gα α α

− ≥ − ∀ ∈

if and only if 0

1 ( )tGy P x

α∈ , if and only if

0 ( )tGy P xα∈ . Therefore

( ) ( ) .t tG Gx xP Pα α=

11. REFERENCES

1. L. C. Barros, R. C. Bassanezi and P. A. Tonelli, Fuzzy modeling in population dynamics, Ecol. Modell, 128 (2000), 27-33.

2. S. C. Cheng and J. N. Morsden, Fuzzy linear operator and fuzzy normed linear spaces, Bull. Calculatta Math. Soc., 86 (1994), 429-436.


3. A. George and P. V. Veermani, On some results in fuzzy metric spaces, Fuzzy Sets and Systems, 64 (1994), 395-399.

4. R. Gites, A computer program for fuzzy reasoning, Fuzzy Sets and Systems, 4 (1980), 221-234.

5. M. Goudarzi and S. M. Vaezpour, t-best approximation in fuzzy normed spaces, Iranian Journal of Fuzzy Systems Vol. 5, No. 2, (2008) pp. 93-99.

6. I. Hong and J. Q. Sun, Bifurcations of fuzzy nonlinear dynamical systems, Commun. Non-linear Sci. Numer. Simul. 1 (2006), 1-12.

7. Kramosil and J. Mischalek, Fuzzy metric and statistical metric spaces, Kybernetika, 11 (1975), 326-334.

8. S.A. Mohiuddine, Stability of Jensen functional equation in intuitionistic fuzzy normed space, Chaos Solitons Fractals 42 (2009) 2989-2996.

9. S.A. Mohiuddine, Q.M.D. Lohani, On generalized statistical convergence in intuitionistic fuzzy normed space, Chaos Solitons Fractals 42 (2009) 1731-1737.

10. M. Mursaleen, S.A. Mohiuddine, Nonlinear operators between intuitionistic fuzzy normed spaces and Frchet difierentiation, Chaos Solitons Fractals 42 (2009) 1010-1015.

11. M. Mursaleen, S.A. Mohiuddine, Statistical convergence of double sequences in intuitionistic fuzzy normed spaces, Chaos Solitons Fractals 41 (2009) 2414-2421.

12. M. Mursaleen, S.A. Mohiuddine, On stability of a cubic functional equation in intuitionistic fuzzy normed spaces, Chaos Solitons Fractals 42 (2009) 2997-3005.

13. R. Saadati and S. M. Vaezpour, Some results on fuzzy Banach spaces, J. Appl. Mathand Computing., 17(1-2) (2005), 475-484.

14. S. M. Vaezpour and F. Karimi, t-best approximation in fuzzy normed spaces, Iranian Journal of Fuzzy Systems, 2 (2008), 93-99.

15. Y. Yilmaz, On some basic properties of di_erentiation in intuitionistic fuzzy normed spaces, Math. Comput. Modelling 52 (2010) 448-458.

1415 University of Guilan-Faculty of Engineering & Technology-East of Guilan 18-19 November - 2015

Existence and uniqueness of fuzzy differential equations with monotone condition

Samira Siahmansouri , Omid Solimani Fard Sama technical and vocational training college, Islamic Azad University, Varamin Branch, Varamin, Iran Department of Mathematics, Ferdowsi University Of Mashhad, Mashhad Iran Corresponding author. E-mail address: [email protected]

Abstract The main purpose of this paper is to investigate to fuzzy differential equation (FDE) is a type of differential equation driven by Liu process. We provide and prove a new existence and uniqueness theorem for fuzzy differential equations under Lipschitz condition and monotone condition. Then examples are given to show which are satisfied with the monotone condition without the linear growth condition.

Keywords: Fuzzy number, Fuzzy differential equation, liu process, credibility space. Stability

1. Introduction Fuzziness is a kind of uncertainty in the real world, which has been first investigated by Zadeh [2] through proposing the concept of fuzzy set via membership function.

During 2002 and 2004, Liu introduced credibility theory and presented for the first time the concept of credibility measure to facilitate measuring of fuzzy events. A powerful tool for dealing with, fuzzy phenomena and is based on normality, monotonicity, self-duality, and maximality axioms.

The main goal of this paper is the prove weaker conditions to study of existence and uniqueness of solution to the fuzzy differential equations. In this regard, we prove a new existence and uniqueness theorem under the Local Lipshitz and monotone conditions.

The paper is arranged as follow. We will review some basic concepts about credibility theory, fuzzy variable, fuzzy process, and liu process in Section 2. A new existence and uniqueness theorem is proved in Section 3. At last, In Section 4, examples are provided to show which are satisfied with the monotone condition without the linear growth condition. 2. Preliminaries The emphasis in this section is mainly on introducing some concepts such as credibility measure, credibility space, fuzzy variable, independence, expected value, variance, fuzzy process, liu process, and stopping time.

Suppose that Θ is a non-empty set and the power set of Θ. Each element of in is called an event. In order to present an axiomatic definition of credibility, it is necessary to assign to each event a number which indicates the credibility that will occur. To ensure the number has certain mathematical properties which we intuitively expect to have a credibility, we accept the following four axioms [2]:

1. Axiom (Normality) = 1.



2. Axiom (Monotonicity) ≤ whenever ⊂ . 3. Axiom (Self-Duality) + = 1 for any event . 4. Axiom (Maximality) = sup for any events with

sup < 0.5.

Definition 2.1 [3]. The set function is called a credibility measure if it satisfies the normality, monotonicity, self-duality, and maximality axioms. Definition 2.2 [3]. Let Θ be a nonempty set, the power set of Θ, and a credibility measure. Then the triplet (Θ, , ) is called a credibility space. Definition 2.3 [3]. A fuzzy process is said to be a Liu process if

• (i) = 0, • (ii) has stationary and independent increments, • (iii) every increment − is a normally distributed fuzzy variable with expected value and

variance whose membership function is ( ) = 2(1 + exp( | |√ )) , −∞ < < +∞.

The parameters and are called the drift and diffusion coefficients, respectively.

3. Main result In this session, first, fundamental inequalities that are used wieldy in basic theorems is expressed and then we present the weak conditions they are used and at the end we prove the main theorem. Lemma 3.1 (Burkholder -Davis-Gundy inequality for Liu process):

Let ∈ ℓ ( , × ). Define for > 0, ( ) = ∫ g( ) ( ) = ∫ | ( )| .

Then, | ( )| ≤ ( sup | ( )| ) ≤ 4 | ( )| (1)

The integral inequalities of Gronwall type have been widely applied in the theory of ordinary differential equations, stochastic differential equations, and fuzzy differential equations to prove the results on existence, uniqueness, and stability. Lemma 3.2 (Gronwall’s inequality for liu process): Let > 0 and ≤ 0. Let (. ) be a credibility measurable bounded nonnegative function on [0, ], and let (. ) be a nonnegative integrable function on [0, ]. If ( ) ≤ + ∫ ( ) ( ) 0 ≤ ≤ , (2) then ( ) ≤ exp(∫ ( ) ) 0 ≤ ≤ . (3) Throughout this paper, we consider the fuzzy differential equations ( ) = ( ( ), ) + ( ( ), ) (4) where is a standard Liu process and , are some given functions. ( ) is the solution to the Eq. (4) which is a fuzzy process in the sense of liu. By the definition of fuzzy differential, this equation is equivalent to the following fuzzy integral equation: ( ) = + ∫ ( ( ), ) + ∫ ( ( ), ) . (5) Furthermore, let us state the following conditions. ( ) Local Lipschitz condition: For each integer ≥ 1, there exists a positive constant number such that


| ( ( ), ) − ( ( ), )| ∨ | ( ( ), ) − ( ( ), )| ≤ | ( ) − ( )| ,

For those ( ), ( ) ∈ with | ( )| ∨ | ( )| ≤ .

( ) Linear growth condition: There exists a positive number such that | ( ( ), )| ∨ | ( ( ), )| ≤ (1 + | ( )| ),

( ) Monotone condition: there exists a positive constant such that

( ) ( ( ), ) + 12 | ( ( ), )| ≤ (1 + | ( )| ) for all ( ) ∈ .

The following Remark prove the exact solution to equation (4) under the monotone condition ( ). Remark 3.1: Assume the monotone condition ( ), there exists a positive constant such that the solution of (4) satisfies ( sup | ( )| ) ∨ ( sup | ( )| ) ≤ , (6)

where = ( , , ) is a constant independent of ℎ, (ℎ = be a given step size with integer ≥ 1 and = ℎ ).

We know some functions such as sin and −| | not satisfy in Lipshitz condition and Linear growth condition therefor we prove the following theorem until weaker conditions and all functions will be included.

Let us now turn to find the conditions that guarantee the existence and uniqueness of the solution to equation (4).

Theorem (Existence and Uniqueness of solution) 3.1: Assume that the locally Lipschitz condition ( ) holds, but the linear growth condition ( ) is replaced with the monotone condition ( ). Then there is a unique solution ( ) to equation (3.4) in (( , ], ).

Proof: Proof follows from truncation procedure. For each ≥ 1, define the truncation function ( ( ), ) = ( ( ), ) | ( )| ≤ ( ( )| ( )| , ) | ( )| >

( ( ), ) = ( ( ), ) | ( )| ≤ ( ( )| ( )| , ) | ( )| > , then and satisfy Lipschitz condition. So that, equation ( ) = (0) + ∫ ( ( ), ) + ∫ ( ( ), ) ≤ ≤ (7)

According to Remark 3.1, ( ) is a unique solution to equation (3.4) in (( , ], ). Additionally, ( ) ∈ (( , ], ). Of course, ( ) is the unique solution of equation ( ) = (0) + ( ( ), ) + ( ( ), ) ≤ ≤ , and ( ) ∈ (( , ], ).

Define the stopping time = ∧ inf ∈ [ , ]: |( ) | ≥ . By taking the expectation, and by the Hölder inequality, we have

| ( ) − ( )| ≤ 2 | ∫ [ ( ( ), )] − ∫ [ ( ( ), )] |


+2 | ∫ [g ( ( ), )] − ∫ [ ( ( ), )] | ≤ 2( − ) ∫ ( ( ), ) − ( ( ), )| + ∫ ( ( ), ) − ( ( ), )| ≤ 4( − ) ∫ [| ( ( ), ) − ( ( ), )| + | ( ( ), ) − ( ( ), )| ]

+4 ∫ [| ( ( ), ) − ( ( ), )| + | ( ( ), ) − ( ( ), )| ] . For ≤ ≤ , getting

( ( ), ) = ( ( ), ) = ( ( ), ),

( ( ), ) = ( ( ), ) = ( ( ), ),

again by substituting ( + ) = ( + ) = ( ), ∈ ( , 0], one gets that (sup | ( ) − ( )| ) ≤ 4( − ) ∫ | ( ( ), ) − ( ( ), )|

+4 ∫ | ( ( ), ) − ( ( ), )|

4( − + 1) ∫ |( ( ), ) − ( ( ), )|

≤ 4( − + 1) ∫ (sup |( ( ), ) − ( ( ), )| ) .

From the Gronwall inequality, one sees that (sup | ( ) − ( )| ) = 0 ≤ ≤ ,

so this means that for ≤ ≤ , we always have ( ) = ( ). (8) We give that is increasing, that is as → ∞, ↑ a.s. By the monotone (I ) condition, for almost all ∈ Ω, there exists an integer = ( ) such that = as ≥ . Now define ( ) by ( ) = ( ), ∈[ , ]. In order to verify that ( ) is the solution of (4). By (8), ( ∧ ) = ( ∧ ), and by (7), it follows that

( ∧ ) = (0) + ∫ ∧ ( ( ), ) + ∫ ∧ ( ( ), )

= (0) + ∫ ∧ ( ( ), ) + ∫ ∧ ( ( ), ) . By letting → ∞ then yields

( ∧ ) = (0) + ∫ ∧ ( ( ), ) + ∫ ∧ ( ( ), ) that is

( ) = (0) + ∫ ( ( ), ) + ∫ ( ( ), ) . We can see ( ) is the solution of (3.4), and ( ) ∈ (( , ], ). So far, the existence is complete.

4. Examples


Here, the following examples illustrates Theorem 3.1. Example 1. Consider the one-dimensional fuzzy differential equation

( ) = [ ( ) − ( )] + ( ) ∈ [ , ] (9)

Here is a one-dimensional standard fuzzy process. Clearly, the local Lipschitz condition ( ) is satisfied but the linear growth condition ( ) is not. On the other hand, note that

( )[ ( ) − ( )] + ( ) ≤ ( ) < 1 + ( ). That is, the monotone condition is fulfilled. Hence Theorem 3.1 guarantees that equation (9) has a unique

solution. Example 2. Consider the following fuzzy differential equation:

( ) = [− ( )] + [sin ( )] ∀ ≥ 0. (10)

Clearly, the equation do not satisfy the linear growth condition ( ). But the example is analyzed under condition ( ) which covers many nonlinear FDEs. On the other hand, we have

(− ) + (sin ) ≤ − + (sin ) ≤ 2(1 + ). (11) In other words, the equation satisfies condition ( ). Moreover, we also have

2 ( ( ), ) + | ( ( ), )| ≤ | ( )| + (1 + | ( )| ) ≤ (1 + 2 )(1 + | ( )| ). (12) We see clearly that ( ) follows from ( ). 5. Conclusion

The existence and uniqueness theorem is one of useful and basic theorems in the theory of fuzzy differential equations. However, there are few people who have considered weaker conditions, in the present paper, we was aimed to prove a New Existence and Uniqueness Theorem under the Local Lipshitz and monotone conditions. References

[1] Liu, B., Uncertainty Theory, 2nd ed., Springer-Verlag, Berlin, 2007. Information Sciences, vol.177,

pp.4329-4337, 2007.

[2] L. A. Zadeh, Fuzzy sets, Information and Control, vol.8, pp.338-353, 1965.

[3] Xiaowei Chen, Fuzzy Differential Equations, August 28, 2008.


Fuzzy Quotient BCK-algebras

S. Saidi Goraghani, F. Forouzesh 1- Department of Mathematics, Farhangian University, Kerman, Iran 2- Department of Mathematics, Higher Education Complex of Bam, Iran [email protected]

Abstract

In this paper, we introduce the concept of fuzzy congruences and fuzzy quotient algebras in BCK-algebras and prove that there is a bijection between the set of fuzzy ideals and the set of fuzzy congruences and we show that for each fuzzy ideal , there is an associated algebra A/ . Also, we show that A/ is an BCK-algebra.

Mathematical Subject Classification(2010): 06F35, 06D99.

Keywords: BCK-algebra, fuzzy ideal, fuzzy congruence.

1. INTRODUCTION The notion of BCK-algebra was formulated first in 1966 by Imai and Is ki. This notion is originated from two different ways. One of the motivations is based on set theory. Another motivation is from classical and non-classical propositional clacului. As is well known, there is close relationship between the notion of the set difference in set theory and the implication functor in logical systems. Then the following problems arise from this relationship. What is the most essential and fundamental common properties? Can we establish a good theory of general algebra? To give an answer this problems, Y. Imai and K. Is ki introduced a notion of a new class of general algebras, which is called a BCK-algebra. This name is taken from BCK-system of C. A. Meredith. BCK-algebras have been applied to many branches of mathematics, such as group theory, functional analysis, probability theory and topology. The concept of fuzzy subset was introduced by Zadeh for the first time. Since then, many studies were performed about this subject and many researchers started working on the fuzzy algebraic structures. The concept of a fuzzy relation on a set was introduced by Zadeh, Kondo defined the quotient BCI-algebras induced by fuzzy ideals. Note that each congruence class in quotient BCI-algebras induced by fuzzy ideals is not a fuzzy set but a crisp set. We introduce the notions of fuzzy congruences, fuzzy congruence classes and fuzzy quotient algebras in BCK-algebras. We will show that the elements in fuzzy quotient algebras induced by fuzzy ideals are fuzzy sets in BCK-algebras. Hence we prove that there is a bijection between the set of fuzzy ideals and the set of fuzzy congruences. For each fuzzy ideal , there is an associated algebra X/ . We prove that is X/ a BCK-algebra and is isomorphic to the BCK-algebra X/ ( ). 2. PRELIMINARIES In this section, we state some definitions of BCK-algebras and ideals in BCK-algebras and we review related lemmas and theorems that we use in the next sections.

Definition 2.1. [8] A BCK-algebra is a structure X=(X,*,0) of type (2,0) such that:



(BCK1) ((x*y)*(x*z))*(z*y)=0 (BCK2)(x*(x*y))*y=0 (BCK3) x*x=0 (BCK4) 0*x=0 (BCK5) x*y=y*x=0 implies that x=y, for all x,y,z X∈ The relation yx ≤ which is defined by x*y=0 is a partial order on X with 0 as least element. In BCK-algebra X, for any x,y,z X∈ , we have (BCK6) (x*y)*z=(x*z)*y (BCK7) yx ≤ implies z*y ≤ z*x, (BCK8) yx ≤ implies x*z ≤ y*z.

Definition 2.2. [3] A fuzzy set in set of X is a mapping : X→ [0,1]. Let be a fuzzy set in X, t∈ [0,1], the set =x∈ : (x)≥ t is called a level subset of .

Definition 2.3. [3] If X is a BCK-algebra, then a fuzzy set in X is a fuzzy ideal of X, if it satisfies (FI1) (0)≥ (x), for all x∈ , (FI2) (y)≥ (x)˄ (y*x), for all x,y∈ .

Note: From now on, in this paper, we let X be a BCK-algebra.

3. FUZZY CONGRUENCE IN BCK-ALGEBRAS In this section, we introduce the notions of fuzzy congruences, fuzzy congruence classes and fuzzy quotient algebras in BCK-algebras. We will show that the elements in fuzzy quotient algebras induced by fuzzy ideals are fuzzy sets in BCK-algebras.

Definition 3.1. Let X be a BCK-algebra. A fuzzy relation from X× X to [0,1] is called a fuzzy congruence in X if it satisfies the following:

(C1) (0,0)= (x,x), for all x∈ X, (C2) (x,y)= (y,x), for all x,y∈ X, (C3) (x,z)≥ (x,y) ∧ (y,z), for all x,y,z∈ X, (C4) (x*z,y*z)≥ (x,y) and (z*x,z*y)≥ (x,y). Example 3.2. Let X=0,a,b,1, where 0<a,b<1. Define * as follows: Then (X,*,0) is a BCK-algebra. Consider fuzzy relation from X×X to [0,1] with (0,0)= (a,a)= (b,b)= (1,1)=0.8, (a,1)= (b,0)= (1,a)= (0,b)=0.5 and (0,1)= (1,0)= (a,b)= (b,a)= (a,0)= (b,1)= (0,a)= (1,b)=0.3. It is easily checked that is a fuzzy

congruence in X.

* 0 a b 1 0 0 0 0 0 a a 0


Remark 3.3. If is a fuzzy congruence in X, then

(i) (Nx,Ny)≥ (x,y), (ii) (x ∧ z,y ∧ z)≥ (x,y), (iii) (x˅z,y˅z)≥ (x,y), (iv) (0,0)≥ (x,y), for all x,y∈ A. Proof. (i),(ii),(iii): By (C4), the proof is routine. (iv) We have (0,0)= (x,x) and (x,x)≥ (x,y)˄ (y,x)= (x,y). Then (0,0)≥ (x,y).

Theorem 3.4. Let be a fuzzy congruence on X. If U( ) ≠ ∅, then U( ) is a congruence on X, for all t∈[0,1], where U( ) =(x,y) ∈ X×X| (x,y)≥ t.

Proof. Suppose that is a fuzzy congruence on X. Take any t∈ [0,1] such that U( ) is not empty. Since U( ) is not empty, there is an element (u,v) ∈ X× X such that (u,v) ∈ U( ) . This means that t≤ (u,v). Since is

the congruence, we have t≤ (u,v)≤ (0,0)= (x,x). That is, (x,x) ∈ U( ) . Also, the relation is clearly symmetric. Let (x,y),(y,z) ∈ U( ) . Since t≤ (x,y), (y,z), we have t≤ (x,y)˄ (y,z)≤ (x,z). Hence (x,z) ∈ U( ) . Now, we assume that (x,y) ∈ U( ) . Since t≤ (x,y)≤ (x*u,y*u), we have (x*u,y*u) ∈ U( ) . Thus U( )

is the congruence on X. Definition 3.5. Let be a fuzzy congruence in a BCK-algebra X and x∈ X. Define the fuzzy set in X as (y)= (x,y), for all y∈ X. The fuzzy set is called a fuzzy congruence class of x by in X. The set X/ = |

x∈ X is called a fuzzy quotient set by . Example 3.6. Consider BCK-algebra X=0,a,b,1 with fuzzy congruence relation in Example 3.2. A fuzzy

quotient set by is X/ = , , , . Lemma 3.7. Let be a fuzzy congruence in X. Then is a fuzzy ideal in X. Proof. By Remark 3.3 (iv), (0)= (0,0)≥ (0,x)= (x). Since is a fuzzy congruence in X, it follows that (0,y)≥ (0,y*x)˄ (y*x,y) and (y*x,y)= (y*x,y*0)≥ (x,0). Hence (0,y)≥ (0,y*x)˄ (0,x). Thus (y)≥ (y*x)˄ (x), for all x,y∈ X. This shows that is a fuzzy ideal in X. Lemma 3.8. Let be a fuzzy ideal in X. Then (x,y)= (x*y)˄ (y*x) is a fuzzy congruence in X. Proof. We only show that satisfies the conditions (C3) and (C4). We have (x,z)= (x*z)˄ (z*x)≥ ( (x*y)˄ (y*z))˄( (z*y)˄ (y*x)=( (x*y)˄ (y*x))˄( (y*z)˄ (z*y))= (x,y)˄ (y,z). For the (C4), it follows that ((x*z),(y*z))= ((x*z)*(y*z))˄ ((y*z)*(x*z)). Since (x*z)*(y*z)≤ x*y and (y*z)*(x*z)≤y*x, we obtain (x*z,y*z)≥ (x*y)˄ (y*x)= (x,y). Note. Let be a fuzzy ideal in X and x∈ X. In the following, let denote the fuzzy congruence class of x by in X and X/ the fuzzy quotient set by . Lemma 3.9. If is a fuzzy ideal in X, then = if and only if (x*y)= (y*x)= (0), for all x,y∈ X. Proof. If is a fuzzy ideal in X, then (v)= (v)= (u,v)= (u*v)˄ (v*u), i.e., (v)= (u*v)˄ (v*u), for

any u,v∈ X. If = , then (x)= (x). From the above, it follows that (x*x)= (0)= (y*x)˄ (y*x). Thus (x*y)= (y*x)= (0).


Conversely, let (x*y)= (y*x)= (0). Then (x*z)≥ (x*y)˄ (y*z) and (y*z)≥ (y*x)˄ (x*z). If (x*y)= (y*x)= (0), then (x*z)≥ (y*z) and (y*z)≥ (x*z) and so (x*z)= (y*z). Similarly, we have (z*x)= (z*y). This implies that (z)= (x*z)˄ (z*x)= (y*z)˄ (z*y)= (z). Hence = . Corollary 3.10 If is a fuzzy ideal, then = , if and only if x~ ( )y, where if and only if x*y∈ ( ) and

y*x∈ ( ). Note. Let be a fuzzy ideal in X. For any , ∈ X/ , we define ∗ = ∗ . Theorem 3.11. Let be a fuzzy ideal in X. Then X/ =(X/ ,*, ) is a BCK-algebra. Proof. We prove that the operation on X/ is well defined. Let = , = . Then by Corollary 3.10,

x~ ( ) and y~ ( ) t. Since ( ) is a congruence relation, we have x*y~ ( ) s*t, so ∗ = ∗ . It is routine to prove that X/ is a BCK-algebra.

Remark 3.12. From Corollary 3.10 and Theorem 3.11, we conclude that ˅ = ˅ and ˄ = ˅ . Theorem 3.13. Let be a fuzzy ideal in X. Define a mapping f:X→ / by f(x)= . Then (1) f is a surjective homomorphism, (2) Ker(f)= , (3) X/ is isomorphic to the BCK-algebra X/ ( ). Proof. (1) Clearly, f is surjective. We have f(x*y)= ∗ = ∗ =f(x)*f(y) and f(0)= . Hence f is a surjective

homomorphism. (2) x∈ Ker(f) if and only if f(x)= if and only if = if and only if x~ ( ) 0 if and only if x ∈ ( ).

Hence Ker(f)= ( ). (3) By (1) and (2) we have X/ is isomorphic to the BCK-algebra X/ ( ). Definition 3.14. Let be a equivalence relation and be a fuzzy relation on X. Then is called

-invariant if (x)= (a) and (y)= (b) imply (x,y)= (a,b), for x,y,a,b∈ X. Definition 3.15. Let be a congruence relation and be -invariant fuzzy relation on X. We define a fuzzy

relation on X/ as follows: ( (x), (y))= (x,y), for any x,y∈ X. Theorem 3.16. If is a -invariant fuzzy congruence relation on X, then is so on X/ . Proof. Since is a -invariant fuzzy congruence relation on X, we get is well defined. The proofs of , , are routine.

9. CONCLUSIONS The aim of this paper is to investigate fuzzy congruence in BCK-algebras. We obtained some important results in this field. 11. REFERENCES


[1] H. A. S. Abujabal, M. A. Obaid, M. Aslam, A. B. Thaheem, On Annihilators of BCK-algebras, Czechoslovak Mathematical Journal, 45(4) (1995), 727-735.

[2] O. Heubo-Kwegna and J. B. Nganou, A Global Local Principle for BCK-modules, International Journal of Algebra, 5(14) (2011), 691-702.

[3] C. S. Hoo, Fuzzy ideals of BCI and MV-algebras, Fuzzy Sets and Systems Vol. 62, pp. 111--114, 1994. [4] Y. Imai and K. Is ki, On axiom systems of propositional calculi, Proceedings of the Japan Academy, 42 (1966),

19-21. [5] K. Is ki, On ideals in BCK-algebras, Mathematics Seminar Notes, 3(1975), 1-12. [6] K. Is ki and S. Tanaka, Ideal theory of BCK-algebras, Mathematica Japonica, 21(1976), 351-366. [7] Y. B. Jun, K. J. Lee and C. H. Park, A method to make BCK-algebras, Communication of the Korean

Mathematical Society, 22(4) (2007), 503-508. [8] M. Kondo, W. A. Dudek, On the transfer principle in fuzzy theory, Mathware. Soft Comput., Vol. 12 pp.

4--55, 2005. [9] J. Meng and Y. B. Jun, BCK-algebras, Kyungmoon Sa Co, Korea, (1994). [10] J. G. Raftery, On prime ideal and subdirect decompositions of BCK-algebras, Mathematica Japonica, 32(1987),

811-818. [11] C. N. Tchikapa, C. Lele, Relation diagram between fuzzy n-fold filters in $BL$-algebras, Ann. Fuzzy Math.

Inform., 4, No. 1, (2012), 131--141. [12] O. Xi, Fuzzy BCK-algebras, Jpn. J. Math., 36 (1991), 935--942. [13] A. Zadeh, Fuzzy set, Information and Control, Vol. 8, pp. 338--353, 1965.


Kolmogorov-Smirnov fuzzy test for fuzzy random variables

Vahid Ranjbar, Zahra Radmehr*, Kamel Abdollahnezhad Department of Statistics, Faculty of Sciences, University of Golestan, Gorgan, 49138-15739, Iran. E-mail: [email protected]

Abstract

In this paper, a new method is proposed for developing Kolmogorov-Smirnov (one sample and two samples) tests for the case when the data are observations of fuzzy random variables, and the hypotheses are imprecise rather than crisp.

Keywords: Kolmogorov-Smirnov test (K-S test), Fuzzy p-value, Fuzzy (empirical) cumulative distribution function, fuzzy random variable, Credibility degree.

1. Introduction Non-parametric procedures are statistical procedures that make relatively mild assumptions regarding the distribution and/or the form of underlying functional relationship. The K-S test is a non-parametric procedure for the equality of continuous, one-dimensional probability distributions that can be used to compare a sample with a reference probability distribution or to compare two samples. In the classical version of such a test, the observations of sample are generally assumed to be crisp (precise) quantities. The present paper aims to develop the one-sample and two-samples K-S test for fuzzy random variables in which the underlying hypotheses are imprecise. In the following, an index to compare fuzzy number ∈ ℱ( ) and crisp value ∈ is expressed, where ℱ(R) is the set of all fuzzy real numbers. Then a new notion of fuzzy random variable based on this index is defined.

Definition 1.1. Let ∈ ℱ(R) and ∈ . The index C :ℱ(R) × → [0,1] which is defined by ≤ = ( ) + 1 − ( )2 , shows the credibility degree that is less than or equal to . Similarly, > = 1 − ≤ shows the credibility degree that is greater than . (Liu 2013 [3])

Definition 1.2. Let ∈ ℱ(R) and [0,1], then = ∈ [0] : ≤ ≥ is called the -pessimistic value of . Clearly, is non-decreasing function of (0,1]. Definition 1.3. Let ℱ ( ) be the class of LR-fuzzy numbers with continuous membership functions. For , ∈ℱ ( ), let ∆ = ⊖ − ∫ ⊖ . :( ⊖ ) (1.1)



Then, the degree of truth of is greater than , is defined to be , = ∆ ∆ ∆ . (1.2) So that we have the following relations ⊕ = + ⨂ = ≥ 0 ≤ 0

Theorem 1.4. The fuzzy preference relation D is transitive, i.e. for , , ∈ ( ), if , ≥ 0.5 and , ≥0.5, then , ≥ 0.5. Remark 1.5. For two fuzzy numbers and , we say that " is bigger than " if , > 0.5. Suppose that a random experiment is described by a probability space (Ω,A,P), where Ω is a set of all possible outcomes of the experiment, A is a -algebra of subsets of Ω and P is a probability measure on the measurable space (Ω,A). Definition 1.6. (Hesamian and Chachi 2015 [1]) The fuzzy-valued mapping : Ω → ℱ(R) is called a fuzzy random variable if for any ∈ [0,1], the real-valued mapping : Ω → R is a real-valued random variable on (Ω,A,P), where ( ) = ( )[0] ∶ ( ) ≤ ≥ . 2. Fuzzy (empirical) cumulative distribution function In this section, we extend the concept of cumulative distribution function for fuzzy random variables. To do this, first we present the definition of cumulative distribution function for fuzzy random variables which defined by Hesamian and Chachi [1]. Definition 2.1. The fuzzy cumulative distribution function (f.c.d.f) of fuzzy random variable at ∈ R is defined as fuzzy set ( ) with the following membership function ( )( )= [0,1] : ( ) = , ∈ [0,1]. Definition 2.2. The f.c.d.f. of fuzzy random variable at ∈ R is defined as fuzzy set ( ) with the following α-pessimistic ( ) = ∈ ( ) [0], ( )( ) ≥ 1 − α = ( ) [1 − ] = max ( ), ( ) = ( ). Definition 2.3. Suppose that , , … , is a fuzzy random sample with fuzzy observed values , , … , . The fuzzy empirical distribution function (f.e.d.f.) of , , … , , at ∈ is defined to be the fuzzy set ( ) with the following α-pessimistic function ( ) = 1 ( ) ≤

, (2.1)

where, I is the indicator function defined as ( ) = 1 ,0 . 3. Kolmogorov-Smirnov tests in fuzzy environment 3.1 One sample test Now, suppose that we have a fuzzy random sample , , … , with observed values , , … , from a population with continuous f.c.d.f. . In this section we generalize the classical K-S one-sample test to a fuzzy environment. In fact, based on the observations of a fuzzy random sample, we are going to test the following fuzzy hypothesis


left one-sided: : : ( ) = ( ) ( ) ≻ ( ) ∈ , ∈ , (3.1)

right one-sided: : : ( ) = ( ) ( ) ≺ ( ) ∈ , ∈ , (3.2)

two-sided: : : ( ) = ( ) ( ) ≠ ( ) ∈ , ∈ . (3.3)

Definition 3.1. For a fuzzy random sample , , … , , the fuzzy K-S one-sample test statistic is a fuzzy set √ , √ and √ on [0,∞) with the following α-pessimistic function, respectively, for left one-sided, right one-sided and two-sided

left one-sided: √ = √ sup ∈ ( ) − ( ) ( ) , (3.4)

right one-sided: √ = √ sup ∈ ( ) − ( ) ( ) , (3.5)

two-sided: √ = max ∈ √ , √ . (3.6)

Remark 3.2. If the fuzzy random sample , , … , reduce to the crisp random sample , , . . . , , then, for every α ∈ (0,1] √ = √ sup ∈ ( ) − ( ) = √ , which is the classical K-S one-sample test statistic. 3.2 Two-samples In this section, we extend the classical two-sample K-S to the case when the underlying hypotheses and the available observations are imprecise rather than crisp. Let and be f.c.d.f. of fuzzy random variables and , respectively. We are going to test the following fuzzy hypothesis

left one-sided: : : ( ) = ( ) ( ) ≻ ( ) ∈ , ∈ , (3.7)

right one-sided: : : ( ) = ( ) ( ) ≺ ( ) ∈ , ∈ , (3.8)

two-sided: : : ( ) = ( ) ( ) ≠ ( ) ∈ , ∈ . (3.9)

Definition 3.3. The fuzzy K-S two-samples test statistic for testing fuzzy hypotheses : = , is defined as fuzzy

set ,

and with the following α-pessimistic function, respectively, for left

one-sided, right one-sided and two-sided


left one-sided: = sup ∈ ( ) − ( ) ( ) , (3.10)

right one-sided: = sup ∈ ( ) − ( ) ( ) , (3.11)

two-sided: = , . (3.12)

Remark 3.4. If the fuzzy random sample , , … , and , , … , reduce to the crisp random sample , , . . . , and , , . . . , , then, for every ∈ (0,1],

+ = sup ∈ ( ) − ( ) = + , which is the classical K-S two-samples test statistic. 4. Fuzzy p-value In the following, we extend the concept of p-value in fuzzy environment. Definition 4.1. In the problem of one sample K-S tests for testing hypotheses (3.3), based on a fuzzy random sample, the fuzzy p-value is defined to be fuzzy set -value with the following -pessimistic function, ( − ) = √ ( ) ≥ √ ( ) , (4.1)

where √ ( ) denotes the observed fuzzy statistics √ ( ). Similar arguments can be made for deriving

fuzzy p-value for testing the right and left one-sided one-sample K-S imprecise hypotheses.

Remark 4.2. It is noticeable that for any ∈ (0,1], √ ( ) ≥ √ ( ) is the classical p-value for

testing the null hypothesis based on a crisp random sample , , … , . Therefore -value is a natural extension of the classical p-value for K-S one-sample test with fuzzy observations and fuzzy hypothesis. Definition 4.3. In the problem of two-samples K-S test for testing hypotheses (3.9) based on a fuzzy random sample, the fuzzy p-value is defined to be fuzzy set -value with the following α-pessimistic function, ( − ) = + ( ) ≥ + ( ) , where ( )denotes the observed fuzzy statistics ( ). Similar arguments can be made

for deriving fuzzy p-value for testing the right and left one-sided two-samples K-S imprecise hypotheses.

Remark 4.4. Similar to one-sample approach, the -value is a natural extension of the classical p-value for K-S two-samples test with fuzzy observations and fuzzy hypothesis.

5. Method of decision making In the classical testing problem a decision rule is made to accept or to reject the null hypothesis, by comparing the observed crisp p-value with the given significance level. But, in the proposed methods, the p-value is defined as a fuzzy set. Therefore, it is natural to consider the significance level as a fuzzy set, too.


Definition 5.1. Consider the problem of testing fuzzy hypothesis discussed in this paper, based on a fuzzy random samples. Then, at fuzzy significance level of , the fuzzy test is defined to be a fuzzy set as follows = [1]1 , [0]0 , where [1] = − value, is called the degree of acceptance of and [0] = 1 − [1] is the degree of rejection of . 6. Numerical examples To demonstrate the application of the proposed method, we provide practical examples in this section. Example 6.1. The 20 observation in Table 1 is chosen randomly from the continuous uniform distribution over (0,1). We want to test the null hypothesis that the square roots of these numbers also have continuous uniform distribution, over (0,1). We may assume that the data are fuzzy. In fact, imprecision is formulated by fuzzy numbers =( ; , ) , with = 0.07 , = 0.05 : i=1,2,…,20.

Table 1- The fuzzy data of Example 6.1 = ( ; , ) = ( ; , ) = ( ; , ) (0.0123; 0.0009,0.0006) (0.1039; 0.0073,0.0052) (0.1954; 0.0136,0.0098) (0.2621; 0.0183,0.0131) (0.2802; 0.0196,0.0140) (0.3217; 0.0225,0.0161) (0.3645; 0.0225,0.0182) (0.3919; 0.0274,0.0196) (0.4240; 0.0299,0.0212) (0.4814; 0.0337,0.0241) (0.5139; 0.0360,0.0257) (0.5856; 0.0409,0.0293) (0.6275; 0.0439,0.0314) (0.6541; 0.0458,0.0327) (0.6889; 0.0482,0.0344) (0.7621; 0.0533,0.0381) (0.8320; 0.0582,0.0416) (0.8871; 0.0621,0.0444) (0.9249; 0.0647,0.0462) (0.9634; 0.0674,0.0482)

Now, suppose that we wish to test fuzzy hypothesis (3.1) in which denotes the c.d.f. of the uniform distribution on the fuzzy interval , where = (0; 0.01,0.01) and = (1; 0.01,0.01) . To compute -value, we should calculate the α-pessimistic of fuzzy test statistic for every α ∈ (0,1]. √ = √ sup ∈ ( ) − ( ) ( ) , in which ( ) ( ) = − − = + 0.01 − 0.02 . So, we obtain √ . = 1.4, and from (4.1) we obtain ( − value ) . = 0.041. By continuing this procedure for

other values of α, the α-pessimistic function of the observed -value is drawn point-by-point based on values of α ∈ 0.001,0.002,...,1 versus ( − value ) ∈[0,1], as we observed in Figure 1. Finally, the membership function of − value is derived about 0.05. For the fuzzy significance level = (0.05; 0.01,0.01) (with α-pessimistic = 0.05 + 0.01), by some calculation, we get [1] = 0.395. Thus, the fuzzy test function is obtained as follows

= [1]1 , [0]0 = 0.3951 , 0.6050 .


Therefore, is accepted with degree of acceptance 0.395 or we reject with degree of rejection 0.605.

Figure 1. The α-pessimistic of fuzzy p-value (line) and fuzzy significant level (dash) in Example 6.1.

Example 6.2. The data in Table 2 are a subset of the data obtained by Friedman et al. (1971) in an experiment comparing the average concentrations of human plasma growth hormone both resting and after arginine hydrochloride infusion in relatively coronary prone subjects (persons with type A behavior patterns) with the corresponding concentrations of relatively coronary- resistant individuals (subjects with type B behavior patterns). Type A behavior is characterized by an excessive sense of time urgency, drive, and competitiveness; type B denotes a converse type of behavior. However, due to some difficulties in measurements, we are not sure about the crispness of the observations. So, it is convenient to consider the data as fuzzy observations. Now, in the following, based on this fuzzy data presented in Table 2, we test the hypotheses : = versus : ≠

Table 2- The fuzzy data of the Example 6.2.

i

Persons with type A behavior patterns = ( ; , ) Subjects with type B behavior patterns = ( ; , )

1 3.6 1.15 0.75 16.2 0.78 0.98 2 2.6 0.95 0.88 17.4 1.01 1.15 3 4.7 0.82 1.05 8.5 1.56 1.20 4 8.0 1.21 1.22 15.6 1.48 1.34 5 3.1 0.88 1.17 5.4 1.62 1.29 6 8.8 1.16 0.92 9.8 1.33 1.37 7 4.6 1.11 1.16 14.9 1.42 1.18 8 5.8 1.08 0.86 16.6 1.51 1.11 9 4.0 0.99 1.13 15.9 1.18 1.09 10 4.6 1.14 1.19 5.3 1.26 1.31 11 10.5 1.44 1.27

Based on the procedure proposed in this paper, the α-pessimistic function of fuzzy p-value is derived point by point as follows (see also, Figure 2)

( − value ) = ⎩⎪⎨⎪⎧0.047 0.00 ≤ < 0.43,0.128 0.43 ≤ < 0.69,0.272 0.69 ≤ < 0.81,0.483 0.81 ≤ < 0.83,0.725 0.83 ≤ < 1.00.


For the fuzzy significance level = (0.05,0.10,0.15) (with α-pessimistic = 0.1 + 0.05) by Definition 1.3 we get [1] = − value, = 0.885. Thus, the fuzzy test function is obtained as follows

= [1]1 , [0]0 = 0.8851 , 0.1150 . Therefore, the null hypothesis of is accepted with degree of acceptance 0.885.

Figure 2. The α-pessimistic of fuzzy p-value (line) and fuzzy significant level (dash) in Example 6.2

References [1] Hesamian, G. and Chachi, J. (2013), “Two-sample Kolmogorov-Smirnov fuzzy test for fuzzy random variables”,

Statistical Papers, Volume 56, Issue 1, pp 61-82. [2] Liu, B. (2013), “Uncertainty theory”, 4th ed. Springer, Berlin [3] Viertl, R. (2011), “Statistical methods for fuzzy data” Wiley, Chichester. [4] Wu, HC. (2005), “Statistical hypotheses testing for fuzzy data”, Fuzzy Sets and Systems 175, pp 30-56


Numerical solving Fredholm fuzzy integral equations by using Radial Basis Functions

R. Firouzdor1, Sh. Asghari2, R. Salehi3 1- Central Tehran Branch, Islamic Azad University, Young Researcher and Elite Club 2- Department of Mathematics, Islamic Azad University, Hamedan. 3- Department of Electrical and Computer and Information Technology, Islamic Azad University, Ghazvin. [email protected]

Abstract The idea in this paper, is using Radial Basis Functions for solving integral equations. This method is applied for the second kind of Fredholm fuzzy integral with a combination of collocation method and radial basis functions (RBFs). Numerical solutions which is resulted in linear system will be shown in the end. Keywords: Radial basis functions, Fuzzy number, Fuzzy integral equations.

1. INTRODUCTION Fuzzy integral equations are important in solving of the problems in many topics in applied mathematics, physics,

geographic, medical and biology. Dubios and Prade [1] were the first persons who introduce the integration of fuzzy

functions. This concept have been quickly growing in these years chiefly for approximating of fuzzy integral

equations. S. Abbas bandy, T. Allahviranloo are studied the numerical methods for fuzzy differential equations

[2,3,4]. The significant problems for approximating of fuzzy integral equations are the existence and uniqueness of

the solution, and the structure of numerical methods. Therefor the numerical methods for fuzzy integral equations

consist various techniques. Some of these methods are, Nystrom techniques, quadrature rules [5,6 ] and Chebyshev

interpolation [7]. RBF play an important rule in approximation numerical solution of integral equations [8,9]. The

radial basis function (RBF) method, especially the multiquadric function, was introduced in solving linear integral

equations.

In this work, the radial basis functions (RBFs) is applied to solve the one-dimensional fuzzy Fredholm integral

equation of the second kind, with unknown function y(x). For this purpose we define interpolating of radial basis

functions as follows:1

(x) || ||Nj jj

y c x xφ=

= −∑ where 's are unknown scalars which should be detect to gain

approximation function.



We utilize as distinct points to form 1 2(x , x ,..., x ) dN ∈¡ wherein dimensional Euclidean space for

each known function which using for approximating. The main radial basis function that is useful to approach the

acceptable approximation is Gaussian (GA).

2. RBF Radial basis function is a function which determined ℝ → ℝ (domain: ℝ ) of the form (r)φ .Another shape of

this function that we can point is Multi quadric 2 2(r) rφ β= + (MQ) where r= || ||jx x− is the distance to a

center point of this function and β is a real constant that use as the shape parameter of RBF. In some of place we can

observe (r)φ is replaced with (r, )φ β . One of the most advantages of RBF is that it does not need a grid so the

scattered data are applicated. The Euclidean distance which is used between points can easily calculate so we would

achieve the linear interpolating to compute approximation.

3. FUZZY INTRODUCTION Fuzzy logic for the first time was introduced by Lofti A Zadeh in the paper of "Fuzzy Sets" [10]. He and other

researchers are trying to expand it in recent years.

Fuzzy logic has convert to one of the chief tool for a number of different applications.

Developing fuzzy models such as fuzzy logic control, fuzzy system models, fuzzy integral and differential equations

are considerable [11].

Definition 3.1. In the following, concepts of fuzzy are introduced by describing the fuzzy number.

A fuzzy number is a pair (u , u ) in parametric form which qualify in these conditions:

1. u (r) is a bounded left continuous non-decreasing function over [0, 1],

2. u (r) is a bounded left continuous non-increasing function over [0, 1],

3. u (r) ≤ u (r), 0 ≤ r ≤ 1.

The set of all fuzzy numbers is denoted by 1E . A popular fuzzy number is Triangular fuzzy number which it is a fuzzy number represented with three points as follows: A= ( 1 2 3, ,a a a )

This representations interpreted as membership functions:

( ) =⎩⎪⎨⎪⎧ 0 < − − < < − − < < 0 >


4. FUZZY INTEGRAL In this section, we represent Fredholm integral equations of the second kind: ( ) = ( ) + ( , ) ( )

(1)

where is an arbitrary given kernel function and ( ) is a given function of [ , ]s a b∈ . If in this integral, y(s) select

fuzzy function, the solution and the integral equation will be fuzzy. Of course it is clear that if it be a crisp function

then the solution of above equation is crisp. We introduce fuzzy integral equations as follows: ( ) = ( ) +

( , ) ( ) (2)

In this paper we show the fuzzy function by the form of ( ) = ( , ) which r is a fuzzy parameter.So we

replace the equation (2) by equation (3): ( , ) = ( , ) + ( , ) ( , ) (3)

From now on, we use the overhead equation for solving our numerical examples.

5. RBF METHOD FOR SOLVING FUZZY INTEGRAL EQUATIONS In this section we want to explain our method to obtain the solution of fuzzy integral equation. The goal is approximation of unknown function ( ) : ( ) = ( ) (4)

In addition we introduced unknown function by the form of equation (5), therefor the shape of fuzzy function which used in this paper is as follows: ( , ) = ( ) ( )

(5)

Observe carefully the scalar would be select fuzzy.

With replacing equation (5) instead equation (3),we have:

( ) ( ) = ( , ) + ( , ) ( ) ( )

(6)

Therefor


( ) ( ) = ( , ) + ( )

( , ) ( ) (7)

So we have

( )[ ( ) − ( , ) ( ) ] = ( , )

(8)

For = , = 1, … , the equation (8) transfigure to equation (9).

( )[ − , ( ) ] = ( , )

(9)

The equation (8) in the matrix form is shown in equation (10). [Φ − ] = (10)

There = ( ) ,Φ = , = ∫ , ( ) , = ( , ).

We confront to linear equations system, which would find the unknown matrix .

6. NUMERICAL SOLUTION Consider the following function fuzzy integral equation ( , ) = ( , ) + ( )

( , ) and ( , ) = (−0.11070137908008493 + 0.11070137908008493 ) + (−1 + ) ( , ) = (2 − 2 ) + (0.22140275816016985 − 0.22140275816016985 ) where the analytical solution is ( , ) = ( − 1,2 − 2 ) . Gaussian function is used in RBF interpolation in which

N = 5. The solution of analytical and the proposed methods are analytical and proposed method and also accuracy of

the method are shown in Figure 1-Figure 3. E shown numerical error for = 0.1 in Figure 4.


A B

Figure 1. (A) is the left and (B) is the right exact solution for ∈ [ , . ] and ∈ [ , ]

A B Figure 2. (A) is the left and (B) is the right RBF solution for ∈ [ , . ] and ∈ [ , ]

A B

Figure 3. (A) is the left and (B) is the right error function for ∈ [ , . ] and ∈ [ , ]


A B Figure 4. (A) is the left and (B) is the right error function for = . and ∈ [ , ]

7. Conclusion

In this paper we approximated the unknown function of fuzzy integral equation by using RBF interpolating. This method is simple and efficient for solving these type of integral equations. The error function which obtain is 10 -2.

That could be an acceptable solution to discover approximating function. REFERENCES 1. D. Dubois, H. Prade, towards fuzzy differential calculus, Fuzzy Sets and System 8 (1982) 1-7, 105-116,225-233.

2. S. Abbasbandy, T. Allahviranloo, Oscar Lopez-Pouso, Juan J. Nieto, Numerical Methods for Fuzzy Differential Inclusions, Journal of Computer and Mathematics with Applications 48 (2004) 1633-1641.

3. S. Abbasbandy and T. Allahviranloo, Numerical solution of fuzzy differential equations, Mathematical and computational Applications, 7 (2002), No.1, 41-52.

4. S. Abbasbandy and T. Allahviranloo, Numerical solution of fuzzy differential equations, Numerical Solutions of Fuzzy Differential Equations By Taylor Method, Computational Methods in Applied Mathematics, 2(2002), No.2, 113-124.

5. S. Abbasbandy, E. Babolian, M. Alavi, Numerical method for solving linear Fredholm fuzzy integral equations of the second kind, Chaos, Solitons Fractals, 31 (1) (2007) 138146.2005.09.036.

6. M.Khezerloo, T. Allahviranloo, S. Salahshour, M. KhorasaniKiasari, S. HajiGhasemi, Application of Gaussian quadratures in solving fuzzy Fredholm integral equations, in: Information Processing and Management of Uncertainty in Knowledge-Based Systems, Applications, in: Communications in Computer and InformationScience, 81 (2010) 481490.

7. M. Barkhordari Ahmadi, M. Khezerloo, Fuzzy bivariate Chebyshev method for solving fuzzy Volterra Fredholm integral equations, Int. J. Ind. Math, 3 (2) (2011) 6777.

8. K.E. Atkinson, The Numerical Solution of Integral equations of the second kind, Cambridge University Press, Cambridge, 1997.


9. L.M. Delves, J.L. Mohamed, Computational Method for Integral Equations, Cam-bridge University Press, New York, 1985.

10. L. A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, Information Sciences, 8 (1975), 199-249.

11. S. Abbasbandy, E. Babolian and M. Alavi, Numerical method for solving linear fredholm fuzzy integral equations of the second kind, Chaos Solutions & Fractals, 31 (2007), 138-146.


Numerical solution of fuzzy linear system by using Particle Swarm optimization method

R. Salehi1 ,R. Firouzdor2, M. Amirfakhrian3, Sh. Asghari3

1- Department of Electrical and Computer and Information Technology, Islamic Azad University, Ghazvin. 2- Central Tehran Branch, Islamic Azad University, Young Researcher and Elite Club 3- Department of Mathematics, Islamic Azad University, Central Tehran Branch. 2- Department of Mathematics, Islamic Azad University, Hamedan. [email protected]

Abstract In this paper a numerical algorithm for solving a fuzzy linear system of equations (FLS) is considered. This system would be changed into an optimazition problem which is based on Particle Swarm obtimization1 algorithm. The efficiency of algorithm is illustrated by some numerical examples. Keywords: Fuzzy linear systems, Particle Swarm obtimization, Hausdorff distance, Optimization.

1. INTRODUCTION Fuzzy systems are employed in various areas such as mathematics, physics, statistics, engineering and etc. since

system’s parameters and measurements have uncertainity in mathematical models. First, Zadeh niteroduced fuzzy

numbers and fuzzy arithmetic [18]. Then it is investigated by Dubois and Prade [8]. A general model for solving a

fuzzy linear system with crisp coefficient matrix and the right-hand side column an arbitrary fuzzy number vector is

introduced by Friedman et al. [9]. They replaced a fuzzy nn× linear systems BAX = by a crisp nn 22 ×

matrix. Different approaches proposed to solve this system. For example, LU decomposition method is applied to

general fuzzy linear systems or symmetric fuzzy linear systems [1, 2]. Muzziloi et al. [14] considered fully fuzzy linear

systems of the form 2211 = bxAbxA ++ with 1A , 2A square matrices of fuzzy coefficients and 1b , 2b fuzzy

number vectors and Dehghan et al. [8] considered fully fuzzy linear systems of the form bAx = where A and b

are a fuzzy matrix and a fuzzy vector, respectively and then discussed the iterative solution of fully fuzzy linear

systems. In this study, we focus on solving fuzzy linear systems with fuzzy variations by Particle Swarm optimazation.

1 PSO



The mentioned method is based on Particle Swarm Theory. Particle Swarm obtimization , has recently attracted by

many researches, since it’s applicable in various problems such as classification [3].

The rest of this paper is as follows. Some background of fuzzy numbers and fuzzy differential equations which

will be applied are brought in the next section. In Section 3. Some numerical examples to illustrate this method is

presented in Section 4. Finally, Section 5 presents concluding remarks.

2. Preliminaries and notations In this section, some definitions and features of fuzzy numbers and fuzzy differential equations which will be used throughout the paper, will reviewed. Definition 2.1 ([15]). A fuzzy number u~ is completely determined by an ordered pair of functions

[ ])(),(=~ ruruu , 10 ≤≤ r , satisfy the following requirements:

1. )(ru is a bounded, monotonic, increasing (non decreasing) left-continuous function for all (0,1]∈r and right-continuous for 0=r . 2. )(ru is a bounded, monotonic, decreasing (non increasing) left-continuous function for all (0,1]∈r and right-continuous for 0=r . 3. For all (0,1]∈r we have )()( ruru ≤ .

For every [ ])(),(=~ ruruu , [ ])(),(=~ rvrvv and 0>k addition and multiplication have the following properties:

[ ])()(),()(=~~ rvrurvruvu ++⊕ (1)

[ ])()(),()(=~~ rvrurvruvu −−! (2)

[ ]

[ ]

≥

0,<,)(),(

0,,)(),(

=~ krukruk

krukruk

uk (3)

The collection of all fuzzy numbers with addition and multiplication as defined by Eqs. (3)(1) − is denoted

by 1E . For 1<0 ≤r we define the r -cuts of fuzzy number u~ with [ ] rxxu ur ≥∈ )(|=~

~µR and support

of u~ is defined as [ ] 0>)(|=~~

0 xxu uµR∈ .


Definition 2.2 Let )<<(,),,,(=~ βαβα nmnmu LR ≤ , where nm, are two centers (defuzzifiers) and

0>0,> βα are left and right spreads, respectively. )(α

xmL − and )(

βnxR −

are non-increasing functions

with 1=(0)L and 1=(0)R respectively. u~ is a L-R fuzzy number if its membership function be as the following form:

∞−

≤≤

∞−−

.0

,<<)(

,1

,<<)(

=)(~

otherwise

xnnxR

nxm

mxxmL

xuβ

α

µ

This definition is very general and covers quite different type of information. For example, fuzzy number u~

is trapezoidal fuzzy number when nm < and )(α

xmL −, )(

βnxR −

be linear functions or when nm = and

)(α

xmL −, )(

βnxR −

are linear functions, u~ denotes triangular fuzzy number and we write ),,(=~ βαmu .

Definition 2.3 The Hausdorff distance 0: 11 ∪→× +REED between fuzzy numbers is given by: ,]~~[sup=)~,~( *

[0,1]P!P r

Hr

vuvuD∈

where, for an interval [a, b], the norm is .||,||max=],[ * baba PP

and )]()(),()([=]~~[ rvrurvruvu rH +−! .

It is easy to see that D is a metric in 1E and has the following properties ( [4]). Lemma 2.1 For 1,,, E∈ewvu and R∈k , we have the following results

(1) ),(=),( vuDwvwuD ⊕⊕ , (2) ),,(||=),( vuDkvkukD ee (3) ),(),( ),( evDwuDewvuD +⊕⊕ .

Definition 2.4 ([7]) Let 1, E∈yx . If there exists 1E∈z such that zyx =+ , then z is called the H −difference of x , y and it is denoted by yx! . Definition 2.5 ([13]) The nn× linear system of equations


nnnnnn

nn

nn

yxaxaxa

yxaxaxayxaxaxa

=

==

2211

22222121

11212111

KMMOMM

KK

(4)

where the coefficient matrix njiaA ij ≤≤ ,),1(= is a crisp nn× matrix and 1E∈iy is called a FSLEs. 3. Particle Swarm Optimization PSO optimizes a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the search-space according to simple mathematical formulae. The movements of the particles are guided by the best found positions in the search-space which are updated as better positions are found by the particles.

PSO algorithm works by having a population (called a swarm) of candidate solutions (called particles). These particles are moved around in the search-space according to a few simple formulae. The movements of the particles are guided by their own best known position in the search-space as well as the entire swarm’s best known position. When improved positions are being discovered these will then it will guide the movements of the swarm. The process is repeated and satisfactory solution will be discovered.

PSO Variants: Various variants of a basic PSO algorithm are possible. New and some more sophisticated PSO variants are continually being introduced in an attempt to improve optimization performance. There is a trend in that research; one can make a hybrid optimization method using PSO combined with other optimization techniques [10, 16].

1. Discrete PSO 2. Constriction Coefficient 3. Bare-bones PSO 4. Fully informed PSO.

Applications: The first practical application of PSO was in the field of neural network training and was reported together with the algorithm itself (Kennedy and Eberhart 1995). Many more areas of application have been explored ever since, including telecommunications, control, data mining, design, combinatorial optimization, power systems, signal processing, and many others. PSO algorithms have been developed to solve:

1. Constrained optimization problems 2. Min-max problems 3. Multi objective optimization problems 4. Dynamic tracking.

Implementation Algorithm: The PSO algorithm is simple in concept, easy to implement and computational efficient. Original PSO was implemented in a synchronous manner. but improved convergence rate is achieved by asynchronous PSO algorithm [6, 17].

In the PSO algorithm each individual is called a "particle", and is subject to a movement in a multidimensional space that represents the belief space. Particles have memory, thus retaining part of their previous state. There is no restriction for particles to share the same point in belief space, but in any case their individuality is preserved. Each particle’s movement is the composition of an initial random velocity and two randomly weighted influences: individuality, the tendency to return to the particle’s best previous position, and sociality, the tendency to move towards the neighborhood’s best previous position.

3.1. The Continuous PSO


There are two versions of the basic PSO algorithm. The "continuous" version uses a real-valued multidimensional space as belief space, and evolves the position of each particle in that space using the following equations:

).(.).(..= 122

111

1 tidgd

tidid

tid

tid xpcxpcvwv −+−++ ψψ (5)

and 11 = ++ + t

idtid

tid vxx (6)

there • t

idv : Component in dimension d of the particle velocity in iteration .

• tidx : Component in dimension d of the particle position in iteration .

• 21,cc : Constant weight factors.

• ip : Best position achieved so long by particle .

• gp : Best position found by the neighbors of particle .

• 21,ψψ : Random factors in the [0,1] interval. • w : Inertia weight.

The particle used to calculate gp depends on the type of neighborhood selected. In the basic algorithm either a global (gbest) or local (lbest) neighborhood is used. In the global neighborhood, all the particles are considered when calculating gp . In the case of the local neighborhood, neighborhood is only composed by a certain number of particles among the whole population. The local neighborhood of a given particle does not change during the iteration of the algorithm.

A constraint ( maxv ) is imposed on tidv to ensure convergence. Its value is usually kept within the interval

],[ maxmaxidid xx− , being max

idx the maximum value for the particle position [11]. A large inertia weight ( w ) favors global search, while a small inertia weight favors local search. If inertia is used, it is sometimes decreased linearly during the iteration of the algorithm, starting at an initial value close to [11, 17]. An alternative formulation of Eq. 1 adds a constriction coefficient that replaces the velocity constraint ( maxv ) [10]. The PSO algorithm requires tuning of

some parameters: the individual and sociality weights ),( 21 cc , and the inertia factor ( w). Both theoretical and empirical studies are available to help in selection of proper values [5, 10, 6, 16, 11, 17]. 3.2. PSO algorithm for solving fuzzy linear systems In this section we used PSO algorithms for solving a fuzzy linear system by crisp coefficients. Consider Equation (4) We change this equation to an optimization problem by using Definition (2.3) as follows:

)~,~(min YXAD (7) There njiaA ij ≤≤ ,),1(= , ~,,~.,~=~

21 nxxxX K and ~,,~.,~=~21 nyyyY K .

Using PSO method introduced in 3.1.1 for the above problem, we have

]~~[.~.]~,~[.~.~.=~ 122

111

1 tidgd

tidid

tid

tid XPDcXpDcVwV ψψ +++ (8)

and


11 ~~=~ ++ + tid

tid

tid VXX (9)

Thus using (2.1) and (2.2) for (8), (7) and (9), we can solve this fuzzy linear system equation. In the next section we will give a numerical example using this method. 4. Numerical examples In this section we present a numerical example. Example 4.1 Consider the following fuzzy linear system 22× by crisp coefficients:

−++−−

)2,7(4=~3~),2(=~~

21

21

rrxxrrxx

(10)

−+−

+−

)2,7(4),2(

,~3~~~

min21

21

rrrr

xxxx

D (11)

We used PSO algorithm for (11). Numerical solution and exact solution are shown in Figure 1. Also the approximations of )~,~( 21 xx and their errors, by this method are given in Table 1. In this example number of iteration

is 5000 and CPU time is 01108.734375 × s. The exact solution is as follows:

)0.375,1.3750.125(0.875=),(=~)0.875,2.8750.625(1.375=),(=~

222

111

rrxxxrrxxx

−+−+

Table 1- Numerical solution )~,~( 21 xx and error of solution for r-cut ( [0,1]∈r ).

r Error of 1x Error of 2x Error Left Right Left Right

.0 1.374995 2.875005 8.750026

0110−× 1.374998 5.051863 0610−×

.1 1.438050 2.787624 8.873753

0110−× 1.337540 6.422254 0410−×

.2 1.497690 701660 9.008841

0110−× 1.299360 1.890763 0310−×

.3 1.560148 2.614695 9.123947

0110−× 1.262228 4.316193 0310−×

.4 1.623309 2.528050 9.256899

0110−× 1.223260 3.228672 0310−×

.5 1.689865 2.431490 9.368498

0110−× 1.188494 6.320487 0310−×

.6 1.753675 349796 9.475587

0110−× 1.151994 7.383220 0310−×


.7 1.817734 2.255146 9.630345

0110−× 1.115765 1.090006 0210−×

.8 1.874780 2.176776 9.726433

0110−× 1.073165 9.313747 0310−×

.9 1.936833 2.086880 9.886899

0110−× 1.038379 2.243998 0210−×

1.999522 1.999708 1.002070 1.000298 6.276715 0310−×

Figure 1. Exact solution and numerical solution.


5. Conclusion In this paper we used PSO for solving a system of fuzzy linear equations with crisp coefficients. The obtained results shows the small amount of error. One advantage of this algorithm is that we can use it without knowing the exact solution and obtainig an approximation with small error. REFERENCES

1. S. Abbasbandy, R. Ezzati and A. Jafarian, LU decomposition method for solving fuzzy system of linear

equations, Appl. Math. Comput. 172 (2006)633-643.

2. B. Asady, S. Abbasbandy and M. Alavi, Fuzzy general linear systems, Appl. Math. Comput. 169 (2005) 34-40.

3. A. Chatterjee, P. Siarry, A PSO aided neuro-fuzzy classifier employing linguistic hedge concepts. Expert

Systems with Applications, 33(4)(2007) 1097-1109.

4. De. Boor,Multivariate piecewise polynomials Acta Numerica, 2 (1993) 65-109.

doi:10.1017/s0962492900002348.

5. L. J. Cimini Jr. Analysis and simulation of a digital mobile channel using orthogonal frequency division

multiplexing. IEEE Trans. Commun. 1985; COM-33: 665-75.

6. K. Choi, K. Kang, and Kim S. Peak power reduction scheme based on subcarrier scrambling for MC-CDMA

system. IEE Proceedings Communications. 2004; 151: 39-43.

7. L. Derong, Ch. Tsu-Shuan, Z. Yi, A constructive algorithm for feedforward neural networks with incremental

training, IEEE Trans. Circ. Syst., 49 (2002) 12.

8. D. Dubois and H. Prade, Operations on fuzzy numbers, J. Systems Sci. 9(1978) 613-626.

9. M. Friedman, Ma Ming and A. Kandel, Fuzzy linear systems, Fuzzy Sets and Systems 96 (1998) 201-209.

10. A.E. Jones , T. A. Wilknson and S. K. Barton, Block coding scheme for reduction of peak to mean envelope

power ration of multicarrier transmission scheme. Electron Lett., 1994; 30: 2098-99.

11. T. Jiang, W. B. Yao, P. Guo, Y. H. Song and D. Qu, Two novel nonlinear companding schemes with interactive

receiver to reduce PAPR in multi-carrier modulation systems. IEEE Trans. Broadcast., 2006; 52: 268-73.

12. X. Li and L. J. Cimini, Jr. Effects of clipping and filtering of the performance of OFDM. IEEE Commun. Lett.,

1998; 2:131-33.

13. Tofigh Allahviranloo, Numerical methods for fuzzy system of linear equations, Applied Mathematics and

Computation 155 (2004) 493â€“502.

14. S. Muzzioli and H. Reynaerts, Fuzzy linear systems of the form 2211 = bxAbxA ++ , Fuzzy Sets and Systems,

In press.

15. L. Stefanini, L. Sorini, M. L. Guerra, Parametric representation of fuzzy numbers and application to fuzzy

calculus, Fuzzy Sets and Systems, 157 2423-2455, 2006.


16. J. Tao and Z. Guangxi, Nonlinear companding transform for reducing peak-to-average power ratio of OFDM

signals. IEEE Trans. Broadcast., 2004; 50: 342-46.

17. C. L. Wang and Y. Ouyang, Low-complexity selected mapping schemes for peak-to-average power ratio

reduction in OFDM systems. IEEE Trans. Signal Proc. 2005; 53: 4652-60.

18. L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, Inform. Sci. 8

(1975) 199-249.


Fitting an ARMA model to the GARCH simulations artificial time series, Forecasting GARCH model, An R software

implementation

Fatemeh Hassantabar Darzi1, Narges Khoshnazar 2, Mehrdad Eslami3

1- Academic member, Department of Statistics, University of Sistan and Baluchestan.

2- BSc, Department of Statistics, University of Sistan and Baluchestan.

3- Academic member, Department of Computer Science, University of Velayat Iranshahr.

[email protected]

Abstract

We report on concepts and methods to implement the family of ARMA models with GARCH simulation and fitting models. The software implementation is written in R language programming. The implementation is tested with ARMA (1, 1) model applied to the simulated data set given by GARCH (1, 1) model. Three different type of GARCH model are used to fit to the ARMA (1, 1) residuals. The models are compared with AIC and BIC values. Moreover forecasting 10 steps ahead form GARCH (1, 4) model. Implementations are available for R software environments, which are open sources project for computational finance and financial engineering. Keywords: ARIMA, GARCH, Simulation, Fitting, Forecasting

1. INTRODUCTION GARCH, Generalized Autoregressive Conditional Heteroskedastic, models have become important in the analysis of time series data, particularly in financial applications when the goal is to analyze and forecast volatility. For this purpose, we describe functions for simulating, estimating and forecasting various univariate GARCH-type time series models in the conditional variance and an ARMA specification in the conditional mean [1]. The parameter estimates are checked by several diagnostic analysis tools including graphical features and hypothesis tests. Functions to compute n-step ahead forecasts of both the conditional mean and variance are also available. The number of GARCH models is immense. The GARCH model introduced by Bollerslev (1986) [2]. R software is free software and comes with ABSOLUTELY NO WARRANTY. R is an integrated suite of software facilities for data manipulation, calculation and graphical display. Among other things it has: • An effective data handling and storage facility. • A suite of operators for calculations on arrays, in particular matrices. • A large, coherent, integrated collection of intermediate tools for data analysis. • Graphical facilities for data analysis and display either directly at the computer or on hard- copy. • A well-developed, simple and effective programming language (called ‘S’) which includes conditionals, loops, user defined recursive functions and input and output facilities (Indeed most of the system supplied functions are themselves written in the S language.) [3]. 2. MEAN AND VARIANCE EQUATION



We describe the mean equation of a univariate time series xt by the process xt=E xt Ωt-1 +εt (1). Where E(·|·) denotes the conditional expectation operator, Ωt-1 the information set at time t-1, and εt the innovations or residuals of the time series; εt Describes uncorrelated disturbances with zero mean and plays the role of the unpredictable part of the time series. In the following we model the mean equation as an ARMA process, and the innovations are generated from a GARCH process. ARMA Mean Equation: The ARMA(m,n) process of autoregressive order m and moving average order n can be described as

xt=μ+ ∑ a ixt-i+ ∑ bjεt-j+nj=1

mi=1 εt=μ+a(B)xt+b(B)εt (2)

with mean µ, autoregressive coefficients a i and moving average coefficients bj. Note, that the model can be expressed in a quite comprehensive form using the backshift operator B defined by Bxt=xt-1. The functions a(B) and b(B) are polynomials of degree m and n respectively in the backward shift operator B. If n=0 we have a pure autoregressive process and if on the other hand m=0 we have a pure moving average process. The ARMA time series is stationary when the series a(B), which is the generating function of the coefficients a i, converges for |B|<1 is, on or within the unit circle. GARCH Variance Equation: The mean equation cannot take into account for heteroskedastic effects of the time series process typically observed in form of fat tails, as clustering of volatilities, and the leverage effect.

In this context Engle (1982) [4] introduced the Autoregressive Conditional Heteroskedastic model, named ARCH, later generalized by Bollerslev (1986) [1], named GARCH. The εt terms in the ARMA mean equation (2) are the innovations of the time series process. Engle (1982) [4] defined them as an autoregressive conditional heteroscedastic process where all εt are of the form εt=ztσt (3). where zt is an iid process with zero mean and unit variance. Although εt is serially uncorrelated by definition its conditional variance equals σt

2 and, therefore, may change over time. All the GARCH models we consider in the following differ only in their functional form for the conditional variance. The variance equation of the GARCH(p,q) model can be expressed as εt=ztσt, zt~Dv(0,1),

σt2=ω+ ∑ αiεt-i

2 + ∑ βiσt-i2q

j=1pi=1 = ω+α(B)εt-1

2 +β(B)σt-12 (4)

where Dv(0,1) is the probability density function of the innovations or residuals with zero mean and unit variance. Optionally, v are additional distributional parameters to describe the skew and the shape of the distribution. For ARMA models a GARCH specification often leads to a more parsimonious representation of the temporal dependencies and thus provides a similar added flexibility over the linear ARCH model when parameterizing the conditional variance. Bolerslev (1986) [1] has shown that the GARCH(p,q) process is wide-sense stationary with E(εt)=0, var(εt)=ω/(1-α(1)-β(1)) and cov(εt,εs)=0 for t≠s if and only if α(1)+β(1)<1. 3. THE SPECIFICATION STRUCTURE The function garchSpec( ) from fGarch-R package[5] creates a S4 object called specification structure which specifies a time series process from the ARMA family and which can be extended easily to include further type of GARCH models. The specification structure maintains information that defines a model used for time series simulation, parameter estimation, diagnostic analysis, and forecasting. The specification structure is a very helpful object which can be attributed to other objects like the result of a time series simulation or a parameter fit, so that always all background information is available. > library (fGarch) #call fGarch package > args (garchSpec) function (model = list(omega = 1.0e-6, alpha = 0.1, beta = 0.8), presample = NULL, cond.dist = c("norm", "ged", "std", "snorm", "sged", "sstd"), rseed = NULL)


The function garchSpec( ) takes the inputs and derives the formula object from the model arguments. The model is specified by default as a GARCH (1, 1) process with ω=10-6, α=0.1, β=0.8, and with normal innovations. A formula object is automatically created from the model list and available through the @formula slot, which is a list with two formula elements named formula. Mean and formula.var, most likely returned as arma (m, n) and garch (p, q), where m, n, p, and q are integers denoting the model order. > spec <- garchSpec(model = list()) # Default GARCH(1,1) - uses default parameter settings > spec Formula: ~ garch(1, 1) Model: omega: 1e-06 alpha: 0.1 beta: 0.8 Distribution: norm Presample: time z h y 1 0 -0.4480984 1e-05 0 4. SIMULATION OF ARTIFICIAL TIME SERIES The function garchSim( ) from fGarch- R package [5],creates an artificial ARMA time series process with GARCH errors. The function requires the model parameters which can also be an object of class garchSpec, the length n of the time series, a presample to start the iteration, and the name of cond.dist, the conditional distribution to generate the innovations. A presample will be automatically created by default if not specified. The function garchSim( ) returns the sample path for the simulated return series. The model specification is added as an attribute. In Figure 1 by using ts.plot function, time series plot for simulated objects has been drawn. This function is availabe in Stats-R package [6]. >spec <- garchSpec (model = list ( )) # Simulate a "time series GARCH (1, 1) model" with 1000 objects >data <- garchSim (spec, n = 1000) >data GMT garch 2013-01-19 3.813062e-03 2013-01-20 5.099345e-04 2013-01-21 -7.147296e-04 2013-01-22 -4.495025e-05 …….... >ts.plot(data, main="1000 simulated objects")

Figure 1. Time Series plot of 1000 simulated objects

5. FITTING ARIMA (1, 1) MODEL TO THE SIMULATED DATA

1000 simulated objects

Time

data

0 200 400 600 800 1000

-0.010

-0.005

0.00

00.00

50.01

0


The next natural extension of the function garchFit ( ) is to allow to model ARIMA (m, n) time series processes to the simulated data. Both, ARIMA and GARCH may have general orders m, n, p, and q. For fitting a special ARIMA model with special parameters, use the arima ( ) function in R programming software. The arima function (in standard package stats) is documented in ARIMA Modelling of Time Series [6]. Extension packages contain related and extended functionality, e.g. Process of fitting an ARIMA (1, 1) model to the data variable contains1000 simulated objects with model summary is: >library(stats) #call stats package > farma <- arima (data, c (1,0,1)) #fitting ARMA model to our simulated data > farma Call: arima (x = data, order = c (1, 0, 1)) Coefficients: ar1 ma1 intercept -0.0016 -0.0038 0e+00 s.e. 2.0452 2.0361 1e-04 Sigma^2 estimated as 9.803e-06: log likelihood = 4347.46, aic = -8686.92 To test the correctness of fitted model and normal distributed residuals, whether the residuals and squared residuals have significant autocorrelations or not, we write a function with six plots consist of histogram, density, normal Q-Q, time series, acf (Autocorrelation) and pacf (partial autocorrelation) plots for the model residuals, which are available in RGraphics package[7]. Different type of plots in Figure 2 shows that the ARIMA(1,1) model residuals are significant, independent and distributed normally with homogeneous variance which demonstrate that the fitted model is appropriate for simulated data. We use par ( ) function to partitioning the graphical device into six parts. >library(graphics) #call graphics package > res1 <- resid (farma) #ARMA models residuals > example1 <- function (res1) #residuals plots function + par (mfrow=c (2, 3)) + hist (res1) + acf (res1) + pacf (res1) + plot (res1, type=”p”) + plot (density(res1)) + qqnorm (res1) + > example1 (res1) Moreover for checking the independency of model residuals, Box-Ljung test from stats-R package [6] can be applied. Box.test ( ) function will calculate this test in R programming software. > Box.test (res1, type="Ljung") Box-Ljung test data: res1 X-squared = 0.00023347, df = 1, p-value = 0.9878 Which shows that model residuals are independently distributed.


Figure 2. ARIMA (1, 1) model residuals plots for significance test

6. FITTING GARCH (1, 4) MODEL TO THE ARMA (1, 1) MODEL RESIDUALS To fit a GARCH model to the ARMA (1, 1) model residuals first step is using McLeod-Li test for ARCH effects. McLeod and Li (1983) [8] proposed a formal test for ARCH effect based on the Ljung-Box test. It looks at the autocorrelation function of the squares of the pre-whitened data, and tests whether the first L autocorrelations for the squared residuals are collectively small in magnitude. For fixed sufficiently large L, the LjungBox Q-statistic of Mcleod-Li test is given by

Q=N(N+2) ∑ rˆK2 (ε2)N-K

LK=1 (5)

Where N is the sample size, and rˆK2 is the squared sample autocorrelation of squared residual series at lag k.

Under the null hypothesis of a linear generating mechanism for the data, namely, no ARCH effect in the data, the test statistic is asymptotically ϭ2(L) distributed. We use McLeod.Li.test( ) function from TSA-R package [9] to run this test. Figure 3 shows the results of the McLeod-Li test for ARMA (1, 1) model residuals. It illustrates that the null hypothesis of no ARCH effect is rejected for residuals. >library(TSA) #call TSA package > McLeod.Li.test (farma <- arima (data, order = c (1, 0, 1)))

Figure 3. McLeod-Li test for the ARMA (1, 1) model residuals

Second step is fitting a GARCH model to the ARMA (1, 1) residuals. garchFit ( ) function from fGarch-R package [5] fits the parameters to the model using the maximum log-likelihood estimator. This function estimates the time series coefficients and optionally the distributional parameters of the specified GARCH model. Summary ( ) function will create a summary report such as AIC and BIC values, etc. As shown in Table 1, best GARCH model for fitting to the ARMA(1,1) residuals is GARCH(1,4) with less AIC and BIC values.

Table 1- AICs and BICs of the fitted GARCH model MODEL AIC BIC

Histogram of res1

res1

Frequency

-0.010 0.000 0.010

050

100

150

200

250

0 5 15 25

-0.05

0.00

0.05

Series res1

Lag

ACF

0 5 15 25

-0.08

-0.04

0.00

0.04

Lag

Partial ACF

Series res1

Time

res1

0 400 800

-0.010

0.000

0.005

0.010

-0.010 0.000 0.010

020

4060

80100

density.default(x = res1)

N = 1000 Bandw idth = 0.000693

Density

-3 -1 1 2 3

-0.010

0.000

0.005

0.010

Normal Q-Q Plot

Theoretical Quantiles

Sample Quantiles

0 5 10 15 20 25 30

0.0

0.2

0.4

0.6

0.8

1.0

Lag

P-value


GARCH(1,4) 1.119631 1.139446 GARCH(3,4) 1.120810 1.146286 GARCH(3,1) 1.128006 1.144990

In Figure 4 we also can see the ACF and PACF plots of GARCH (1, 4) model residuals are significant as well.

Figure 4. ACF and PACF plots of GARCH (1, 4) model residuals.

> g <- garchFit (resid1~garch (1, 4)) #Fitting Garch model > summary (g) > par (mfrow=c (2,1)) # Drawing acf & pacf diagrams > acf (residuals(g),main="acf plot of residuals GARCH(1,4)") > pacf (residuals(g),main="pacf plot of residuals GARCH(1,4)") 7. Forecasting Heteroskedastic Time Series, GARCH (1, 4) model One of the major aspects in the investigation of heteroskedastic time series is to produce forecasts. Expressions for forecasts of both the conditional mean and the conditional variance can be derived. Predict ( ) is a generic function to forecast ten steps ahead (by default) from an estimated model. This function can be used to predict future volatility from a GARCH model and also is available in FinTS-R package [10]. Prediction with confidence intervals plot of GARCH (1, 4) model is shown in Figure 5. >library(FinTS) ##call FinTS package > predict (garchFit (resid1~garch (1, 4)), plot=TRUE) # Forecast 10 steps ahead Mean Forecast mean Error standard Deviation lower Interval upper Interval 1 -0.003299163 0.3990333 0.3990333 -0.7853901 0.7787918 2 -0.003299163 0.3723256 0.3723256 -0.7330439 0.7264456 3 -0.003299163 0.3645423 0.3645423 -0.7177890 0.7111907…..

Figure 5. Prediction with confidence intervals plot of GARCH (1, 4) model

0 5 10 15 20 25 30

-0.06

0.02

acf plot of residuals GARCH(1,4)

Lag

ACF

0 5 10 15 20 25 30

-0.06

0.02

Lag

Partia

l ACF

pacf plot of residuals GARCH(1,4)

0 100 200 300 400 500

-2-1

01

23

Index

x

Prediction with confidence intervals

Xt+h

Xt+h − 1.96 MSEXt+h + 1.96 MSE


8. CONCLUSIONS In this paper we have presented and discussed the implementation of R functions for modeling Univariate time series processes from the ARMA-GARCH family. The functions listed in this paper are part of the R timeSeries packages such as fGarch, FitARMA, Rgraphics, TSA and FinTS. R is an open source project, providing a freely available and a high quality computing environment with thousands of add-on packages make it a strong software for statistical computation. 9. REFERENCES 1. Brooks C., Burke S.P., Persand G., (2001); Benchmarks and the Accuracy of GARCH Model

Estimation, International Journal of Forecasting 17, 45-56. LNK.

2. Bollerslev T., (1986); Generalized Autoregressive Conditional Heteroskedasticity, Journal of Business and Business and Economic Statistics 14, 139–151.

3. W. N. Venables, D. M. Smith and the R Core Team, (2013); An introduction to R, https://cran.r-project.org/doc/manuals/r-release/R-intro.html.

4. Engle R.F., (1982); Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation, Econometrica 50, 987–1007.

5. Diethelm Wuertz and Yohan Chalabi with contribution from Michal Miklovic, Chris Boudt, Pierre Chausse and others, (2013); Rmetrics - Autoregressive Conditional Heteroskedastic Modelling, Package ‘fGarch, https://cran.r-project.org/web/packages/fGarch/index.html.

6. R Core Team and contributors worldwide, Stats Package, https://stat.ethz.ch/R-manual/R-devel/library/stats/html/00Index.html.

7. Paul Murrell, (2015); Data and Functions from the book R Graphics, Second Edition, Package ‘RGraphics’, https://cran.r-project.org/web/packages/RGraphics/index.html.

8. McLeod, A. I. and Li, W. K.: Diagnostic checking ARMA time series models using squared residual autocorrelations, Journal ofTime Series Analysis, 4, 269–273, 1983.

9. Kung-Sik Chan, Brian Ripley, (2012); Time Series Analysis, Package ‘TSA’, https://cran.r-project.org/web/packages/TSA/index.html.

10. Spencer Graves, (2009); Companion to Tsay (2005) Analysis of Financial Time Series, Package ‘FinTS’, https://cran.r-project.org/web/packages/FinTS/index.html.

https://cran.r

https://cran.r-project.org/web/packages/fGarch/index.html

https://stat.ethz.ch/R-manual/R

https://cran.r-project.org/web/packages/RGraphics/index.html

https://cran.r

https://cran.r-project.org/web/packages/FinTS/index.html


Application of Linear Fuzzy and Ordinary Linear Regression on the Geographical data with Outlier Observation

Case Study: Saghez Station

Mohammad Hossein Dehghan 1, Hojatollah Daneshmand 2, Narges Khoshnazar 3

1- Academic member, Statistics Department, University of Sistan and Baluchestan, Zahedan/Iran, [email protected]

2- Academic member, physics Department, University of Sistan and Baluchestan, Zahedan/Iran, daneshmand @hamoon.usb.ac.ir

3- MSc, Statistics Department, University of Sistan and Baluchestan, Zahedan/Iran, [email protected]

[email protected]

Abstract

In regression models normally, both of data and parameters are considered as crisp. But, in some cases, there is obscurity in the model parameter or observations. In these cases fuzzy regression can be fair alternative model. In this research we applied the mentioned models to forecasting Temperature (response variable) of Saghez areas. To do this we consider the Wet Temperature (WT), Relative Humidity (RH) and Cloud Angle (CA) as descriptive variables. Finally the estimated models and the parameters show the high determination coefficient and significance values to forecast the temperature. apply these approaches to geography data (WT, RH, WS, CA) with symmetric triangular fuzzy response observations. Keywords: Linear Fuzzy Regression, Absolute Value and least square Regression, Climatology, Precipitation

1. INTRODUCTION Since Zadeh [1] introduced fuzzy set theory, it has been widely developed in theory and application [1]. Regression analysis is one of the areasin which fuzzy set theory has been used frequently. Since Tanaka et al. [2] initiated research on fuzzy linear regression (FLR) analysis, this area has been widely developed and a wide variety of methods have been proposed. One approach to deal with FLR is least-squares method, which was first introduced by Celmins and developed by others [3]. FLR models can be classified into two general categories according to the type of dependent and independent variables: (a) Input data are non-fuzzy and output data are fuzzy numbers. (b) Both input and output data are fuzzy numbers. Temperature is the fundamental of elements in climate formation and is the few elements of climate that can measured in all places and geographic spaces. This variable of climate has a direct and indirect correlation with solar radiation, humidity, wind and rainfall and may to control the climate processes. One the most fundamental factors and momentous in environment, agriculture, industrial, glacial and Frostbite management, evaluation of energy consuming and probability of tornadoes and flooding, is the forecasting of air temperatures (Smith et al. 2009 [4], Afzali et al. 2012 [5], Tasadogh 2005) . A review of the literature indicatives Multiple models for minimum temperature, which is applied versus the risk of frostbite and glacial (Alen 1957 [6]). Adaptive Neuro-Fuzzy Inference System utilized in forecasting as a novel method (Jang 1993 [7]). Forecast of temperature in north west of Iran is done by adaptive Neuro-


mailto:@hamoon.usb.ac.ir




Fuzzy Inference System(Darbandi, Ouranghi 2011 [8]). Forecast autumn drought is done with using different input variables in east Iran. Indicators of climates, rainfall and drought indices were used as input variables in ANFIS system (Azhdry Moqadam et al. 1391 [14]). 2. LINEAR MODELS Regression model is one of the most efficient statistical tools that are used in many of research areas. Regression analysis is a statistical approach to study and model the relationship between variables, describe status of data, estimate parameters, fit the model to predict and control the response variable. To create these models, we need observations of independent (description) variables (say) x1, x2, …, xn and dependent (response) variable, (say) y. For instance, consider a simple linear regression that there is only an independent variable, x. Suppose this relationship between tow dependent and independent variables expressed by linear model yi=β0+β1xi ;i=1,…, n (1) Since data in hand involve error, they don’t exactly follow a mathematical linear model as mentioned model. Consequently we find an appropriate model based on data in hand, called estimated model such as, y i=β 0+β 1xi ;i=1,…,n (2) Obviously there are amount of error between any observed value yi and estimated value (y i), so the above model changes to following equation: yi=β٠+β1xi+εi i=1,…,n (3) In classic regression both, data and parameters are considered as crisp. But, in some cases, the observed data associated to one or more variables are imprecise. In such cases, one suggests fuzzy regression to replace in classic regression. In general, fuzzy regression can divide in to various types, for instance mandatory and possibility regression that for the first time Tanaka et al [9] and least square regression are introduced by Diamond [10] and Celmins [3]. One of the most important issues in regression model when there are outliers in data set. In such cases usually we use robust approach (least absolute deviations) to reduce the effect of outlier data. This method is proposed by Chang and Lee [11]. 3. LEAST SQUARE REGRESSION The least square deviation method is a criterion to estimate the parameters in linear models. Convenient to apply it, is the reason of its application in practice. But one of its imperfection is too influence of outliers observations. For parameters estimation, the minimization ∑ εi

٢ni=1 , is known least square regression method.

4. LAD REGRESSION In fact, when data in hand involve outliers in fuzzy cases, maybe it is more appropriate to use the Least Absolute value Deviation (LAD) in fuzzy regression. We express a distance between fuzzy numbers, proposed by kelkinnama and Taheri [12] based on least absolutes values. Then we show the regression model and apply it on geography data. 4.1 PROPOSE A METER ON FUZZY NUMBERS Definition 1. Let X = mx, αx, βx LR

and Y = my, αy, βy LR be two LR-type fuzzy numbers, then distance

between two fuzzy numbers is defined by [12] DLR X , Y = 1

2 mx-my + mx-lαx - my-lαy + mx+rβx - my+rβy (4)

Where, l= ∫ L-1(w)1٠ , r= ∫ R-1(w)1

٠ . Theorem 1. (FLR(R),DLR) is a complete metric space [13]. 4.2 FUZZY REGRESSION BASED ON LEAST ABSOLUTES DEVIATIONS


Consider an observed data set as xi, y i ,i=1, …, n where, xi= xi٠, xi1, …, xip , xi٠=1 and xij≥٠, i=1, …, n, j=1, .., p and y i= myi, αyi, βyi LL ,i=1, …, n Are fuzzy observed data of the dependent

variable. Based on such set of data we wish to fit a fuzzy linear model Y =A ٠⊕ A 1x1 ⊕…⊕ A pxp (5)

Where, A j= mj, αj, βj LR, j=1, …, p , denotes the fuzzy coefficients of the model. Between fuzzy numbers,

triangular numbers are used, because of their convenience in calculation. Therefore, are considered y i and A i triangular fuzzy numbers. Finally, LR fuzzy numbers, Y formed as the following: Y i= ∑ mjxij

pj=٠ ,∑ αjxij

pj=٠ ,∑ βjxij

pj=٠

T, i=1,..,n

(6) We want to minimize the sum of distances between Y i and y i based on mentioned above meter, which is equivalent to minimize the following statement: ∑ DLR y i, Y i = ∑ |myi-∑ mjxij

pj=٠ |n

i=1ni=1 +| − − (∑ ٠ − ∑ ٠ )| + | + − (∑ ٠ + ∑ ٠ )| (7)

Now, we change this problem to a standard linear programming problem. By introducing two nonnegative variables di

m١ and dim٢, we can write

myi-∑ mjxijpj=٠ =di

m1-dim2 (8)

|myi-∑ mjxijpj=٠ |=di

m1+dim2 (9)

If myi-∑ mjxij≥٠pj=٠ then, di

m1=myi-∑ mjxijpj=٠ and di

m2=٠ , and if myi-∑ mjxijpj=٠ <٠ then, di

m1=٠ ,

dim2=- yi-∑ a jxij

pj=٠ In other words, at least one of these variables will be zero. Similarly, by defining

nonnegative deviation variables diL1, di

L2 and diR1, di

R2 so two other absolute is written as follows: |(myi-

12αyi)-( ∑ mjxij

pj=٠ - 1

2∑ αjxij

pj=٠ )|=di

L1+diL2 (10)

(myi-12αyi)-( ∑ mjxij

pj=٠ - 1

2∑ αjxij

pj=٠ )=di

L1-diL2 (11)

|(myi+12βyi)-( ∑ mjxij

pj=٠ + 1

2∑ βjxij

pj=٠ )|=di

R1+diR2 (12)

(myi+12βyi)-( ∑ mjxij

pj=٠ + 1

2∑ βjxij

pj=٠ )=di

R1-diR2 (13)

Finally, to minimize the expression, we reform to, min∑ dim1+di

m2+diL1+di

L2+diR1+di

R2 ni=1 (14)

s.t myi-∑ mjxijpj=٠ =di

m1-dim2 (15)

(myi-12αyi)-( ∑ mjxij

pj=٠ - 1

2∑ αjxij

pj=٠ )=di

L1-diL2 (16)

(myi+12βyi)-( ∑ mjxij

pj=٠ + 1

2∑ βjxij

pj=٠ )=di

R1-diR2 (17)

diM1,di

M2,diL1,di

L2,diR1,di

R2≥٠, i=1, …, n, a j,βj ≤٠ , j=٠, …, p. As linear programming that could be solved by LINGO or other suitable and similar software. 5. NUMERICAL EXAMPLES 5.1 TABLE AND ESTIMATED MODELS The temperature is one of the most important elements that form the climate. Moreover the minimum of temperature is very important, because of glacial and frost, in two seasons normally. To estimate the temperature of weather (response variable), we consider the wet temperature (WT), relative humidity (RH) and cloud Angle (CA) as descriptive variables. 60 observations are randomly draw from Saghez station for 3 month (March, April and May) are shown in Table 2. In other word we focus on the effect of outlier observation on the models. Finally we apply the mentioned models on the data in hand, to estimate the parameter of the models and then forecast the responses variables.

Table 2- Observations


Number Ly DTM=My Ry WT=x1 RH=x2 CA=x3 1 0.7 -8 0.7 -9 61 0 2 0.2 -5 0.2 -6 71 0 3 3.1 -1 3.1 -2 85 0 4 1.9 -2 1.9 -4 45 250 5 0.7 -3 0.7 -5 53 0 6 0.8 -9 0.8 -10 68 0 7 1.2 -4 1.2 -6 60 0 8 0.9 1 0.9 -2 54 0 9 3.4 3 3.4 2 78 0

…. …. …. …. ….. …. …. Example 1: We modeled a simple ordinarily and fuzzy linear regression between DTM=Y and WT= variables without outliers observations. Estimated models expressed by, Y =3.5*e-07+(2*e-16)X1 and Y =(1.43, 2.71, 0.094)T⊕(0.096, 1.29, 0.039)⊗x١. Also FLR models ANOVA is shown in Table1. Observations scatter plot with simple ordinarily and fuzzy regression lines is shown in Figure 1. Example 2: We also modeled a simple ordinarily and Fuzzy linear regression between DTM=Y and WT= variables with outliers observations. Estimated models expressed by, Y =3.6*e-09+(7.3*e-12)X1 and Y =(1.26, 3, 0.7)T⊕(0.1182, 1.25, 0.05)⊗x١. Also FLR Models ANOVA is shown in Table1. Outliers observations are shown in Table 3. Also scatter plot with simple ordinarily and fuzzy regression lines is drawn for observations including outliers in Figure 2.

Table 1- Models ANOVA Simple FLR (P_value) Simple FLR including outliers (P_value) Multiple FLR (P_value)

B0= 3.5*e-07 B0= 3.6*e-09 B0= 8.715 B1= 2*e-16 B1= 7.3*e-12 B1= 1.183

R2= 0.89 R2= 0.558 B2= -0.098 …… ….. B3= -0.002 ….. …… R2= 0.993

Figure 1. Observations scatter plot with regression lines (Example 1)

Example 3: In last part We modeled a multiple ordinarily and Fuzzy linear regression between DTM=Y and WT= , RH= , CA= variables with outliers observations. Estimated models expressed by, Y =8.715+(1.183)X1+(0.098) X2-(0.002)X3


and Y = (1.47, 8.76, 0.89)T⊕(0.083, 1.18, 0.037)T⊗ x١ ⊕ (0, 0.099, 0.0119)T⊗x2⊖0.0012⊗x3.

Table 3- Observations including outliers Number Ly DTM=My Ry WT=x1

1 0.7 -8 0.7 -9 2 0.2 -5 0.2 -6 3 4.5 16 4.5 -2 4 3.7 15 3.7 -4 5 1.2 11 1.2 -5 6 0.8 -9 0.8 -10 7 1.2 -4 1.2 -6 8 0.9 1 0.9 -2 9 3.4 3 3.4 2

.… ….. …. ….. .…

Figure 2. Observations scatter plot with regression lines (Example 2)

9. CONCLUSIONS As the figures show, it’s not surprisingly, that linear fuzzy regression with LAD method slightly affected by outliers data. But the ordinarily linear regression (LSE) more affected by outliers observations. So the linear fuzzy regression is more stable than the ordinarily linear regression. The models show that the dry temperature is positive function of wet temperature, relative humidity and slightly negative function of cloud angle. 11. REFERENCES

1. L. A. Zadeh, Fuzzy sets, Information and Control, 8 (1956), 338-353. 2. H. Tanaka, S. Uejima and K. Asai, Linear regression analysis with fuzzy model IEEE Transactions on

Systems, Man and Cybernetics, 12 (1982), 903-907.


3. A. Celmins, “Least Squares Model Fitting to Fuzzy Vector Data,” Fuzzy Sets and Systems, Vol. 22, pp. 260-269, 1987.K. Elissa, “Title of paper if known,” unpublished.

4. Smith, BA., Hoogenboom, G., McClendon, R.W (2009) Artificial neural networks for automated year-round temperature prediction. Comput and Electro Agric 68: 52-61.

5. Afzali, M., Afzali, A., Zahedi, G (2012). The Potential of Artificial Neural Network Technique in Daily and Monthly Ambient Air Temperature Prediction. International Journal of Environmental science and development, 3, 33-38.

6. Allen C.C (1957) A simplified equation for minimum temperature prediction. Monthly Weather Review 85,pp 119-120.

7. Jang J.S.R (1993) ANFIS: Adaptive- Network Based fuzzy inference system. IEEE Trans system, Man, Cybernetic 23(3): 665-685.

8. Darbandi, S., Arvanaghi, H., 2009. Air temperature estimation using artificial intelligent methods (Case study: Maragheh City). European Journal of Scientific Research 61 (2), 290–298.

9. H. Tanaka, S. Uejima, K. Asai, “Linear Regression Analysis whit Fuzzy Model,” IEEE Transactions on Systems, Man, Cybernetics, Vol. 12, pp. 903-907, 1982.

10. P. Diamond, “Least Squares of Fitting Several Fuzzy Variables,” Proceding of the Second IFSA Congress, Tokoyo, pp. 20-25, 1987.

11. P.T. Chang, E.S. Lee, “Fuzzy Least Absolute Deviations Regression Based on the Ranking of Fuzzy Numbers,” Proceeding of the Third IEEE World Congress on Computational Intelligence, Orlando, FL, pp. 1365-1369, 1994.

12. S.M. Taheri, M. Kelkinnama, “Fuzzy Linear Regression Based on Least Absolute devitions,” Irannian Journal of Fuzzy Systems, Vol. 9, No. 1, pp. 121-140, 2012.

13. S.M. Taheri, M. Kelkinnama, “Fuzzy Least Absolutes Regression,” 4th Internatinal IEEE Conference on Intelligent Systems, Vana, Bulgaria, pp. 55-58, 2008.

14. M. Azhdry Moqadam, M. Khosravi, N. hosseinpour, H. and A. Jafari nadushan, (1391). Drought Forecasting using Neuro-Fuzzy Model, Climatic Index, Precipitation and Drough Index ( case study: Zahedan). Geography and Development Quarterly, (in Persian).


A Survey on Different Strategies on Preparing Data for Data Mining

Michael Bidollahkhany1, Marzieh Faridi Masouleh2

1- Bachelor Student, Ahrar University, Rasht, Guilan, Iran

[email protected]

2- Ph.D. Candidate, Tehran Science and Research Branch, Islamic Azad University, Tehran, Iran

[email protected]

Abstract

Nowadays each individual and organization – business, family or institution – produces and collects huge volume of data about itself and its environment. Based on Data mining is the process of selection, exploration, and modeling of large quantities of data to discover regularities or relations that are at first unknown with the aim of obtaining clear and useful results for the owner of the database. Data mining is a preparing tool for making strategic decision and play important role in market segmentation, customer services, fraud detection, credit and behavior scoring and benchmarking. Keywords: Big Data, Data, Data mining, Data preparation

1. INTRODUCTION Recent tremendous technical advances and technological developing on economy, politics, scientifically researches and etc. are creating unprecedented quantities of digital data. Data mining, the science of extracting useful knowledge from such huge data repositories, has emerged as a young and interdisciplinary field in computer science. Data mining techniques have been widely applied to problems in industry, science. Engineering and government, and it is widely believed that data mining will have profound impact on our society. The growing consensus that data mining can bring real value has led to an explosion in demand for novel data mining technologies and for students who are trained in data mining-students who have an understanding of data mining techniques, can apply them to real-life problems, and are trained for research and development of new data mining methods. Courses in data mining have started to sprawl all over the world [1]. Based on this field development, this paper set up a survey on varieties of strategies to prepare data for Data Mining that gives recommendations for researchers and IT professionals to choose the best method of data preparation in data mining. Based on feedback from researchers, educators, and students, we are convinced that it is an important task to have a carefully [1] comparison, with tangible exemplification, technically rich, and balanced knowledge for this investigation. A comprehensive and balanced research will ensure that the best choice in data preparing methods, a solid foundation for the reliable, and it will provide guidance for the training of how to choose and what's the best strategy of data preparing for researchers, developers and technology users.

2. THE DEFINITION OF DATA Data (/ˈdeɪtə/ DAY-tə, /ˈdætə/ DA-tə, or /ˈdɑːtə/ DAHtə) is a set of values of qualitative or quantitative variables; restated, pieces of data are individual pieces of information. Data is measured, collected and reported, and analyzed, where upon it can be visualized using graphs or images. Data as a general concept refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing. Raw data, i.e. unprocessed data, is a collection of numbers, characters; data processing commonly occurs by stages, and the “processed data” from one stage may be considered the “raw data” of the next. Field data is raw data that is collected in an uncontrolled in situ environment. Experimental data is data that is generated within the context of a scientific investigation by observation and recording. The




word “data” used to be considered the plural of “datum”, and still is by some English speakers. Nowadays, though, “data” is most commonly used in the singular, as a mass noun (like “information”, “sand” or “rain”) [22]. 2.1. TYPES OF DATA In computer science and computer programming, a data type or simply type is a classification identifying one of various types of data, such as real, integer or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of that type can be stored. but this isn’t all about data types, at modern type fractionalize we can find new data classification, include videos, audio and metadata, images and etc. now we have more varieties of data types and this huge amount of data should analyze and convert to usable knowledge [19,22]. For example, some kinds of familiar data types are integers, Booleans, characters, floating point, numbers and alphanumeric strings.

2.2. METADATA

Metadata is "data about data". Two types of metadata exist: structural metadata and descriptive metadata. Structural metadata is data about the containers of data. Descriptive metadata uses individual instances of application data or the data content [20].

Metadata was traditionally in the card catalogs of libraries. As information has become increasingly digital, metadata is also used to describe digital data using metadata standards specific to a particular discipline. Describing the contents and context of data or data files increases their usefulness. For example, a web page may include metadata specifying what language the page is written in, what tools were used to create it, and where to find more information about the subject; this metadata can automatically improve the reader's experience. The main purpose of metadata is to facilitate in the discovery of relevant information, more often classified as resource discovery. Metadata also helps organize electronic resources, provide digital identification, and helps support archiving and preservation of the resource. Metadata assists in resource discovery by "allowing resources to be found by relevant criteria, identifying resources, bringing similar resources together, distinguishing dissimilar resources, and giving location information.” [20,22].

Why knowing type of data are important, because an important detail to be highlighted in data mining is usually regarding the types of data. Some algorithms only accept numeric values and others accept only nominal values, however, for the algorithms used in this work we don’t need to worry about it, because they transform the data to the required type automatically [12]. 2.3. DATA STREAMS A data stream is an ordered sequence of instances that arrive at a rate that does not permit to permanently store them in memory. Data streams are potentially unbounded in size making them impossible to process by most data mining approaches [9]. Data Stream Mining is one of the area gaining lot of practical significance and is progressing at a brisk pace with new methods, methodologies and findings in various applications related to medicine, computer science, bioinformatics and stock market prediction, weather forecast, text, audio and video processing to name a few [25]. 2.4. DATA SET A data set (or dataset) is a collection of data. Most commonly a data set corresponds to the contents of a single database table, or a single statistical data matrix, where every column of the table represents a particular variable, and each row corresponds to a given member of the data set in question. The data set lists values for each of the variables, such as height and weight of an object, for each member of the data set. Each value is known as a datum. The data set may comprise data for one or more members, corresponding to the number of rows [22].


2.5. BIG DATA Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data creation, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making. And better decisions can mean greater operational efficiency, cost reduction and reduced risk. The most fundamental challenge for the Big Data applications is to explore the large volumes of data and extract useful information or knowledge for future actions [24]. 2.6. DIFFRENCES BETWEEN DATA AND INFORMATION Frequently heard the words Data, Information and Knowledge used as if they are the same thing. Data is/are the facts of the World. For example, take yourself. You may be 5ft tall, have brown hair and blue eyes. All of this is “data”. You have brown hair whether this is written down somewhere or not. In many ways, data can be thought of as a description of the World. We can perceive this data with our senses, and then the brain can process this. Human beings have used data as long as we’ve existed to form knowledge of the world. Until we started using information, all we could use was data directly. If you wanted to know how tall I was, you would have to come and look at me. Our knowledge was limited by our direct experiences. Information allows us to expand our knowledge beyond the range of our senses. We can capture data in information, and then move it about so that other people can access it at different times. Here is a simple analogy for you. If I take a picture of you, the photograph is information. But what you look like is data [17,22].

I can move the photo of you around; send it to other people via e-mail etc. However, I’m not actually moving you around – or what you look like. I’m simply allowing other people who can’t directly see you from where they are to know what you look like. If I lose or destroy the photo, this doesn’t change how you look. So, in the case of the lost tax records, the CDs were information. The information was lost, but the data wasn’t. Mrs. Jones still lives at 14 Whitewater road, and she was still born on 15th August 1971 [17,22].

3. THE DEFENITION OF DATA MINING Repeating the process with the goal of data mining analysis of large data sets in order to extract information and knowledge is provided for farmers involved in decision making and problem solving issued. The process of data mining has a recurring nature is because it’s different phases; there is the possibility of feedback and reform [3]. Therefore, the process of data mining terms describing that involves the collection and analysis of data, the development of inductive learning models and finally the adoption of appropriate measures and practical decisions based on knowledge. The phrase "computational learning theory" to the collection of models and mathematical methods which are said to be in the form of any type of data mining analysis and are used for the production of new knowledge [3]. Data mining is a process of discovering hidden patterns and information from the existing data. The difference between data in the databases and a data warehouse is in a database the data is in the structured form whereas in the data warehouse the data may or may not be present in the structured format. The structure of the data may be defined to make it compatible for processing [25]. The process of data mining based on inductive learning techniques has been that the main goal of data extraction and their example is available. In other words, data mining analysis to achieve objective results based on a sample of observations and generalized the results of the whole society is one example. In this regard, several patterns, such as model and linear equations, rules if-then, clusters, charts and trees are available. In fact, these data with destinations other than data mining have been saved; for example, information related to the use of any phone number by providingadministrative purposes mainly for phone services Web Link are collected. So, in this case data collection independently of data mining is done. For any analysis with huge amount of databases fields, we should get help from the data sampling to reduce the time trashing. Sampling is a common practice for selecting a subset of data to be analyzed. Instead of dealing with an entire data stream, we select instances at periodic intervals. Sampling is used to compute statistics (expected values) of the stream. While sampling methods reduce the amount of data to process, and, by consequence, the computational costs, they can also


be a source of errors, namely in monitoring applications that require to detect anomalies or extreme values [3].

4. DATA PREPARATION OR DATA PREPROCESSING The data processing task is also one of the criteria which must be taken care in the process of data mining. The data input to a data mining algorithm need not be in proper format and is hence not suitable for processing efficiently. In such a case, we need to see the data is in proper format so that it is suitable for processing. This case generally arrives when we try to mine the data using the existing data mining tools or algorithms. Different Data mining tools available in the market have different formats for input which makes the user forced to transform the existing input dataset into the new format. This itself is very time consuming, laborious and has a chance of data loss as the data is to be entered manually into a new format to be supported by the tool [25]. Data preparation or data preprocessing in this context means manipulation of data into a suitable form for further analysis and processing. It is a process that involves many different tasks and which cannot be fully automated. Many of the data preparation activities are routine, tedious, and time consuming. It has been estimated that data preparation accounts for 60% - 80% of the time spent on data mining project. Data preparation is essential for successful data mining And Good data preparation is a key prerequisite to successful data mining [16] Poor quality data typically result in incorrect and unreliable data mining results. Data preparation improves the quality of data and consequently helps improve the quality of data mining results. The well-known saying "garbage-in garbage-out" is very relevant to this domain [18]. Or in another word, the data preparation phase covers all activities to construct the final dataset (data that will be fed into the modeling tool(s)) from the initial raw data. Data preparation tasks are likely to be performed multiple times, and not in any prescribed order. Tasks include table, record, and attribute selection as well as transformation and cleaning of data for modeling tools. Correct data preparation prepares both the miner and the data. Preparing the data means the model is built right. Preparing the miner means the right model is built. Data preparation and the data survey lead to an understanding of the data that allows the right model to be built, and built right the first time. But it may well be that in any case, the preparation and survey lead the miner to an understanding of the information enfolded in the data, and perhaps that is all that is wanted [15]. Often today, instead of adequate data preparation and accurate data survey, time-consuming models are built and rebuilt in an effort to understand data. Modeling and remodeling are not the most cost-efficient or the most effective way to discover what is enfolded in a data set. If a model is needed, the data survey shows exactly which model (or models if several best fit the need) is appropriate, how to build it, how well it will work, where it can be applied, and how reliable it will be and its limits to performance. All this can be done before any model is built, and in a small fraction of the time it takes to explore data by modeling. There were two objectives for using data preprocessing techniques in the common applications: (i) to solve problems in the data, and (ii) to learn more about the nature of the data [15]. For missing parameters, after identifying the percent of missing attributes in each record, records containing more than 20% were eliminated and the ones with 20% or less were kept for data analysis. Handling incomplete records required some additional work. However, when large amounts of data are collected, there is always a chance of having some problems with the data. Use of proper techniques to solve data problems is very important. This is due to the information contents of the remaining fields [15].

5. DATA PREPARATION METHODS 5.1. DATA DISCRETIZATION

Data discretization is a part of data reduction but with particular importance, especially for numerical data. Discretization is the process of partitioning values of continuous variables into categories. The goal of discretization is to find a set of cut points to divide the range of data into some number of intervals. Discretization methods can be fundamentally divided into two groups: supervised and unsupervised. Supervised methods utilize class information during discretization process whereas unsupervised do not. Supervised methods are considered as more efficient and deliver better results24,25. Basic discretization process can be composed of four steps, sorting the continuous range of data to be discretized, evaluating points for splitting (or intervals for merging), applying splitting or merging process established on specified


rules, and stopping the process after reaching some postulated criteria (especially for iterative, incremental processes)[13,26].

5.2. DATA CLEANING Fill in missing values, smooth noisy data, identify or remove outliers, and resolve Inconsistencies are all in data cleaning fields. Data cleaning techniques usually rely on some quality rules to identify violating tuples, and then fix these violations using some repair algorithms. Oftentimes, the rules, which are related to the business logic, can only be defined on some target report generated by transformations over multiple data sources. This creates a situation where the violations detected in the report are decoupled in space and time from the actual source of errors. In addition, applying the repair on the report would need to be repeated whenever the data sources change. Finally, even if repairing the report is possible and adorable, this would be of little help towards identifying and analyzing the actual sources of errors for future prevention of violations at the target [13,27].

5.3. DATA INTEGRATION Integration of multiple databases, data cubes, or files are data integration where multiple data sources may be combined. A popular trend in the information industry is to perform data cleaning and data integration as a preprocessing step, where the resulting data are stored in a data warehouse [13,28].

5.4. DATA TRANSFORMATION Data transformation means normalization and aggregation where data are transformed and consolidated into forms appropriate for mining by performing summary or aggregation operations. Sometimes data transformation and consolidation are performed before the data selection process, particularly in the case of data warehousing. Data reduction may also be performed to obtain a smaller representation of the original data without sacrificing its integrity. Data transformations (e.g., normalization) may be applied, where data are scaled to fall within a smaller range like 0.0 to 1.0. This can improve the accuracy and efficiency of mining algorithms involving distance measurements. These techniques are not mutually exclusive; they may work together. For example, data cleaning can involve transformations to correct wrong data, such as by transforming all entries for a date field to a common format[28].

5.5. DATA REDUCTION Data reduction can reduce data size by, for instance, aggregating, eliminating redundant features, or clustering. Obtains reduced representation in volume but produces the same or similar analytical [28]. When the task is to develop a model using a high dimensional opportunistic database of a couple of thousand possible independent predictor variables, the first task is to reduce the variables to “reasonable” number, and the second is to reduce the number of levels in the categorical variables [7,13].

5.6. DATA CLUSTERING Unlike classification and regression, which analyze class-labeled (training) data sets, clustering analyzes data objects without consulting class labels. In many cases, class labeled data may simply not exist at the beginning. Clustering can be used to generate class labels for a group of data. The objects are clustered or grouped based on the principle of maximizing the intra class similarity and minimizing the interclass similarity [28]. Clustering refers to the division of data into groups of similar objects. Each group, or cluster, consists of objects that are similar to one another and dissimilar to objects in other groups. When representing a quantity of data with a relatively small number of clusters, we achieve some simplification, at the price of some loss of detail (as in loss data compression, for example) [13]. Clustering is a form of data modeling, which puts it in a historical perspective rooted in mathematics and statistics [6]. Clustering values, data types, identify outliers to discover what is enfolded in a data set, this is about data mining functionalities, this could help data workers to find relevant sub contents and which data objects are useful, for example: Clustering of user information or data from web server logs can facilitate the development and execution of future marketing strategies, both online and offline, such as automated return mail to visitors falling within a certain cluster, or dynamically changing a particular site for a visitor on a return visit, based on past classification of that visitor [4].


6. ANALYSIS Analyzing provides the better rules (Higher accuracy and coverage) can be discovered by using aggregate predicates in the background knowledge, such as the comparison of traditional data mining algorithm and multi-relational data mining algorithm. In fact analysis help us to reach to a better efficiency as it avoids costly multiple joining operations in using functions and data mining. When the data is scattered over the many tables which will causes many problems in the data mining. To avoid to the huge amount of data problems, data workers have used some of the aggregate functions [10] and analysis methods save time and work on profitable achievements [22].

7. DECISION MAKING In decision-making, rationality of individuals is limited by the information they have, the cognitive limitations of their minds, and the finite amount of time they have to make a decision. In complex situations, individuals who intend to make rational choices are bound to make satisfactory (rather than maximizing) choices [3]. The developments of information and communication technologies dramatically change the data collection and processing methods. What distinguish current data sets from earlier ones are automatic data feeds. We do not just have people entering information into a computer. We have computers entering data into each other [14]. Detecting data anomalies, rectifying them early, and reducing the data to be analyzed can lead to huge payoffs for decision making [28].

8. CONCLUSION AND SUGGESTIONS In the paper the definition of data and types of data classification for any processing is presented. In particular an analysis of influence of data discretization and stream mining on efficiency of any process has been explained. All definitions were performed using datasets and preparation methods. Therefore, it can be concluded that any definition of data should be taken into consideration, because improvement of data preparation quality may be achieved for specific tasks. Another important fact is that learning and testing datasets, especially attributes sets, should be deeply investigated. Improperly chosen datasets can negatively influence the process methods especially in ETL (Extraction, Transform and Load). It will be the subject of further research. Data preparation methods in ETL after transforming data process did not prove its usefulness. Even though the best results have been obtained for data preparing at the beginning of data mining process that show rather minor negative influence for the whole process. REFRENCES

1. Chakrabarti, S., Ester, M., Fayyad, U., Gehrke, J., Han, J., Morishita, S., ... and Wang, W. (2006). Data mining curriculum: A proposal (Version 1.0). Intensive Working Group of ACM SIGKDD Curriculum Committee.

2. Muthukrishnan, S. (2005). Data Streams: Algorithms and Applications. Now Publishers. 3. Gama, J. (2013). Data Stream Mining: the Bounded Rationality. Informatica (Slovenia), 37(1), 21-

25. 4. Cooley, R., Mobasher, B., and Srivastava, J. (1999). Data preparation for mining World Wide Web

browsing patterns. Knowledge and information systems, 1(1), 5-32. 5. Keramati, A., and Yousefi, N. (2011, January). A proposed classification of data mining techniques

in credit scoring. In Proc. 2011 Int. Conf. on Industrial Engineering and Operations Management Kuala Lumpur, Malaysia.

6. Kogan, J., Nicholas, C., and Teboulle, M. (2006). Grouping multidimensional data (pp. 25-71). Springer.


7. Manahan, C. (2006). Comparison of Data Preparation Methods for Use in Model Development with SAS Enterprise Miner. In Proceedings of the 31th Annual SAS Users Group International Conference (pp. 079-31).

8. Chakrabarti, S., Ester, M., Fayyad, U., Gehrke, J., Han, J., Morishita, S. and Wang, W. (2004). Data mining curriculum: A proposal (Version 0.91).

9. Brzeziński, D. (2010). „Mining Data Streams with Concept Drifts”. 10. Padhy, N., and Panigrahi, R. (2012). Multi Relational Data Mining Approaches: A Data Mining

Technique. arXiv preprint arXiv:1211.3871. 11. Kruczyk, M. (2013). Rule-Based Approaches for Large Biological Datasets Analysis: A Suite of

Tools and Methods. 12. Rezende, H. R., and Esmin, A. A. A. Proposed Application of Data Mining Techniques for

Clustering Software Projects. 13. Mining, D. (2001). Concepts and Techniques. Jiawei Han and Micheline Kamber, 2. 14. Zhang, S., Zhang, C., and Yang, Q. (2003). Data preparation for data mining. Applied Artificial

Intelligence, 17(5-6), 375-381. 15. Famili, F., Shen, W. M., Weber, R., and Simoudis, E. (1997). Data pre-processing and intelligent

data analysis. International Journal on Intelligent Data Analysis, 1(1). 16. Jermyn, P., Dixon, M., and Read, B. J. (1999). Preparing clean views of data for data mining.

ERCIM Work. on Database Res, 1-15. 17. N Ingebrigtsen, (2007-2015) [http://www.infogineering.net/index.php]. 18. Pyle, Dorian. Data preparation for data mining. Vol. 1. Morgan Kaufmann, 1999. 19. Paul Zandbergen, Types of Data: Text, Numbers & Multimedia, PhD from the University of British

Columbia and has taught Geographic Information Systems, statistics and computer programming 20. Guenther, Rebecca, and Jacqueline Radebaugh. "Understanding metadata." National Information

Standard Organization (NISO) Press, Bethesda, USA (2004). 21. Philip Bagley (Nov 1968), Extension of programming language concepts, Philadelphia: University

City Science Center 22. Wikipedia, The Free Encyclopedia; 2015Sep 26, 10:00GMT [cited since 2000 Jan to 2015 Sep].

Available from: http://en.wikipedia.org/w/index.php. 23. Online blog contributors. Related to Data mining [Data prepare- process] articles. 24. Wu, X., Zhu, X., Wu, G. Q., and Ding, W. (2014). Data mining with big data. Knowledge and Data

Engineering, IEEE Transactions on, 26(1), 97-107. 25. PhridviRaj, M. S. B., and GuruRao, C. V. (2014). Data mining–past, present and future–a typical

survey on data streams. Procedia Technology, 12, 255-263. 26. Baron, G. (2014). Influence of Data Discretization on Efficiency of Bayesian Classifier for

Authorship Attribution. Procedia Computer Science, 35, 1112-1121. 27. Chalamalla, A., Ilyas, I. F., Ouzzani, M., and Papotti, P. (2014, June). Descriptive and prescriptive

data cleaning. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (pp. 445-456). ACM.

28. Han, J., Kamber, M., and Pei, J. (2011). Data mining: concepts and techniques: concepts and techniques. Elsevier.

http://www.infogineering.net/index.php

http://en.wikipedia.org/w/index.php


Edge Detection of Digital Image Using Fuzzy Rule Based Technique

Maryam koniyehnoor1, Nasim Pillehvar2

1Islamic Azad University, Science and Research 2Islamic Azad University, North Tehran Branch

[email protected], [email protected],

Abstract: The edge detection in digital images based on fuzzy inference system has become popular todays. So, this paper proposes a simple and efficient method based on fuzzy inference rules without determining threshold value for edge detection in digital images .The approach of this algorithm to detect the edges of an input image by scanning it throughout using a 2*2 pixel window over the whole image pixel , It has 4 inputs corresponding to 4 pixels of scanning matrix and has one output identifying the pixel is review that if it is “edge” pixels .The Fuzzy system includes sixteen fuzzy rules to classify the pixels. The results of the implemented algorithm have been compared with the standard edge detection algorithm such as, Canny, Sobel, Prewitt and Roberts. Keywords: Fuzzy edge detection, fuzzy inference system, fuzzy rules

I. INTRODUCTION In recent years the application of edge detection is widely used in different usage. It is also widely used in the area of image segmentation. Edge detection is a segmentation method based on detecting sharp, local changes in intensity. Edge is one of the most essential features of an image and it can be useful in many image processing and machine vision applications. The edge detection in digital images based on fuzzy inference system has become popular recently .Edges which are one of the most important features of images are defined as significant changes in image intensity According to four edge profiles exist: step, ramp, ridge and roof .In some applications there is a need to detect regions with significant changes in intensity ,in earlier edge detection methods, such as Sobel and Perwitt are based on spatial derivative filtering that calculates the intensity gradient magnitude at each image pixel to only detect edges of certain coordination. Then this gradient value is compared to a threshold. A pixel location is classified as edge when its gradient value is more value than the threshold but the problem of this method is happened because of noisy and blurred image and detection of threshold value. To solve the noise problem, Canny proposed a method which in the image is convolved with first order derivatives of a Gaussian smoothing function followed by thresholding and those methods such as Canny which involve Gaussian filtering suffer from edge displacement and false edges. The Laplacian edge detection method is proposed by using a 2D linear filter to approximate second order derivatives of the image pixel values. The classic methods work well when image contrast is high. In this paper we have proposed a fuzzy rule based soft computing approach for edge detection. and it’s mentioned with the extension of a fuzzy logic rules based algorithm without using a thresholding value for the detection of image edges. A 2x2 window of pixels, the smallest possible slide, is used as a scanning mask. The mask slides moved the whole image pixel by pixel and peaks the edge pixels. The fuzzy rule base comprising of only 16 rules are capable of detecting the edges in an image. The rule base indentifies the pixels belonging to “Edge” set. The results obtained by applying this method are compared




with that of the Sobel and Canny, Prewitt algorithm, the standard edge detection method. The results are found to be very significant, precise and accurate.

1. Algorithms

All of the required steps in the proposed method are summarized in the following 1 .Read gray image X with dimensions M×N. 2. For each window on image: a. Set the initial 2×2 mask as P1, P2, P3 and P4 and calculate FIS inputs as D1, D2, D3, D4, D5 and D6 .(fig. 2) b. Map input pixels to FIS for fuzzification using input membership functions. c. Apply AND operator as Fuzzy t-norms operator (MIN operator). d. Fire fuzzy rules. e. Generate aggregated output with applying s-norms (MAX operator). f. Defuzzify the output using centroid method and get the crisp P4 output pixel. g. If calculated pixel isn’t last pixel in image X, slide

1 , The window to the next pixel and goto step 2, Otherwise go to step 3. 3. Convert output image with two gray levels to black and white image. 4. Apply thinning process with “thin” morphological operators for one-pixel edge extraction. 5. End

Figure1:Block diagram proposed algorithm

P1 X(i-1 , j-1)

P2 X(i-1, j)

P3 X(I , j-1)

P4 X(I , j)

Figure 2 :2*2 mask


2. FUZZY INFERENCE SYSTEM New Fuzzy Inference System designed of Mamdani type and the algorithm detects edges of an input image by using a window mask of 2x2 size that slides over the whole image horizontally pixel by pixel. The FIS is implemented by considering four inputs which correspond to four pixels P1, P2, P3 and P4 of the 2*2 mask in (Fig. 2) and one output variable ,the first step of the FIS implementation is input fuzzification with definition of membership functions(fig.3) , in this design inputs are fuzzified using triangular membership functions because these functions have better and simple performance(1) ( ; , , ) = ( , , 0 ) (1)

black=trimf (x,[0 ,0, 255] white=trimf (x,[0,255,255]

Figure 3: Input membership function

Two fuzzy sets are used for the input Black & White and three fuzzy sets are used for the output. sixteen numbers of rules have been defined to apply implication on the inputs. The inference rules depend on the weights of 3 neighbors i.e. P1, P2 and P3 and P4 itself, if the weights are degree of Black or White. These weights are evaluated by using AND operator as it shows in the rule base , These fuzzy output of all rules are calculated to a single fuzzy set by aggregating them with the OR (max) operation. The output of applying these rules is fuzzy too and the final output of the FIS as defuzzification is presented to get two values between 0 and 255. We define a triangular membership function for the output with the name “edge” that is shown in (fig.4)

Figure4: output membership function

Black= Trimf (x, [0,3,5]) Edge= Trimf (x, [130,133,135]) White= Trimf(x,[249,252,255])


These rules are defined as following: 1. If P1 is Black and P2 is Black and P3 is Black and P4 is Black then Black 2. If P1 is Black and P2 is Black and P3 is Black and P4 is White then edge 3. If P1 is Black and P2 is Black and P3 is White and P4 is Black then edge 4. If P1 is Black and P2 is Black and P3 is White and P4 is White then edge 5. If P1 is Black and P2 is White and P3 is Black and P4 is Black then edge 6. If P1 is Black and P2 is White and P3 is Black and P4 is White then edge 7. If P1 is Black and P2 is White and P3 is White and P4 is Black then edge 8. If P1 is Black and P2 is White and P3 is White and P4 is White then White 9. If P1 is White and P2 is Black and P3 is Black and P4 is Black then edge 10. If P1 is White and P2 is Black and P3 is Black and P4 is White then edge 11. If P1 is White and P2 is Black and P3 is White and P4 is Black then edge 12. If P1 is White and P2 is Black and P3 is White and P4 is White then edge 13. If P1 is White and P2 is White and P3 is Black and P4 is Black then edge 14. If P1 is White and P2 is White and P3 is Black and P4 is White then edge 15. If P1 is White and P2 is White and P3 is White and P4 is Black then edge 16. If P1 is White and P2 is White and P3 is White and P4 is White then White

3. EXPERIMENTAL RESULTS

In this paper, the simulation is very simple & small method but very efficient, fuzzy rule based edge detection algorithm in MATLAB environment. Comparisons were made with other edge detection algorithms (Prewitt, Sobel, Canny,,) and the result have shown the accuracy of the edge detection using the fuzzy rule based over the other algorithms(fig. 5,6).The time of execution is 2.01380(s). = 20 √ (2) = ∑ ∑ ‖ ( , ) − ( , )‖ (3) Legend(2,3): f represents the matrix data of our original image g represents the matrix data of our degraded image in question m represents the numbers of rows of pixels of the images , i represents the index of that row n represents the number of columns of pixels of the image and j represents the index of the column MAX f is the maximum signal value that exists in our original “known to be good” image

Prewitt Sobel Canny Proposed algorithm 0.0201 0.0203 0.0870 0.1325 MSE

62.0421 62.0110 57.2387 56.2025 PSNR Figure5: accuracy of edge detection

The fuzzy rule based algorithm has been successful in detection of the edges that is present in an image after the simulation and execution of images. Using on valuation metrics and visual judgment from different images is used to evaluate edge detection methods and proved that method using both of these assessment in comparison with other methods is robust and by using new method and also solving noise problem.


This algorithm provide much more distinct marked edges and have better visual appearance than the standard existing. The sample output have shown below in (Fig. 6) compares the other Edge detection algorithm and our fuzzy edge detection algorithm.

Figure6 :The result of Edge detection methods

4. Conclusions

In this paper Fuzzy rule base edge detection system designed very simple and small computation but very efficient .This algorithm have been developed in Matlab environment in compared with various other edge detection The algorithm uses a 2*2 window mask, which is of smallest size, so it minimizes the computation .and also no threshold value need to be determined in this algorithm. The result have shown the accuracy of this method in short time. The fuzzy rule based algorithm has been successful in pick up the edges our method finds meaningful edges in most images and unlike other three methods don’t need threshold adjustment. This paper used the metric for matching degree evaluation between our method in one side and the methods mentioned above in other ways . Finally, visual assessment of proposed method is illustrated in different images. (Figures 5) show the resultant images produced by proposed method in comparison with (Canny, Sobel,Prewitt ,Log and Robert) standard methods and showed a good result for Fuzzy method in short time .

5. Future work

Ø Optimization of the FIS can be done by using other soft computing techniques such as

PSO and ANN, GA etc. Ø The size of mask can be increased and more rule can set and the results would be

compared.


Ø . The Algorithm is able to detect edges of gray images only or have to converted into gray, this work may be extended for color images directly.

Ø Using type 2 fis with expectation better result.

6. REFERENCES

1. S. Nagendram, G. Divya, L. Avinash Bharadwaj, P. Dharanijam “A Novel Method of Exploring Human Reasoning Power in Image Analysis”, International Journal of Science and Advanced Technology, Vol. 1, Issue 8, Oct 2011. 2. N. Senthilkumaran and R. Rajesh, “Edge Detection Techniques for Image Segmentation – A Survey of Soft Computing Approaches”, International Journal of Recent Trends in Engineering, Vol. 1, No. 2, May 2009 3. R. C. Gonzalez and R. E. Wood, Digital Image Processing, 3rd Ed., Prentice Hall, 2009 4. D. O. Aborisade, “Novel Fuzzy logic Based Edge Detection Technique”,International Journal of Advanced Science and Technology, pp. 75-82,2011. 5. O. P. Verma, M. Hanmandlu, P. Kumar, S. Chhabra, and A. Jindal, “A novel bacterial foraging technique for edge detection”, Pattern recognition letters, pp. 1187-1196, 2011. 6. A. DO, “Fuzzy Logic Based Digital Image Edge Detection”, Global Journal of Computer Science and Technology, 2010. 7. Bijuphukan Bhagabati,Chumi das,” Detection of Digital Images Using Fuzzy Rule Based Technique” , International Journal of Advanced Research in Computer Science and Software Engineering, June 2012 8-Mahdieh Alimohammadi, Javad Pourdeilami and Ali A. Pouyan “Edge Detection Using Fuzzy Inference Rules and First Order Derivation”, 2013 13th Iranian Conference on Fuzzy Systems (IFSC) 9. Er Kiranpreet Kaur,Vikram Mutenja, Inderjeet Singh Gill, “Fuzzy Logic Based Image Edge Detection Algorithm in MATLAB”, International Journal of Computer Applications,2010 10. Shashank Mathur, Anil Ahlawat, “Application Of Fuzzy Logic In Image Detection”, International Conference "Intelligent Information and Engineering Systems "INFOS 2008, Varna, Bulgaria, June-July 2008


Fuzzy wavelet transform in nonparametric regression estimating

Mohammad-Javad Davoudabadi, Mina Aminghafari

Faculty of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran

[email protected]

[email protected]

Abstract We propose a new Fuzzy Wavelet denoising method for estimation of nonparametric regression function. We apply proposed method on the simulated signal. We show that the proposed method has robust results with decreasing signal to noise ratio (SNR). Keywords: Fuzzy transform, Fuzzy wavelet, Thresholding.

1. INTRODUCTION Consider the following model = ( ) + , = 1, … , and ∈ Where = ( , … , )′~ (0, ), is unknown and ( ) is an unknown function to be estimated on some index set ∈ .

Without parametric assumptions, such as those lead to linear regression , this is a nonparametric regression problem . In this paper, we will construct an approximation model based on the fuzzy wavelet theory that is based on one of the three fuzzy transforms given in [4].

In classical mathematics , several kinds of transform (Fourier , Laplace , integral , wavelet) are used as an effective tool for construction of approximation . Their core idea consist in transforming an original functions into a special space of functions where some computations are simpler . The inverse transform returns to the original space and produces either the original function or its approximation . Using wavelet transform we estimate or denoise with three following steps:

1) Decompose ( , … , ) by wavelet transform 2) Threshold wavelet coefficients obtained in step 1 3) Apply inverse wavelet transform Hereafter this method denoted by WD.

Perfilieva [4] developed three techniques of direct and inverse Fuzzy transform (F-transform) and approximation properties of the inverse F-transform. In this paper, the main idea of the F-transforms is a fuzzy partition of a universe; for example support of into fuzzy subsets.

The benefit of the inverse formula of the F-transform is a simple approximate representation of the original function. Therefore, in complex computations, the inversion formula can be used instead of precise representation of the original function. In addition to the aforementioned characteristics, the inverse F-transform has good filtering properties that Perfilieva and her colleagues proved it ]٥[ . Beg and Aamir [1] developed notion of Fuzzy Wavelet (FW) and proved a number of theorems establishing properties of fuzzy wavelet that filtering properties of F-transform had been exploited in developing fuzzy wavelets. They have extended fuzzy multiresolution analysis schemes to the fuzzy wavelets.




2. Fuzzy Wavelet Denoising Consider the model same as introduction = ( ) + , = 1, . . . , and =

Where ~ (0, ) and iid and is unknown. We want to estimate by fuzzy wavelet denoising (FWD).

In FW decomposition using decimation operator on convolution of and we can extract detail coefficients of function . When detail coefficients are small, they might be ignored without substantially affecting the general function . Thus the idea of thresholding FW coefficients is a way of cleaning out unimportant details considered to be noise. Therefore our proposed method is as follows:

1) Decompose ( , … , ) by fuzzy wavelet transform 2) Threshold wavelet coefficients obtained in step 1 3) Apply inverse fuzzy wavelet transform

The optimal thresholding rules are obtained by choosing the threshold λ . There are a variety of methods to choose the threshold level λ .These thresholds all require an estimate of .

A robust estimate of (based on the median absolute deviation) given by ]٦[ : = 1.4825 (| − ( )| ) where is the vector of detail coefficients at level 1 .

We use the universal threshold for thresholding detail coefficients where the name of the universal threshold given by Donoho and Johnstone [2] to λ = 2 ( ) .

The two most common thresholding policies are hard and soft. The analytic expressions for the hard- and soft-thresholding rules are as follows ( , λ ) = (| | > λ ) , 0λ ≥ , d ∈ ¡ and , λ = − ( )d λ | | > λ = ( )(| | − λ ) , 0λ ≥ , d ∈ ¡ For FW coefficients we can write = + , = 1, … , where is FW coefficients of , is FW coefficients of ( ) and is FW coefficients of .

Since the basis to which we project are orthonormal, the noises and have the same probabilistic properties. 3. Simulation results In this section we compare WD and FWD through Mean Square Error (MSE) on different function from wavelet toolbox of contaminated by additive noise in MATLAB. Where MSE is ∑ ( − ) and is estimator of . We estimate by FWD and WD then compute MSE for each sample and then iterate it 50 times and calculate mean of MSE over 50 iteration. In Table 1 , when signal is smooth WD gives better results other wise our method is better.

Table 1- MSE FWD method and WD method with sample size 512.

sample Mean MSE FWD Mean MSE WD Blocks 7.8568 1.2186 Bumps 2.9537 0.7262 Heavy sine 8.5188 0.3474 Doppler 0.1741 0.3333 Quadchirp 0.5615 0.6846 Mishmash 1.4163 1.4863


In Table 2 we compare FWD method and WD method when data is genarated with different noise variance. As we can see the performance of our method does not change with increasing noise variance (decreasing SNR).

Table 2- MSE FWD method and WD method with sample size 128 of Doppler 's sample . sample Mean MSE FWD Mean MSE

WD 0.5 0.1018 0.1101 1 0.1093 0.2033 2 0.1256 0.3721 4 0.1594 0.7166 8 0.2272 1.4186 16 0.3636 2.7854

4. REFERENCES 15. Beg , I . Aamir , K . M . (2013) . Fuzzy Wavelets , The Journal of Fuzzy Mathematics. 21 , No.3, pp.

623-638 . 16. Donoho, D, L. Johnstone, I, M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika.

VOL. 81 , PP. 425–455 .

17. Misiti , M . Misiti , y . Oppenheim , G . Poggi , J-M . (2007) . Wavelets and their Applications . Wiley-ISTE.

18. Perfilieva , I . (2006) . Fuzzy transforms : Theory and applications . Fuzzy Sets and Systems. 157 , pp. 993-1023 . Trifunac,.

19. Perfilieva , I . Hodakova,P . (2011) . Fuzzy and Fourier Transforms , 7th conference of the European Society for Fuzzy Logic and Technology, Aix-les-Bains , France.

20. Vidakovic, B. (1999). Statistical Modeling by Wavelets. Wiley.


Comparison of Optimal Homotopy Asymptotic Method and Homotopy Perturbation Method for Improved Boussinesq

Equation

Z. Ayati1, Sima ahmady2

1.Depatment of Enginneering sciences, Faculty of Technology and Engineering East of Guilan, University of Guilan P.C.44891-Rudsar-Vajargah,Iran,E-mail:[email protected]

2.Ms student, Department of Mathematics, East Tehran Payame Noor University,E-mail:sima. [email protected]

Abstract Today nonlinear phenomenons play an important role in applied mathematics, physics and engineering. Explicit solutions of nonlinear partial differential equations sound essential in most caces. In homotopy method, without converting or linearization and by applying homotopy technique in topology, a homotopy with embedded parameters of [0,1]p ∈ can be created. In this paper, the improved Boussinesq equation will be solved by two methods: homotopy perturbation method and optimal homotopy Asymptotic method. Finally, the results will be compared and evaluated with the numerical values obtained by these methods. It is shown that the results of the optimal homotopy asymptotic method are better than the homotopy perturbation method. Keywords: Optimal Homotopy Asymptotic Method(OHAM), Homotopy Perturbation Method(HPM), Improve Boussinesq equation.

1. Introduction The Boussinesq equation was first introduced in 1870 by Joseph Boussinesq to model the propagation of shallow water waves in multiple directions. Subsequently, it was applied to many other areas of mathematical physics dealing with wave phenomena [1-4] Applications to acoustic waves, ion-sound waves , plasma and nonlinear lattice waves, are described in references[5,6,7,8]. A general form for the Boussinesq-type equations considered in these reference is as follows

2( , ) ( , ) ( , ) ( ( , )) ,tt xx xxxx xxu x t u x t qu x t u x t= + + (1) where 1q = or 1− . The original equation used by Boussinesq in [9] was Eq.1with 1q = . In the literature,it is known as the “bad “ Boussinesq equation .Eq.(1) with 1q = − is typically called the “good” Boussinesq equation. Bogolubsky in [10,11] has shown that the “bad “ Boussinesq equation describes non- realistic instability at short wavelengths. Consequently, the following so-called improved Boussinesq equation (IBq) was proposed 2( , ) ( , ) ( , ) ( ( , )) .tt xx xxtt xxu x t u x t u x t u x t= + + (2) The IBq is more suitable for computer simulation than its predecessor. It has been used to model ion- sound wave propagation [12] , nonlinear wave dynamics in weakly dispersive media [13], and acoustic waves on elastic rods with a circular variants give rise to compact and non- compact physical structures [14]. The homotopy perturbation method introdused by He [15] and the optimal homotopy asymptotic method sujested by Marinca and Herisuna, [16] andthey have been widely used by numerouse researchers,Successfully for different physical systems such as bifurcation , asympotology nonlinear wave equations. In this work, we consider improved Boussinesq equation with the following initial condition 3( ) ,tt xx xxttu u u= + , 0,x t−∞ ≤ ≤ ∞ ≥ (3)

( ,0) , ( ,0)3 3t

x xu x u x= − = − .

mailto:P.C.44891-Rudsar-Vajargah,Iran,E-mail:[email protected]



This equation has exact solution as below 1( , )

( 1)3xu x t

t=

−.

2. Outlines of HPM To illustrate the basic idea of homotopy perturbation method for solving nonlinear deferential equations, the following equation is considered

( ) ( ) 0,A u f r− = ,r ∈ Ω (4) with boundary conditions:

( , ) 0,uB un

∂=

∂ ,r ∈ Γ

(5) where A is a general functional operator , B is a boundary operator, f(r) is a known analytical

function, Γ is the boundary of the domain ,Ω and n∂

∂ denotes differentiation along the normal drawn

outwards from Ω . Generally speaking the operator A can be divided into two parts L and N , where L is a linear and N is a nonlinear operator. Eq. (4), therefore can be rewritten as the following:

( ) ( ) ( ) 0.L u N u f r+ − = (6) We construct a homotopy [ ]( , ) : 0,1 ,v r p RΩ × → of Eq.(3) which satisfies

[ ] [ ]0( , ) (1 ) ( ) ( ) ( ) ( ) 0,H v P p L v L u p A v f r= − − + − = [ ]0,1 ,p ∈ ,r ∈ Ω (7) and is equivalent to

[ ]0 0( , ) ( ) ( ) ( ) ( ) ( ) 0,H v P L v L u pL u p N v f r= − + + − = (8) where [ ]0,1p ∈ is an embedding parameter, 0u is an initial approximation for the solution of Eq.(4), which satisfies the boundary conditions. Obviously, from Eqs (7) and (8) we will have

0( ,0) ( ) ( ) 0,H v L v L u= − = (9)

.( ,1) ( ) ( ) 0H v A v f r= − = (10) Thus, the changing process of p from zero to unity is just that of ( , )v r p from 0u to ru . In topology these are called deformation, and 0( ) ( )L v L u− , ( ) ( )A v f r− called homotopic. Here the embedding parameter is introduced much more naturally, unaffected by artificial factors ; further it can be considered as small parameter for 0 1p≤ ≤ . So it is very natural to assumed that the solution of (7) and (8) can be expressed as a power series in p :

20 1 2 ....v v pv p v= + + +

(11) Setting 1p = , results in the approximate solution of Eq .(4) will be obtained as follows

0 1 2 .1lim ...p

u v v v v→

= = + + +

(12) 3.Basic idea of optimal homotopy asymptotic method We review the basic principles of OHAM as expounded in [16,17]. (i) Consider the following differential equation:

[ ( )] ( ) 0, ,A v x a x x+ = ∈Ω (13) where Ω is problem domain, ( ) ( ) ( )A V L V N V= + ,where ,L N are linear and nonlinear operator, ( )v x is an unknown function, and ( )a x is a known function. (ii) Construct an optimal homotopy equation as:


(1 )[ ( ( ; ) ( )] ( )[ ( ( ; ) ( )] 0,p L x p a x H p A x p a xφ φ− + − + = (14) convergence of the solution is greatly dependent. The auxiliary function ( )H p adjusts the convergence domain and controls the convergence region, where 1o p≤ ≤ is an embedding parameter, and

1( )

mk

kk

H p p c=

= ∑ is auxiliary function.

iii) Expand ( ; , )jx p cφ in Taylor’s series about p , one has an approximate solution :

01

( ; , ) ( ) ( , ) , 1, 2,3,....kj k j

kx p c v x v x c p jφ

∞

=

= + =∑ (15)

Many researchers have observed that the convergence of the series Eq.(14) depends upon , 1,2,3,...,jc j m= . If it is convergence then, we obtain:

01

( ) ( , ).k jk

v v x v x c∞

=

= + ∑%

(16) iv) substituting Eq . (15) in Eq .(13) , the following residual has been derived ( ; ) ( ( ; )) ( ) ( ( ; )).j j jR x c L v x c a x N v x c= + +% % (17) If ( ; ) 0jR x c = , then v% will be the exact solution . For nonlinear problems, generally this will not be the

case. For determminig , 1,2,3,...jc j = Galerkin’s method, Rits method or the method of least squares can be used. v) Finally, substitute these constans in Eq .(16) and one can get the approximate solution. 4.Example 4.1. Solution of the improved Boussinesq equation via HPM Consider improved Boussinesq equation with the following initial condition

3( ) ,tt xx xxttu u u= + , 0x t−∞⟨ ⟨∞ ≥

(18)

( ,0) , ( ,0) .3 3t

x xu x u x= − = − (19)

This equation have exact solution as below

1( , ) .( 1)3

xu x tt

=−

(20) According to the ,HPM the following homotopy can be constructed

30 0( ) [( ) ( ) ].tt tt xx xxtt ttv u p v v u− = + − (21)

Substitute 0

ii

iv v p

∞

=

= ∑ in Eq .(21) leads to

30 0

0 0 0( ) ( ) [( ) ( ) ( ) ].i i i

i tt tt i xx i xxtt tti i i

v p u p v p v p u∞ ∞ ∞

= = =

− = + −∑ ∑ ∑

Equating the coefficients of the terms with the identical powers of p, leads to: 0

0 01

1 0 0 02 2 2

2 0 0 0 0 1

: ( ) ( ) 0,: ( ) ( ) ( ) ( ) ,: ( ) 6 (( ) ) 3 ( ) ( ) ,

tt tt

tt xx xxtt tt

tt x xx xxtt

p v up v v v up v v v v v v

− =

= + −

= + +

(22)

M


Assume 0 0 (1 ),3

xv u t= = − + then

0 0

2 3 4 51

4 5 6 7 8 92

(1 ),3

1 1 1 13 3 3 3 ,3 3 6 301 3 7 1 1 13 3 3 3 3 3 ,6 10 30 10 40 360

xv u t

v x t x t x t x t

v x t x t x t x t x t x t

= = − +

= − − − −

= − − − − − −

(23)

6 7 8 9 10 113

12 13

1 7 1 29 121 113 3 3 3 3 310 30 4 180 1800 600

11 113 3 ,3600 46800

v x t x t x t x t x t x t

x t x t

= − − − − − −

− −

M Consider approximation for four terms 0 1 2 3u v v v v≈ + + + solution of equation will be obtained as the following form

82 3 4 5 6 7 3

9 10 11 12 13

1 1 1 1 1 1 1 113(1 ) 3 3 3 3 3 3 33 3 3 3 3 3 3 4059 121 11 11 113 3 3 3 3 .360 1800 600 360 46800

u x t x t x t x t x t x t x t x t

x t x t x t x t x t

= − + − − − − − − −

− − − − − (24)

4.2. Solution of the improved Boussinesq equation via OHAM According to the OHAM, byapplying Eq .(14) to Eq .(3), we derive

2 3 31 2 3(1 ) ( ...)[( ) ].tt xx xxtt ttp v c p c p c p v v v− = + + + + −

By substituting Eq.(15) into above equation, we get 2 3 2 3 3 2

0 1 2 3 1 2 3 0 0 1

2 2 20 2 0 1

1 20 1 2

20 1 2

(1 )( ...) ( ...)[( ) (3 )

(3 3 ) ...

( ) ( ) ( ) ...

( ) ( ) ...].

tt tt tt tt

tt

xx xx

xx

xxtt xxtt xxtt

tt tt

p v v p v p v p c p c p c p v v v p

v v v v p

v v p v p

v v p v p

− + + + = + + + +

+ + +

+ + + +

− − − −

(25)

Equating the coefficients of the terms with the identical powers of p , leads to

00 0: 0 (1 ),

3tt

xp v v t−= → = +

1 31 0 1 0 1 0 1 0: ( ) ( ) ( ) ( ) ( ) ,tt tt xx xxtt ttp v v c v c v c v− = + − (26)

22 32 1 1 0 1 1 1 1 1 2 0 2 0 2 0: ( ) ( ) (3 ) ( ) ( ) ( ) ( ) ( ) ,tt tt xx xxtt tt xx xxtt ttp v v c v v c v c v c v c v c v− = + − + + −

M So we derive

31 1 0( )xxv c v dtdt= ∫∫

2 32 1 1 0 1 1 1 2 0( ) (3 ) ( ) ( ) )tt xx tt xxv v c v v c v c v dtdt= + − +∫∫ (27)


5 4 3 21 1 1 1

2 9 2 8 2 71 1 1

2 6 2 5 2 31 1 1

2 2 5 4 3 21 2 2 2 2

1 1 1 13 3 3 330 6 3 31 1 13 3 3

360 40 107 4 13 3 3

30 15 31 1 1 1 13 3 3 3 3.3 30 6 3 3

c x t c x t c x t c x t

c t x c t x c t x

c t x c t x c t x

c t x c t x c t x c t x c t x

= − − − −

− − − −

− + +

− − − −

(28) M Therefore,the three

terms’ approximation using OHAM for solution will be obtaind as fallow

51

4 2 2 91 1 1

2 8 2 7 2 61 1 1

2 5 2 3 2 21 1 1

5 4 3 22 2 2 2

1 1 13 3 33 3 15

1 2 13 3 33 3 3601 1 73 3 340 10 30

4 1 13 3 315 3 3

1 1 1 13 3 3 3.30 6 3 3

v x t x c t x

c t x c t x c t x

c t x c t x c t x

c t x c t x c t x

c t x c t x c t x c t x

= − − −

− − −

− − −

+ +

− − − −

(29)

1 2 = 18.08904377 and c = 281.9692890c are computed by the method of least squares. (30) Table 1 Comparison of OHAM and HPM for t=0.1

X OHAM Solution HPM Solution Exact Solution Error (OHAM) Error(HPM) 0.0 0 0 0 0 0

0.1 -0.0590196310 -0.3023353179 -0.0525388744 0.0064807566 0.2497964435

0.2 -0.1180395046 -0.6046706358 -0.1050777490 0.0129618156 0.4995928868

0.3 -0.1770592570 -0.9070059537 -0.1576166234 0.0194426336 0.7493893303


Table 2 Comparison of OHAM and HPM for t=0.09

0.4 -0.2360790093 -1.209341272 -0.2101554979 0.0259235114 0.9991857741

0.5 -0.2950987616 -1.511676590 -0.2626943724 0.0324043892 1.248982218

0.6 -0.3541185139 -1.814011907 -0.3152332469 0.0388852670 1.551317535

0.7 -0.4131382662 -2.116347225 -0.3677721214 0.0453661448 1.748575104

0.8 -0.4721580186 -2.418682543 -0.4203109958 0.0518470228 1.998371547

0.9 -0.5311777709 -2.721017861 -0.4728498703 0.0583279006 2.248167991

1.0 -0.5901975232 -3.023353179 -0.5253887448 0.0648087784 2.497964434

X OHAM Solution HPM Solution Exact Solution Error (OHAM) Error(HPM) 0.0 0 0 0 0 0

0.1 -0.05885248439 -0.06415000795 -0.05196152421 0.00689096018 0.01218848374

0.2 -0.1177049688 -0.1283000159 -0.1039230484 0.0137819204 0.0243769675

0.3 -0.1765574532 -0.1924500238 -0.1558845726 0.0206728806 0.0365654512

0.4 -0.2354099376 -0.2566000318 -0.2078460968 0.0275638408 0.0487539350

0.5 -0.2942624220 -0.03207500398 -0.2598076210 0.0344548010 0.0609424188

0.6 -0.3531149063 -0.3849000477 -0.3117691453 0.0413457610 0.0731309024

0.7 -0.4119673907 -0.4490500556 -0.3637306695 0.0482367212 0.0853193861

0.8 -0.4708198751 -0.5132000636 -0.4156921937 0.0551276814 0.0975078699

0.9 -0.5296723595 -0.5773500716 -0.4676537119 0.0620186416 0.1096963537

1.0 -0.5885248439 -0.6415000795 -0.5196152421 0.0689096018 0.1218848374


Figure 1.comparison of improved Boussinesq equation profile using OHAM , HPM and Exact solution For t=0.1

(……. HPM ---- OHPM Exact)

Figure 2. comparison of improved Boussinesq equation profile using OHAM , HPM and Exact solution For t=0.09

(……. HPM ---- OHPM Exact)


6. Conclusion In this paper, the improved Boussinesq equation by HPM and OHAM have been solved. The results obtained by OHAM are very consistent in comparison with HPM, and Exact solutions.It is found that OHAM compared with HPM is a reliable efficient and powerful method for solving nonlinear partial differential equations, especially the improved Boussinesq equation. 7. Reference

1. S. Lai, Y.H. Wu, The asymptotic solution of the Cauchy problem for a generalized Boussinesq equation, Discrete and Continuous Dynamical Systems, B 3 (2003) 401-408.

2. S. Lai, Y.H. Wu, X. Yang, The global solution of an initial boundary value problem for the damped

Boussinesq equation, Communications on Pure andApplied Analysis, 3 (2004) 319-328. 3. S. Lai, Y.H. Wu, B. Wiwatanapataphee, On exact travelling wave solutions for two types of

nonlinear K.n; n/ equations and a generalized KP equation,Journal of Computational and Applied Mathematics, 212 (2008) 291-299.

4. Q. Lin, Y.H. Wu, S. Lai, On global solution of an initial boundary value problem for a class of damped nonlinear equations, Nonlinear Analysis, 69 (2008)4340-4351.

5. P.A. Clarkson, M.D. Kruskal, New similarity reductions of the Boussinesq equation, Journal of Mathematical Physics, 30 (1989) 2201-2213.

6. L. Debnath, Non-linear PDEs for Scientists and Engineers, Birkhäuser, Boston, 1997. 7. F. Linares, Global existence of small solutions for a generalized Boussinesq equation, Journal of

Differential Equations, 106 (1993) 257-293.

8. V.G. Makhankov, Dynamics of classical solitons, Physics Reports (Section C of Physics Letters), 35 (1) (1978) 1-128.

9. J. Boussinesq, Théorie des ondes et de remous qui se propagent . . . , Journal de Mathmatiques Pures et Appliques, 17 (1872) 55-108.

10. I.L. Bogolubsky, JETP Letters 24 (1976) 160. 11. I.L. Bogolubsky, Some examples of inelastic soliton interaction, Computer Physics

Communications, 13 (1977) 149-155. 12. V.G. Makhankov, Dynamics of classical solitons, Physics Reports (Section C of Physics Letters), 35

(1) (1978) 1-128. 13. M.P. Soerensen, P.L. Christiansen, P.S. Lomdahl, Solitary waves on non-linear elastic rods. I,

Journal of the Acoustical Society of America, 76 (3) (1984)871-879.

14. Qun Lin,Linear B-spline element method for the improved Boussinesq equation, Journal of Computational and Applied Mathematics, 224(2009) 658-667. 15. He, J.H., 2006. Some asymptotic method for sttongly nonlinear equations. Int.J. Mod. Phs.20

(2006)1141-1199. 16. Marinca, V., Herisanu, N., Application of optimal asymptoticmethod for solving nonlinear

equations arising in heat transfer. Int.Commun. Heat Mass Transfer, 35(2008) 710–715.

17. Mabood, F., Khan, Waqar A., Ahmad Izani Bin Ismail, optimal homotopy asymptotic method for heat transfer in hollow sphere with robin boundary conditions. Heat Transfer- Asian, 43(2014)124-133.


A Numerical Method for Solving Fuzzy differential Equations With Fractional Order

N. Ahmady , E. Ahmady

Department of Mathematics, Varamin-Pishva Branch, Islamic Azad University, Varamin, Iran

Department of Mathematics, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran

Corresponding author. Email address: [email protected] Tel:+9126385642

Abstract In this paper a new solution for fuzzy differential equation with fractional order is considered. The

fuzzy solution of fractional fuzzy differential equation is construct in power series in the Caputo derivatives sense. To illustrate the reliability of the method some examples are provided.

Keywords:Generalized Hukuhara differentiability, Caputo differentiability, Fuzzy fractional differential equations.

1 Introduction Fractional calculus theory is a mathematical analysis tool applied to the study of integrals and derivatives of arbitrary order. Fractional differential equation with uncertainly have excited, in recent years. Riemann-Liouville differentiability concept based on the Hukuhara differentiability was introduced by Agarwal et al [1]. Existence and uniqueness of solution of fuzzy fractional differential equation was investigated in [4, 2]. Allahviranloo et al in [3] give the explicit solutions of uncertain fractional differential equations under Riemann-Liouville H-differentiability using Mittag- Leffler functions and in [13] give the solutions of fuzzy fractional differential equations under Riemann-Liouville H-differentiability by fuzzy Laplace transforms. In this paper, some theorems of the fractional power series are generalized for the fuzzy fractional power series by using fuzzy Caputo fractional derivatives. We use the fuzzy fractional power series to solve the fractional differential equations subject to given fuzzy initial conditions. In section 2, some basic definitions are brought. The fuzzy fractional power series and the proposed method for solving fuzzy fractional differential equation are introduced in section 3 and 4. Examples are presented in section 4 and finally conclusion is drawn. 2 Preliminaries First notations which shall be used in this paper are introduced. We denote by ℝℱ , the set of fuzzy numbers, that is, normal, fuzzy convex, upper semi-continuous and compactly supported fuzzy sets which are defined over the real line. The Hausdorff distance between fuzzy numbers is given by :ℝℱ × ℝℱ → ℝ ∪ 0 as

( , ) = sup ∈[ , ]max| ( ) − ( )|, | ( ) − ( )|. Consider , , , ∈ ℝℱ and ∈ ℝ, then the following properties are well-known for metric ,

• ( ⊕ , ⊕ ) = ( , ); • ( , ) = | | ( , ); • ( ⊕ , ⊕ ) ≤ ( , ) + ( , ); • ( ! , ! ) ≤ ( , ) + ( , ), as long as ! and ! exist, where , , , ∈ ℝℱ.

where, ⊖ is the Hukuhara difference(H-difference), it means that ! = if and only if ⊕ = .

Definition 2.1 Let , ∈ . If there exists ∈ such that



⊖ = ⇔ ( ) = + , ( ) = + (−1) , (1)

Then is called the generalized Hukuhara difference of and . A function : [ , ] → so called fuzzy-valued function. The r-level representation of fuzzy valued function is expressed by [ ( )] = [ ( , ), ( , )], ∈ [ , ], ∈ [0,1]. Definition 2.2 The generalized Hukuhara derivative of a fuzzy-valued function : ( , ) → at is defined as

′ ( ) = lim → ( )⊖ ( ) , (2) if ′ ( ) ∈ , we say that is generalized Hukuhara differentiable (gH-differentiable) at . Also we say that is [ − ]-differentiable at if

[ ′ ] ( ) = [ ′ ( , ), ′ ( , )], 0 ≤ ≤ 1, (3) and say is [ − ]-differentiable at if

[ ′ ] ( ) = [ ′ ( , ), ′ ( , )], 0 ≤ ≤ 1. (4) Definition 2.3 Let ∈ [ , ] ∩ [ , ]. The fuzzy Riemann-Liouville integral of fuzzy-valued function is defined as following:

( )( ) = ( ) ∫ ( ) ( ) , < < , 0 < ≤ 1. (7)

Definition 2.4 Let ′ ∈ [ , ] ∩ [ , ]. The fractional generalized Hukuhara Caputo derivative of fuzzy-valued function is defined as follows:

( ∗ )( ) = ( ′ )( ) = ( ) ∫ ( )( ) ( ) , < < , 0 < ≤ 1. (8) Also we say that is [ − ]-differentiable at if

( ∗ ) ( )( ) = [ ∗ ( , ), ∗ ( , )], 0 ≤ ≤ 1, (9) and say is [ − ]-differentiable at if

( ∗ ) ( )( ) = [ ∗ ( , ), ∗ ( , )], 0 ≤ ≤ 1, (10)

3 Fuzzy Fractional Power Series In this section, we will generalize some important definition and theorem related with the fuzzy power series into fractional case in the sense of the fuzzy Caputo definition. Definition 3.1 A fuzzy power series representation of the form

∑ ⊙ ( − ) = ⊕ ⊙ ( − ) ⊕ ⊙ ( − ) ⊕⋯, (11) where 0 ≤ − 1 < ≤ and ≥ is called fuzzy power series (FPS) about , where is a variable and are fuzzy constants called the coefficient of the series.

Theorem 3.1 If ( ) is a fuzzy-valued function defined by ( ) = ∑ ⊙ , then for 0 ≤ − 1 < ≤ , If is [( ) − ]-differentiable we have:

∗ ( ) = ∑ ⊙ ( ) (( ) ) ( ) , (12) If is [( ) − ]-differentiable

∗ ( ) =⊖ ∑ ⊙ ( ) (( ) ) ( ) , (13) Proof: Define ( ) = ∑ ⊙ . If is [( ) − ]-differentiable therefore ( ) is [( ) − ]-differentiable, therefore

[ ∗ ( )] = ( ) ∫ ( − ) [ ( )( )] thus

∗ ( , ) = ( ) ∫ ( − ) ( )( , ) and

∗ ( , ) = ( ) ∫ ( − ) ( )( , )


∗ ( , ) = ( ) ∫ ( − ) (∑ ( ). )

∗ ( , ) = ( ) ∫ ( − ) (∑ ( ). )

therefore ∗ ( ) = ∑ e ( ) ∫ ( − ) ( ) = ∑ e ∗ ( ),

If we make the change of variable = , ≥ 0, we have ∗ ( ) = ∗ ( ) = ∑ e ∗ ( ) = ∑ e ( ) (( ) ) ( ) ,

By similarly way, If is [ − ]-differentiable we have ∗ ( ) =⊖ ∑ e ( ) (( ) ) ( ) ,

Theorem 3.2 Suppose that fuzzy-valued function ( ) has fuzzy power series representation at of the form:

( ) = ∑ ⊙ ( − ) , 0 ≤ − 1 < ≤ , (14) then

= ( )⊙ ∗ ( ) (15)

Proof: First we put = into equation (14), we get ( ) = . On the other aspect as well, by using (12) we have: ∗ ( ) = ⊙ Γ( + 1) + ⊙ ( ) ( ) ( − ) + ⊙ ( ) ( ) ( − ) + ⋯ (16)

The substitution of = into quation (16)then we have ∗ ( ) = ⊙ Γ( + 1) ⇒ = ( )⊙ ∗ ( ) (17)

Applying Equation (12) on the series representation in Equation (16), on can obtain that ∗ ( ) = ⊙ Γ(2 + 1) + ⊙ ( ) ( ) ( − ) + ⊙ ( ) ( ) ( − ) + ⋯ (18)

at = we have ∗ ( ) = ⊙ Γ(2 + 1) ⇒ = ( )⊙ ∗ ( )

Now we can see pattern and discover the general formula for , = ( )⊙ ∗ ( ),

By substituting of = ( )⊙ ∗ ( ), = 0,1,2,⋯ back into the series of Equation (11)will lead to the following expansion for fuzzy-valued function ( ) about :

( ) = ∑ ( )⊙ ∗ ( )e( − ) (19) which is the Generalized Taylor’s series. Theorem 3.3 Suppose that fuzzy-valued function ( ) has a Generalized Taylor’s series representation at of the form

( ) = ∑ ( )⊙ ∗ ( )e( − ) (20) then

∗ ( ) = ( ) ! ⊙ ( )( ), (21)

∗ ( ) = ( ) ! ⊙ ( )( ), (22) where

( ) = (( − ) / + ). (23)

Proof:By change of variable = ( − ) / + into Equation (20) then we obtain ( ) = (( − ) / + ) = ∑ ( )⊙ ∗ ( )e( − ) (24)


The other hand, the power series of fuzzy-valued function ( ) about take the form ( ) = ∑ ( )( )⊙ ( ) ! (25)

then the two series expansion in Equation (24) and (25) converge to the same function ( ). Therefore (∑ ( )⊙ ∗ ( )⊙ ( − ) ,∑ ( )( )⊙ ( ) ! ) = 0

this means ∗ ( ) = ( ) ! ⊙ ( )( ), ∗ ( ) = ( ) ! ⊙ ( )( ),

4 Proposed Method The idea of this method is to look for the solution in the form of a power series, the coefficient of the series must be determined. Now we recall the fuzzy fractional differential equation, as follows:

( ∗ )( ) = ( ), > 0, 0 < < 1, (26) where we suppose that 0 < < 1. We assume that the fuzzy-valued function ( ) can be expand in the fuzzy power series. Then

( ) = ∑ ⊙ = ⊕ ⊙ ⊕ ⊙ ⊕⋯, (27) Recalling the rule of fractional differentiation of the power function

∗ = ( ) ( ) (28) , we look for the solution of the equation (26) in the form of the following power series

( ) = ∑ e (29) Taking into account the formula in Theorem (3.1) we note that

∗ ( ) = ∑ ⊙ ( ) (( ) ) ( ) , (30) Substituting the expression (29) in

( ∗ ( ), ( )) = 0 This means

supmax| ∗ ( , ) − ( , )|, | ∗ ( , ) − ( , )| = 0 therefore

∗ ( , ) = ( , ), ∗ ( , ) = ( , ),

∑ . ( ) (( ) ) ( ) = ∑ .

this means ∑ . (( ) ) ( ) = ∑ .

By comparison of the coefficient of both series gives: = , = ( ) ( ) , = ( ) ( ) , ⋮ = ( ) (( ) )

by similarly way we obtain: = , = ( ) ( ) , = ( ) ( ) , ⋮


= ( ) (( ) ) therefore, under the above assumptions the solution of the equation (26), ( ) = ( ( , ), ( , )) is

( , ) = ( ) + ∑ (( ) ) ( ) ( ) (31)

( , ) = ( ) + ∑ (( ) ) ( ) ( ) (32) 5 Numerical Example

Example 5.1 Consider Fractional Relaxation equation

( ∗ . )( ) = ( ), (0) = ∈ (33)

( ) = ∑ ⊙

therefore ∗ . ( ) = ∑ ⊙ ( ) ( ( ) ) ( )

by putting in (33) we have

(∑ ⊙ ( ( ) ) ( ) ,∑ ⊙ ) = 0, Therefore we obtain: ( ) = . ( . )

Figure 1:Solution of Example 5.1 Conclusion In this paper we generalized fuzzy power series to fractional fuzzy power series in sence of Caputo derivatives. Then we used fuzzy fractional power series for solving fractional fuzzy differential equation. References

[1] R. P. Agarwal , V. Lakshmikantham , J. J. Nieto , On the concept of solution for fractional differential equations with uncertainty, Nonlinear Anal 72(2010)59-62. [2] T. Allahviranloo , S. Abbasbandy , S. Salahshour, Fuzzy fractional differential equations with Nagumo and Krasnoselskii-Krein condition, In: EUSFLAT-LFA 2011, July 2011, Aix-les-Bains, France.


[3] T. Allahviranloo, S. Salahshour, S. Abbasbandy, Explicit solutions of fractional differential equations with uncertainty, Soft Comput. Fus. Found. Meth. Appl. 16 (2012) 297-302. [4] S. Arshad, V. Lupulescu, On the fractional differential equations with uncertainty, Nonlinear Anal 74(2011)85-93. [5] S. Arshad, V. Luplescu, fractional differential equation with fuzzy initial conditon ,Electronic Journal of Differential Equations,34(2011) 1-8. [6] B. Bede, S. Gal , Generalizations of the differentiability of fuzzy number valued functions with applications to fuzzy differential equations,Fuzzy Sets and Systems, 151( 2005) 581-599. [7] T. Gnana Bhaskar, V. Lakshmikantham, S. Leela, fractional differential equations with a Krasnoselskii-Krein-type conditions, Nonlinear analysis:Hybrid system 3(2009)734-737. [8] K. Diethelm, G. Gesellschaft, The Analysis of Fractional Differential Equations, Springer Heidelberg Dordrecht London New York. [9] K. Diethelm, The Analysis of Fractional Differential Equations( An Application-Oriented Exposition Using Differential Operators of Caputo Type), Lecture Notes in Mathematics,Springer-Verlag Berlin Heidelberg, (2004). [10] I. Podlubny, Fractional Differential Equations, Academic Press, San Diego, (1999). [11] M. L. Puri, D. A. Ralescu, Differentials for fuzzy functions, J. Math. Anal. Appl. 91 (1983) 552-558. [12] M. L. Puri, D. A. Ralescu, Fuzzy random variables, J. Math. Anal. Appl. 114 (1986) 409-422. [13] S. Salahshour, T. Allahviranloo, S. Abbasbandy, Solving fuzzy fractional differential equations by fuzzy Laplace transforms. Commun Nonlinear Sci Numer Simulat 17 (2012) 1372-1381.


Fuzzy Optimization of Linear Fractional Function Subject to a System of Max- Arithmetic Mean Relational Inequalities

Fateme Kouchakinejad1, Mashaal lah Mashinchi2, Esmaile Khorram3

1- Department of Mathematics , Graduate University of Advanced Technology , Kerman

2-Department of Statistics , Faculty of Mathematics and Computer Sciences , Shahid Bahonar University , Kerman

3- Faculty of Mathematics and Computer Sciences , Amirkabir University of Technology , Tehran


Abstract

A linear fractional optimization problem subject to a system of fuzzy relational inequalities is considered. First, the problem is solved using Charnes and Cooper’s transformation. In case, decision maker is not satisfied with the optimal solution, employing linear membership functions and Bellman-Zadeh decision, a new optimal solution is found. If he\she is not satisfied with the new solution, more perturbation should be forced on the inequalities. This process is continued until obtain the desired solution of his\her.

Keywords: Fuzzy inequality, Fuzzy optimization, Fuzzy relational inequalities, Linear fractional objective function, Max- arithmetic mean composition.

1. INTRODUCTION

Considering wide application of fuzzy relational equations (FRE) and inequalities (FRI) , a lot of works have been done in this field with several different compositions; see [1,2,3,4] . In [5], it is shown that in the sense of sensitivity, arithmetic mean is one of the best choices to employ in the composition of FRE s and FRI s. So far, some researchers have used max-arithmetic mean composition in FREs or FRIs [6,7,8,9]. In the present paper , linear fractional optimization problem subjected to the FRIs with max-arithmetic mean composition is considered and an algorithm is given to obtain a fuzzy solution. 2. P ROBLEM SOLVIN G Consider the following linear fractional optimization problem

min

. .[0,1] ,

t

t

av

n

c xZd x

s t A x bx

µν

+=

+≤

∈

o (1)

where ( )ij m nA a ×= , 1( )i mb b ×= , 1( )j nx x ×= such that for all 1,2, , i I m∈ = … and 1, 2, , j J n∈ = … , we have , , [0,1]ij i ja b x ∈ . Also , 1( )j nc c ×= and 1( )j nd d ×= such that ,j jc d ∈ R for all j J∈ ; and

,µ ν ∈ R . Moreover , avo stands for the max- arithmetic mean composition where , avi ia o x b≤ means

max (( 2))j J ij j ia x b∈ + ≤ for all i I∈ . In this paper, we assume that tc x µ+ and td x ν+ remain positive. To solve (1), first, some previously obtained results are given [5,10].

Notation 1. Set



( , ) [0,1] : i n avi iS A b x a o x b= ∈ ≤ for all i I∈ ,

( , ) ( , ) [0,1] : i n avi I

S A b S A b x Ao x b∈

= = ∈ ≤I .

Theorem 1. a) ( , )S A b ≠ ∅ if and only if for all i I∈ and all j J∈ , 2 0i ijb a− ≥ . b) If ( , )S A b ≠ ∅ then 10 [0,0,...,0]t

n×= is the unique minimum solution of ( , )S A b . Also, x is the unique maximum solution of ( , )S A b , where , 1( )j nx x ×= and min1, m 2 in j i I i ijx b a∈= − .

Here , by Theorem 1, the feasible domain of (1) can be presented as in the next corollary . Corollary 1. If ( , )S A b ≠ ∅ then , ( , ) [0, ]S A b x= .

Now , (1) is equivalent to the following problem due to Corollary 1

min

. . [0, ].

t

tc xZd x

s t x x

µν

+=

+∈

(2)

Since the feasible domain of (2) is non-empty and bounded then, the problem can equivalently be transformed to a linear programming problem using Charnes and Cooper's transformation [11]. To do this , let

1( )tt d x ν

=+

and .y tx= Therefore , (2) is equivalent to the following linear programming problem

min. . [0, ]

10.

t

t

c y ts t y y

d y tt

µ

ν

+

∈

+ =≥

(3)

Note 1. T he optimal solution of (2) and equivalently (1) is obtained by (1 )os osx t y= , where

osy is the optimal solution of (3). Also, the optimal value of (1) is the same as optimal value of (3) namely, .osz

Example 1. Consider the following linear fractional optimization problem

1 2 3 4 5

1 2 3 4 5

2 3 4 2 10min3 3 2 5

0.5 0.2 0.3 1.0 0.0 0.90.4 0.8 0.1 0.2 0.5 0.7

. . .0.0 0.3 0.7 0.6 0.7 0.50.1 0.3 0.1 0.4 0.5 0.6

av

x x x x xx x x x x

s t o x

− + − − + ++ − + + +

≤

By Theorem 1, we have ( )1.0 0.6 0.3 0.4 0.3x = and so, ( , ) [0, ]S A b x= . Using (3), we have the following equivalent problem

1 2 3 4 5

1 2 3 4 5

min 2 3 4 2 10. . [0, ]

3 3 2 5 10.

y y y y y ts t y y

y y y y y tt

− + − − + +

∈+ − + + + =

≥

(4)

Solving (4) by simplex method, we obtain ( )0.115 0.069 0.034 0.046 0.000 ,osy = 0.115t = and

0.701osz = . Thus, we have ( )0.999 0.600 0.295 0.400 0.000 .osx =


In the next section, we try to find a fuzzy solution. 3. FINDING FUZZY SOLUTION L et the decision maker (DM) is dissatisfied with the optimal solution of (1). T o obtain better solution which violate at least one constraint and still is acceptable to be a solution based on the DM's view, (1)'s constraints can be softened having interaction with the DM [5,12]. In case that , the flexibility of the constraints are not sufficient , the fuzzy solution may be the same as optimal solution of (1) and therefore , more flexibility is enforced on the constraints to find a better solution [12]. To this end , we focus on the following problem

. .[0,1] .

t

t

av

n

c xmin Zd x

s t A x bx

µν

+=

+

∈

%

o °

Here , " "min% and " "° represent a fuzzy version of those of "min" and " "≤ meaning that " objective function should be minimized as much as possible " and " the constraints should be possibly well satisfied" , respectively [13] . To be more explicit , they mean that " the objective function Z should be essentially smaller than or equal to an aspiration level 0z of the DM " and " the constraints avA xo should be essentially smaller than or equal to b" , respectively . In fact , (1) is converted to the following problem

0

0.

av

ZA x b

z

x ≥o°

° (5)

To model the ith fuzzy inequality of (5) , the following linear membership functions are used as in [12] and [5].

1

( ) 1

0 ,

avi i

avav avi i

i i i i i ii

avi i i

a x ba x ba x b a x b d

da x b d

µ

≤

−= − ≤ ≤ + ≥ +

ooo o

o

(6)

0

00 0 0

0

0 0

1

( ) 1

0

Z zZ z

Z z Z z dd

Z z d

µ

≤

−= − ≤ ≤ + ≥ +

(7)

where , 0 0osz z vd= − for some fixed (0,1)v ∈ and each id is a chosen constant expressing the limit of the admissible violation of the ith inequality for 0,1,...,i m= . Also , 0( ) 1, ( ) 1osz z vµ µ= = − and

0( (1 ) ) 0osz v dµ + − = . The parameters 0,v d and id can usually be found based on the the DM’s view [12]. N otation 2. [12] Set [0,1] : ( , ) nS x x S A bΛ = ∈ ∉ . Note 2. Just those vectors [0,1]nx ∈ can be a better solution of (5) than osx that violate at least one inequality

avi ia o x bÑ . That means x is an infeasible solution or equivalently by Nota tion 2 , .x SΛ∈

Theorem 2. (5) is equivalent to the following problems 0 0[0,1]

max min ( ),min (max ( )),n i I i i j J ij jxB D Z B D a x∈ ∈∈

Λ = − − + (8)


0 0

max. . (max( )) ;

( )[0,1] .

i ij j ij J

n

s t D a x B i I

D Z Bx

λλ

λ∈

+ + ≤ ∈

+ ≤

∈

(9)

3-1. PROBLEM SIMPLIFICATION Now , similar to [5], some theorems are given to convert (8) and (9) into the equivalent problems that are more simplified and more easily solvable as well . Notation 3. [12] Let 0 0 0( ) ( )x B D Zλ = − , ( ) (max ( ))i i i j J ij jx B D a xλ ∈= − + for all ,i I∈ ( ) ( )ij j i i ij jx B D a xλ = − + for all i I∈ and j J∈ and 0( ) min ( ).m

i ix xλ=Λ = Theorem 3. [5] The functions iλ for all i I∈ and ijλ for all i I∈ and j J∈ are non-increasing continuous functions . Especially , ijλ is decreasing for each component [0,1]jx ∈ . Theorem 4. [12] Let .i I∈ Then , for all [0,1]nx ∈ , ( ) min ( )i j J ij jx xλ λ∈= .

By Theorem 4, (8) has the equivalent form , 0[0,1]

max min ( ),min ( ).n i I ijxj J

x xλ λ∈∈∈

Λ =

Theorem 5. [5] For all i I∈ and ,j J∈ a) If 2 1i ijb a− Ö then , ( ) 1ij jxλ Ö for all [0,1]jx ∈ . b) If 2 1i ijb a− < then , ( ) 1ij jxλ Ö for all [0, 2 ]j i ijx b a∈ − . Corollary 2. [12] Let i I∈ and .j J∈ ( ) 1ij jxλ Ö for all [0,1]jx ∈ if and only if jx does not violate the

inequality avij j ia x bo Ñ .

Corollary 3. [5] x SΛ∈ if and only if there exist i I∈ such that ( ) 1i xλ < . Here , a simplification process is given to obtain more simplified equivalent problems . Theorem 6. [5] Let .i I∈ Then , ( ) min ( )

ii j J ij jx xλ λ∈= , where : 2 1i i ijJ j J b a= ∈ − < . Theorem 7. (9) is equivalent to the following problem

0 0

max. . (max( )) ;

( )[0,1] .

ii ij j ij J

n

s t D a x B i I

D Z Bx

λλ

λ∈

+ + ≤ ∈

+ ≤

∈

(10)

Corollary 4. Let ' : 2 1j i ijI i I b a= ∈ − < for all j J∈ and ' ' : jJ j J I= ∈ ≠ ∅ . a) We can remove all column 'j J∉ from matrix A with no effect on the optimal solution of (9). b) ( )j os jx x∗ = for all 'j J∉ , where x ∗ is the optimal solution of (9). Corollary 5. (14) is equivalent to the following problem

'

'

'0 0

max. . (max( )) ;

( )

[0,1] .

i ij j ij J

j jj J

j jj J

n

s t D a x B i I

c xD B

d x

x

λλ

µλ

ν

∈

∈

∈

+ + ≤ ∈

++ ≤

+

∈

∑∑

(11)


3-2. P ROPOSED ALGORITHM So far , (9) was simplified and (11) was obtained as equivalent problem . Now , some definitions are given to provide an algorithm to obtain the fuzzy solution . In the sense that it is assumed that the simplification that has been mentioned earlier has been done on (11). Definition 1. [5] Let (0,1)Lλ ∈ . For all j J

′

∈ define ' : ( )ij j ij j LI i I xλ λ= ∈ = for some (0,1)i

jx ∈ and '

min j

i ij i I jx x∈= .

Note 3. [5] a) Since ( )iij j Lxλ λ= and

'i ij jx x≤ implies

'( ) ( )i i

ij j ij j Lx xλ λ λ≥ = by Theorem 2, and then '

( )iij j Lxλ λ≥ for all ji I∈ . Therefore , it can be assumed that '' jI i= .

b) If for some '' '''i i≠ we have '' '''

min j

i i ij j i I jx x x∈= = , then set '' 'i i= in case

'' '' ''' '''

'' '''

i i j i i j

i i

d a d a

b b

+≤

+

, otherwise '' ''i i= . Remark 1. In case, where, for some i I∈ it holds that '

ji I∈ for more than one 'j J∈ . This means there

exist I I⊆g such that for all i I∈ g , there exist 'iJ J⊆g such that '

ji I∈ for all ij J∈ g .

Theorem 8. a) If for all ',j j J′

∈ such that 'j j≠ , 'jI and '

'j

I are disjoint , then (11) is equivalent to:

'

'

' '

0 0

max. . ( ) ;

( )

[0,1] .

i ij j i j

j jj J

j jj J

n

s t D a x B i I and j J

c xD B

d x

x

λ

λ

µλ

ν∈

∈

+ + ≤ ∀ ∈ ∀ ∈

++ ≤

+

∈

∑∑

(12)

b ( Let for some i I∈ , 'ji I∈ for more than one 'j J∈ . In this case , (11) is equivalent to:

'

'

' ''

'

0 0

max. . ( ) ;

( ) ;

( )

[0,1] .

i ij j i j

i ij j i j

j jj J

j jj J

i

n

s t D a x B i I and j J

D a x B i I and j

c xD B

d x

x

J

λ

λ

λ

µλ

ν∈

∈

+ + ≤ ∀ ∈ ∈

+ + ≤ ∀ ∈ ∈

++ ≤

+

∈

∑∑

g (13)

Where , '''i

i I

J J J∈

=g

gUÇ .

Now , the algorithm is presented to obtain a fuzzy solution for (1). Algorithm 1 . Suppose (1) is given and , ( )iv d i I∈ and 0d are suggested by the DM . Step 1: Solve (1) and obtain osx and osz by Note 1. Step 2: Derive (9) . Step 3: Obtain iJ for all i I∈ and convert problem obtained in Step 2 to (10) using Theorem 6. Step 4: Derive '

jI for all j J∈ and obtain 'J . If 'j J∉ , then remove jth column of A and also , remove jth variable in the problem obtained in Step 3 . Step 5: Get , Pò . Let 1 1, LL vλ= = − and ( ) ( )j os jx x∗ = for all 'j J∈ .


Step 6: Until 1Lλ ≥ − ò or L P= or 1L Lλ λ −= do Step 6-1: Derive jI for all 'j J∈ using Definition 1.

Step 6-2: If for some ' , jj J I∈ = ∅ then, if 1L vλ = − , then (1) with this Lλ has no better solution than x ∗ and so x ∗ and z ∗ are optimal . Otherwise , 1Lλ λ∗

−= , x x∗ = and go to Step 8 . Step 6-3: Obtain '

jI using Definition 1, Note 3. Step 6-4: If for all 'j j≠ and '

' ' ' ', ; j jj j J I I∈ ∩ = ∅ , then convert problem obtained in

Step 4 to (12) , otherwise convert it to (13) using Theorem 8. Step 6-5: Solve the problem obtained in 6 -4 with any method of this kind of problems and find

,x λ . If it has no optimal solution , then set 1Lλ λ∗−= and x x∗ = is the optimal

solution and go to Step 8 . Step 6-6: : 1, LL L λ λ= + = .

Step 7 : j jx x∗ = for all 'j J∈ and λ λ∗ = .

Step 8 : t

t

c xzd x

µν

∗∗

∗

+=

+ .

Step 9 : End . 4. CONCLUSIONS Employing Charnes and Cooper's transformation, a linear fractional optimization problem with a system of fuzzy relational inequalities as constraints in the presence of max-arithmetic mean composition has been solved. In order to use infeasible points to obtain better solution, the problem has been converted into a new one with fuzzy inequalities. Then, the dimension of the problem has been reduced and an algorithm has been introduced to generate optimal solution. If the algorithm gives a similar solution, then the decision maker should accept more perturbation on the constraints. This process should be continued un till achieve the desired solution of the decision maker . D ue to page restriction; numerical example will be given elsewhere. 5. REFERENCES 21. Khorram , E . Ezzati, R . and Valizadeh , Z . (2014) , “ Linear Fractional Multi-Objective Optimization

Problems Subject to Fuzzy Relational Equations with a Continuous Archimedean Triangular Norm ,” Information Sciences, 267, pp . 225–239 .

22. Li , D . C . and Geng , S . L . (2014 ,( “Optimal Solution of Multi-Objective Linear Programming with Inf-→ fuzzy Relation Equations Constraint, ” Information Sciences , 271 , pp . 159–178 .

23. Molai, A . A . (2014), “Linear Fractional Programming Problem with Fuzzy Relation Inequality Constraints, " Proc. of the 14th Iranian Conf. On Fuzzy Systems, Tabriz, pp. 334–339.

24. Shivanian, E. (2015 ), “Linear Optimization of Fuzzy Relation Inequalities with Max-

Lukasiewicz Composition,” International Journal of Industrial Mathematics, 7 ( 2 ) , pp . 129–138.

25. Kouchakinejad , F . Khorram , E . and Mashinchi , M . (2015), “ Fuzzy Optimization of Linear Objective Function Subject to Max-Average Relational Inequality Constraints,” Journal of Intelligent and Fuzzy Systems 29, pp. 635–645.

26. Kouchakinejad , F . Khorram , E . and Mashinchi , M . (2015), “Solving Multi-Objective Optimization Problems with Fuzzy Max-Arithmetic Mean Relational Inequality Constraints, ” Proc. of the 1th Conf. on Research in Advanced Sciences of Mathematics (On a CD-Rom( , Urmia , Paper No . RAMS338 .


27. Kouchakinejad , F . Mashinchi , M. and Khorram , E . (2015), “On the System of Fuzzy Relational Equations and Inequalities,” Proc. of the Second National Conf. on Mathematics and its Applications in Engineering Sciences ) On a CD- Rom), Mazandaran , pp . 436–446 .

28. Kouchakinejad , F . Mashinchi , M. and Khorram , E . (2015), “Solving Multi-Objective Linear Programming Problems with Fuzzy Goals in the Presence of Fuzzy Max-Arithmetic Mean Relational Inequality Constraints,” Proc. of the 4th Iranian Joint Cong. on Fuzzy and Intelligent Systems, Zahedan, pp. 29–34.

29. Wu , Y . K . (2007), “Optimization of Fuzzy Relational Equations with Max-Av Composition,” Information Sciences , 177 , pp . 4216–4229 .

30. Shivanian , E . Khorram , E . and Ghodousian , A . (2007 ), “Optimization of Linear Objective Function Subject to Fuzzy Relation Inequalities Constraints with Max-Average Composition,” Iranian Journal of Fuzzy Systems, 4 , pp . 15–29 .

31. Bajalinov, E. B. (2003 ) , “ Linear-Fractional Programming: Theory, Methods, Applications and Software”, Kluwer Academic Publishers, Boston.

32. Ghodousian, A . and Khorram , E . (2008) , “Fuzzy Linear Optimization in the Presence of the Fuzzy Relation Inequality Constraints with Max-Min Composition ,” Information Sciences, 178, pp. 501–519 .

33. Sakawa , M . (1993), “Fuzzy sets and interactive multi-objective optimization ,” Plenum , New York .


Copula and W.L.W approaches for composite end point of multivariate failure time

P. Azhdari*, F. Abedini1 *,1Department of statistics, Tehran North Branch, Islamic Azad University, Tehran, Iran

[email protected]

Abstract

Copula models offer an alternative approach for modeling association between failure time. In

this article two types of copula is considered: Clayton copula and Frank copula. Another

approach is the marginal approach of Wei, Lin, Weissfeld (WLW) for handling multivariate

failure time data.

Keywords: multivariate failure time data, Clayton copula, Frank copula, Composite End Point (CEP), terminal, clinical trials.

1. INTRODUCTION Multivariate failure times are routinely encountered in clinical trials and observational Studies [5].

Investigators have increasingly turned to multiple endpoints in clinical trials and regulatory agencies are

increasingly requiring demonstration of efficacy for multiple endpoints [1]. In a composite endpoint, several

clinical outcomes of interest are combined into a single endpoint and each endpoint is considered as a

component of the composite endpoint. Instead of separate analysis of each endpoint, the event time is the

time of the first occurrence of any component endpoint [2]. Many diseases put individuals at elevated risk for

a multitude of adverse clinical events and randomized clinical trials are routinely designed to evaluate the

effectiveness of experimental interventions for the prevention of these events. Trials in cardiology, for

example, record times of events such as non-fatal myocardial infarction, non-fatal cardiac arrest, and

cardiovascular death [6].

There have several methods for the analysis of multivariate failure time data. These approaches are copula-

based models and robust marginal methods. Copula models have the appealing property of linking two

marginal distributions and so marginal features may be constructed in any desirable way. One possible

extension of the Cox regression model for dealing with multivariate failure time data is the marginal

approach of Wei, Lin and Weissfeld [7], famous to the WLW approach. The rest of the paper is as follows:

section 2 introduces the different kinds of copula and section 3 is about marginal approach of Wei, Lin,

Weissfeld (WLW). The section 4 is simulations.

2. Composite Endpoint Analysis based on a Clayton and Frank Copula



In this paper we use bivariate failure times. Consider the joint distribution of ( )1 2, |T T Z in which the

marginal distributions for |kT Z 1,2k = , feature proportional hazards; so with

0( | ) ( )exp( )k k kt z t zλ λ β= 0( | ) ( )exp( )k k kt z t zβΛ = Λ where 0 00( ) ( )

t

k kt s dsλΛ = ∫ 1,2k =

If the joint survivor function 12 1 2( , | ; )F t t z Ω is determined by the below relation

12 1 2 1 1 2 2 1 1 1 2 2 2( , | ; ) ( , ) ( ( | ; ), ( | ; ); ) (2.1)F t t z P T t T t C F t z F t zθ α α θΩ = ≥ ≥ =

Where ( ),α θ ′′Ω = with ( )1 2,α α α′′ ′= . Copula function is used for joint distribution ( )1 2, |T T z .

Clayton copula [3] has the following copula function: 1

1 2 1 2( , ; ) (u 1) (2.2)C u u uθ θ θθ − − −= + −

With 1θ ≥ − and Kendall's τ is then given by ( 2)θτ θ θ= + , which can be seen to vary over [ ]1,1− .

The survivor function of the failure time 1 2min( , )T T T= given z is

1 21

10 20( | ; ) exp( ( ) ) exp( ( ) ) 1 (2.3)z zF t z t e t eθβ βθ θ

− Ω = Λ + Λ −

Hence the hazard ratio for the treatment versus control groups for the composite endpoint is 2 2

0 0 01 1

2 20 0 01 1

( ) exp( ( ) ) exp( ( ) ) 1( | 1; ) (2.4)( | 0; ) ( ) exp( ( )) exp( ( )) 1

k kk k k kk k

k k kk k

t t e t et zt z t t t

β βλ β θ θλλ λ θ θ

= =

= =

+ Λ Λ −= Ω == Ω Λ Λ −

∑ ∑∑ ∑

which is not invariant with respect to time in general. Note that this ratio is 1 when 1 2β β= . To gain some insight into this function, suppose the marginal distributions are exponential with common baseline hazards of 10 20( ) (t) log10tλ λ λ= = = so that the probability of a type k event occurring before

t = 1 is 0.90 for a control subject (i.e. ( 1| 0) 0.90)kP T Z< = = . Further suppose a common hazard ratio of 0.50 holds for the two margins(i.e. 1 2(exp( ) exp( ) 0.50)β β= = . This setting is consistent with the

recommendations that the component events occur with comparable frequency since 1 2P(T | ) 0.5T Z< =

and have comparable treatment effects 1 2( )β β= . Figure 4.1 (a) presents variation in this hazard ratio over [0,1]. The generator for the Frank copula [4] is ( ; ) log((exp( ) 1) (exp( ) 1))H u tθ θ θ= − − − − and the resulting copula function is

( )( )1 2

11 2

1 1C(u , ; ) log 1 (2.5)

1

u ue eu

e

θ θ

θθ θ− −

−−

− − = − +

−

Where θ ∈ ℜ ; Kendall's τ is then 1 2

01 4 4 (exp( ) 1)t t dt

θ

θτ θ θ− −= − + −∫ . If we adopt the same

marginal distributions as before, the survivor function for the composite endpoint is 1 2

1 2( ) ( )1 (exp( ) 1)(exp( ) 1( | z) log 1 (2.6)1

z zt e t ee eF te

β β

θ

θ θθ

−Λ −Λ

−

− − − −= − +

− Note that since ( | ; ) = − log ( ) ⁄ , the hazard ratio for the treatment versus control groups for the composite endpoint is complicated. Figure 4.1 (b) presents a plot of this hazard ratio over [0,1].


3. Limiting Values for a Wei-Lin-Weissfeld Analysis In this section, we investigate the marginal approach of W.L.W for multivariate failure time data. This method is based on Cox models for each component to obtain component-specific estimates of treatment effect. We proceed in the derivations in the case where the composite endpoint is comprised of K components but subsequently will focus on the case K = 2. We let ik ikdN (s) = I(T = s) ), and let ikN (s); 0 < s denote the counting process for type k events and

i1 i2Ni(s) =(N (s);N (s); 0 < s denote the bivariate counting process for subject i, i = 1,...,m . Let

ik ikY (s) = I(s T )≤ , †iY (s) = I(s C )i ≤ and †

ik ikY (s) = Y (s)Y (s)i , k = 1,...,K , i = 1,...,m . A Cox model is assumed for type k events meaning

i k0 k i(t|z ) = (t) exp( z ) ,kλ λ β

where k0 (t) λ is the baseline hazard function for type k events and kβ is the treatment effect on the kth

component. The kth component-specific score function for kβ is (1)

(0)01

( , )( ) ( ) ( ) (3.1)( , )

mk k

k k ik i iki k k

S tU Y t Z dN tS t

ββ

β

∞

=

= −

∑∫

Where (1)1

( ,u)= ( ) exp , 0,1m rk ik i k ii

S Y t Z Z rβ β=

=∑ . The proportional hazards assumption holds for each component and the solution to the score equation (3.1), ˆ

kβ , is a consistent estimate of true treatment effect kβ . If we let 1( ,..., )Kβ β β= and its estimate

1ˆ ˆ( ,..., )T

Kβ β β= , Wei et al. show that ( )ˆm β β− converges in distribution to a multivariate Normal

distribution with zero-mean vector and variance-covariance matrix ( )β∑ and provided a consistent

sandwich-type estimate for ( )β∑ . The global estimate of treatment effect proposed by Wei et al. is simply a linear combination of all component-specific treatment effect estimates 1

ˆ,..., Kβ β and can be obtained as

ˆ ˆ ˆ( ) (3.2)cβ β β′=

where the weight 11 1ˆ ˆ ˆˆ ˆ( ) ( ) ( )c J J Jβ β β

−− − ′= ∑ ∑ is chosen to estimate the weight matrix to

minimize the variance in the class of all linear estimators; ˆ( )β∑ is the estimate for the variance-

covariance matrix of β and (1,...,1)J ′= . In order to compare the performances of the global approach and the composite endpoints analysis, we obtain

the limiting value of β as

c( ) (3.3)β β β′=


Where 11 1( ) ( ) ( )c J J Jβ β β

−− −′ = ∑ ∑ . We therefore require the limiting value of the robust

variance ( )β∑ to obtain the limiting value β . 4. Simulation

Figure 4.1: Plots of the hazard ratio (treatment vs. control) over the time interval [0; 1] for the composite endpoint

analysis implied by the Clayton copula (Panel (a)) and Frank copula (Panel (b)) with marginal exponential

distributions with 1 2 log10λ λ= = and 1 2exp( ) exp( ) exp( ) 0.50β β β= = = and mild ( 0.20)θτ = ,

moderate ( 0.40)θτ = and strong ( 0.60)θτ = association.

Figure 4.1 (a) contains a plot of the hazard ratio (2.4) over the time interval [0; 1] for models with mild

( 0.20)θτ = moderate ( 0.40)θτ = and strong ( 0.60)θτ = association. As can be seen, even when the

treatment effects are the same for the two component endpoints, there can be non-negligible variation in the

hazard ratio over time, and within this family of models the nature of this variation depends on the strength of

the association between the two failure times.


Table 4.1: Frequency properties of estimator of treatment effect based on global analysis using the Wei-Lin-Weissfeld approach: Clayton copula with 0.4τ = , 1 .223β = −

Aπ π m β AVE( α ) ESE ASE1 ASE2 *ECP % ECP% EP%

Common Treatment Effect: 2β = -0.223 0.2 0.2 621 -0.223 -0.223 0.084 0.072 0.086 95.9 95.9 83.6 0.4 828 -0.223 -0.223 0.086 0.074 0.087 95.1 95.1 82.0 0.6 1242 -0.223 -0.221 0.088 0.077 0.088 95.0 95.0 80.8 0.8 2484 -0.223 -0.223 0.089 0.083 0.090 95.6 95.6 80.3 0.4 0.4 828 -0.223 -0.223 0.087 0.076 0.087 95.4 95.4 82.7 0.6 1242 -0.223 -0.221 0.089 0.078 0.088 95.0 95.0 79.9 0.8 2484 -0.223 -0.223 0.089 0.083 0.090 95.6 95.6 80.6 0.6 0.6 1242 -0.223 -0.223 0.090 0.081 0.089 95.1 95.1 79.7 0.8 2484 -0.223 -0.222 0.089 0.083 0.090 95.2 95.2 80.5 0.8 0.8 2484 -0.223 -0.225 0.088 0.086 0.090 95.2 95.2 80.5 Different Treatment Effects: 2β = 0 0.2 0.2 7090 -0.066 -0.067 0.025 0.021 0.025 95.9 0.0 84.2 0.4 9664 -0.065 -0.066 0.025 0.022 0.025 94.5 0.0 83.3 0.6 14623 -0.065 -0.066 0.026 0.023 0.026 94.8 0.0 82.8 0.8 28219 -0.066 -0.066 0.026 0.024 0.027 95.3 0.0 81.7 0.4 0.4 10203 -0.064 -0.065 0.025 0.022 0.025 95.1 0.0 83.6 0.6 14897 -0.064 -0.066 0.025 0.023 0.025 94.6 0.0 83.2 0.8 28316 -0.066 -0.066 0.026 0.024 0.027 95.2 0.0 80.6 0.6 0.6 14733 -0.065 -0.066 0.026 0.024 0.026 94.1 0.0 83.4 0.8 28202 -0.066 -0.067 0.026 0.025 0.027 95.2 0.0 81.7 0.8 0.8 27355 -0.067 -0.069 0.026 0.026 0.027 95.4 0.0 82.2

†( )A P C Tπ = < is the administrative censoring rate, †( )P C Tπ = < is the net censoring rate, ESE is

the empirical standard error, 1ASE is the average model based standard error, 2ASE is the average robust

standard error, %ECP ∗ is the empirical coverage probability for β of a nominal 95% confidence interval

using the robust standard error, ECP% is the empirical coverage probability for 1β of a nominal 95%

confidence interval using the robust standard error, EP% is the empirical power of a Wald test of

0 : 0H β = based on the robust standard error.

Table 4.1 reports the results from a global analysis of treatment effect based on the marginal analysis

proposed by Wei et al. In this table the sample sizes were computed based on the formula for the composite

endpoint analysis using the limiting value of the regression coefficient. As one would expect from (3.1) when

the treatment effects are equal then the marginal analysis yields consistent estimators for this common effect

and the mean estimate across all simulated trials is very close to the limiting value. Moreover, the empirical

standard error and the average robust standard error were in very close agreement; the average model-based

standard error is conservative since it is based on the working independence assumption being correct. The


empirical coverage probabilities (based on the robust standard errors) were compatible with the nominal 95%

level for β when 1 2β β= . When 1 2β β≠ the empirical coverage for 1β was zero, a reflection of the

difference between β and 1β . When 2 0β = , the limiting value β was quite small and hence the sample

sizes of the trial were much larger. Since the sample size was computed based on the composite endpoint

analysis with β , it is not surprising that there is a slight gain in empirical power from the global analysis

since each individual may contribute more than one event.

When 1 2β β≠ , the composite endpoint and global analyses yield estimators which do not coincide with 1β

, 2β , or each other. We next compare the two limiting value: one is α ∗ from the composite endpoint

analysis and the other one is β from the global analysis. We consider the case in which two failure times are

generated by a Clayton copula with exponential margins and a single treatment covariate modeled through

proportional hazards with 1 log(0.80)β = and 2 0β = . We consider mild and moderate association

between the failure times with 0.20τ = and 0.40τ = respectively. Administrative censoring was set to

40% and additional random censoring from an exponential withdrawal time gave cases with 60% and 80% as

well. The limiting value of the composite endpoint and global analyses were plotted against

1 2 1( | 0)P T T Z p< = = in Figure 4.2. It is apparent that when 1p approaches zero, the limiting value for

both methods approaches 0. For the composite endpoint this makes sense since the first event is most likely

to be a type 2 event for which there is no treatment benefit. As 1p approaches 1, the limiting value for the

composite endpoint analysis approaches 1β for analogous reasons. The limiting value from the global

analyses track these limiting values quite well, but tend to correspond to larger estimates of treatment effect

since the limiting value is larger in absolute value. Thus even when the two components have equal

frequencies and the proportional hazards assumption holds for each component, the global analysis, in the

limit, will yield an estimate of treatment effect which is greater than that of the composite endpoint analysis.

These relationships hold across both levels of association and over different degrees of censoring.


Figure 4.2: Plot of limiting values of regression estimator of treatment effect based on a composite endpoint analysis and a global Wei-Lin-Weissfeld analysis with bivariate data generated via a Clayton copula; 1 log(0.80)β = and

2 0β = ; administrative censoring only.

5. REFERENCES 1. Buzney EA and Kimball AB (2008), “A critical assessment of composite and coprimary endpoints: a complex problem.” Journal of the American Academy of Dermatology, 59:890-6. 2. Chi GYH (2005). “Some issues with composite endpoints in clinical trials.” Fundamental & Clinical Pharmacology, 19:609-619.

3. Clayton, DG (1978). “ A model for association bivariate tables and its application in epidemiological studies of family tendency in chronic disease incidence,” Biometrika, 65:141-151.

4. Genest C. (1987). “Frank ‘s family of bivariate distributions.” Biometrika, 74: 549-550.

5. Lawless, J.F. (2003). “ Statistical Models and Methods for Lifetime Data.” John Wiley and Sons.

6. POISE Study Group. “Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial.” Lancet, 371:1839- 1847.

7. Wei, LJ, Lin,DY, and Weissfeld L (1989). “Regression Analysis of Multivariate Incomplete Failure Time

Data by Modeling Marginal Distributions.” Journal of American Statistical Association, 84:1065-1073.


Stability of General Cubic Mapping in Fuzzy Normed Spaces

S. Javadi

Faculty of Engineering- East Guilan, University of Guilan

[email protected]

Abstract We establish some stability results concerning the general cubic functional equation f (x + ky) - kf (x + y) + kf (x -y) - f (x - ky) = 2k( - 1)f (y) for fixed k∈ N-1 in the fuzzy normed spaces. More precisely, we show under some suitable conditions that an approximately cubic function can be approximated by a cubic mapping in a fuzzy sense and we establish that the existence of a solution for any approximately cubic mapping guarantees the completene-ss of the fuzzy normed spaces

Keywords: Generalized Hyers-Ulam-Rassias stability, Cubic functional equation, Fuzzy normed spaces.

1. INTRODUCTION In order to construct a fuzzy structure on a linear space, in 1984, Katsaras [1] defined a fuzzy norm on a li-near space to construct a fuzzy vector topological structure on the space. At the same year Wu and Fang [2] also introduced a notion of fuzzy normed space and gave the generalization of the Kolmogoroff normalized theorem for a fuzzy topological linear space. In [3], Biswas defined and studied fuzzy inner product spaces in a linear space. Since then some mathematicians have defined fuzzy norms on a linear space from various poi-nts of view [4, 5]. In 1994, Cheng and Mordeson introduced a definition of fuzzy norm on a linear space in such a manner that the corresponding induced fuzzy metric is of Kramosil and Michalek type [6]. In 2003, Bag and Samanta [7] modified the definition of Cheng and Mordeson [8] by removing a regular condition. They also established a decomposition theorem of a fuzzy norm into a family of crisp norms and investigated some properties of fuzzy norms (see [9]). The concept of stability of a functional equation arises when one replaces a functional equation by an inequality which acts as a perturbation of the equation. In 1940, Ulam [10] posed the first stability problem. In the next year, Hyers [11] gave an affirmative answer to the question of Ulam. Hyers’s theorem was generalized by Aoki [for additive mappings and by Rassias [12] for linear mappings by considering an unbounded Cauchy difference. The concept of the generalized Hyers-Ulam stability was originated from Rassias’s paper [12] for the stability of functional equations. During the last decades several stability problems for various functional equations have been investigated by many mathematicians; we refer the reader to [13, 14]. The functional equation f(x + ky) − kf(x + y) + kf(x − y) − f(x − ky) = 2k(k2− 1)f(y) (1.1) for fixed k with k ∈ N-1 is called the general cubic functional equation, since the function f(x) = x3 is its solution. Every solution of the general cubic functional equation is said to be cubic mapping. From (1.1), putting x = y = 0 yields f(0) = 0. Note that the left hand side of (1.1) changes sign when y is replaced by −y. Thus f is odd. Putting x = 0 and y = x in (1.1). We conclude that f(kx) =k 3f(x). By induction, we infer that f(knx) =knf(x) for all positive integer n. The stability problem for the cubic functional equation was proved



by Jun and Kim [14] for mappings f : X → Y , where X is a real normed space and Y is a Banach space. Later a number of mathematicians worked on the stability of some types of the cubic equation [9,10, 11, 12, 13,14]. Najati in [15] established the general solution and the generalized Hyers-Ulam stability for the equation (1.1). 2. FUZZY GENERALIZED HYERS-ULAM THEOREM:NON-UNIFORM VERSION. In this section we proved the non-uniform version of the generalized Hyers-Ulam stability of the general cubic functional equation (1.1) in fuzzy normed spaces. The uniform version is discussed in Section 3. Theorem 2.1. Let k ∈ N \ 1, α ∈ [1, +∞) and α≠ k3. Let X be a linear space and let (Z, N′) be a fuzzy normed space. Suppose that an even function φ : X × X → Z satisfies φ(knx, kn y) = αn φ(x, y), for all x, y ∈ X and for all n ∈ N. Suppose that (Y, N) is a fuzzy Banach space. If a map f : X → Y satisfies

N (f(x+ky) -kf(x+y)+kf(x -y)-f(x -ky)-2k(k 2-1)f(y), t) ≥ N′(φ(x, y), t)

for all x, y ∈ X and t > 0, then there exists a unique cubic map C : X → Y which satisfies (1.1) and inequality

N(f(x)- C(x), t)≥min N′ (φ(0, x), − t), N′(φ(0, x), ( )( ) t) holds for all α <k 3, x ∈ X and t > 0. Also, N(f(x)- C(x), t)≥min N′ (φ(0, x), t), N′(φ(0, x), ( )( ) t) holds for all α > k 3, x ∈ X and t > 0. . Corollary 2.2. Let X be a Banach space and ε > 0 be a real number. Suppose that a function f : X → X satisfies ∥f(x + ky) − kf(x + y) + kf(x − y) − f(x − ky) − 2k(k2 − 1)f(y)∥ ≤ ε(∥x∥2p + ∥y∥2p + ∥x∥p∥y∥p) for all x, y ∈ X where 0 < p < and k ∈ N \ 1. Then there exists a unique cubic function C : X → X which satisfying (1.1) and the inequality ∥C(x) − f(x)∥ < ∥ ∥ for all x ∈ X. Corollary 2.3. Let X be a Banach space and ε > 0 be a real number. Suppose that a function f : X → X satisfies ∥f(x + ky) − kf(x + y) + kf(x − y) − f(x − ky) − 2k(k2− 1)f(y)∥ ≤ ε(∥x∥ + ∥y∥ − ∥ ∥ ∥ ∥ ) for all x, y ∈ X where k ∈ N \ 1. Then there exists a unique cubic function C : X → X which satisfying (1.1) and the inequality ∥C(x) − f(x)∥ < ∥ ∥ for all x ∈ X. Let X be a Banach space. Denote by N and N′ the fuzzy norms obtained as Corollary 2.2 on X and R, respectively. Let ϵ > 0 and let φ : X ×X → R be defined by φ(x, y) = ε for all x, y ∈ X. Let f : X → X be a φ-approximately cubic mapping in the sense that

∥f(x + ky) − kf(x + y) + kf(x − y) − f(x − ky) − 2k(k2 − 1)f(y)∥ < ε,

then there exists a unique cubic function C : X → X which satisfies ∥f(x) − C(x)∥ ≤ for all x ∈ X. Let f be a mapping from X to Y . For each k ∈ N, let Dfk: X × X → Y be a mapping defined by


Dfk(x, y) = f(x + ky) − f(x − ky) − kf(x + y) + kf(x − y) − 2k(k2 − 1)f(y)

Proposition 2.4. Let X be a linear space and let Y be a normed space. Let ε be a nonnegative real number and let f be a mapping from X to Y . Suppose that ∥ Df2(x, y) ∥ ≤ ε for all x, y ∈ X. Then there exists a sequence of nonnegative real numbers such that ε0 = ε, ε1 = 4ε, ε2= 10ε, ..., εk = 2εk-1 + (k + 1)ε + εk-2 (k ≥ 3) and ∥Dfk(x, y)∥ ≤ εk-2.

Corollary 2.5. Let X be a linear space and let Y be a Banach space. Let f be a mapping from X to Y and let ε be a nonnegative real number. Suppose that ∥Df2(x, y)∥ ≤ ε holds for all x, y ∈ X. Then for each positive integer k > 1, there exists a unique cubic mapping Ck : X → Y such that ∥Ck(x) − f(x)∥ ≤ for all x ∈ X. 3. Fuzzy Generalized Hyers-Ulam theorem: uniform version In this section, we deal with a fuzzy version of the generalized Hyers-Ulam stability in which we have uniformly approximate cubic mapping. Theorem 3.1. Let X be a linear space and (Y, N) be a fuzzy Banach space. Let φ : X × X → [0, ∞) be a function such that

ϕ(x, y) =∑ ( , ) <∞ (3.1)

for all x, y ∈ X. Let f : X → Y be a uniformly approximately cubic function respect to φ in the sense that → ( ( + ) − ( + ) + ( − ) − ( − ) − ( − ) ( ), ( , )) =

uniformly on X × X. Then T (x) := N − lim → ( ) for each x ∈ X exists and defines a cubic mapping T : X → Y such that if for some δ > 0, α > 0,

N(f(x+ky)−kf(x+y)+kf(x−y)−f(x−ky)−2k(k2 −1)f(y), δφ(x, y)) > α for all x, y ∈ X, then

N ( T (x) − f(x), ϕ(0, x) )> α for all x ∈ X. Corollary 3.2. Let X be a linear space and (Y, N) be a fuzzy Banach space. Let φ : X × X → [0, ∞) be a function satisfying (3.1). Let f : X → Y be a uniformly approximately cubic function with respect to φ. Then there is a unique cubic mapping T : X → Y such that → ( ( ) − ( ), ( , )) = uniformly on X. 4. Fuzzy Completeness We proved that under suitable conditions including the completeness of a space, for every approximately cubic function there exists a unique cubic map-ping which is close to it. It is natural to ask whether the converse of this result holds. More precisely, under what conditions involving approximately cubic functions, our fuzzy normed space is complete. The following result gives a partial answer to this question.


Definition 4.1. Let (X, N) be a fuzzy normed space. A mapping f : N∪0 → X is said to be approximately cubic if for each α ∈ (0, 1) there is some nα ∈ N such that

N(f(i + kj) − kf(i + j) + kf(i − j) − f(i − kj) − 2k(k2 − 1)f(j), t) ≥ α, for all i ≥ 2j ≥ n α. Definition 4.2. Let (X, N) be a fuzzy normed space. A mapping f : N∪0 → X is said to be a conditional cubic if

f(i + kj) − kf(i + j) + kf(i − j) − f(i − kj) = 2k(k2 − 1)f(j), for all i ≥ 2j. Theorem 4.3. Let (X, N) be a fuzzy normed space such that for each approx-imately cubic type mapping f : N ∪ 0 → X, there is a conditional cubic mapping C : N ∪ 0 → X such that lim → ( ( ) − ( ), 1) = 1. Then (X, N) is a fuzzy Banach space. REFERENCES

34. Katsaras, A. K. (1984), “ Fuzzy topological vector spaces II,” Fuzzy Sets and Systems, 12 , pp. 143–154.

35. Congxin, W. and Fang, J., (1984), “ Fuzzy generalization of klomogoroffs theorem,” J. Harbin Inst. Technol., 1, pp. 1-7.

36. Biswas, R. (1991), “ Fuzzy inner product spaces and fuzzy normed functions,” Inform. Sci., 53, pp. 185–190.

37. Balopoulos, V., Hatzimichailidis, A. G. and Papadopoulos, B. K., (2007), “Distance and Similarity measures for fuzzy operators,” Inform. Sci., 177, pp. 2336–2348.

38. Felbin, C., (1992),, “Finit dimensional fuzzy normed linear spaces,” Fuzzy Sets and Systems, 48, pp. 239-248.

39. Kramosil, I., Michalek, J., (1975), “Fuzzy metric and statistical metric spaces,” Kybernetica, 11, pp. 326–334.

40. Bag, T., Samanta, S. K., (2003), “Finite dimentional fuzzy normed linear spaces,” J. Fuzzy Math., 11, pp. 687–705.

41. Cheng, S.C., Mordeson, J. N., (1994), “Fuzzy linear operators and fuzzy normed linear spaces,” Bull. Calutta Math. Soc., 86, pp. 429–436.

42. Bag, T. and Samanta, S. K., (2005), “Fuzzy bounded linear operators,” Fuzzy Sets and Systems, 151, pp. 513–547.

43. Ulam, S. M., (1964),“Some equations in analysis, Stability, Problems in Modern Mathematics,” Science eds., Wiley, New York.

44. Hyers, D. H., (1941), “On the stability of the linear functional equation,” Proc. Nat. Acad. Sci. USA, 27, pp. 222–224.

45. Rassias, Th. M., (1978),“ On the stability of linear mapping in Banach spaces,” Proc. Amer. Math. Soc., 72, pp. 297–300.

46. Czerwik, S., (2002), “Functional equations and Inequalities in Several Variables,” Worldscientific, iver Edge, NJ.

47. Jun, K. M. and Kim, H. M., (2002), “The generalized Hyers-Ulam-Rassias stability of a cubic functional equation,,” J. Math. Anal. Appl., 274, pp. 867–878.

48. Najati, A., (2007), “The generalized Hyers-Ulam-Rassias stability of a cubic functional equation,” Turkish J. Math., 31, pp. 395-408


An Accurate Fuzzy Frequent Pattern Based Classifier Using Confidence Tuning

Alireza Hekmatinia, Mohammad Saniee Abadeh

Faculty of Electrical and Computer Engineering

Tarbiat Modares University, Tehran, Iran

[email protected]

[email protected]

Abstract

Associative classifier algorithms combine two data mining paradigms, namely sample classification and association rule mining. These methods are very interesting for building an accurate classification model in a wide area of real-world applications. Lately, many methods have been presented to integrate associative classifiers with fuzzy set theory, in order to improve the quality of previous algorithms. This paper presents a three-step fuzzy frequent pattern (FFP) based classifier which uses an Apriori like algorithm to generate a large number of FFPs from each data class. Our algorithm in the second step selects a subset of useful FFPs and removes redundant ones. Finally, in order to tune the boundaries between various data classes, we use a confidence improvement process. We tested our algorithm on six real-world datasets and compared the achieved results with two well-known fuzzy associative classifier algorithms. Keywords: Fuzzy associative classifier, Certainty factor tuning, Fuzzy frequent pattern, Fuzzy rule

1. INTRODUCTION Sample Classification tasks and association rule mining methods are two interesting areas of research. Classification is a well-studied problem. Various algorithms for classifier construction have been proposed including Decision Tree Induction, Bayesian Network, K-Nearest Neighbor, Rule Induction and etc.[1][2] In rule induction methods, fuzzy rule based classifiers due to their high interpretability by human are a mainstay of research. There are many methods for evolutionary fuzzy rule mining.[3]. These algorithms face with the explosion of the fuzzy rule search space. This explosion makes the induction process more difficult, and in most cases, it leads to problems of scalability such as time and memory[4].

Besides, association rule mining has received a noticeable deal of attention from many years ago [5].Association rule mining has two main steps: frequent pattern discovery and association rule extraction. Suppose that we have a dataset DS of transactions in which the transaction ∈ is a set of items. A set of items which duplicates more than a predefined threshold in transactional dataset is called frequent pattern. After generating frequent patterns of dataset, association rule extraction from frequent patterns is started. In brief, an association rule is an expression as form → , where X and Y are sets of items. → Showing that when a transaction includes X, it probably contains Y, too. X and Y are called antecedent and consequent, respectively. In association rules, consequent part can contain several items. If we add a class label to consequent part of the frequent pattern, a frequent pattern for classification (FPC) will be created.

In the last decade, classification using association rule mining, has shown good performance in classification tasks. The resulting classification model, called associative classifier (AC), consists of class association rules (CAR). The antecedent of a CAR consists of an item set and the consequent part of it is a class label. There are several works that use association rule mining to build accurate classifiers. [6] Proposes CBA algorithm as a two steps AC method that uses Apriori like algorithm to generate CARs from input dataset. This method in the first step generates CARs, then eliminates redundant CARs based on minimum support and pessimistic error rate. Second step of this method includes selecting a subset of CARs and uses them for building classifier. To improve the efficiency of Apriori like algorithms, [7] proposes CMAR




algorithm which uses a divide-and-conquer approach to partition the dataset to minor data parts. Recently an AC algorithm proposed in [8] that tries to establish tradeoff between accuracy and compactness of generated rule set. This work presents a new rule quality called principality.

In order to improve the interpretability of the obtained CARs and to avoid crisp partitioning of attributes, several studies have proposed to obtain fuzzy associative classifiers. For instance, [9] presents the well-known FARC_HD method that uses Apriori like algorithm to extract Fuzzy CARs from each class then removes some of the redundant fuzzy rules during a prescreening phase. Finally, the last phase of the method uses a genetic algorithm to select and adjusting a compact set of fuzzy class association rules with high classification performance. Another method to this field is D-MOFARC algorithm that uses multi objective algorithms to tune fuzzy membership functions and at the same time, selects a rule subset[10]. D-MOFARC is an extended version of FARC_HD algorithm. Furthermore, newly AC-FFP algorithm was presented in [11] which extends the CMAR algorithm in fuzzy set theory. In AC-FFP, at first, dataset attributes are discretized. Then appropriate fuzzy set for each attribute is created. Finally, this algorithm used of three pruning steps to eliminate redundant fuzzy class association rules and building a fuzzy rule based classifier.

In this paper, we present an algorithm which generates a set of fuzzy frequent patterns for classification (FFPC). The resulting set of FFPCs constructs an accurate fuzzy rule based classifier with interpretable model. Our method is based on three main steps:

1) Using Apriori like algorithm presented in [9] to extract a set of FFPCs from each data class. 2) Selecting a subset of FFPCs based on class data coverage and pruning unnecessary ones. 3) Tuning FFPCs confidence value by an iterative process.

The rest of this work is arranged as follows. Section 2 introduces the basic notions in fuzzy rule based classification field. Section 3 describes our method in more details. Section 4 presents results obtained by our method. At last, in Section 5 we conclude paper and offer some guidance for future works. 2. BACKGROUND For N dimensional classification problem with L data classes, fuzzy rules will be used as follows:

1 1: i i n in i iR if x is FA and x is FA then Class l with Confidence c… = (1)

Where, iR , ( )1, , nx x x= … , , 1, ..., il i L∉ and ic indicate ith fuzzy rule, N dimensional feature vector, class of ith fuzzy rule and fuzzy rule confidence, respectively. Two important rule quality measures in association rule mining field are support and confidence. Let us assume there are T data samples 1( , ..., ), 1, ...,s sns x x sx T= = . Fuzzy support for one rule, calculates as follows:

( )( )

i

is i

A sx ClasslFuzzySupport Rx

T

µ∈=

∑

(2)

Where, ( )iA sxµ demonstrates matching degree of data sample sx with antecedent part of iR . For

matching degree calculation, product operator is used as follows:

1 1( ) ( ) .... ( )i i inA s A s A sNx x xµ µ µ= × × (3)

Where, (.)ijAµ is the membership function of the antecedent fuzzy set ijA .

Fuzzy rule confidence is calculated as follows:

1

(R )( )

(x )

i

i

is i

A s

T

A II

x ClasslFuzzyConfidencexµ

µ=

∈=∑

∑

(4)

Several types and shapes of membership functions are proposed in the literature, namely Triangles, Trapezoids, Bell curves, etc. In this work, we use the Triangles type membership function to keep the interpretability of fuzzy rules. Figure1 is showing five fuzzy sets used in this work for each dataset attribute.


The type of reasoning method used in a fuzzy rule based classifier is another important part of fuzzy rule based classifiers. Several reasoning methods are proposed in the literature; in this paper, we used weighed vote as reasoning method. In this type of reasoning method, an unseen sample is classified into the class with maximum average weight vote of fuzzy rules. In particular, for a test sample as tx , the weighted vote of fuzzy rules for class li is computed as follows:

, ( )( ) ( )

iii i i

t tAlR RS L R l

V x xµ∈ =

= ∑ (5)

Where, RS is Set of Fuzzy Rules and L (Ri) indicates class of Ri.

3. THE PROPOSED METHOD In this section, we will describe our proposed method to obtain a fuzzy frequent pattern based classifier. As shown in figure 2, our method is based on the following three main steps:

1) The first step of our method separates data from different classes and then generates fuzzy frequent patterns for classification (FFPC) from each class using Apriori like algorithm.

2) This step selects a subset of fuzzy FFPCs and removes redundant ones based on classification accuracy.

3) The last step of our algorithm tunes the FFPCs confidence to improve classification accuracy.

Figure 2. Method overview

At the end of three steps, we obtain a fuzzy rule based classifier, which can be used for the classification task of unseen samples. In following subsections, we describe above-mentioned steps in more details.

...Class 1 Class 2

... Apriori nApriori 2Apriori 1

Step1

Step2Select FFPC subset by coverage

Select FFPC subset by coverage

Select FFPC subset by coverage

Aggregate FFPC s of different classes

Prune redundant FFPC based on classifier accuracy

Step3Tune FFPCs

confidence

Accuracy increases

Yes

No

Dataset

...

Fuzzy Rule Based Classifier

Class n

Figure 1. Fuzzy partition for each

Attribute Value


3.1 STEP1: FUZZY FREQUENT PATTERN DISCOVERY To find FFPs, we used Apriori like algorithm proposed in[9]. This algorithm in first stage lists all attribute values as item sets with size one (S1-item set). Attribute values are fuzzy linguistic terms that are defined for each attribute.

Example1: assume that there are two attributes (Att1, Att2) and two fuzzy linguistic values (Low, High). In stage1 this algorithm generates following candidate S1-item sets.

1 1 2 2Att =Low, Att =High, Att =Low, Att =High

After generating all the candidate S1-item set, fuzzy support is calculated for each one. Those S1-item sets which have fuzzy support greater than the minimum fuzzy support threshold are frequent item sets. Frequent item sets will pass to next stage, and infrequent ones are removed. In second stage, the algorithm generates a set of acceptable combination of S1-item sets to produce item sets with size 2. An acceptable combination of item sets is a state in which one item set belongs to an attribute combined with another item set from different attribute. It is invalid to have two S1-itemsets from one attribute in an item set.

Example2: by using S1-itemsets of example1 an acceptable combination of item sets is as follows:

1 2 1 2 1 2Att =Low, Att =Low, Att =High, Att =Low, Att =High, Att =High

And an invalid combination is:

1 1Att =Low, Att =High

This algorithm in each stage increases the length of item sets by one. For instance, in stage 1, the algorithm generates item sets with the length 1, then passes frequent ones to next stage. In second stage, acceptable item sets with length 2 are generated and frequent ones are passed to the next stage. This process will be continued until a user defined level l .

In this paper, fuzzy support of an item set is calculated based on equation (2). Note that in fuzzy support calculation we consider T equal to number of samples in each data class. Number of extracted fuzzy frequent item sets depends on the minimum support threshold directly. The fuzzy support is usually calculated by considering the total number of samples in dataset. However, the number of samples for each class in a dataset can be different. For this reason, in this paper, the number of samples of each class is used as T in equation (2) and as a result, minimum support is applied properly on data class with different sizes. For example, suppose that there is a class with 2 samples and another class with 8 samples. If minimum support is equal to 0.4, a fuzzy item set like I* which covers one sample of first class, its fuzzy support will exceed minimum support threshold (FuzzySupp (I*) =0.5). While if in fuzzy support calculation we use total number of data samples as T, I* cannot pass the minimum support threshold (FuzzySupp (I*) =0.1) and will be removed. This type of fuzzy support calculation is different from that used in [9]. After this stage, a large number of candidate fuzzy frequent item sets will be generated for each data class. If we consider consequent part of fuzzy frequent item sets as the class label from which it is extracted, we will have a set of fuzzy frequent patterns for classification (FFPC).

At the end of this step, fuzzy confidence of all FFPCs is calculated and FFPCs with confidence level lower than minimum confidence threshold is removed. In this work, fuzzy confidence is calculated based on equation (4). 3.2 STEP2: FFPC SUBSET SELECTION AND PRUNING Previous stage generated huge number of FFPCs that some of them are redundant and should be removed. In this step, we first select a subset of FFPCs based on data sample coverage, then prune redundant FFPCs through an iterative procedure. In order to select a subset of FFPCs, we first sort them. For sorting FFPCs, we used following definition:

Given two fuzzy rules, namely 1FFPC and

2FFPC , then 1FFPC is prior to

2FFPC and denotes as

1 2FFPC FFPCf if:


1 21)FConf(FFPC )>FConf(FFPC ) OR

1 2 1 22)FConf(FFPC )=FConf(FFPC ) And FSupp(FFPC )>FSupp(FFPC ) OR

1 2 1 2 1 13)FConf(FFPC )=FConf(FFPC ) And FSupp(FFPC )=FSupp(FFPC ) And Len(FFPC ) < Len(FFPC )

Where, Len (FFPC) denotes the number of items in FFPC. If all above conditions are equal, the FFPC which is generated earlier is selected. Based on the above

conditions, if support and confidence values of two FFPCs are equal, we prefer shorter one. Because using short FFPCs increases the interpretability of the classifier. After sorting, a subset of FFPCs is selected based on data coverage for each class. As shown in Figure 3, the algorithm starts from first FFPC, and iterates over class data. Each FFPC that covers at least one sample is added to the selected set of FFPCs. Moreover, the weight of the covered sample is increased by the coverage degree. For example, suppose FFPC 1 covers samplex by fuzzy matching degree of 0.2, then FFPC 1 is added to the selected set of FFPCs and the weight of samplex is increased by 0.2. Whenever the weight of a sample gets larger than the coverage thresholdτ , it is removed from the training set and no longer considered for remaining FFPCs. This procedure will be ended, when there is not any data sample or all FFPCs have been visited.

After selecting a subset of FFPCs from each class, all of FFPCs are aggregated in a single FFPC set. Then the algorithm removes ineffective FFPCs by an iterative procedure. In each iteration of the inner loop, the FFPCi is removed from FFPC set and classification accuracy is calculated. If the accuracy is decreased, the removed FFPCi has returned at the end of the FFPC set, and the algorithm picks up FFPCi+k. In the experiments, we found that k=2 gives the best performance. Note that, if FFPCi is removed and Accuracy has not changed (or increased); the pruning process will be applied on FFPCi+1 and so on. This process repeats until no changes in the number of FFPC. The Accuracy is calculated as follows: NCP

AccuracyN

= (6)

Where, NCP is number of samples which classified correctly and N is total number of dataset samples. 3.3 FFPC CONFIDENCE TUNING In the previous step, we obtained a FFPC set with few numbers of FFPCs. This step is used to tuning the FFPCs confidence and establishing a more accurate boundary between classes. FFPC confidence plays an

Figure 4. FFPCs confidence tuning procedure Figure 4

Figure 3. Select FFPC by data coverage procedure


important role in determining the boundary between data classes. In this work, we tried to adjust the confidence values of FFPCs by an iterative procedure. The tuning procedure, for each FFPC, modifies the confidence by adding a small decimal value. If modification of confidence causes classification accuracy improvement, the changes will be permanent; otherwise, confidence value returns to its previous state. As shown in figure 4, this process repeats for all FFPCs iteratively. The input vector Alfa includes decimal values to make changes in confidence values. The values of Alfa vector are generated with a fixed step size as follows: Alfa[0]=step, Alfa[1]=-step, Alfa[2]=Alfa[0] step, Alfa[3]=Alfa[1] step,...+ − Where, step variable is computes by an input parameter ∂ as follows: 1

step =∂

(7)

In order to show the confidence tuning effect, we generate an artificial dataset with one feature as shown in Figure 3. This artificial dataset has 12 samples of two classes and used four fuzzy linguistic values for this feature. After running our algorithm on this dataset without confidence tuning step, the algorithm generates two FFPCs as shown in figure 5-A: Using this FFPC set, our algorithm achieves classification accuracy equal to 91% on this artificial dataset. Figure 6, shows data sample which causes the misclassification error. As shown in Figure 6, the data sample is located on the boundary between two classes. Determining appropriate class boundary causes correct classifying of this data sample. Accordingly, after running proposed confidence tuning procedure on the dataset, FFPCs are modified as shown in figure 4-B. We observed accuracy equal to 100% by using this improved FFPC set of artificial dataset. This example indicates that confidence tuning will improve fuzzy classifier performance.

4. RESULTS In this section, we first describe benchmark datasets used for our algorithm performance evaluation. We used six dataset available on the keel website[12]. Table 1 lists the datasets and their specifications. For testing our method, we first normalized the dataset features, then performed 10 fold cross validation on benchmark datasets. In experiments, we used algorithm calibration as shown in Table 2. Dataset name Number Of

classes Number of

features Number of

samples Appendicitis 2 7 106

Monkes2 2 6 432 Pima 2 8 768

New thyroid 3 5 215 Iris 3 4 150

Magic 2 10 19020

S MS M L

Class1 Class2

(.)µ

X0.11 0.5 0.62 0.90

Parameter name

Parameter value

FMinSupp 0.02

FMinConf 0.6 l 3 τ 0.3

∂ 10

Figure 5- A. FFPCs before confidence tuning. B. FFPCs after confidence tuning.

Table2.Algorithm calibration

Figure 6. One dimensional feature space and samples of two classes

Table2. Dataset specifications


The results of executing our method on datasets are shown in Table 3. We compared proposed method with two well-known fuzzy associative classifiers proposed in[9][11]. We picked up the methods classification accuracy from reported results in reference papers. Results show that our algorithm has superior performance on three datasets. Also on remaining datasets, our algorithm achieves competitive results. As shown in table 4, our method also produces few numbers of fuzzy rules comparable with FARC_HD algorithm.

Table 1. Performance results

Dataset Train Accuracy Test Accuracy Number Of Rules

Our method FARC_HD

AC-FFP Our method FARC_HD

AC-FFP Our method

FARC_HD

Pima 81.71 82.90 79.25 76.57 75.66 74.87 47.7 22.7 Appendices 91.29 93.82 91.51 87.72 84.18 85.09 6.0 6.8

Monks 100 99.92 97.22 100 99.77 97.27 10.8 14.2 Iris 98.00 98.59 96.15 96.66 96 98.00 6.6 4

Magic 83.53 85.36 73.59 82.65 84.51 73.40 83.2 43.3 New thyroid 97.26 98.91* 97.73 93.96 95.32 95.87 7.6 9.3

Average 91.96 93.25 89.24 89.59 89.24 87.41 26.98 16.71 5. CONCLUSION In this paper, we proposed a three-step fuzzy associative classifier which uses fuzzy frequent patterns for classification tasks. Our algorithm generated a compact set of fuzzy rules in the last step. Experiments show that our algorithm produces good results in comparing with other fuzzy associative classifiers. For future works, we can speed up our algorithm for handling datasets with larger sizes. 6. REFERENCES

1. T. N. Phyu, “Survey of Classification Techniques in Data Mining,” Int. MUlticonference Eng. Comput. Sci., vol. I, pp. 18–20, 2009.

2. P. Langley and H. a. Simon, “Applications of machine learning and rule induction,” Commun. ACM, vol. 38, no. 11, pp. 54–64, 1995.

3. F. Herrera, “Genetic fuzzy systems: Taxonomy, current research trends and prospects,” Evol. Intell., vol. 1, no. 1, pp. 27–46, 2008.

4. W. E. Combs and J. E. Andrews, “Combinatorial rule explosion eliminated by a fuzzy rule configuration,” Fuzzy Systems, IEEE Transactions on, vol. 6, no. 1. pp. 1–11, 1998.

5. G. Nakhaeizadeh, J. Hipp, and U. Güntzer, “Algorithms for association rule mining --- a general survey and comparison,” ACM SIGKDD Explor. Newsl., vol. 2, no. 1, pp. 58–64, 2000.

6. B. Liu, W. Hsu, Y. Ma, and B. Ma, “Integrating Classification and Association Rule Mining,” Knowl. Discov. Data Min., pp. 80–86, 1998.

7. W. Li, J. Han, and J. P. Cmar, “Accurate and e cient classi cation based on multiple class-association rules,” Proc. ICDM, vol. pages, pp. 369–376, 2001.

8. F. Chen, Y. Wang, M. Li, H. Wu, and J. Tian, “Principal Association Mining: An efficient classification approach,” Knowledge-Based Syst., vol. 67, pp. 16–25, 2014.

9. J.Alcalá-Fdez, R. Alcalá, and F. Herrera, “A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning,” IEEE Trans. Fuzzy Syst., vol. 19, no. 5, pp. 857–872, 2011.

10. M. Fazzolari, R. Alcalá, and F. Herrera, “A multi-objective evolutionary method for learning granularities based on fuzzy discretization to improve the accuracy-complexity trade-off of fuzzy rule-based classification systems: D-MOFARC algorithm,” Appl. Soft Comput., vol. 24, pp. 470–481, 2014.

11. M. Antonelli, P. Ducange, F. Marcelloni, and A. Segatori, “A novel associative classification model based on a fuzzy frequent pattern mining algorithm,” Expert Syst. Appl., vol. 42, no. 4, pp. 2086–2097, 2015.

12. “www.keel.es.” .

*We used KEEL software [12] to get FARC-HD results on New thyroid dataset. Default parameter values

http://www.keel.es


Solution of fractional Black–Scholes equation by using homotopy perturbation method

Behrouz Fathi Vajargah 1, Maryam Ghazizadeh2

1,2Department of Mathematics, University of Guilan, Rasht, P.O. Box: 1914, Iran

Abstract

In the present paper, solution of nonlinear fractional Black–Scholes equation is deduced with the

help of the powerful homotopy perturbation method (HPM). To illustrate the method an example

has been prepared. The method for this equation has been lead to an exact solution. The reliability,

simplicity and cost-effectiveness of the method are confirmed by the results of applying the method

on the different forms of these kinds of functional equations. Keywords: Homotopy perturbation method; fractional Black–Scholes equation.

1. Introduction Black–Scholes equation, which is proposed by Black and Scholes (1973), is the financial model

that concern with option [1]. An option is a contract between the seller and the buyer. It consists of

a call option and a put option. Option valuation depends on the underlying asset price and time.

The European option can only be exercised at the expiration date, but the American option can be

exercised at any time before expiration date. The solution of Black–Scholes equation provides an

option pricing formula for European option. The analytic solution is used in general case with basic

assumption but it is not satisfied in some conditions. Some restrictions were appeared in the

classical Black–Scholes equation that is the weakness of this model. The fractional Black–Scholes

equation is following 2

2 22

1( ) ( , ) ( ) 0,2

( , ) [0, ]

u u ur s s s r us s

s R T

α

α τ σ τ ττ

τ +

∂ ∂ ∂+ + − =

∂ ∂ ∂∈ ×

With the terminal and boundary condition

[ ]( , ) max( ,0), , (0, ) 0, 0,u s T s E s R u Tτ τ+= − ∈ = ∈

where ( , )U s T is the value of European call option at underlying asset price s at time ,Tτ is

the expiration date, r is the risk-free interest rate, σ is the volatility of underlying asset price and E


is the strike price. From above Eq., when s goes to zero then degenerating will occur in

approximation. We transform the Black–Scholes equation into a nondegenerate partial differential

equation by using a logarithmic transformation x = Ln s, t T τ= − , and define the computational

domain for convenient in numerical experiments by 2

2 22

1 1( ) 0,2 2

u u ur rux x

α

α σ στ

∂ ∂ ∂+ + − − =

∂ ∂ ∂

The fundamental definition of fractional calculus as following

2. Basic definitions In this section some basic definitions and properties of the fractional calculus theory used in this

work will be discussed.

Definition 2.1: A real function ( ), 0,f x x in the space ,C R if there exists a real number

,p such that 1

( ) ( )pf x x f x where 1( ) [0, ]f x C and it is said to be in the space mC if

( ) , .mf C m N

Definition 2.2: The Riemann–Liouville fractional integral operator of order 0, of a function

, 1,f C is defined as:

1 0

0

1( ) ( ) ( ) , 0, 0, ( ) ( ).

( )

x

J f x x t f t dt x J f x f x

The general and detailed properties of the operator J can be found in reference [2-4]. For this

study, where , 1, , 0f C and 1 :

(1) ( ) ( ),

(2) ( ) ( ),

( 1)(3) .

( 1)

J J f x J f x

J J f x J J f x

J x x

It is worth mentioning here that, the Riemann–Liouville derivative method has some disadvantages

when used to model real-world phenomena with fractional differential equations. Therefore, a

modified fractional differential operator,D

should be introduced to overcome those weaknesses in


the previous models. Such modified fractional differential operators,D

were first proposed by

Caputo, in his work on the theory of viscoelasticity.

Definition 2.3: The fractional derivative of ( )f x according to Caputo, is defined as:

1 ( )

0

1( ) ( ) ( ) ( ) ,

( )

xm m m mD f x J D f x x t f t dt

m

For: 11 , , 0, .mm m m N x f C

The following two properties of this operator will be used in what comes next.

Lemma 2.4: If 1 ,m m and , 1,mf C then ( ) ( ),D J f x f x

and

1

0

( ) ( ) (0 ) , 0.!

km

k

xJ D f x f x f x

k

Definition 2.5: For m as the smallest integer that exceeds , the Caputo time-fractional derivative

operator of order 0, is defined as:

1

0

( , )1( ) , 1 ,

( , ) ( )( , )( , )

, .

mtm

m

t m

m

u xt d m m

u x t mD u x tu x tt

m Nt

For more information on the mathematical properties of fractional derivatives and integrals, one

can consult the above mentioned references.

3. Homotopy perturbation method For the purpose of applications illustration of the methodology of the proposed method, using

homotopy perturbation method, we consider the following nonlinear differential equation [5-27],

( ) ( ) 0,A u f r ,r (1)

( , / ) 0,B u u n ,r

(2) where A is a general differential operator, ( )f r is a known analytic function, B is a boundary

condition and is the boundary of the domain .

The operator A can be generally divided into two operators, L and ,N where L a linear, whileN

is a nonlinear operator. Equation (1) can be, therefore, written as follows:


( ) ( ) ( ) 0.L u N u f r

(3) Using the homotopy technique, we construct a homotopy ( , ): [0,1]u r p R which

satisfies:

0( , ) (1 )[ ( ) ( ))] [ ( ) ( )] 0, [0,1], ,H u p p L u L u p A u f r p r

(4) Or

0 0( , ) ( ) ( ) ( ) [ ( ) ( )] 0.H u p L u L u pL u p N u f r

(5) Where [0,1]p , is called homotopy parameter, and 0

u is an initial approximation for the

solution of Eq.(1), which satisfies the boundary conditions. Obviously from Esq. (4) and (5) we

will have.

0( , 0) ( ) ( ) 0,H U L u L u (6)

( ,1) ( ) ( ) 0,H u A u f r

(7) we can assume that the solution of (4) or (5) can be expressed as a series in p , as follows:

20 1 2

u u pu p u K

(8) Setting 1,p results in the approximate solution of Eq. (1)

0 1 21limp

u u u u u

K (9)

4. Numerical Example

Consider the following Equation 2

2 22

1 1( ) 0,2 2

u u ur rux x

α

α σ στ

∂ ∂ ∂+ + − − =

∂ ∂ ∂ (10)

With initial condition

( ,0) xu x e=

To solve Eq. (10) by homotopy perturbation method, we construct the following homotopy

22 20

2

1 1(1 p)( ) ( ( ) ) 0,

2 2

uu u u up r ru

t xt t x

(11)

Or


22 20 0

2

1 1( ( ) ) 0,

2 2

u uu u up r ru

t t xt x

(12)

Suppose the solution of Eq. (12) has the following form 2

0 1 2.u u pu p u K (13)

Applying the inverse operator, t

J to both sides of the above equations, results in:

22 2

0 0 2

1 1( , ) ( , 0) ( , ) ( , ) ( ) ,

2 2t t

u uu x t u x J u x t J u x t r ru

xx

(14)

Suppose the solution of Eq. (14) has the form of Eq. (8), then substituting Eq. (8) into Eq. (14), and

after collecting the terms with the same powers of p , and equating each coefficient of p to zero, all

of that results in: 0

0 0

21 2 20 0

1 0 02

22 2 21 1

2 12

2

21

: ( , ) ( , 0) ( , ),

1 1: ( , ) ( , ) ( ) ,

2 2

1 1 : ( , ) ( ) ,

2 2

1 : ( , )

2

t

t

t

jj t

p u x t u x J u x t

u up u x t J u x t r ru

xx

u up u x t J r ru

xx

up u x t J

M

2

2

1( ) ,

2j j

j

ur ru

xx

M

(15)

Assuming

0 e .xu (16)

With the aid of the initial approximation given by Eq. (14), consider

0.2, .1, 1, 1r T and the iteration formula (15) we get the other of component as

follows

1.05 xu e

2u

3u

4u

5u


6u

7u

M

Approximate solution of (10) can be obtained by setting 1p

0 1 21limp

u u u u u

K

Suppose 10

0

,j

j

u u

the results are presented in Fig. 1.

5. Conclusion

In this Letter, application of HPM to fractional Black–Scholes equation has been presented

successfully. We note that novel algorithm solutions were computed via a simple algorithm and

without any need for perturbation techniques, special transformations, linearization, or


discretization. Thus, it can be concluded that the new method is an effective numerical tool for

solving functional equations. All computations are performed by using Maple 15. References [1] F. Black, M. Scholes,.J. Political Econ. 81 (3), 1973, 637–654. [2] B J West, M Bolognab, P Grigolini , Physics of Fractal Operators (New York: Academic

Press) (2003). [3] K S Miller, B Ross An Introduction to the Fractional Calculus and Fractional Differential

Equations (New York: Academic Press) 1993. [4] I. Podlubny, Fractional Differential Equations (San Diego: Academic Press) (1999). [5] J.H.He Comput Meth Appl Mech Eng178 257(1999). [6] J H He Int J Non-linear Mech 35 37 (2000). [7] J H He Int J Mod Phys B 20 2561 (2006). [8] J H He Topol Meth Nonlinear Anal 31205(2008). [9] S. Abbasbandy, Chaos Soliton Fract 31 257(2007). [10] AM Siddiqui, A Zeb, QK Ghori Phys Lett A 352 404 (2006). [11] Qi Wang Chaos Soliton Fract 35 843(2008). [12] J Biazar, M Eslami and H. Ghazvini International Journal of Nonlinear Sciences and

Numerical Simulation 8411(2007). [13] J Biazar, M Eslami International Journal of Numerical Methods for Heat & Fluid Flow 22 803

(2012). [14] A M El-Sayed, A Elsaid, I L El-Kalla, and D Hammad Applied Mathematics and Computation

218 8329(2012). [15] S. AbbasbandyAppl Math Comput 173 493 (2006). [16] F. I. Compean, D. Olvera, F. J. Campa, L. N. Lopez de Lacalle, A. Elias-Zuniga, and C. A.

Rodriguez, International Journal of Machine Tools and Manufacture 57 ( 2012). [17] U Filobello-Nino H VazquezLeal, RCastaneda-Sheissa Asian Journal of Mathematics and

Statistics 5 50 (2012). [18] YG Wang, WH Lin and N Liu, International Journal for Computational Methods in

Engineering Science and Mechanics 13 197 (2012). [19] L Cveticanin Chaos Soliton Fract 30 1221 (2006). [20] Turgut Ozis, Ahmet Yıldırım Chaos Soliton Fract 34 989(2007). [21] E YusufogluInt J Nonlinear Sci Numer Simul 8 353 (2007). [22] M Zare, O Jalili and M Delshadmanesh Indian J. Phys. 86 855 (2012) [23] Z Azimzadeh, A R Vahidi and E Babolian Indian J. Phys. 86 721 (2012) [24] A. R. Vahidi Z. Azimzadeh M. Didgar Indian J. Phys accepted (2013) [25] J.H. He, S.K. Elagan, Z.B. Li, Phys. Lett A, 376(4) (2012). [26] J.H. He, Z.B. Li, Thermal Science, 16(2) (2012).


Fixed point theorems on intuitionistic fuzzy metric spaces

S. Karimzadeh 1, M. Saheli 2 1- Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,Iran 2- Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,Iran


Abstract

In this paper, we use the definition of intuitionistic fuzzy metric spaces and provide two types of fuzzy versions of contraction. Moreover, we show that these mappings necessarily have fixed points in intuitionistic fuzzy metric spaces. Keywords: Fixed point, Contractive conditions, Intuitionistic fuzzy metric space.

1. Introduction

Banach contraction mapping principle is one of the fundamental consequences of analysis. This contraction mapping is an important object in metric fixed point theory. Also its emphasis lies on its wide applicability in branches of mathematics. The followings are a few examples of such contractions.

Theorem.([1]) Let (X, d) be a complete metric space, and let f : X → X be a self-mapping which satisfies the following inequality φ (d(f(x), f(y))) ≤ c φ (d(x, y)), for all x, y ∈ X and for some 0 < c < 1, where φ : [0, +∞) → [0, +∞) is a continuous monotonically nondecreasing mapping such that φ (0) = 0. Then f has a unique fixed point.

Theorem.([3]) Let (X, d) be a complete metric space, and let f : X →X be a self-mapping which satisfies the following inequality d(f(x), f(y)) ≤ d(x, y) − φ (d(x, y)), for all x, y ∈ X, where ϕ : [0, +∞) → [0, +∞) is a continuous nondecreasing mapping such that ϕ(t) = 0 if and only if t = 0. Then f has a unique fixed point. A natural question is whether we can provide contractive conditions which imply existence of fixed point in an intuitionistic fuzzy metric space. In this paper, we use the definition of intuitionistic fuzzy metric space and discuss two types of fuzzy versions of contraction and some corollaries. 2. Preliminaries

We give below some basic preliminaries required for this paper.

Definition 2.1. ([2])A 5-tuple (X, M, N, ∗, ♦) is said to be an intuitionistic fuzzy metric space if X is an arbitrary set, ∗ is a continuous t-norm, ♦ is a continuous t-conorm and M, N are fuzzy sets on X × (0, +∞) satisfying the following conditions: for all x, y, z ∈ X, s, t > 0, (F1) M(x, y, t) + N(x, y, t) ≤ 1, (F2) M(x, y, t) > 0, (F3) M(x, y, t) = 1 if and only if x = y, (F4) M(x, y, t) = M(y, x, t), (F5) M(x, y, t) ∗ M(y, z, s) ≤ M(x, z, t + s), (F6) M(x, y, .) : (0, ∞) → (0, 1] is continuous, (F7) N(x, y, t) > 0, (F8) N(x, y, t) = 0 if and only if x = y, (F9) N(x, y, t) = N(y, x, t), (F10) N(x, y, t)♦N(y, z, s) ≥ N(x, z, t + s), (F11) N(x, y, .) : (0, ∞) → (0, 1] is continuous.

We assume that (F12) M(x, y, .) is a strictly increasing on the subset t : 0 < M(x, y, t) < 1 of R and limt→∞ M(x, y, t) = 1, (F13) N(x, y, .) is a strictly increasing on the subset t : 0 < N(x, y, t) < 1 of R and limt→∞ N(x, y, t) = 0,



3. Fixed point theorems First we introduce the following notation. We denote by Ψ the set of functions ψ : [0, +∞) → [0, +∞) satisfying the following hypotheses: (i) ψ is continuous and nondecreasing, (ii) ψ(t) = 0 if and only if t = 0. We denote by Φ the set of functions φ : [0, +∞) −→ [0, +∞) satisfying the following hypotheses: (i) φ is continuous and strictly increasing, (ii) φ(t) = 0 if and only if t = 0. Theorem 3.1. Let (X, M, N, min, max) be a complete intuitionistic fuzzy metric space such that (X, M, N, min, max) satisfying (F12), (F13) and f : X → X be a selfmap such that for all x, y ∈ X, t > 0 and α ∈ (0, 1], M(x, y, t) ≥ α implies that M(f(x), f(y), ϕ−1(ϕ(t) − φ(t))) ≥ α, and N(x, y, t) ≤ 1 − α implies that N(f(x), f(y), ϕ−1(ϕ(t) − φ(t))) ≤ 1 − α, where φ, ϕ ∈ Φ and ϕ(t) ≥ φ(t), for all t > 0. Then f has a fixed point in X. Corollary 3.2. Let (X, M, N, min, max) be a complete intuitionistic fuzzy metric space such that (X, M, N, min, max) satisfying (F12), (F13) and f : X → X be a selfmap such that for all x, y ∈ X and t > 0, M(f(x), f(y), ϕ−1(ϕ(t) − φ(t))) ≥ M(x, y, t), and N(f(x), f(y), ϕ−1(ϕ(t) − φ(t))) ≤ N(x, y, t), where φ, ϕ ∈ Φ and ϕ(t) ≥ φ(t), for all t > 0. Then f has a fixed point in X. Theorem 3.3. Let (X, M, N,min,max) be a complete intuitionistic fuzzy metric space such that (X, M, N,min,max) satisfying (F12), (F13) and f : X → X be a selfmaps such that for all x, y ∈ X, s, t > 0 and α ∈ (0,1], M(x, f(y), t) ≥ α and M(f(x), y, s) ≥ α implies that M(f(x), f(y),1/2(t + s) − θ(t, s)) ≥ α, and N(x, f(y), t) ≤ 1 − α and N(f(x), y, s) ≤ 1 − α implies that N(f(x), f(y),1/2(t + s) − θ(t, s)) ≤ 1 − α, where θ : [0,+∞)×[0,+∞) → [0,+∞) is a continuous mapping such that θ(x, y) = 0 if and only if x = y = 0. Then f has a fixed point in X. Corollary 3.4. Let (X, M, N,min,max) be a complete intuitionistic fuzzy metric space such that (X, M, N,min,max) satisfying (F12), (F13) and f : X → X be a selfmaps such that for all x, y ∈ X and s, t > 0, M(f(x), f(y),1/2(t + s) − θ(t, s)) ≥ minM(x, f(y), t), M(f(x), y, s), and N(f(x), f(y),1/2(t + s) − θ(t, s)) ≤ maxN(x, f(y), t), N(f(x), y, s), where θ : [0,+∞)× [0,+∞) → [0,+∞) is a continuous mapping such that θ(x, y) = 0 if and only if x = y = 0. Then f has a fixed point in X. 4. Acknowledgment This research was supported by Vali-e-Asr university of Rafsanjan. 5. References 1. Khan, M. S., Swaleh, M. and Sessa, S. (1984), “Fixed point theorems by altering distances between the points, ” Bulletin of the Australian Mathematical Society, 30, pp. 1-9. 2. Park, J. H. (2004), “ Intuitionistic fuzzy metric spaces, ” Chaos, Solitons and Fractals, 22, pp. 1039-1046. 3. Rhoades, B. E. (2001) , “Some theorems on weakly contractive maps, ” Nonlinear Analysis: Theory, Methods and Applications, 47, pp. 2683-2693.


Fuzzy Topology Generated By Fuzzy Norm

M. Saheli 1, S. Karimzadeh 2 1- Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,Iran 2- Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,Iran


Abstract

In this paper, In this paper, a fuzzy topology generated by fuzzy norm is studied and some basic properties of this topology are proved. Keywords: Fuzzy normed linear space, Fuzzy topology, Fuzzy topological space.

1. Introduction Felbin [3] has offered in 1992 an alternative definition of a fuzzy norm on a linear space with an associated metric of the Kaleva and Seikkala type [4]. In 1999, Das and Das [1] constructed a fuzzy topology on the fuzzy normed linear space. Later Fang [2, 6] studied some properties of fuzzy topology generated by fuzzy norm. 2. Preliminaries

We give below some basic preliminaries required for this paper.

Definition 2.3. A fuzzy topology on a set X is a family τ of fuzzy subsets of X satisfying the following: (i) The fuzzy subsets 1 and 0 are in τ, (ii) τ is closed under finite intersection and arbitrary union of fuzzy subsets. The pair (X, τ) is called a fuzzy topological space.

Definition 2.4. A fuzzy subset µ of a vector space X is said to be convex if µ(kx+(1-k)y) > min(µ(x), µ(y)) for all x, y ∈ X and k ∈ [0, 1].

Definition 2.5. A fuzzy set µ in a fuzzy topological space (X, τ) is called a nbhd of a point x ∈ X iff there is ρ in τ such that ρ ⊆ µ and µ(x) = ρ(x) > 0.

Definition 2.8. A fuzzy topological space (X, τ) is said to be fuzzy Hausdorff if for x, y ∈ X and x≠ y there exist µ, η ∈ τ with µ(x) = η(y) = 1 and µ ∩ η = ∅.

Definition 2.9. Let X be a vector space over R. Assume the mappings L, R : [0, 1] × [0, 1] → [0, 1] are symmetric and non-decreasing in both arguments, and that L(0, 0) = 0 and R(1, 1) = 1. Let ∥.∥ : X → F +(R). The quadruple (X, ∥.∥, L, R) is called a fuzzy normed linear space with the fuzzy norm ∥.∥, if the following conditions are satisfied: (F1) if x≠ 0 then inf ∥ x ∥ > 0,

(F2) ∥x∥ = 0 if and only if x = 0, (F3) ∥rx∥ = |r |∥x∥ for x ∈ X and r ∈ R, (F4) for all x, y ∈ X, (F4L)∥x + y∥(s + t) ≥ L(∥x∥(s), ∥y∥(t)) whenever s ≤∥ x ∥ , t ≤ ∥ y ∥ and s + t ≤∥ x + y ∥ , (F4R)∥x+ y∥(s+ t) ≤ R(∥x∥(s), ∥y∥(t)) whenever s ≥ ∥ x ∥ , t ≥ ∥ y ∥ and s + t ≥ ∥ x + y ∥ .

In what follows L(s, t) = min(s, t) and R(s, t) = max(s, t) for all s, t ∈ [0,1]. We write (X, ∥.∥) or simply X when L and R are as above.



3. Fuzzy Topology Generated by Fuzzy Norm Definition 3.1. Let (X, ∥.∥) be a fuzzy normed linear space and ϵ > 0. The fuzzy set on X by (x) = supα ∈ (0, 1] : ∥ x ∥ ≤ α is said to be a fuzzy sphere with center 0 and radius ϵ in X. Theorem 3.2. Let (X, ∥.∥) be a fuzzy normed linear space. Then a family τ∥.∥= µ ∈ : for all x ∈ suppµ and 0 < r < µ(x) there is ϵ > 0 s.t. x+ ∩r ⊆ µ is a fuzzy topology on X. Theorem 3.3. Let (X, ∥.∥) be a fuzzy normed linear space. Then (X, τ∥.∥) is a fuzzy topological vector space and for every λ ∈ (0,1), = ∩ r : ϵ > 0, r ∈ (1 − λ,1] is a Q-neighborhood base of 0 . Theorem 3.4. Let (X, ∥.∥) be a fuzzy normed linear space. Then ∩ r is a fuzzy convex set, for all ϵ > 0 and r ∈ [0, 1]. Corollary 3.5. Let (X, ∥.∥) be fuzzy normed linear space. Then the fuzzy topological space (X, τ∥.∥) is a locally convex fuzzy topological vector space. Theorem 3.6. Let (X, ∥.∥) be a fuzzy normed linear space. Then the fuzzy sphere with center 0 and radius ϵ is a open set, for all ϵ > 0. 4. Acknowledgment This research was supported by Vali-e-Asr university of Rafsanjan. 5. References 1. Das, N. F. and Das, P. (1999), “Fuzzy topology generated by fuzzy norm,” Fuzzy Sets and Systems, 107, pp. 349-354. 2. Fang, J. X. (2006), “On I-topology generated by fuzzy norm, Fuzzy Sets and Systems, 157, pp. 2739-2750. 3. Felbin, C. (1992), “Finite dimensional fuzzy normed linear space, ” Fuzzy Sets and Systems, 48, pp. 239-248. 4. Kaleva, O. and Seikkala, S. (1984), “On fuzzy metric spaces, ” Fuzzy Sets and Systems, 12, pp. 215-229. 5. Saheli, M. (2015), “On fuzzy topology and fuzzy norm, ” Annals of Fuzzy Mathematics and Informatics, 10, pp. 639–647. 6. Xu, G. H. and Fang, J. X. (2007), “A new I-vector topology generated by a fuzzy norm, ” Fuzzy Sets and Systems, 158, pp. 2375-2385.


Optimal correction of linear inequalities based on second order conic programming

Hossein Moosaei1, M Jalili2

1- Department of Mathematics, Neyshabur Branch, Islamic Azad University, Neyshabur, Iran 2- Department of Mathematics, Neyshabur Branch, Islamic Azad University, Neyshabur, Iran


Abstract

In this work, we have studied the optimum correction of infeasible linear inequalities through making minimal changes in the right hand side by using second order conic programming. Solving this problem is equal to minimize the nonlinear quadratic functions. In this work, we introduce a method for solving this problem. Also, some randomly generated problems are provided to illustrate the efficiency and validity of our proposed method.

Keywords: Linear Inequalities, Convex Optimization, Second Order Conic Programming.

1. Introduction

The fully-discretized feasibility model of the inverse problem of intensity modulated radiation therapy (IMRT) gives rise to a system of linear inequalities that describes the effects of radiation on the irradiated body and the treatment prescription, see Censor, Altschuler and Powlis [1, 2].

Also, in many models in industrial engineering, we encounter problems that present as infeasible linear inequalities. We could provide numerous reasons for the infeasibility of an linear inequalities system, including data and modeling errors. The remodeling of such a system and finding its errors might take remarkable time and expenses, and might eventually result in yet another infeasible system, hence we refrain from doing so. We therefore focus on an optimal correction of the given system.

In fact, we would like to reach feasible systems with the least changes in data [3,4,5] . In this paper we consider the following set of linear inequalities that are inconsistent [5]:

Ax ≤ b; (1)

x ≥ 0

where A ∈ m nR × and b ∈ mR .

In order to make the above-mentioned system feasible, we apply the changes in the right-hand side vector b, and

in order to efficiently correct system (1), we need to solve following problem. 2

01min ( ) .2x Ax b≥ +−

Where a+ mean a vector that we obtain from a by replacing the negative component by zero.

In this paper, we study about above problem. An equivalent formulation of the problem is given and an efficient algorithm is designed to solve the new formulation. 2. 2-norm Corrections The minimal correction using the l-2 norm by changing the right hand side vector is:

20,

1min2

.

x r r

Ax b r

≥

≤ + (2) In the following theorem we show how we compute optimal x and r values.



Theorem Let *x and

*r be the optimal solution of (2). Then * *( )r Ax b += − ; and

*x is an optimal solution of 2

01min ( ) .2x Ax b≥ +−

(3)

To solve (3) we would like to use second order conic programming.

3. Conic Programming A specific cones is second-order cone which define as follows: Definition A second order (Lorentz) cone is defined by:

21 2 1 ( ,..., ) : ... n n n

n nL x x x R x x x= = ∈ + + ≤ Optimization over these cones, called second order conic programming (SOCP). Lemma We have

1 2, ,.... , , 1,2,..., , n i ix t y y y x y i n y t+ ≤ ⇔∃ ≤ ∀ = ≤

By using above Lemma, (3) is equivalent to the standard dual form SOCP as follows:

1

max

[ ][ ]

,

,

( , )

m

m

n

T m

t

xb A I R

y

y R

x R

t y L

+

+

+

+

−

− − ∈

∈

∈

∈ Then

1*K m m mnR R R L ++ + += × × × = is the product of tree linear and one second order cones.

4. Computational Results

To solve (3) we have utilized the SOCP. Our computational experiments on several problems show the superior performance of the method. MATLAB random insolvable linear inequalities generator Generates random inconsistent system Ax =< b; Input:m,n,d(density); Output: A, b pl=inline(’(abs(x)+x)/2’);m=input(’enter m= ’); n=input(’enter n= ’); d=input(’enter d= ’); m1=max(m-round(0.5*m),m-n); A1=sprand(m1,n,d);A1=1*(A1-0.5*spones(A1)); x=spdiags(rand(n,1),0,n,n)*1*(rand(n,1)-rand(n,1)); x=spdiags(ones(n,1)-sign(x),0,n,n)*10*(rand(n,1)-rand(n,1)); m2=m-m1;u=randperm(m2);A2=A1(u,:); b1=A1*x+spdiags((rand(m1,1)),0,m1,m1)*1*ones(m1,1); b2=b1(u)+spdiags((rand(m2,1)),0,m2,m2)*10*ones(m2,1); A=100*[A1;-A2]; b=[b1;-b2]; The results show in the following Table I.


Table 1: Computational Results n ( )Ax b +−

x time(sec) 50* 10 1.9890e+001 3.1505e+002 0.461452 100*50 4.6850e+001 1.2320e+000 0.478509 200*100 5.7488e+001 7.0744e-001 0.611178 600*500 9.4989e+001 2.6950e+000 6.115507

5. Acknowledgment The authors would like to acknowledge Islamic Azad University, Neyshabur Branch (Iran), for financial support of this research. This paper is from project “Optimal correction of inconsistent linear systems by using Conic Programming”. 6. References 1. Y. Censor, Mathematical optimization for the inverse problem of intensity-modulated radiation therapy, in: J.R. Palta and T.R.

Mackie (Editors), Intensity-Modulated Radiation Therapy: The State of The Art, American Association of Physicists in Medicine, Medical Physics Monograph No. 29, Medical Physics Publishing, Madison, Wisconsin, USA, 2003, pp. 2549.

2. Y. Censor, M.D. Altschuler and W.D. Powlis, A computational solution of the inverse problem in radiation therapy treatment planning, Applied Mathematics and Computation, Vol. 25, pp. 5787, (1988).

3. S. Ketabch, H. Moosaei, Optimal Error Correction and Methods of Feasible Directions, J. Optim. Theory Appl. 154 (2012), pp. 209–216.

4. S. Ketabch, H. Moosaei, S. Fallahi, Optimal error correction of the absolute value equation using a genetic algorithm, Mathematical and Computer Modelling 57 (2013) 2339–2342

5. M. Salahi, S. Ketabchi, Correcting an inconsistent set of linear inequalities by the generalized Newton method, Optimization Methods and Software. 25 (2010) , pp. 457–465.


Ridge regression model fitting with triangular fuzzy inputs-output data

Mohammad Reza Rabiei1, Fatemeh Piadeh Koohsar2

1- Department of Statistics, School of Mathematical Sciences, University of Shahrood, Shahrood, Iran 2- Master of Science in Statistics, Shiraz University

Email:[email protected].

Abstract

This paper deals with ridge regression modeling in the context of fuzzy analysis in which the input and output are triangular fuzzy numbers. First , some approximations for multiplication of two triangular fuzzy numbers are introduced. Then, to evaluate the fuzzy linear ridge regression models, the best approximation is selected to minimize a suitable function. Further, experimental results are then presented, which indicate the performance of this model. Keywords Fuzzy regression-Ridge regression-Fuzzy number

1. Introduction

Fuzzy regression analysis is a powerful tool for analyzing phenomena in which one variable namely output (response or dependent) variable depends on one or more variables namely input (explanatory or independent) variables, in a fuzzy environment. A fuzzy linear regression model was first proposed by Tanaka et al. [9] to analyze fuzzy data with vague relationship between the dependent variable and independent variables. In this paper, we will be dealing with a ridge regression methodology where the data including the inputs and output are fuzzy numbers. Least squares and ridge regressions are classical statistical algorithms which have been known for a long time. Ridge regression is particularly used for multivariate settings in which multicollinearity exists between the columns of design matrix.

In this paper, we adopt the fuzzy regression model of Diamond [1] and refine it for a multicollinear system. It allows us to perform nonlinear analysis for his model by constructing a fuzzy linear regression function in a high dimensional feature space with fuzzy inputs and fuzzy output data. The rest of this paper is organized as follows. Section 2 illustrates preliminaries and ridge regression procedure for fuzzy multivariate linear models. Section 3 describes how to apply this idea to the fuzzy multivariate model. Section 4 gives some conclusions.

2. Preliminaries

In this section , we review some preliminary definitions and a well-known result of the triangular fuzzy numbers which is needed for the construction of fuzzy ridge regression . These are taken from Hassanpour et al . [4]. Let be the real number set .

Definition 1: [6] A fuzzy set on is called a fuzzy number iff is convex and there exists exactly one point , say , with ( ) = 1.

Definition 2: [8] Let and be two fuzzy sets . is said to be a subset of ; denoted by ⊆ ; if and only if its membership function is less than or equal to that of everywhere on ⊆ ⟺ ( ) ≤ ( ), ∀ ∈ . (1)

Proposition 1: Suppose , and are fuzzy numbers . Then + ⊆ + . (2)

The equality holds if , , > 0. Proof: See [7] .

mailto:Email:[email protected]


Definition 3: [11] A triangular fuzzy number (TFN) , say , is denoted by = ( , , ) where , , and are the center , left , and right spread of , respectively is a real number and , > 0. The membership function of is as follows

( ) = ⎩⎪⎨⎪⎧ − ( − ) , − ≤ ≤ , + − , ≤ ≤ + ,0, ℎ . (3)

Definition 4: The closed intervals [ − (1 − ℎ) , + (1 − ℎ) ] and − , + are called h-level set ℎ ∈ (0,1] and support of ; respectively . If = ; then is a symmetric TFN and is denoted by = ( , ) . A real number can be considered as a degenerated TFN whose spreads are zero . Definition 5: The following formulas for addition of two TFNs and multiplication of a TFN by a scalar are drawn from the extension principle of Zadeh [10] . If = ( , , ) and = ( , , ) be in T( ) and ∈ , then + = ( + , + , + ), (4)

= ( , , ), ≥ 0,( ,− ,− ), < 0. (5)

Definition 6: The multiplication of two TFNs based on extension principle is not necessarily a TFN [4] . The following approximation for the product of two TFNs = ( , , ) and = ( , , ) is proposed by Dobois and Prade [4] :

≅ ( , + , + ), > 0, > 0,( , − , − ), < 0, > 0,( ,− − ,− − ), < 0, < 0, (6)

where > 0 (< 0) means − ≥ 0 ( + ≤ 0), = 1,2. Definition7: Let = ( , , ) and = ( , , ) be any two LR fuzzy numbers in fuzzy linear regression (FLR). Diamond [2] defined a distance between and as d , = ( − ) + ( − ) + ( − ) . (7)

Although this distance was originally defined for the triangular fuzzy numbers , it is still suitable for the LR fuzzy numbers with finite lower and upper limits. Indeed , the distance (7) measures the closeness between the membership functions of two LR fuzzy numbers and particularly , when d , = 0 then the membership functions of and are equal .

In this paper, we will also study the ridge estimation by distance Diamond [2] for triangular fuzzy numbers . In follows what, we will be studying ridge regression of the fuzzy linear model .

Definition 8: Suppose we are given with the training data ( , ), … , ( , ), where are vectors in and ∈ , = 1, … , . The ridge regression procedure is a slight modification on the least-squares method and replaces the objective function ( ) by

( ) = || || + ∑ ( − . ) , where the dot . denotes the dot product in . The parameter controls the smoothness and degree of fit . If there

happens multicollinearity in the input variables , numerical stability problems occur and require the use of ridge regression. See [3] for more details.

3. The proposed regression model Consider a set of TFN data , , … , , | = 1, … , , in which ( = 1, … , , = 1, … , ) is the value of th independent variable and ( = 1, … , ) is the corresponding value of dependent variable in the th case . The purpose of fuzzy linear regression (FLR) is to fit a fuzzy linear model to the given TFN data . This model can be considered as follows = + + + ⋯+ . (8)


In model (8) , , , … , are FLR coefficients (parameters) which are fuzzy numbers . These parameters must be estimated so that the estimated responses , = + + + ⋯+ , = 1, … , , (9)

have the best fitness to the observed responses ( = 1, … , ) , according to a criterion of goodness of fit . In this paper , the FLR coefficients , inputs , and outputs are supposed to be non-symmetric TFN . Also , suppose that given inputs , = , , , = 1, … , , = 1, … , , are positive non-symmetric TFNs , by a simple translation of all data , if necessary . On the other hand , suppose that the observed responses, ∈ ( ) ( = 1, … , , = 1, … , ), are non-symmetric TFNs . To estimate the regression coefficients , it is necessary to evaluate the product of two TFNs and in model (8) .

It is apparent that the multiplication in (6) depends on the sign of TFNs . Therefore , one must formulate different models for different states of the sign of regression coefficients . Another problem with using the multiplication (6) is that it is proposed only for two positive and/or negative TFNs. So , it cannot be used for TFNs whose supports contain both positive and negative real numbers . we calculate the multiplication of two TFNs by some (in fact by infinite) TFNs . Then , the best approximation is chosen among them , to minimize a suitable function.

Note that a real number has infinite representations in the form of = − ′′, where and are nonnegative real numbers . Assume = ( , , ) and = ( , , ) are two TFNs where is unrestricted and > 0 . Set = + ′′ where ′ = ( ′, ′, ′) and ′′ = (− ′′, ′′, ′′) with = − , = + , = + ′′and ≥ , ′′ ≥ ′′ . By using Proposition (1) , multiplication (6) , and addition (3) we have = ( , , ) + (− , , ) ⊆ ( , , )( , , ) + (− , , )( , , ) ≅ ( , + , + ) + − , + , + = ( − ) , + + + , ( + + + ) . (10) Therefore , can be approximated as the RHS of (10) as

≅ ⊗ = ( − ) , + + + , ( + + + ) . (11)

Let = + ′′ for = 1, … , , where = ( ′, ′, ′), = (− ′′, ′′, ′′), and > 0 , < 0. For each choice of and , by using (11) and (4) , we have the following approximation for the th estimated response , = 1, … , , = + + + ⋯+ ≅ + ⊗ + ⊗ + ⋯+ ⊗

= ( + − , + + + +

+ + + + + . (12)

Let us denote the RHS of approximation (12) by ( , ′′ ), so that it is a function of and ′ which are m-dimensional vectors with jth element A and A , respectively . As we know , is not necessarily a TFN . However , for each choice of and ′ , ( , ′′) is a triangular approximation of which is independent of the sign of regression coefficients . We consider it as an approximation for the ith response . Therefore , there are many approximations for , among which , we try to choose the best approximation . To this end , we attempt to make the membership function of each approximated response ( , ′′ )as close as possible to the membership function of corresponding observed response [5] . We have to solve the following linear programming problem which finds the best choices of and ′for the coefficients of model (8) by minimizing the total distance between the observed and approximated responses . Suppose that the observed responses are non-symmetric TFNs = ( , , ) , i=1,…, n . Hence, the ridge regression problem can be formulated as an optimization problem defined by


minimize , , … , =∥ ∥ + ∑ (A , A ), y , (13) . . ≥ , ′ ≥ , = 1, … , , ∈ , , ≥ 0, , , , , , ≥ 0 , = 1, … , . Consequently , we arrive at the following convex optimization problem for model (13) by using as distance (7) , , … , = + + + [ ′ − ′ ′ + ′ − ′ ′ + ′ − ′ ′ ]

+ ( 0 + ′ − ′′ − =1 ) + ( 0 + ′ + ′ + ′′ + ′′

=1 − )

+ ( 0 + ′ + ′ + ′′ + ′′ =1 − ) ,

. . ≥ , ′ ≥ , = 1, … , , ∈ , , ≥ 0, , , , , , ≥ 0 , = 1, … , . 4. Illustrative example

In this section, we provide a real example to explain how the proposed method is applicable to deriving the regression model for fuzzy observations .We illustrate the fuzzy ridge regression model through one example. This example deals with estimating fuzzy linear models. These data are taken from [3], which have been used to analyze the effect of the composition of Portland cement on heat evolved during hardening. According to our result, we have

Y% 1X% 2X%

( 4 , 0.5 , 0.5 ) ( 2 , 0.5 , 0.5 ) ( 2 , 0.5 , 0.5 ) ( 5.5 , 0.5 , 0.5 ) ( 3.5 , 0.5 , 0.5 ) ( 5 , 0.5 , 0.5 ) ( 7.5 , 1 , 1 ) ( 5.5 , 1 , 1 ) ( 1 , 1 , 1 ) ( 6.5 , 0.5 , 0.5 ) ( 7 , 0.5 , 0.5 ) ( 6 , 0.5 , 0.5 ) ( 8.5 , 0.5 , 0.5 ) ( 8.5 , 0.5 , 0.5 ) ( 8 , 0.5 , 0.5 ) ( 8 , 1 , 1 ) ( 10.5 , 1 , 1 ) ( 7 , 1 , 1 ) ( 10.5 , 0.5 , 0.5 ) ( 11 , 0.5 , 0.5 ) ( 9 , 0.5 , 0.5 ) ( 9.5 , 0.5 , 0.5 ) ( 12.5 , 0.5 , 0.5 ) ( 1 , 0.5 , 0.5 )

1 21.89,0,0 0.62,0( ) ( ) X (,0 0.14,0,0 ) XY = + +% % %

The above fuzzy ridge regression model can be applied to predict Y% for a new case.

5. Conclusions In this paper, we dealt with estimating fuzzy ridge regression model with multiple fuzzy inputs and triangular

fuzzy output. It is realized that this method is successfully applicable when there happens multicollinearity in the input variables.

6. References 1. Diamond, P. (1987), “Least squares fitting of several fuzzy variables,” In Preprints of second IFSA congress, Tokyo, July , pp.

20–25. 2. Diamond, P. (1988), “Fuzzy least squares,” Information Sciences, ٣(٤٦,( pp 141-157 . 3. Draper, N.R. and Smith, H. (1981), “Applied regression analysis” , Wiley, New York. 4. Dubois, D. and Prade, H. (1980), “Fuzzy sets and systems: Theory and applications,” Academic Press, New York 5. Hassanpour , H. Maleki, R . and Yaghoobi, M . A . (2011), “A goal programming approach to fuzzy linear regression with fuzzy

input–output data,” Soft Computing , 15(8), pp. 1569–1580 .


6. Bandemer, H., and Wolfgang, N., (2012) “Fuzzy data analysis,” Vol. 20. Springer Science & Business Media. 7. Nguyen, H.T. and Walker, E.A. (2005), “A first course in fuzzy logic” , 3rd edn. Chapman & Hall, New York 8. Sakawa, M. (1993), “Fuzzy sets and interactive multi objective optimization” . Plenum, New York 9. Tanaka, H. Uejima , S. and Asia, K. (1982), “Linear regression analysis with Fuzzy model” , IEEE Trans .Man .Cybernet

,pp.12903-907. 10. Zadeh, LA. (1978), “Fuzzy sets as a basis for a theory of possibility” Fuzzy Sets System, pp. 3–28. 11. Zimmermann, H.J. (2001), “Fuzzy Set: Theory and its Application”, Kluwer Academic Press, Dordrecht.


Hybrid algorithm based on PSO-TLBO and FCM for Image Color Quantization

S. Milad. Nayyersabeti 1, A. Mostaar 2, MR. Deevband 3

1- Department of Medical Physics and Biomedical engineering, Shahid Beheshti University of Medical Sciences, Tehran, Iran

2,3- Department of Medical Physics and Biomedical engineering, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Corresponding Author’s [email protected]

Abstract One of the most common approaches for color quantization is the use of clustering algorithms in pixel area. There are many algorithms for clustering, some of which were used for color quantization. The purposes of color quantization are displaying images on limited hardware, reduction use of storage media and accelerating image sending time. In this paper a hybrid algorithm of TLBO and PSO algorithms with FCM algorithm is proposed. The results demonstrate Superior performance of proposed algorithm in comparison with other color quantization algorithms. Keywords: Color Quantization, FCM, PSO Algorithm, TLBO Algorithm, Clustering

1. Introduction

In the past, color quantization was a necessity due to the limitations of the display hardware, which could not handle over 16 million possible colors in 24-bit images. Although 24-bit display hardware has become more common, color quantization still maintains its practical value [1]. Modern applications of color quantization in graphics and image processing include: (i) compression, (ii) segmentation, (iii) text localization/detection, (iv) color texture analysis, (v) watermarking, (vi) non-photorealistic rendering, and (vii) content-based retrieval [2]. In the image processing literature many different algorithms have been introduced that aim to find a palette that allows for good image quality of the quantized image [3, 4].FCM algorithm is an important color quantization technology. FCM allows more flexibility by introducing the possibility of partial memberships to clusters [1].Soft computing techniques is used for color quantization [5].

In the following section, FCM algorithm is reviewed in section 2, and in sections 3, PSO and TLBO is discussed and next section we review the hybrid of PSO and TLBO. In section 6, FCM-PSO-TLBO algorithm is described. Finally we present the experiments and results. 3. The concept of fcm algorithm FCM algorithm is an iterative optimization Algorithm that minimizes the object function:

c nm 2

m ij iji 1 j 1

min J (U,V) u d= =

= ∑∑ (1)

cs

ij ij 1 2 ni 1

subjecting to u 1, u 0(i=1,...c,j=1,...n) where, X=(x ,x ,...x ) R=

= ≥ ⊂∑ is the data set of s dimension s, n is the

number of data points, c is the number of clusters, m (1, )∈ ∞ is a weighting constant, ij i jd V x= − is the distance, which is usually Euclidian distance, between the ith cluster center and the jth data point,uij is the membership of vector xj to ith cluster, U =[ uij ]is the c × n fuzzy c-partition matrix, V= [V1 ,V2 ,...,Vc ]T is the c × s cluster center matrix. The FCM algorithm is simply an iterative procedure: Step1: Initialize the cluster centers with random values, and choose ε>0. Step2: Calculate the partition matrix U(k).



1i, j,if d (k)>0,u (k)= ;ij ij 2d (k) m 1c ijd (k)l 1 lj

if d (k) 0,u (k) 1 and u (k)=0 (r i);ij ij rj

∀

−∑=

= = ≠ (2)

Step3: Calculate the cluster center matrix V(k+1) .

nmij j

j 1(k 1)i n

mij

j 1

u (k)xi,V

u (k)

=+

=

∀ =∑

∑ (3)

Step4: if ||V (k)-V(k+1) ||<ε, stop; otherwise k+1→k and go to step2. 3. GA algorithm

PSO is developed by Kennedy and Eberhart who have been inspired by the research of the artificial life [6]. PSO starts with a group of random answers (particles), and then it searches for finding optimal answer in problem space by updating particles position. Every multi-dimensional particle (depending on the problem nature) is specified by idP and

idV which denote the location position and speed of dimension thd of particle i. In every phase of swarm movement, position of each particle is updated by two best values. The first value is the best answer according to fitness which is obtained for each particle separately until now and is called b

gP . Another value is the best value that is obtained by all of particles through total swarms until now and is called i

bP . If a neighborhood is defined for every particle, this value is computed in that neighborhood. In this case, this value is called b

iP . In every iterations of the algorithm, the new speed and position of each particle are updated by (3) and (4), after finding g

iP and biP [7]

bi id 1 1 g i 2 2 b iV (s 1) wV (s) R [P (s) P (s)] R [P (s) P (s)]γ γ+ = + − + − (3)

i id idP (s 1) P (s) V (s)+ = + (4) Where idP (s) and idV (s) are respectively the position and the velocity of particle i at time s, w is called inertia

weight and decide how much the old velocity will affect the new one and coefficients _1and _2 are constant values called learning factors, which decide the degree of affection of b

gP and ibP . In particular, _1 is a weight that accounts

for the “social” component, while _2 represents the “cognitive” component, accounting for the memory of an individual particle along the time. Two random numbers, 1R and 2R , with uniform distribution on [0, 1], are included to enrich the searching space. Finally, a fitness function must be given to evaluate the quality of a position.

Algorithm1 (pseudo code version of the PSO algorithm).

Initialization s←0 /* s: time variable*/ for i = 1 to N do /* N: size of the swarm */ Initialize vectors iV and iP to random values

biP ← iP

end for bgP ←best b

iP ; i=1. . . N /* initial global best */ Main Loop while (not termination condition) do Evaluation Loop for i = 1 to N do if f( iP ) is better than f ( b

iP ) then /* f: fitness function*/


biP ← iP /* particle’s best position */

end if if ( b

iP ) is better than f ( bgP ) then

bgP ← b

iP /* swarm’s best position */ end if end for Update Loop for i = 1 to N do

b bi i 1 g i 2 i iV w.V R(0,1).(p p ) R(0,1).(p p )γ γ← + − + −

i i iP P V← + end for s ←s + 1 end while

This procedure is repeated several times (thus yielding successive generations) until a termination condition is reached. This PSO procedure is sketched in Algorithm 1. 4. TLBO algorithm

One of the most recently developed meta heuristics is teaching-learning-based- optimization (TLBO) algorithm [8].TLBO is a population-based algorithm inspired by learning process in a classroom. The searching process consists of two phases, i.e. Teacher Phase and Learner Phase. In teacher phase, learners first get knowledge from a teacher and then from classmates in learner phase. In the entire population, the best solution is considered as the teacher ( teacherX).On the other hand, learners learn from the teacher in the teacher phase. In this phase, the teacher tries to enhance the results of other individuals ( iX ) by increasing the mean result of the classroom ( meanX ) towards his/her position teacherX .

new i teacher F meanX X r.(X T .X )= + − (5) where r is a randomly selected number in the range of 0 and 1 and FT is a teaching factor which can be either 1 or 2:

iFT round[1 rand 0,1)2 1]= + − (6)

In the second phase, i.e. the learner phase, the learners attempt to increase their information by interacting with others. Therefore, an individual learns new knowledge if the other individuals have more knowledge than him/her. Throughout this phase, the student iX interacts randomly with another student jX (i ≠j) in order to improve his/her knowledge. In the case that jX is better than iX (i.e. f ( jX ) < f ( iX ) for minimization problems), iX is moved toward jX . Otherwise it is moved away from jX :

new i j iX X r.(X X )= + − if f(Xi)>f(Xj) (7) new i i jX X r.(X X )= + − if f(Xi)<f(Xj) (8)

If the new solution newX is better, it is accepted in the population. The algorithm will continue until the termination condition is met. . This TLBO procedure is sketched in Algorithm 2.

Algorithm2 (pseudo code version of the TLBO algorithm).

Initialization s ←0 ; /*s: time variable*/ Objective function f(X), T

1 2 dX (x , x ,..., x )= /*d: no. of design variables*/ Generate initial students of the classroom randomly Xi, i=1, 2,…, N / *N: no. of students*/ Calculate objective function f (X) for whole students of the classroom Main Loop while (not termination condition)do Teacher Phase


Calculate the mean of each design variable meanX Identify the best solution (teacher) for i =1 to N do Calculate teaching factor i

FT round[1 rand(0,1)2 1]= + − Modify solution based on best solution (teacher) i i i

new teacher F meanX X rand(0,1)[X (T .X )]← + − Calculate objective function for new mapped student i

newf (X ) if i i i i

new newX is better than X , i.e f (X ) f (X )⟨ i i

newX X← end if End of Teacher Phase Student Phase Randomly select another learner jX , such that j ≠ i if iX is better than jX , i.e. f ( iX ) < f ( jX ) i i i j

newX X rand(0,1)(X X )← + − else i i j i

newX X rand(0,1)(X X )← + − end if if i i i i

new newX is better X , i.e f(X )< f(X )than i i

newX X← end if End of Student Phase end for s← s + 1 end while Post process results and visualization 5. Hybridization of PSO and TLBO

The drawback of PSO is that stochastic approaches have problem-dependent performance. This dependency usually results from the parameter settings in each algorithm. In general, no single parameter setting can be applied to all problems. Increasing the inertia weight (w) will increase the speed of the particles resulting in more exploration (global search) and less exploitation (local search). Thus finding the best value for the parameter is not an easy task and it may differ from one problem to another. A further drawback is that the swarm may prematurely converge. Therefore, from the above, it can be concluded that the PSO performance is problem dependent. The problem-dependent performance can be addressed through hybrid mechanism. It combines different approaches to be benefited from the advantages of each approach. To overcome the limitations of PSO, hybrid algorithms with TLBO are proposed. TLBO is based on the effect of influence of a teacher on the output of the learners in the class. The teacher is considered as a highly learned person who shares knowledge with the learners in the class. However, generally in practice, teachers' knowledge needs to be refreshed to improve them. As an advantage, the TLBO algorithm has no parameters to be tuned. But in classic TLBO, the "Teachers Refresher" process is not considered. In this paper, we make full use of the advantages of PSO and TLBO by combining them together. The PSO-TLBO performs N PSO’s populations simultaneously at first. After 1M iterations the best particles in each population are selected to constitute an N-individual-population for TLBO operations. Then the population should be run using TLBO-operators. After 2M iterations the best solution of TLBO should be transmitted back to PSO populations. We define gap-PSO, gap-TLBO and gap as the iteration numbers of PSO subsystem, TLBO sub-system, and whole system, respectively; 1M , 2M and GAP-MAX the corresponding permissible maximum iteration numbers, respectively. The performance of the algorithm is described as follows: 1. Initialize N populations of PSO, set gap=O. 2. Set gap_PSO=O. 3. Execute PSO operators for each particle in each population. 4. Output the best solution and stop if the termination criterion is satisfied. 5. If gap_PSO=M1, then go to step 6; else gap_PSO ++, go to step 3.


6. Select the best particle in each PSO population, encode them into a TLBO population, set gap TLBO=O. 7. Execute TLBO operators. 8. Output the best solution and stop if the termination criterion is satisfied. 9. If gap_TLBO=M2 then go to step10, else gap_TLBO++ and go to step 7. 10. If gap≥GAP_MAX then output the best solution and stop; else gap++ and transmit the best individual’s information back to each population of PSO, go to step 2. 6. Hybrid PDO-TLBO and FCM

In the proposed algorithm, hybrid PSO-TLBO is used for finding values of optimum clusters' centers c1,c2,…ck . To this purpose, clusters' centers that (1) is minima for them are assumed as optimum clusters' centers. The first step is defining particles. Since our data are images that each pixel of them has 3 values R, G, B and number of cluster is K -defined by user- so every chromosome has K × 3 entries. and (1) is used as fitness function. we explain proposed algorithm steps 7. Experiments and results

Mean Squared Error (MSE) is one of the most important criteria for comparison of two clustering algorithms. For every image with dimensions of m×n, MSE calculated as [1]:

n m2 2 2

1 2 1 2 1 2 1 2i 1 j 1

1MSE(I , I ) [(R (i, j) R (i, j)) (G (i, j) G (i, j)) (B (i, j) B (i, j)) ]3nm = =

= − + − + −∑∑ (9)

where I1 is original image and I2 is quantized image and R(i,j), G(i, j) and B(i,j) are the red, green and blue pixel values at location (i, j) .another criterion to be used is Peak Signal-to-Noise Ratio (PSNR). PSNR described as [1]:

2

1 2 101 2

225PSNR(I , I ) 10 logMSE(I , I )

= (10)

The parameters PSO-TLBO listed in Table 1.

Table1-Parameters PSO-TLBO Algorithm

Swarm size(population size) 100 Num. of Iteration 50 Inertia weight 0.2 Inertia weight Damping Ratio

0.9

Learning Coefficient 1.5

In Table 2 we are compared by PSNR for Lena and Peppers and Baboon images for different. The pepper color quantization is depicted in Figure 1, 2, 3. The results demonstrate Superior performance of proposed algorithm in comparison with other color quantization algorithms.

Table2-Comparison of algorithms by PSNR for Lena and Peppers and Baboon images for different K

K IMAGE CHENG SOM FS-SOM PROPOSED ALGORITHM

Lena 27.29 29.61 29.67 29.69 16 Pepper 23.61 26.70 26.70 26.95 Baboon 23.61 24.85 24.89 24.93

Lena 30.43 32.09 32.12 32.19 32 Pepper 25.43 29.10 29.22 29.34

Baboon 26.13 27.05 27.11 27.17 Lena 32.39 34.23 34.25 34.34

64 Pepper 27.75 31.40 31.59 31.55 Baboon 27.82 29.01 29.10 29.19 Lena 33.71 36.08 36.15 36.29

128 Pepper 30.33 33.48 33.65 33.66


Baboon 29.56 30.91 31.04 31.13

Figure 1- Pepper original image

Figure 2-quantised image with SOM Figure3-quantised image with FCM-PSO-TLBO 8. Conclusion

In this paper a clustering algorithm was represented, which is the hybrid of TLBO and Particle Swarm Optimization algorithms with FCM algorithm. This algorithm was compared to some of color quantization algorithms. The results demonstrate the superiority of the proposed algorithm to other algorithms. 9. References 1. L. Brun, A. Tr´emeau, Digital Color Imaging Handbook, CRC Press, 2002, Ch. Color Quantization, pp.589–638 2. Celebi. M.E., Wen. Q., Chen. J., “Color quantization using c-means clustering algorithms”, 20118th IEEE

International Conference on Image Processing (ICIP), pp.1729-1732, 2011. 3. G. Schaefer, H. Zhou, “Fuzzy clustering for color reduction in images”, Telecomm Sys, ppI7-25, 2009. 4. A. Dekker, "Kohonen neural networks for optimal color quantization”, Network: Computation in Neural Systems,

5, pp. 351-367, 1994. 5. L. Nolle, G. Schaefer, "Color map design through optimization”, Engineering Optimization, 39(3), pp.327-343,

2007.


6. D.E.Goldberg, Genetic Algorithms in Search, Optimization & Machine Learning. Reading, MA: Addison Wesley, 1989.

7. J. Kennedy, R.C. Eberhart, "Particle swarm optimization”, Proceedings of IEEE International Conference on Neural Networks, vol. 4, 1995.

8. Rao, R. V., Savsani, V. J. & Vakharia, D. P. (2012). Teaching-learning-based optimization: An optimization method for continuous non-linear large scale problems. Information Sciences, Vol. 183, pp. 1-15.


Tuning of PID Controller using PSO-TLBO Algorithm

S.Milad.Nayyer sabeti1, MR.Deevband2

1- Department of Medical Physics and Biomedical engineering, Shahid Beheshti University of Medical Sciences, Tehran, Iran

2- Department of Medical Physics and Biomedical engineering, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran

Corresponding Author’s [email protected]

Abstract

This paper presents a new approach for tuning PID controller. However, the PID controller has advantages such as: a good response, the simple structure and robustness of method, but it has disadvantages such as: optimally tuning gains of PID have been quite difficult and it cannot adapt a non-linear state of plant. In this paper a hybrid of PSO and TLBO is proposed for autotuning a PID. The results demonstrate Superior performance of proposed method in comparison with other PID controller. Keywords: PID Controller, TLBO, PSO, Step Response

1. Introduction

PID controller has been extensively used in controlling industrial plants. PID controller tries to correct the error between the measured outputs and desired outputs of the process so that transient and steady state responses are improved as much as possible. In a PID controller, three variables involved in the tuning process (Kp, Ki and Kd). Ziegler-Nichols tuning formula is one of the best known tuning methods [1].In paper [2] is used NSGA II for tuning a PID controller and in [3] a Fuzzy with GA algorithim is used to autotuning a PID.In this paper, PID controller and Ziegler-Nichols tuning is reviewed in section 2 and section 3, 4 and 5 PSO, TLBO and hybrid PSO-TLBO is discussed respectively. Finally, in section 6, we present the experiments and results. 2. A brief description of PID controller

PID control is one of the earlier control strategies, namely classical control that typical structure’s is shown Figure 1. C(s) is the controller and P(s) is the plant to be controlled. For a PID controller, C(s) is defined as:

ip d

KC(s) K K ss

= + + (1) Where Kp, Ki, and Kd are the proportional, integral and derivative gains respectively. The output of the system will be:

p i d0

u(t)=K e(t)+K e(t)dt+K e(t)∞

∫ (2)

KP, Ki and Kd can be tuned by Ziegler-Nichols tuning formula that obtained when the plant model is given the step input.

C(s) P(s)e(t) u(t) y(t)

Disturbance d(t)

Measurement noise

r(t)

Controller Plant Model

Figure 1.Block diagram of control system



The Ziegler–Nichols tuning method is a heuristic method of tuning a PID controller. It is performed by setting

the I and D gains to zero. The P gain, Kp is then increased (from zero) until it reaches the ultimate gain Ku, at which the output of the control loop has stable and consistent oscillations. Ku and the oscillation period Tu are used to set the P, I, and D gains depending on the type of controller used:

Table 1-Zigler Nichols tuning formula Controller

type Ziegler Nichols method

kP ki kd

P 0.5ku - - PI 0.45ku Tu/1.2 - PD 0.8ku - Tu/8 PID 0.6ku Tu/2 Tu/8

3. PSO Algorithm

PSO is developed by Kennedy and Eberhart who have been inspired by the research of the artificial life [4]. PSO starts with a group of random answers (particles), and then it searches for finding optimal answer in problem space by updating particles position. Every multi-dimensional particle (depending on the problem nature) is specified by idP and

idV which denote the location position and speed of dimension thd of particle i. In every phase of swarm movement, position of each particle is updated by two best values. The first value is the best answer according to fitness which is obtained for each particle separately until now and is called b

gP . Another value is the best value that is obtained by all of particles through total swarms until now and is called i

bP . If a neighborhood is defined for every particle, this value is computed in that neighborhood. In this case, this value is called b

iP . In every iterations of the algorithm, the new speed and position of each particle are updated by (3) and (4), after finding g

iP and biP [5]

bi id 1 1 g i 2 2 b iV (s 1) wV (s) R [P (s) P (s)] R [P (s) P (s)]γ γ+ = + − + − (3)

i id idP (s 1) P (s) V (s)+ = + (4) Where idP (s) and idV (s) are respectively the position and the velocity of particle i at time s, w is called inertia

weight and decide how much the old velocity will affect the new one and coefficients _1and _2 are constant values called learning factors, which decide the degree of affection of b

gP and ibP . In particular, _1 is a weight that accounts

for the “social” component, while _2 represents the “cognitive” component, accounting for the memory of an individual particle along the time. Two random numbers, 1R and 2R , with uniform distribution on [0, 1], are included to enrich the searching space. Finally, a fitness function must be given to evaluate the quality of a position.

This procedure is repeated several times (thus yielding successive generations) until a termination condition is reached. The flow chart of the standard PSO is shown in Figure 2. 4. TLBO Algorithm

One of the most recently developed Meta heuristics is teaching-learning-based- optimization (TLBO) algorithm [6].TLBO is a population-based algorithm inspired by learning process in a classroom. The searching process consists of two phases, i.e. Teacher Phase and Learner Phase. In teacher phase, learners first get knowledge from a teacher and then from classmates in learner phase. In the entire population, the best solution is considered as the teacher ( teacherX).On the other hand, learners learn from the teacher in the teacher phase. In this phase, the teacher tries to enhance the results of other individuals ( iX ) by increasing the mean result of the classroom ( meanX ) towards his/her position teacherX .

new i teacher F meanX X r.(X T .X )= + − (5) Where r is a randomly selected number in the range of 0 and 1 and FT is a teaching factor which can be either 1 or 2:

iFT round[1 rand 0,1)2 1]= + − (6)

In the second phase, i.e. the learner phase, the learners attempt to increase their information by interacting with others. Therefore, an individual learns new knowledge if the other individuals have more knowledge than him/her.


Throughout this phase, the student iX interacts randomly with another student jX (i ≠j) in order to improve his/her knowledge. In the case that jX is better than iX (i.e. f ( jX ) < f ( iX ) for minimization problems), iX is moved toward jX . Otherwise it is moved away from jX :

new i j iX X r.(X X )= + − if f(Xi)>f(Xj) (7) new i i jX X r.(X X )= + − if f(Xi)<f(Xj) (8)

If the new solution newX is better, it is accepted in the population. The algorithm will continue until the termination condition is met. The flow chart of the standard TLBO is shown in Figure 3.

Identify the best solution(teacher)

Modify solution based on based solutionXnew=Xi+r.(Xteacher-TF.Xmean)

Initialize number of students(population),termination criterion

Calculate the mean of each design variables

Is new solution better than exsting? AcceptReject

Select any two solutions randomly Xi and Xj

Is new solution better than

existing?

Xnew=Xold+r.(Xj-Xi)Xnew=Xold+r.(Xi-Xj)

AcceptReject

Is Xi better thanXj

Is termination criteria satisfied?

Final value of solutions

No

NoYes

YesNo

Yes

No

YesGeneration of initial population

Fitness evolution of each particle

Modification of each particles searching point by equations 1 and 2

Final solution

Termination criterion met?

No

Yes

Figure3.Flowchart of TLBO

Figure2.Flowchart of PSO

5. Hybrid PSO-TLBO

The drawback of PSO is that stochastic approaches have problem-dependent performance. This dependency usually results from the parameter settings in each algorithm. In general, no single parameter setting can be applied to all problems. Increasing the inertia weight (w) will increase the speed of the particles resulting in more exploration (global search) and less exploitation (local search). Thus finding the best value for the parameter is not an easy task and it may differ from one problem to another. A further drawback is that the swarm may prematurely converge. Therefore, from the above, it can be concluded that the PSO performance is problem dependent. The problem-dependent performance can be addressed through hybrid mechanism. It combines different approaches to be benefited from the advantages of each approach. To overcome the limitations of PSO, hybrid algorithms with TLBO are proposed. TLBO is based on the effect of influence of a teacher on the output of the learners in the class. The teacher is considered as a highly learned person who shares knowledge with the learners in the class. However, generally in practice, teachers'


knowledge needs to be refreshed to improve them. As an advantage, the TLBO algorithm has no parameters to be tuned. But in classic TLBO, the "Teachers Refresher" process is not considered. In this paper, we make full use of the advantages of PSO and TLBO by combining them together. The PSO-TLBO performs N PSO’s populations simultaneously at first. After 1M iterations the best particles in each population are selected to constitute an N-individual-population for TLBO operations. Then the population should be run using TLBO-operators. After 2M iterations the best solution of TLBO should be transmitted back to PSO populations. We define gap-PSO, gap-TLBO and gap as the iteration numbers of PSO subsystem, TLBO sub-system, and whole system, respectively; 1M , 2M and GAP-MAX the corresponding permissible maximum iteration numbers, respectively. The performance of the algorithm is described as follows: 1. Initialize N populations of PSO, set gap=O. 2. Set gap_PSO=O. 3. Execute PSO operators for each particle in each population. 4. Output the best solution and stop if the termination criterion is satisfied. 5. If gap_PSO=M1, then go to step 6; else gap_PSO ++, go to step 3. 6. Select the best particle in each PSO population, encode them into a TLBO population, set gap TLBO=O. 7. Execute TLBO operators. 8. Output the best solution and stop if the termination criterion is satisfied. 9. If gap_TLBO=M2 then go to step10, else gap_TLBO++ and go to step 7. 10. If gap≥GAP_MAX then output the best solution and stop; else gap++ and transmit the best individual’s information back to each population of PSO, go to step 2. 6. Experiments and results

The output of the system has some important characteristics. As used in [7], to design an optimal PID controller and making a comparison, we use the total cost function:

p sT M Tr T IAEZ Z Z Z Z= + + + (9) These characteristics is defined as: 1. MP: Mp=ymax-yss (ymax: max value of y and yss: the steady-state value of y.) 2. Tr: the time required for the step response to rise from 10 to 90 percent of its final value. 3. Ts: the time required for the step response to stay within 2 percentage of its final value.

4. IAE: 0

IAE e(t)dt∞

= ∫ (3×Ts is an acceptable approximation of real value of IAE.)

As in [7], P(S) is defined as: 2

4.228p(s)(s 0.5)(s 1.64s 8.456)

=+ + +

In Figure 4.where the step responses of the system for Ziegler Nichols and PSO-TLBO controllers are shown. Overshoot and settling time in PSO-TLBO method is less than Ziegler Nichols .The parameters for Tuning of PID Controller using PSO-TLBO Algorithm listed in Table 2.

Table2-Parameters PSO-TLBO Algorithm Swarm size(population size) 100 Num. of Iteration 50 Inertia weight 0.2 Inertia weight Damping Ratio

0.9

Learning Coefficient 1.5 In Table3. four controllers resulting from Zeigler Nichols method, ACO, continuous GA and the PSO-TLBO Algorithm are listed and are compared. The results indicate that the PSO-TLBO yields the lowest total cost


Table3-The Controllers and their Characteristics of Step Responses Controller

pMZ rTZ sTZ IAEZ TZ Parameters [Kp ,Ki ,Kd]

GA 0.1762 0.31 2.86 0.8669 4.2131 [3.369 ,2.125 ,4.204] ACO 0.156 0.62 4.99 0.8861 6.6525 [2.517 ,2.219 ,1.151]

PSO-TLBO 0.1828 0.29 2.80 0.8890 4.1618 [3.440 ,2.1665 ,4.231] Ziegler Nichols

0.165 0.73 5.37 0.9595 7.2242 [2.19 ,2.126 ,0.565]

Figure4.Step response of the system for Ziegler Nichols and PSO-TLBO

7. Conclusion

We proposed a hybrid of PSO and TLBO for autotuning a PID. The results demonstrate Superior performance of proposed method in comparison with other PID controller. 8. References 1. J. G. Ziegler and N. B. Nichols, "Optimum Settling for automatic Controllers", Trans. On ASME, vol 64, 1942, pp

759-768. 2. A. Herreros, E. Baeyens, J. R. Pera´n, “Design of PID-type controllers using multiobjective genetic algorithms”,

ISA Transactions 41 ~2002. 457–472. 3. R. Bandyopadhyay, U. K. Chakraborty, D. Patranabis, “Autotuning a PID controller: A fuzzy-genetic approach”,

Journal of System Architecture 47 (2001) 663-673. 4. D.E.Goldberg, Genetic Algorithms in Search, Optimization & Machine Learning. Reading, MA: Addison Wesley,

1989. 5. J. Kennedy, R.C. Eberhart, "Particle swarm optimization”, Proceedings of IEEE International Conference on Neural

Networks, vol. 4, 1995. 6. Rao, R. V., Savsani, V. J. & Vakharia, D. P. (2012). Teaching-learning-based optimization: An optimization method

for continuous non-linear large scale problems. Information Sciences, Vol. 183, pp. 1-15. 7. Y. T. Hsiao, C. L. Chuang and C. C. Chien, “Ant Colony Optimization for Designing of PID Controllers”, 2004

IEEE International Symposium on Computer Aided Control Systems Design, September 24, 2004, Taipei, Taiwan.

Ziegler Nichols

PSO-TLBO


A new fuzzy rule-based approach for automatic facial expression recognition

Mohamad Roshanzamir 1, Ahmad Reza Naghsh Nilchi 2, Mahdi Roshanzamir 3

1- Electronic and Computer Faculty, Fasa Branch, Islamic Azad University, Fasa, Iran 2- Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran

3- Electronic and Computer Faculty, University of Tabriz, Tabriz, Iran

[email protected]

Abstract One of the most important methods in human interaction is emotion representation and the most common way is facial expression. In artificial intelligence, automatic facial expression recognition is the basic step in automatic interaction between human and computers. Many different methods have been proposed recently. None of them can do it ideally and completely automatic, however different methods try to improve or solve some parts of it. In this research, in addition to find some new deformation in face when showing a facial expression, a new fuzzy rule based method is proposed to classify features extracted from image sequence of facial expression and detect facial expression. Motion vectors are the features extracted from image sequences and used for classification. They are extracted using optical flow algorithm and classified using Sugeno type fuzzy inference system. Experimental results show that its performance is 81.22%. Keywords: automatic facial expression recognition, optical flow, Sugeno type fuzzy inference system.

1. Introduction

One the most important topics in artificial intelligence field is intelligent interaction between human and computers. Indeed, computer as a virtual person must be able to interact with human and perceive his/her emotions. One of the straight methods for emotion understanding is facial expression recognition. Other methods such as speaking tone or body language can be used too. However, it seems that facial expression is the best.

Different methods have been invented for automatic facial expression recognition [1-3]. Optical flow algorithms and fuzzy methods are also used in some researches [4-8]. One of the most important problems in automatic facial expression recognition is uncertainty nature of this subject. Human being show same emotions in not exactly the same forms. They are different to some extent. So, it is preferred not to use deterministic systems to solve this problem. Meanwhile, extracting data is not easy and always noisy. Considering these two problems show that it is essential to offer a probabilistic or fuzzy model to process these uncertain data. In this article, a Sugeno fuzzy inference system will be introduced to handle these uncertainties and classify extracted data to a correct class of facial expression.

This article is organized in six sections. Section 2 introduces the database which is used in this article. In sections 3 and 4, suggested algorithm is described in details and in section 5, implementation results are analyzed. Finally, conclusion and future works are represented in section 6.

2. Database

From psychological view, emotions are categorized into six main groups [9]. They are happiness, sadness, surprise, anger, disgust and fear. An overview of psychological works on basic emotions can be found in [10]. These emotions lead to changes in face because of contraction in face muscles. It must be considered that emotions and facial expressions are two different concepts. However, it is possible to define a type of correspondence between them. When a person feels an emotion, his/her face will have some specific changes. Bassili [11] describes these changes as in figure 1.



Happiness Sadness Surprise Fear Anger Disgust

Figure 1: Bassili description of face deformation in each emotion [11]

In this research Cohn-Kanade (CK) facial expression database [12] is used. It is one of the most common and comprehensive databases that contain 486 image sequences from 97 posers. 65% of them are female. 15% are African-American. 3% are Asian or Latino and others are Americans.

The image sequences in this database start from neutral and end in the peak of one of the six basic emotions. All the six emotions were not available for some subjects. In addition to face deformations represented in figure 1, our investigations on Cohn-Kanade database show that there are six more deformations in some emotion representations. We show these deformations in figure 2.

Happiness Type 2 Sadness Type 2 Sadness Type 3 Sadness Type 4 Anger Type 2 Fear Type 2

Figure 2: Other types of face deformation in different emotions

3. Facial data extraction and presentation There are different algorithms to extract features from an image or image sequences of facial expression. In this

research, an image sequence of facial expression representation is used as a system input. An example of system input can be seen in figure 3.a. This image sequence is given to optical flow algorithm [13]. Output will be a collection of motion vectors which show deformations created in face because of an emotion. Figure 3.b illustrates result of applying optical flow algorithm to figure 3.a image sequence.

a b

Figure 3: (a) Image sequence of happiness. (b) Happiness motion vectors.


As it is mentioned previously, showing an emotion will cause some deformations in face. These deformations are extracted as a collection of motion vectors using optical flow algorithm. The next step will be motion vector classification into one of the six basic emotions.

4. Fuzzy facial expression recognition Classification of motion vectors is the main issue in this research. First, face is segmented into eight areas as shown in figure 4. The reason of this segmentation can be explained according to figures 2 and 3. It is done so that all motion vectors in a part have same directions approximately. To divide face into eight areas, eyes and mouth locations must be determined firstly. This can be done automatically[14, 15] or manually. In this research, it is done manually. Lines 1 and 2 determine eyes and mouth location respectively. Line 3 is line bisectors of line 1. Lines 4 and 5 connect the intersection of lines 2 and 3 to the lower corners of image. It is not necessary to determine the precise location of lines. Finding approximate locations are enough.

Figure 4: Face segmentation into 8 areas

After face segmentation, area of motion vectors are determined. Then three features are extracted using these motion vectors: 1) number of motion vectors in each area, 2) the average size of the motion vectors in each area horizontally and 3) the average size of the motion vectors in each area vertically. So, 24 features are extracted. As the face is symmetric only one side of face (left side) is considered. Consequently, number of features are reduced to 12.

To classifying these 12 features, Sugeno type fuzzy inference system is used. Sugeno, or Takagi-Sugeno-Kang, method of fuzzy inference is introduced in 1985 [16]. The main difference between regular and Sugeno type fuzzy inference system is that the Sugeno output membership functions are either linear or constant. In this research, constant output is used. Outputs are one of the six main emotions.

Input variable characteristics such as membership functions, their ranges, types and parameters as well as fuzzy rules must be determined. To determine them, some essential points are considered:

1. If there are a few motion vectors in an area, their directions are not important. So, when the rules are written in such areas, only number of vectors is considered.

2. It is not true to use same membership functions for number of motion vectors in different areas, because size of areas are different from each other. In this article membership functions are defined based on the extracted motion vectors and according to normal distribution feature. In normal distribution as it is illustrated in figure 5, 68% of data are in μ±σ interval and 99.7% of them are in μ±3σ interval. Membership function types are named as zmf, pimf and smf and defined in equations 1, 2 and 3 respectively. Their parameters are also defined in these equations using mean (μ) and standard deviation (σ) of extracted motion vectors.

3. Deformation in face is almost symmetric. Extracted motion vectors confirm this fact. So in fuzzy rules only areas number 1, 2, 3 and 4 are considered.

4. The difference between number of extracted motion vectors in area number 3 and 4 is not significant when different facial expressions are represented. So, only their length are considered in fuzzy rules and their numbers are not used. Consequently, number of features is reduced to ten features.


Figure 5: For the normal distribution, the values less than one standard deviation away from the mean account for 68.27% of the set; while two standard deviations from the mean account for 95.45%; and three standard deviations account for 99.73%.

( ; , ) =⎩⎪⎨⎪⎧ 11 − 2 − − 2 − − 0

≤ ≤ ≤ + 2 + 2 ≤ ≤ ≥

= + = + 3 (1)

( ; , , , ) =

⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧ 02 − − 1 − 2 − − 11 − 2 − − 2 − − 0

≤ ≤ ≤ + 2 + 2 ≤ ≤ ≤ ≤ ≤ ≤ + 2 + 2 ≤ ≤ ≥

= − 3 = − = + = + 3 (2)

( ; , ) =⎩⎪⎨⎪⎧ 02 − − 1 − 2 − − 1

≤ ≤ ≤ + 2 + 2 ≤ ≤ ≥

= − = − 3 (3)

5. Experimental results The suggested method is tested on 339 randomly selected image sequences of CK facial expression database. In

this research, eight images in each sequence are used for each test. It is implemented in MATLAB 2012. The rules must be applied to the motion vectors extracted from the left side of face. It is very easy to apply them to the motion vectors extracted from right side of face. Only sign of average size of motion vectors horizontally must be inverted. Two other features i.e. number of motion vectors and their average size vertically are not changed. Now it is possible to test proposed algorithm on 339 cases. Membership function parameters are extracted using mean and standard deviation of motion vectors extracted from left side of face as the train data. Their values are shown in table 1. In this table, input variable characteristics are also illustrated. Then the system was tested on motion vectors extracted from right side of face as the test data using fuzzy rules represented in table 2. In table 3, the number of cases for each emotion are specified. Result of applying the proposed algorithm are also shown in that table. If different types of emotions are not considered, the performance will improve much. In this situation there is no difference between various types of emotions. Now, confusion matrix is as shown in table 4 with overall performance of 81.22%.


Table 1: Input variables characteristics

Nam

e of

In

put

Var

iabl

es

Des

crip

tion

Rang

e

Num

ber o

f M

embe

rshi

p fu

nctio

ns

Mem

bers

hip

Func

tion

Nam

es

Type

Para

met

ers

VN1 Percentage of motion vector number in area #1 [0 60] 3 Low zmf [11.75 16.75] Medium pimf [3.25 10.5 24.5 32] High smf [3.25 10.5 24.5 32]

VX1 Average size of the motion vectors horizontally in area #1 [-1.5 1.5] 2 Zero zmf [0.18 0.5] Pos smf [-0.23 0.08]

VY1 Average size of the motion vectors vertically in area #1 [-3 3] 3 Neg zmf [-0.75 0.3] Zero pimf [-1.6 -0.6 0.4 1.4] Pos smf [-0.2 0.3]

VN2 Percentage of motion vector number in area #2 [0 60] 3 Low zmf [17 22] Medium pimf [4.35 10.3 22.3 28.2] High smf [15.5 20.5]

VX2 Average size of the motion vectors horizontally in area #2 [-1.5 1.5] 2 Neg zmf [-0.15 0.2] Zero smf [-0.4 0.09]

VY2 Average size of the motion vectors vertically in area #2 [-3 3] 2 Neg zmf [-0.01 1] Pos smf [-0.4 0.03]

VX3 Average size of the motion vectors horizontally in area #3 [-1.5 1.5] 3 Neg zmf [-0.2 0.25] Zero pimf [-0.5 -0.2 0.2 0.5] Pos smf [-0.45 0.02]

VY3 Average size of the motion vectors vertically in area #3 [-3 3] 2 Neg zmf [-0.2 0.5] Pos smf [-1 -0.05]

VX4 Average size of the motion vectors horizontally in area #4 [-1.5 1.5] 2 Neg zmf [-0.04 0.35] Zero smf [-0.497 -0.197]

VY4 Average size of the motion vectors vertically in area #4 [-3 3] 3 Neg zmf [-0.07 0.7] Pos pimf [-0.6 0.2 0.95 1.73] VPos smf [-0.12 0.95]

Table 2: Fuzzy Rules

Table 3: Confusion matrix and total number of each emotion are represented. Rows show actual class. Columns show predicted class. HA, SA, SU, FE,AN, DI, HAT2, SDT2, SDT3, SDT4, ANT2 and FET2 means happiness, sadness, surprise, fear, anger, disgust, happiness type 2, sadness type 2, 3 and 4, anger type 2 and fear type 2 respectively.

HA SD SU FE AN DI HAT2 SDT2 SDT3 SDT4 ANT2 FET2 Total number

HA 43 0 0 3 0 2 2 0 0 0 0 0 50 SD 1 5 0 1 0 0 0 0 0 1 1 0 9 SU 5 2 56 3 0 0 0 0 1 0 0 0 67 FE 3 0 1 23 0 0 2 0 0 0 4 1 34 AN 1 2 0 0 1 1 0 0 0 0 9 0 14 DI 2 2 0 1 1 35 0 0 0 0 4 0 45 HAT2 6 0 0 3 0 0 36 0 0 0 0 0 45 SDT2 5 1 0 1 0 0 0 1 16 1 0 1 26 SDT3 0 0 0 0 0 0 0 1 3 0 0 0 4 SDT4 0 0 0 1 0 0 0 2 3 3 0 0 9 ANT2 0 1 0 0 0 1 0 0 0 1 19 0 22 FET2 1 0 0 4 0 0 3 1 0 0 0 5 14

According to tables 3 and 4, some emotions such as fear are misclassified as happiness a lot. This confusion

happens because there are some similarity between these emotions. According to figures 1 and 2, motion vectors in lower part of face in these two emotions are identical. To reduce this type of mistakes, membership functions and rules must be defined more precisely.


Table 4: Rows show actual class. Columns show predicted class

Hap

pine

ss

Sadn

ess

Surp

rise

Fear

Ang

er

Dis

gust

Happiness 91.58 0 0 6.32 0 2.11 Sadness 10.42 81.25 0 6.25 2.08 0 Surprise 7.46 4.48 83.58 4.48 0 0 Fear 16.67 2.08 2.08 70.83 8.33 0 Anger 2.78 11.11 0 0 80.56 5.56 Disgust 4.44 4.44 0 0 11.11 80

6. Conclusion and future works In this research a fuzzy rule based system was suggested to classify features extracted from facial image

sequence. These features are motion vectors extracted from image sequence of facial expression. They are extracted using optical flow algorithm. Then they are grouped in eight areas and classified using Sugeno type fuzzy inference system. Percentage of motion vector numbers and their average size horizontally and vertically are used as input variables to Sugeno fuzzy inference system. This system classifies features into one of basic emotions. System performance is 81.22%. As the first step using this method it is a good performance. However more works need to improve it.

To improve the system performance, there are two suggestions. Extracting all motion vectors are time consuming. So, it is better to extract face motion vectors in the parts that are more distinctive. Another option for performance improvement is optimization of membership functions and rules. They must change especially in cases where there are more mistakes. This must be done according to their confusion matrixes.

7. References [1] B. Mishra, S. L. Fernandes, K. Abhishek, A. Alva, C. Shetty, C. V. Ajila, D. Shetty, H. Rao and P. Shetty, “Facial expression

recognition using feature based techniques and model based techniques: A survey”, 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 589 – 594, 2015.

[2] M. Valstar, J. Girard, T. Almaev, G. McKeown, M. Mehu, L. Yin, M. Pantic, and J. Cohn, “Fera 2015 - second facial expression recognition and analysis challenge,” in Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, 2015.

[3] S. Zhong, Y. Chen and Sh. Liu, "Facial Expression Recognition Using Local Feature Selection and the Extended Nearest Neighbor Algorithm," Seventh International Symposium on Computational Intelligence and Design (ISCID), Vol.1, pp 328 - 331, 2014.

[4] X. Fan and T. Tjahjadi, "A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences," Pattern Recognition, Vol. 48, Issue 11, pp. 3407–3416, 2015.

[5] A. Sanchez, J. V. Ruiz, A. B. Moreno, A. S. Montemayor, J. Hernandez, and J. J. Pantrigo,, "Differential optical flow applied to automatic facial expression recognition," Neurocomputing, vol. 74, no. 8, pp. 1272-1282, Mar. 2011.

[6] M. Mufti and A. Khanam, "Fuzzy rule based facial expression recognition," IEEE Intelligent Agents, Web Technologies and Internet Commerce'06, pp.57-61, 2006.

[7] S. D. More and S. Deshpande, “Fuzzy Model for Human Face Expression Recognition”, International Journal of Advanced Technology and Engineering Research (IJATER), vol. 2, Issue 2, pp. 149-153, May 2012.

[8] R. Ghasemi, M. Ahmadi, “Facial Expression Recognition Using Facial Effective Areas And Fuzzy Logic”, Iranian Conference on Intelegent System (ICIS), 2014.

[9] P. Ekman, Emotion in the Human Face, Cambridge Univ. Press, 1982. [10] W. Turner, A. Ortony. What's basic about basic emotions? Psychological Review, pages 315-331, 1990. [11] J. N. Bassili, "Facial Motion in the Perception of Faces and of Emotional Expression," J. Experimental Psychology 4, pp. 373-

379, 1978. [12] T. Kanade, J. Cohn, Y. Tian, Comprehensive database for facial expression analysis, in: IEEE International Conference on

Automatic Face & Gesture Recognition (FG), 2000. [13] T. Gautama, and M. M. VanHulle, "A Phase-Based Approach to the Estimation of the Optical Flow Field Using Spatial

Filtering," IEEE Transaction on Neural Networks, Vol. 13, No. 5, September 2002. [14] M. Asadifard, and J. Shanbezadeh, "Automatic Adaptive Center of Pupil Detection Using Face Detection and CDF Analysis,"

International MultiConference of Engineers and Computer Scientists, pp 130-133, March 2010. [15] L. Wang, H. Ye, L. Xia, "Mouth Detection Based on Interest Point,” IEEE Conference of Control, pp 610–613, 2007. [16] M. Sugeno, Industrial applications of fuzzy control, Elsevier Science Pub. Co., 1985.


A study on the vulnerability of CAPTCHA patterns of Iranian popular websites and presenting approaches for resolving it

Hossein KardanMoghaddam 1, Hossein Moradi *2

1- Faculty Member of Birjand University of Technology, Birjand, Iran 2- Faculty Member of Birjand University of Technology, Birjand, Iran

[email protected]

Abstract

CAPTCHA creates tests that humans can easily answer to them, but the computers would not be able to recognize and respond to them. Given that one of the main ways to break the CAPTCHA is to use Optical Character Recognition (OCR), in this study, the vulnerability rate of CAPTCHA patterns used in 20 top Iranian websites versus the OCR solutions was investigated. The study showed that 43% of surveyed websites either do not use CAPTCHA or use patterns that can be broken without using OCR. In addition, except for CAPTCHA patterns used by three websites, including Blogfa.com, Varzesh3com and Persianblog.com, other patterns used by Iranian popular websites were broken by two approaches of Captcha Sniper and GSA Captcha Breaker with 15 to 96 percent success rate. In the end, some corrective suggestions were proposed to fix the existing vulnerabilities, and a safe Persian CAPTCHA was suggested based on them, which has a high readability for human user due to the use of a Persian dictionary. In addition, considering the specific security features of this new CAPTCHA pattern, even in case of relative development of Persian OCR solutions, it is unlikely to break this CAPTCHA. Keywords: Breaking CAPTCHA, CAPTCHA pattern, Vulnerability, Iranian popular websites, OCR, Persian CAPTCHA

1. Introduction

Many websites use CAPTCHA (Completely Automated Public Turing tests to tell Computers and Humans Apart) to prevent automated interacting of machines with the website. These efforts done in various forms significantly influence the success of these websites. For example, Gmail improves its services by blocking automated spams. Also, eBay, through blocking the robots, prevents flooding scamming attacks and improves its market. Meanwhile, Facebook prevents the sending of spam to its trusted users or cheating in the games, by limiting the creation of fake profiles.

The most important and common CAPTCHA pattern is using a combination of deformed or transformed characters and ambiguities techniques that can be detectable by humans, but their detection by automated scripts would be difficult. However, despite the importance of CAPTCHA, its widespread use and the growing number of research studies in this area, there is no systematic methodology to design or evaluate the CAPTCHAs. In fact, many websites are still using patterns that are vulnerable to automated attacks [1]. Bursztein et al. demonstrated that their strategy (DECAPTCHA) is able to break a variety of CAPTCHA of global popular websites, like Wikipedia CAPTCHA pattern seen in Figure 1, with a high success rate. Among all, only Google and RECAPTCHA resisted against the attacks and did not break [1]. However, Ahmad et al. challenged the strength of Google CAPTCHA and RECAPTCHA as well, since they succeeded to break the Google CAPTCHA with the success rate of 46.75%. These studies show that designing a secure CAPTCHA remains still an open research problem, and before using any CAPTCHA pattern, its vulnerability rate should be seriously studied [2] [3].

Given the fragility of CAPTCHA patterns used in the popular world websites, it is predicted that the vulnerability rate of different types of CAPTCHA patterns used in Iranian popular websites would be high as well. Thus, this study attempts to examine the vulnerability rate of different CAPTCHA types used in a number of Iranian popular websites against OCR attacks and to answer the following questions: Primary question: How much is the vulnerability rate of CAPTCHA patterns applied in Iranian popular websites against OCR attacks? Sub-question 1: What are the most important OCR approaches on the market to test or break various common CAPTCHA patterns?



Sub-question 2: What are the recommendations to fix the vulnerabilities of CAPTCHA patterns used in Iran's popular websites?

To answer the research questions, first, the research literature and background were reviewed. Then, the research methodology and process were investigated. Finally, the research findings and corrective recommendations were proposed, and at the end, a secure Persian CAPTCHA pattern based on the corrective recommendations was provided. 2. Literature review

Bursztein et al. have mentioned that the purpose of designing CAPTCHA is that the success of automated scripts would not be more than 1 in 10,000 (i.e., the accuracy of 0.01%). However, achieving such a security goal seems to be very ambitious due to the possible success of random guesses; therefore, they assumed that if a solution to break CAPTCHA can achieve at least 1% of accuracy, the CAPTCHA pattern has been broken [1]. Moreover, according to Ahmad et al., the common and accepted purpose in designing CAPTCHA is that the success of automated attacks should not be more than one percent, while the success rate of humans has to be at least 90% [2].

Another important issue related to the assessment of CAPTCHA strength is how to choose a set of test images. Bursztein et al. believe that cross-validation is useful for initial experiments, but it is not sufficient to verify the assumption of an unsafe CAPTCHA, since it does not reflect the real-world situations in which an attacker attacks to a website with unknown CAPTCHAs. Instead, they tried to use the best methods of machine learning community. They used a set of test images completely different from the set of training images. In addition, they emphasized that a simple CAPTCHA should not be placed in the test set in order to avoid distortions in the assessment [1].

Bursztein et al. evaluated 15 different CAPTCHA patterns used in the websites, including Authorize, Baidu, Blizzard, Captcha.net, CNN, Digg, and eBay, Google, Megaupload, NIH, Recaptcha, Reddit, Skyrock, Slashdot and Wikipedia. These patterns are shown in Figure 1.

Figure 1: Images of CAPTCHA patterns in 15 global popular websites [1]

The results of research conducted by Bursztein et al. to break CAPTCHA patterns of global popular websites resulted in a success rates as follows: 1 to 10% in Baidu and Skyrock websites, 10 to 24% in CNN and Digg websites, 25 to 49% in eBay, Reddit, Slashdot and Wikipedia websites and 50% or higher for Authorize, Blizzard, Captcha.net, Megaupload and the NH websites. In the end, they stressed that only the CAPTCHA patterns of Google and RECAPTCHA websites resisted against these attacks and did not break [1]. However, Ahmad et al. challenged the CAPTCHA strength of Google and RECAPTCHA, since they could attack a family of CAPTCHA patterns by designing a new algorithm that use the segmentation-resistance mechanism by character crowding together method, and succeeded to break Google's CAPTCHA with success rate of 46.75% [2] [3]. 3. Research methodology

The common and accepted aim in designing CAPTCHA is to limit the success rate of automated attacks to a maximum of 1%. Thus, to reflect the real-world conditions where an attacker attacks a website with instances of unknown CAPTCHA, a set of test images quite different from the set of training images were used. In addition,


avoiding the occurrence of simple CAPTCHA images in the test, it was tried to prevent deviation in the evaluation. Thus, 300 test images and 100 separate training images were used in this study. In this situation, if the attacker wants to reach an accuracy of 1%, he/she should solve three examples of the corresponding training images to confirm non-security hypothesis of a pattern. It should be noted that the above method was used in all assessments made in this study. 4. Research process The main process of implementing this research included the followings:

1. Identifying 20 Iranian popular websites 2. Identifying the CAPTCHA patterns used in the mentioned websites 3. Identifying and selecting several OCR solutions for breaking the CAPTCHAs 4. Evaluating the effectiveness of existing OCR solutions 5. Estimating the vulnerability rate of each of the studied CAPTCHA patterns 6. Presentation of research findings 7. Providing corrective recommendations 8. Designing a safe Persian CAPTCHA based on the corrective recommendations

4.1 Identifying 20 top popular iranian websites Alexa.com website, launched since 1996 owes its reputation to Alexa ranking, which is a website's ranking

system, tracking the rate of users visit from about 30 million Internet websites in the entire world [4]. This website can present its ranking based on global, national and thematic clustering basis. Thus, at this stage of the study, 20 popular websites of Iranian users were extracted. It should be noted that a number of popular websites of Iranian users were universal websites like Yahoo, Google and Wikipedia, which were excluded from the final list of the study. Therefore, the final list of Iranian popular websites on January 31, 2014 was as shown in table 1. It should be noted that since the Peyvandha.ir website, which belongs to the national Internet filtering system, is loaded unintentionally, to extract the 20 main and popular websites of Iranian users, 21 websites have been embedded in the following table.

Table 1: List of Iranian popular websites [4]

Row Domain name Row Domain name Row Domain name

1 Blogfa.com 8 Farsnews.com 15 Picofile.com 2 Varzesh3.com 9 Tabnak.ir 16 Rozblog.com 3 Mihanblog.com 10 Peyvandha.ir 17 Yjc.ir 4 Facenama.com 11 Digikala.com 18 Bahseazad.ir 5 Mihanwebads.com 12 Aparat.com 19 Bmi.ir 6 Persianblog.ir 13 Beytoote.com 20 Blogsky.com 7 Bankmellat.ir 14 Cloob.com 21 Asriran.com

Then, Iranian popular websites were generally studied and 3 examples of CAPTCHA images used in them were randomly extracted. Given that some of the national popular websites have used different CAPTCHA patterns in their different sections, it was tried to extract and analyze the samples related to their most important CAPTCHA patterns. In addition, preliminary conducted studies indicated that CAPTCHA patterns are not used on popular websites, including Facenama.com, mihanwebads.com, beytoote.com, cloob.com, asriran.com and Picofile.com, suggesting the vulnerability of these websites against automated machine attacks. Meanwhile, a number of Iranian websites have used other methods rather than image CAPTCHA. For example, the user of Fars News Agency should do drag & drop to comment on the farsnews.com website, a protocol easy to break. In addition, a full-text CAPTCHA is used for the application form of adding websites to the useful links of Peyvandha.ir, which is very easy to break. On this page, the user will be asked to add or subtract two numbers that their values are on the page and insert the resultant in the relevant textbox. 4.2 Identifying and selecting some suitable OCR approaches for breaking the CAPTCHAs

To test the fragility of CAPTCHA against OCR method, two types of general and specific strategies can be used. General solutions are those designed with the aim of transforming general images to text. Although these solutions are


not designed to break the CAPTCHA, some of them can be used to break the less distorted CAPTCHAs. The specific solutions are initially designed to break or test different types of CAPTCHA and work well for breaking some well-known CAPTCHAs. The CAPTCHA-breaking solutions can be divided into two other categories, including stand-alone software and online solutions. It should be noted that, due to the costly use of specialized online solutions for solving CAPTCHAs such as Deathbycaptcha.com, Decaptcher.com and Bypasscaptcha.com and the problems facing in credit payment of their cost, this category of solutions was not used in this study. However, important websites and applications related to three other categories of strategies to test CAPTCHA were investigated. Therefore, the selected OCR strategies in this study aimed to test the extracted CAPTCHA images from 20 Iranian popular websites included the following three categories with a total of 5 different approaches:

• General Online OCR Strategies: Two websites of Onlineocr.net and Free-online-ocr.com • General standalone OCR software: Tesseract-OCR software • CAPTCHA-Breaking Standalone Software: Two Captcha Sniper and GSA Captcha Breaker software

In the preliminary stage of the study, a random CAPTCHA sample image for configuration and two other images for testing were used. The preliminary results of the study showed that the Captcha Sniper x44 and GSA Captcha Breaker V2.29 strategies had a high success rate of 18 to 23% to break various CAPTCHA patterns used in Iranian popular websites. Meanwhile, the Tesseract OCR and Free-online-ocr.com solutions with 5% success in breaking different CAPTCHA patterns were at the next importance level. In addition, the Onlineocr.net solution was not capable of breaking any of the samples examined at this stage. 4.3 Evaluating the efficacy of existing tools

The main limitation of the preliminary stage of research was the number of training and testing images, which was inevitable due to the large number of evaluated websites and using 5 different approaches. Thus, in the main stage of research, the number of evaluation strategies reduced to better solutions of Captcha Sniper and GSA Captcha Breaker, which were referred to abbreviated names of CS and GCB. Also in this stage, 100 training images and 300 different testing images were used. Thus, a more accurate assessment of vulnerability of CAPTCHA patterns in the websites studied was provided. Analysis results on the vulnerability of the nation popular websites are described in Table 2 based on the two available approaches. It should be noted that in the above table, different types of CAPTCHA patterns used in the national popular websites were presented in a descending order based on their vulnerability against OCR solutions. As can be seen in the table, the success rate of GCB and CS approaches to break 300 existing test images were presented separately. In addition, by comparing the success rate of the two approaches, the highest percentage of success of the two solutions was considered as the highest success rate of OCR.

Table 2: Vulnerability of Iranian popular websites against OCR (in descending order)

CAPTCHA pattern of the

website

Original instance of

CAPTCHA pattern

Processed image

by the best OCR

Success rate

GCB CS Best

OCR

1- Contact us at Digikala.com

96.67% 82% 96.67%

2- Forgotten password form at

Rozblog.com 95% 70% 95%

3- Payment at Bmi.ir

46.33% 80% 80%

4- Request new password at

Bmi.ir 66.67% 76% 76%

5- Follow contradiction at

Bankmellat.ir 70% 47% 70%

6- Requesting filter removal at

Peyvandha.ir 53% 55% 55%


CAPTCHA pattern of the

website

Original instance of

CAPTCHA pattern

Processed image

by the best OCR

Success rate

GCB CS Best

OCR

7- Payment at Bankmellat.ir 7.33% 52% 52%

8- Blog registration at

Blogsky.com 51.33% 36% 51.33%


Rozblog.com 20% 41% 41%

10- Login to Internet Banking

at Bankmellat.ir

21% 18% 21%

11- Subscribe for newsletter at

Tabnak.ir 0% 20% 20%

12- Subscribe for newsletter at

Yjc.ir 0% 15% 15%

13- Blog registration and

comment at Mihanblog.com *

3% 5% 5%

14- Sign up at Aparat.com *

0% 3% 3%

15- Comment at Blogfa.com 0% 0% 0%

16- Contact us at

Varzesh3.com

0% 0% 0%

17- Comment at Persianblog.ir

0% 0% 0%


Persianblog.ir

0% 0% 0%

* In this table, the vulnerability rates of CAPTCHA in Mihanblog.com and Aparat.com were presented regardless of their design problem. 4.4 Research findings

As can be seen in Table 2, the success rate of GCB strategy to break the patterns of subscribe for newsletter at Yjc.ir and subscribe for newsletter at Tabnak.ir was equal to 0%, while the CS solution could achieve success rates of 15% and 20% in breaking them. Moreover, the success rate of GCB approaches to break the pattern of payment at Bankmellat.ir and blog registration at Rozblog.com were respectively as 7.33% and 20%. However, the CS solution could achieve the success rates of 52% and 41%, respectively, in breaking these two patterns; while, in the other patterns, no significant differences were seen between the success rates of the two solutions. Therefore, it could be argued that the CS solution was more successful than GCB in breaking CAPTCHA patterns, applying the small and low-quality images, letters rotation or adding lines with color different from the text color.

The study results showed that the majority of popular websites in Iran are vulnerable to automated OCR tools. Thus, the degree of vulnerability of CAPTCHA patterns in Iranian popular websites can be divided into four general categories. The first category includes quite vulnerable CAPTCHAs that the success rate for solving them was between 41 to 97 percent, which are as the following in a descending order:


Contact us at Digikala.com, forgotten password form at Rozblog.com, payment at Bmi.ir, request new password at Bmi.ir, follow contradiction at Bankmellat.ir, requesting filter removal at Peyvandha.ir, payment at Bankmellat.ir, blog registration at Blogsky.com, and blog registration at Rozblog.com.

The second category includes relatively vulnerable CAPTCHAs that the success rate of solving them was between 15 to 21 percent, which are as the following in a descending order: login to Internet Banking at Bankmellat.ir, subscribe for newsletter at Tabnak.ir, and subscribe for newsletter at Yjc.ir.

The third category includes relatively safe CAPTCHAs that the success rate of solving them was between 3 to 5 percent, involving blog registration and comment at Mihanblog.com and sign up at Aparat.com. The fourth category includes safe CAPTCHAs that the success rate of CS and GCB strategies for solving them was equal to 0%, including the CAPTCHAs of comment at Blogfa.com, contact us at Varzesh3.com, comment at persianblog.ir and blog registration at persianblog.ir.

However, it should be noted that the CAPTCHA patterns of blog registration and comment at mihanblog.com and sign up at aparat.com, despite the relative strength against OCR, are facing with a serious design error that has jeopardized their security. In these two types of CAPTCHA, in each new session, the original text of CAPTCHA was chosen only from four constant 4-character strings. Thus, after four stages of refreshing CAPTCHA, the original text of the CAPTCHA will be repeated and only the representation of this text will change. Therefore, with no need for OCR, these two patterns can be actually broken with probability 25%. Obviously, in case of OCR method combined with the mentioned error, it is more likely to solve the CAPTCHA; since, with exact detection of one or two characters from the CAPTCHA image, the main text will be extracted easily. 5. Recommendations

According to the results presented in the research findings, it is suggested that the use of quite vulnerable CAPTCHAs will be stopped. Also, the CAPTCHA designers are recommended to improve the vulnerability of various types of relatively vulnerable or relatively safe CAPTCHAs with regard to the following points: • Using the Persian language (like comment CAPTCHA at Persianblog.ir) • Rotation of each character with random angle relative to other characters (like the pattern of comment at

Blogfa.com) • Adding lines and noise with the color of CAPTCHA main text (like contact us CAPTCHA at Varzesh3.com) • Reducing the size and quality of CAPTCHA image (like comment CAPTCHA at Blogfa.com and subscribe for

newsletter at Yjc.ir) • Overlapping the characters of CAPTCHA original text (like contact us CAPTCHA at Varzesh3.com) • Using a combination of uppercase, lowercase, numbers and special symbols in designing CAPTCHA • Use of CAPTCHA with variable length: However, when using CAPTCHA with variable number of characters, the

CAPTCHA should not be less than 5 characters in length. For example, blog registration CAPTCHA at Rozblog.com uses this idea, but in some cases, the CAPTCHA length is only 3 characters, which is its Achilles Heel.

• Fixing design error in CAPTCHAs of Mihanblog.com and Aparat.com In addition, imposing the following changes in the CAPTCHA images would not have a serious impact on their security, since these changes can be easily reversed by applying one or more simple filters: • Overall rotation of text with a fixed angle in all CAPTCHA instances (such as the pattern of request new password

at Bmi.ir) • Adding an image or similar background pattern to all CAPTCHA instances (such as the pattern of contact us at

Digikala.com) Moreover, random functions should be used carefully in such a way that none of the CAPTCHA instances generated will be insecure. For example, if the CAPTCHA background producing is a random one, which may in some cases lead to simple or pure white backgrounds, the CAPTCHA security will be reduced. For instance, The pattern of Login to Internet Banking at Mellatbank.ir faces this issue, which produces three different random images that breaking the first one (right image) is difficult, but the other two images can be easily broken (Fig. 2).

Figure 2: Three random images generated by the pattern of login to Internet banking at MellatBank.ir


It is necessary to mention that some of the recommendations of this study, which have been provided independently and based on practical experiences and observations of the present study, are very similar to the corrective recommendations presented by Hindle et al. [5]. They reported that the following methods can make serious problems for automated tools in segmentation and identification of the letters: Use of more other characters, the similarity of background to the text, use of overlapping characters, limiting the number of attempts to solve the CAPTCHA, use of continuous characters with very little distance between the two characters, not using reversible linear transformations and not using letters capable of being filled. 6. A safe Persian CAPTCHA

At the end, utilizing several secure open source CAPTCHAs available along with their localization and improvement, we designed a Persian CAPTCHA that by using the modifying suggestions of this research can have high strength against OCR. In addition, due to use of common Persian words, it has a relatively high readability by the users. Figure 3 shows the Persian words of "داوران" ,"ناودان" and "دوران" that are produced by the safe Persian CAPTCHA provided by this study.

Figure 3: Three examples of CAPTCHA produced by the safe Persian CAPTCHA

However, several Persian CAPTCHAs have been previously developed that the most important of them can be seen in Table 3. It should be noted that if the Persian features of the above CAPTCHA patterns will be excluded, it can be claimed that all of them are relatively simple. The simplicity of above Persian patterns can be easily demonstrated by comparing them with patterns examined in this study, which are presented in Table 2. Therefore, with a simple pre-processing, noise, lines and background image for these patterns can be removed, and supposing the relative maturity and development of Persian OCR, all of the below patterns would be broken after a simple training phase.

Table 3: A number of Persian CAPTCHA patterns

Original pattern Processed image Explanation Source

Persian Baffle text

CAPTCHA

Shirali-Shahreza, M. H., &Shirali-

Shahreza, M. [6]

Advanced Nastaliq

CAPTCHA

Shirali-Shahreza, M. H., &Shirali-

Shahreza, M. [7]

CAPTCHAFA http://captchafa.net/contact

Persian CAPTCHA,

type A (two above

images) and type B

(two bottom images)

SAMSUNCHI, N. [8]

Numerical Persian

CAPTCHA

http://persianblog.ir/

Signup.aspx

In contrast, the CAPTCHA pattern presented in this study, in addition to using Persian languages, benefits from

the following features, which is safer than other Persian CAPTCHA patterns available, and even assuming the relative development of the Persian OCR solutions, it will have a high security factor:

http://captchafa.net/contact

http://persianblog.ir/


Random selection of Persian fonts, Adding random letters or a constant string to the CAPTCHA in the case of short length of the extracted word from the dictionary; Random replacement of phonetic alphabet of dictionary words; Blurring the image; Reducing image quality; Waving the image characters; Adding noise to the image with the desired color of designer; Adding arbitrary number of additional lines with the same color of the main text of CAPTCHA and random rotation of each letter. 7. Conclusion The major achievement of this study was to determine the vulnerability of CAPTCHA patterns used in 20 Iranian popular websites against the OCR attacks. In addition, after identification of the vulnerability of CAPTCHA patterns applied in Iranian popular websites, several corrective recommendations were also provided to cover their weaknesses. It was found in the study that 9 of the 21 studied websites (43% of them) either do not use CAPTCHA or use the patterns that are breakable without using OCR. In addition, it was shown that a typical Persian CAPTCHA has higher strength than many common CAPTCHAs, which is rooted in the lack of OCR solutions development for Persian and Arabic languages. Therefore, a safe Persian CAPTCHA was presented based on the corrections of this research with a high readability by human users due to the use of a Persian dictionary. In addition, due to the special security features of this new Persian pattern, even in case of relative development of Persian OCR solutions, it is still unlikely to be broken. Thus, the nation popular webmasters and other websites using similar CAPTCHA patterns can change or reinforce their CAPTCHA by using the findings and applied recommendations presented in this study. 8. References 1. Bursztein, E., Martin, M., and Mitchell, J., “Text-based CAPTCHA strengths and weaknesses,” in Proceedings of

the 18th ACM conference on Computer and communications security, pp. 125–138, 2011. 2. Ahmad, S., Yan, J., and Tayara, M., “The robustness of Google CAPTCHA’s,” Newcastle University, Technical

Report CS-TR-1278, 2011. 3. Azad, S., Jain, K., “Captcha: Attacks and weaknesses against OCR technology,” Global Journal of Computer

Science and Technology, Vol. 13, No. 3, pp. 14–17, 2013. 4. Alexa, “Top Sites in Iran.” Alexa Website, January 2014, http://www.alexa.com/topsites/countries;0/IR. 5. Hindle, A., Godfrey, M.W., and Holt, R.C., “Reverse Engineering CAPTCHAs,” in 15th Working Conference on

Reverse Engineering, pp. 59–68, 2008. 6. Shirali-Shahreza, M. H., and Shirali-Shahreza, M., “Persian/Arabic Baffletext CAPTCHA,” J. UCS, vol. 12, no. 12,

pp. 1783–1796, 2006. 7. Shirali-Shahreza, M. H., and Shirali-Shahreza, M., “Advanced Nastaliq CAPTCHA,” in 7th IEEE International

Conference on Cybernetic Intelligent Systems, pp. 1–3, 2008. 8. Samsunchi, N., “An enhanced human interaction proof system for web based energy information and control

systems based on the Persian OCR CAPTCHA,” presented at the 5th Iranian Conference on Machine Vision and Image Processing, 2008.

http://www.alexa.com/topsites/countries;0/IR

Proceedings of the First National Conference on Soft...

Documents

Transcript of Proceedings of the First National Conference on Soft...