Fuzzy belief measure in random fuzzy information systems and its application to knowledge reduction

13
ORIGINAL ARTICLE Fuzzy belief measure in random fuzzy information systems and its application to knowledge reduction Jialu Zhang Xiaoling Liu Received: 26 January 2011 / Accepted: 19 April 2012 / Published online: 17 May 2012 Ó Springer-Verlag London Limited 2012 Abstract In a random fuzzy information system, by introducing a fuzzy t-similarity relation on the objects set for a subset of attributes set, the approximate representa- tions of knowledge are established. By discussing fuzzy belief measures and fuzzy plausibility measures defined by the lower approximation and the upper approximation in a random fuzzy approximation space, some equivalent conditions of knowledge reduction in a random fuzzy information system are proved. Similarly as in an infor- mation system, the fuzzy-set-valued attribute discernibility matrixes in a random fuzzy information system are con- structed. Knowledge reduction is defined from the view of fuzzy belief measures and fuzzy plausibility measures and a heuristic knowledge reduction algorithm is proposed, and the time complexity of this algorithm is O(|U| 2 |A|). A running example illustrates the potential application of algorithm, and the experimental results on the data sets with numerical attributes show that the proposed method is effective. Keywords Random fuzzy information systems Fuzzy belief measures Fuzzy plausibility measures Fuzzy-set-valued attribute discernibility matrix Knowledge reduction Complexity of algorithm 1 Introduction An information system is a database comprising an objects set U and an attributes set A, such a database is understandable because it implies the relation between objects and attributes through data, and the knowledge mode expressed in the form of decision rules in the end is done by attributes that have definitely direct meaning. It is well known that not all conditional attributes are necessary to depict the decision attribute before decision rules are generated. A decision rule with too long a description means high prediction cost. Hence, knowledge reduction in an information system, or attributes reduction in database, by which the irrelevant or superfluous attributes can be eliminated according to the learning task without losing essential information about the original data in the databases, is an important aspect of knowledge discovery that is a way to identify the mode (knowledge) from database. As a result of the knowledge reduction, a set of concise and meaningful rules are produced. In the pro- ceeding of knowledge reduction, various mathematical tools have been used, and consequently, many ways have been presented [13]. Recently, starting from Pawlak’s rough sets theory, many types of knowledge reduction have been proposed in complete information systems, complete decision systems, incomplete information sys- tems, incomplete decision systems, and covering infor- mation systems [114], each of them aimed at some basic requirement. We can see that all the above-mentioned reductions are based on classical rough set data analysis, which uses only internal knowledge, avoids external parameters, and does not rely on prior model assumptions. However, there are various types of uncertainties in the real-world problem, and this is why traditional rough set theory encounters a problem. When attribute value domains are the real interval [0, 1] in an information systems, it is not possible in the theory to say whether two attribute values are similar and what extent they are the same; for example, two close J. Zhang (&) X. Liu Network Center, Xiangnan University, Chenzhou 423000, Hunan, China e-mail: [email protected] 123 Neural Comput & Applic (2013) 22:1419–1431 DOI 10.1007/s00521-012-0951-0

Transcript of Fuzzy belief measure in random fuzzy information systems and its application to knowledge reduction

ORIGINAL ARTICLE

Fuzzy belief measure in random fuzzy information systemsand its application to knowledge reduction

Jialu Zhang • Xiaoling Liu

Received: 26 January 2011 / Accepted: 19 April 2012 / Published online: 17 May 2012

� Springer-Verlag London Limited 2012

Abstract In a random fuzzy information system, by

introducing a fuzzy t-similarity relation on the objects set

for a subset of attributes set, the approximate representa-

tions of knowledge are established. By discussing fuzzy

belief measures and fuzzy plausibility measures defined by

the lower approximation and the upper approximation in

a random fuzzy approximation space, some equivalent

conditions of knowledge reduction in a random fuzzy

information system are proved. Similarly as in an infor-

mation system, the fuzzy-set-valued attribute discernibility

matrixes in a random fuzzy information system are con-

structed. Knowledge reduction is defined from the view of

fuzzy belief measures and fuzzy plausibility measures and

a heuristic knowledge reduction algorithm is proposed, and

the time complexity of this algorithm is O(|U|2|A|). A

running example illustrates the potential application of

algorithm, and the experimental results on the data sets

with numerical attributes show that the proposed method is

effective.

Keywords Random fuzzy information systems �Fuzzy belief measures � Fuzzy plausibility measures �Fuzzy-set-valued attribute discernibility matrix �Knowledge reduction � Complexity of algorithm

1 Introduction

An information system is a database comprising an

objects set U and an attributes set A, such a database is

understandable because it implies the relation between

objects and attributes through data, and the knowledge

mode expressed in the form of decision rules in the end is

done by attributes that have definitely direct meaning. It is

well known that not all conditional attributes are necessary

to depict the decision attribute before decision rules are

generated. A decision rule with too long a description

means high prediction cost. Hence, knowledge reduction in

an information system, or attributes reduction in database,

by which the irrelevant or superfluous attributes can be

eliminated according to the learning task without losing

essential information about the original data in the

databases, is an important aspect of knowledge discovery

that is a way to identify the mode (knowledge) from

database. As a result of the knowledge reduction, a set of

concise and meaningful rules are produced. In the pro-

ceeding of knowledge reduction, various mathematical

tools have been used, and consequently, many ways have

been presented [1–3]. Recently, starting from Pawlak’s

rough sets theory, many types of knowledge reduction

have been proposed in complete information systems,

complete decision systems, incomplete information sys-

tems, incomplete decision systems, and covering infor-

mation systems [1–14], each of them aimed at some basic

requirement.

We can see that all the above-mentioned reductions are

based on classical rough set data analysis, which uses only

internal knowledge, avoids external parameters, and does

not rely on prior model assumptions. However, there are

various types of uncertainties in the real-world problem,

and this is why traditional rough set theory encounters a

problem. When attribute value domains are the real interval

[0, 1] in an information systems, it is not possible in the

theory to say whether two attribute values are similar and

what extent they are the same; for example, two close

J. Zhang (&) � X. Liu

Network Center, Xiangnan University,

Chenzhou 423000, Hunan, China

e-mail: [email protected]

123

Neural Comput & Applic (2013) 22:1419–1431

DOI 10.1007/s00521-012-0951-0

values may only differ as a result of noise, but in the

classical rough set–based approach, they are considered to

be as different as two values of a different order of mag-

nitude. Data discretization must take place before reduction

methods based on crisp rough sets can be applied. This is

often still inadequate, however, as the degrees of mem-

bership of values to discretized values are not considered at

all. In order to combat this, fuzzy rough set theory has been

developed. Fuzzy rough sets encapsulate the related but

distinct concepts of vagueness (for fuzzy sets [15]) and

indiscernibility (for rough sets), both of which occur as a

result of uncertainty in knowledge [16–18]. The fuzzy

rough set–based approach considers the extent to which

fuzzified values are similar. Based on fuzzy rough set

theory, many scholars have studied knowledge reduction in

fuzzy decision systems, in which the conditional attribute

value domains are [0, 1] and the decision attribute takes

symbol values. For example, to keep the dependency

degree invariant, Jensen et al. [19, 20] studied an knowl-

edge reduction method based on fuzzy rough sets. Tsang

and Chen et al. [21, 22] introduced a formal notion of

knowledge reduction based on fuzzy rough sets and ana-

lyzed the mathematical structure of the knowledge reduc-

tion by using the discernibility matrix approach. Zhao et al.

[23] addressed the issue whether and how the different

fuzzy approximation operators affect the result of knowl-

edge reduction from the theoretical view point. Hu et al.

[24, 25] developed a new model of fuzzy rough sets to

reduce the influence of noise generated by the fuzzy

dependency function.

As we all known, random phenomenon is one class of

uncertain phenomena which has been well studied. If

available database is obtained by a randomization method,

based on the probability, Wu [26] introduced the notion of

random information systems and constructed the random

rough set models. The notions of belief reduction and

plausibility reduction via the Dempster–Shafer theory of

evidence are presented. However, in a practical decision

rule–making process, we often face a hybrid uncertain

environment where linguistic and frequent nature coexist.

For example, in a decision information system on credit

card application, condition attributes C = {a(account

balance), b(monthly income)} are numerical attributes,

which can be normalized further into the real interval

[0, 1], and the decision attribute d (application evaluation

of credit card) has discrete values {agree(1), reject(0)}. If

some applicant may be obtained by random sampling from

a special crowd, that is to say, there is a probability dis-

tribution on the set of objects, then randomness and

fuzziness appear simultaneously in this case; the ran-

domness emerges due to two causes: one is that some

objects of information system may be retained by random

sampling and another is that some attribute values may

have error because of noise when attribute value domain is

real interval [0, 1]; the fuzziness emerges because of

attribute values in real interval [0, 1]. To deal with this

twofold uncertainty, it is required to employ random fuzzy

theory, which is a combination of classical probability

theory and fuzzy set theory [27, 28]. To depict the phe-

nomena in which randomness and fuzziness appear

simultaneously, random fuzzy set, which is also called

probabilistic set in [28], is a fundamental concept in this

theory.

As rough set theory and Dempster–Shafer theory of

evidence have strong relationships, there are also strong

relationships between fuzzy rough set theory and fuzzy

evidence theory. Based on this fact, many scholars have

analyzed the knowledge acquisition in fuzzy information

systems and fuzzy decision systems by using the fuzzy

evidence theory. For example, based on the R-implication

operator and S-implication operator, Chen et al. [29]

explored two types of fuzzy belief and plausibility func-

tions that are respectively the fuzzy lower and upper

approximation probabilities. Wu et al. [30] proposed a

general type of fuzzy belief structure induced by a general

fuzzy implication operator in an infinite universe and the

generalized fuzzy belief and plausibility functions by

generalizing Shafer approach to the fuzzy environment.

Yao et al. [31] studied knowledge reduction in fuzzy

decision systems based on generalized fuzzy evidence

theory and proved that the concepts of fuzzy positive

region reduction, lower approximation reduction, and

generalized fuzzy belief reduction are all equivalent and

the concepts of fuzzy upper approximation reduction

and generalized fuzzy plausibility reduction are equivalent.

In this paper, we attempt to investigate knowledge

reduction in random fuzzy information systems that there is

a normal probability measure P on U and the conditional

attribute and the decision attribute value domains are [0, 1].

Since fuzzy belief functions have strong connections with

fuzzy rough approximation operators, we try to study the

knowledge reduction in random fuzzy information systems

by employing fuzzy evidence theory. This paper focuses on

establishing a new model of knowledge representation,

which is called a random fuzzy rough set model, and

finding some new ways of knowledge reduction in random

fuzzy information systems. For a given subset of attribute

set, by introducing fuzzy t-similarity relation on the objects

set of random fuzzy information systems by virtue of re-

siduated implication (R-implication) operators in fuzzy

logic, the corresponding random fuzzy approximation

spaces are defined. By discussing the fuzzy belief measures

and fuzzy plausibility measures defined by lower approxi-

mation and upper approximation of random fuzzy

approximation spaces, respectively, we obtain some char-

acteristics of knowledge reduction in the random fuzzy

1420 Neural Comput & Applic (2013) 22:1419–1431

123

information systems and the random fuzzy decision infor-

mation systems. It is similar to Pawlak’s rough set theory

that the fuzzy-set-valued attribute discernibility matrix in

random fuzzy information systems and random fuzzy

decision information systems are constructed. Then, we

propose some knowledge reduction methods from the view

of fuzzy belief measures and fuzzy plausibility measures.

On the basis of fuzzy-set-valued attribute discernibility

matrix, a heuristic algorithm is proposed and the time

complexity of the algorithm is analyzed. To illustrate the

potential application and the validation of the presented

algorithm, a running example and some experiments are

presented respectively.

2 Random fuzzy approximation space

2.1 t-norms and R-implications

At first, we summarize the basic concepts on t-norms and

its residuated implication. For more details of these con-

cepts, we refer the reader to [32, 33].

A t-norm is a binary operation � on [0, 1] (i.e.,

� : ½0; 1�2 ! ½0; 1�) satisfying the following conditions: (1)

� is commutative, i.e., for all x; y 2 ½0; 1�; x� y ¼ y� x;

(2) � is associative, i.e., for all x; y; z 2 ½0; 1�; ðx� yÞ �z ¼ x� ðy� zÞ; (3) � is non-decreasing in both arguments,

i.e., x1 B x2 implies x1 � y B x2 � y, and y1 B y2 implies

x � y1 B x � y2; (4) 1 � x = x and 0 � x = 0 for all

x 2 ½0; 1�:A t-norm is called left-continuous if it is a t-norm and

is also a left-continuous mapping from [0, 1]2 into [0, 1]

(in the usual sense).

The following are our most important examples of left-

continuous t-norms:

a�G b ¼ a ^ b;

a�Go b ¼ ab;

a�Lu b ¼ ðaþ b� 1Þ _ 0;

a�R0b ¼

a ^ b; aþ b [ 1;

0; aþ b� 1:

For a left-continuous t-norm �, the operator !: ½0; 1�2! ½0; 1�,

a! b ¼_fc : c 2 ½0; 1�; a� c� bg;

is called a R-implication operator induced by t-norm �. �and ? form an adjoint pair, i.e., a � b B c if and only if

a� b! c for all a; b; c 2 ½0; 1�:The following are four R-implication operators induced

by the above four t-norms:

a!G b ¼1; a� b;

b; a [ b;

a!Go b ¼1; a� b;ba ; a [ b;

a!Lu b ¼1; a� b;

1� aþ b; a [ b;

a!R0b ¼

1; a� b;

ð1� aÞ _ b; a [ b;

respectively.

In the following, we list some properties of R-implica-

tion operators [32, 33]:

(1) 1! a ¼ a;(2) a� b, a! b ¼ 1;

(3) a� b! a� b;

(4) a� ða! bÞ� b;(5) a� b! c ¼ a! ðb! cÞ;(6) a! b ^ c ¼ ða! bÞ ^ ða! cÞ;(7) a _ b! c ¼ ða! cÞ ^ ðb! cÞ;(8) b! c�ða! bÞ ! ða! cÞ;(9) ða$ bÞ � ðb$ cÞ� ða$ cÞ;where a$ b ¼ ða! bÞ ^ ðb! aÞ:

The above four R-implication operators !G;!Go;!Lu;

!R0also have the following properties:

(10) a! b _ c ¼ ða! bÞ _ ða! cÞ;(11) a ^ b! c ¼ ða! cÞ _ ðb! cÞ:Moreover, !Lu;!R0

also satisfies:

(12) a! 0 ¼ 1� a:

In this paper, we limit the R-implication operators which

is one of the four R-implication operators we listed.

2.2 Information systems

An information system (or database system) (IS, for short)

is a quarternary form (U, A, V, F), where U ¼ fx1; x2;

. . .; xng is an object set, every xi is called an object; A ¼fa1; a2; . . .; amg is an attribute set, every ai is called an

attribute; V ¼S

ak2A Vakis a attribute value set, Vak

is a

value domain of attribute ak; F ¼ ffak: U ! Vak

; k�mg is

a relation set from U to A.

In an IS, the relation set F is very important, which

shows the connection between object set and attribute set

and gives the information source of knowledge discovery.

For example, fakxið Þ ¼ m means that the attribute value of

attribute ak of object xi is v.

Example 2.1 Table 1 gives a case information system

with three symptoms and six objects.

Neural Comput & Applic (2013) 22:1419–1431 1421

123

where objects set is U ¼ fx1; x2; . . .; x6g; attributes set

is A = {a1, a2, a3}, and attribute value domains are

Va1 = {1, 2, 3}, Va2 = {1, 2}, Va3 = {1, 2, 3, 4}, resp-

ectively. F ¼ ffa1; fa2

; fa3g; fa1

: U ! Va1; fa1

shows the

value of symptom a1 of every object. The meaning of fa2:

U ! Va2and fa3

: U ! Va3is similar to fa1.

The knowledge discovery in an IS is the classification of

objects on the basis of attribute values, a classification

determines a group concepts; therefore, the knowledge

discovery in an IS is the discovery of concepts. As the

relation among attributes is investigated, we introduce the

notion of decision information systems (sometimes called

decision table). If the attributes set of an IS is partitioned

into A and D, then it is called a decision information system

(DIS, for short) and A is referred to as a condition attribute

set and D is referred to as a decision attribute set.

In a DIS, the relations fa : U ! Va; a 2 A show the

connection between objects set and condition attributes set

and the relations fd : U ! Vd; d 2 D show the connection

between objects set and decision attributes set, and con-

sequently, condition attributes and decision attributes are

connected via the objects set. This connection gives the

information source of knowledge discovery in a DIS. For

example, fa(xi) = v means that condition attribute a of

object xi possess attribute value v and fd(xi) = u means that

decision attribute d of object xi possess attribute value

u; therefore, the proposition rule ða; vÞ ! ðd; uÞ is obtained

via object xi. In the following, we often denote

F(xi, a) = fa(xi) and then F: U 9 A ? V is a relation from

U 9 A to V.

In an IS, if Vaða 2 AÞ is a real interval [0, 1], then it is

called a continuous value information system, and it is also

called a fuzzy information system (FIS, for short) in [1, 20,

30]. A FIS is referred to as random fuzzy information

system (RFIS, for short) if there is a normal probability

measure P on U, where a normal probability measure P

means that 8x 2 U;PðfxgÞ[ 0 andP

x2U PðfxgÞ ¼ 1: If

Vaða 2 AÞ are [0,1] and Vdðd 2 DÞ are symbolic valued

sets in a DIS , then it is called a fuzzy decision system (FDS,

for short) [31]. Further, if Vdðd 2 DÞ are also [0,1], then it

is called a fuzzy decision information system (FDIS, for

short). Likewise, we can understand clearly the notions of

random fuzzy decision system (RFDS, for short) and of

random fuzzy decision information system (RFDIS, for

short). It should be noted that a FIS may be treated as a RFIS

with a special probability P(x) = 1 / |U| for all x 2 U:

Example 2.2 Table 2 gives a RFDIS.

where the object set is U = {x1, x2, x3, x4}, the condi-

tion attribute set is A = {a1, a2, a3, a4}, the decision

attribute set D = {d1, d2}, the attribute value domains are

½0; 1�; F ¼ ffa1; fa2

; fa3; fa4

; fd1; fd2g:fa1

: U ! Va1; fa1ðx1Þ ¼

0:4; fa1ðx2Þ ¼ 0:5; fa1

ðx3Þ ¼ 0:4; fa1ðx4Þ ¼ 0:7: The mean-

ing of fa2; fa3

; fa4; fd1

; fd2is similar to fa1

: P is an evenly

probability distribution on U = {x1, x2, x3, x4}.

Let U be a finite and non-empty set called universe. A

fuzzy set n of U is defined by a membership function n :

U ! ½0; 1�; nðxÞ is the value of fuzzy set n on the element

x. A fuzzy set g is called a fuzzy subset of fuzzy set

n, denoted by g � n; if g(x) B n (x) for all x 2 U: The

intersection n \ g and union n [ g of two fuzzy sets n and gare interpreted in the usual sense, namely (n \ g)(x) = n(x)

^ g(x) and (n [ g)(x) = n(x) _ g(x) for all x 2 U: A fuzzy

set R of U 9 U is also called a fuzzy relation on U. The

t-composite relation R ^ R of fuzzy relation R and R is

defined as R � Rðx; yÞ ¼W

z2UðRðx; zÞ � Rðz; yÞÞ for all

x; y 2 U: If R is reflexive (i.e., R(x, x) = 1 for all x 2 U),

symmetric (i.e., R(x, y) = R(y, x) for all x; y 2 U), and

t-transitive (i.e., R � R � R), then R is called a fuzzy

t-similarity relation on U.

2.3 Random fuzzy approximation space

Similarly as in an IS, each attributes subset B � A deter-

mines a fuzzy relation RB on U in a FIS (U, A, F) by

RBðxi; xjÞ ¼^a2B

ðFðxi; aÞ $ Fðxj; aÞÞ; xi; xj 2 U:

Theorem 2.3 (see [34]). RB is a fuzzy t-similarity relation

on U, and then the fuzzy set ½xi�BðyÞ ¼ RBðxi; yÞ; y 2 U; is

called the fuzzy t-similarity class of element xi generated by

fuzzy t-similarity relation RB.

Definition 2.4 ((U, P), RB) is called a random fuzzy

approximation space (RFAS, for short) w.r.t. the attribute

subset B. For any X 2 FðUÞ (FðUÞ denotes the set of all

fuzzy sets of U), we define a pair of lower approximation

Table 1 A case information system

U a1 a2 a3

x1 2 1 3

x2 3 2 1

x3 2 2 3

x4 1 1 2

x5 3 2 1

x6 1 1 4

Table 2 A RFDIS

U a1 a2 a3 a4 d1 d2

x1 0.4 0.6 0.3 0.8 0.7 0.6

x2 0.5 0.7 0.6 0.4 0.8 0.7

x3 0.4 0.5 0.4 0.7 0.6 0.5

x4 0.7 0.6 0.4 0.8 0.8 0.7

1422 Neural Comput & Applic (2013) 22:1419–1431

123

and upper approximation of X w.r.t. the RFAS ((U, P), RB)

as follows:

RBðXÞðxÞ ¼^y2U

ðRBðx; yÞ ! XðyÞÞ;

RBðXÞðxÞ ¼_y2U

ðRBðx; yÞ � XðyÞÞ:

The operators RB and RB from FðUÞ to FðUÞ are referred

to as lower and upper random fuzzy rough approximation

operators of ((U, P), RB), respectively, and the pair

(RBðXÞ, RBðXÞ) is called the random fuzzy rough set of

X w.r.t. ((U, P), RB).

Remark 2.5 In definition 2.4, the expression^y2U

ðRBðx; yÞ ! XðyÞÞ

can be interpreted as a measure of inclusion of fuzzy set

[x]B to fuzzy set X, and it can be denoted by uð½x�B� XÞ: If

X and [x]B are crisp sets, then obviously uð½x�B� XÞ ¼ 1 if

½x�B� X; and uð½x�

B� ½x�

BÞ ¼ 0 otherwise. Hence,

RBðXÞðxÞ ¼ uð½x�B� XÞ: Analogously, the expression_

y2U

ðRBðx; yÞ � XðyÞÞ

can be interpreted as a measure of intersection of fuzzy set

[x]_B and fuzzy set X, and it can be denoted by uð½x�B\ XÞ:

If X and [x]_B are crisp sets, then obviously uð½x�B\ XÞ ¼ 1

if ½x�B \ X 6¼ ;, and uð½x�B\ XÞ ¼ 0 otherwise. Hence,

RBðXÞðxÞ ¼ uðF \ TðxÞÞ: Therefore, the lower approxi-

mation RB and the upper approximation RB of fuzzy set

X w.r.t. the RFAS can be viewed as extension of Pawlak’s

lower approximation and upper approximation operators.

Theorem 2.6 For X; Y 2 FðUÞ;

(1) RBð;Þ ¼ ;;RBðUÞ ¼ U;

(2) RBðXÞ � X � RBðXÞ;(3) RBðX \ YÞ ¼ RBðXÞ \ RBðYÞ;

RBðX [ YÞ ¼ RBðXÞ [ RBðYÞ;(4) RBðX [ YÞ RBðXÞ [ RBðYÞ;RBðX \ YÞ

� RBðXÞ \ RBðYÞ;(5) X � Y ) RBðXÞ � RBðYÞ; RBðXÞ � RBðYÞ;(6) RBð½x�BÞ ¼ ½x�B ;RBð½x�BÞ ¼ ½x�B ;

(7) RBðRBðXÞÞ ¼ RBðXÞ;RBðRBðXÞÞ ¼ RBðXÞ;(8) RBðRBðXÞÞ ¼ RBðXÞ;RBðRBðXÞÞ ¼ RBðXÞ:

Proof They are obvious. h

The lower and upper approximations can be understood

as a pair of additional unary fuzzy set-theoretic operators

RB;RB : FðUÞ ! FðUÞ; called approximation operators.

There is an obvious duality between RB and RB; and this

duality is related to the duality between ? and �. In

Theorem 2.6, (1) are the boundary conditions that the

operators must meet at the two extreme points of FðUÞ; the

minimum element ; and the maximum element U. (2) says

the two operators produce a range in which it lies in the

given set. (3) and (4) may be viewed as distributivity and

weak distributivity of operators RB;RB over fuzzy set

intersection and union. (5) shows that the two operators are

monotone on FðUÞ: (6) indicates that all fuzzy t-similarity

classes [xi]_B are fixed under such two operators. (7) shows

that the lower approximation RBðXÞ and RBðXÞ of each

fuzzy set are the fixed points of such two operators RB and

RB; respectively. (8) shows that the upper approximation

RBðXÞ of each fuzzy set are the fixed points of operator RB

and the lower approximation RBðXÞ of each fuzzy set are

the fixed points of operator RB:

The expressions of lower approximation RBðXÞ and the

upper approximation RBðXÞ of fuzzy set X in RFAS are

same as in fuzzy rough set model [35], and the probability

distribution P is appeared implicitly in expressions, and it

is only used in computing the possibility of fuzzy event

RBðXÞ and RBðXÞ (as shown next of this paper). If the

attribute value of a fuzzy information system is only taken

in {0, 1}, then RB is an unary crisp relation; in this case, the

lower approximation operator RB and the upper approxi-

mation operator RB in this paper coincide with the lower

approximation operator and the upper approximation

operator in Pawlak’s rough set model, for which see [1].

Another way to look at the definition for the lower

approximation operator RB and the upper approximation

operator RB in RFAS is through fuzzy topology. In fact, RB

belongs to a very special subclass of the fuzzy interior

operator of the class of fuzzy topology space called fuzzy

t-locality spaces, and RB belongs to a very special subclass

of the fuzzy closure operator of the class of fuzzy topology

space called fuzzy t-neighborhood spaces, for which see

[36, 37].

In order to find out some new ways of knowledge

reduction in RFIS and RFDIS, we now discuss fuzzy belief

measures and fuzzy plausibility measures derived from the

lower approximation and upper approximation in RFAS.

At first, we recall the definition of fuzzy belief measures

and fuzzy plausibility measures [1, 38, 39], which is

nothing but an extension to fuzzy sets of the definition

given in [40] for crisp sets.

Definition 2.7 A function

m : FðUÞ ! ½0; 1�; mð;Þ ¼ 0;X

E2FðUÞmðEÞ ¼ 1;

is called a basic probability assignment; if m(E) [ 0, then

we call E a focal.

Neural Comput & Applic (2013) 22:1419–1431 1423

123

Proposition 2.8 (see [38]). Given a basic probability

assignment m in FðUÞ; the focal setM¼ fE : mðEÞ[ 0gconstitutes a countable (eventually finite) set.

Definition 2.9 (see [1]). Suppose that m is a basic prob-

ability assignment and M is the focal set. The function

Bel : FðUÞ ! ½0; 1� that maps every X 2 FðUÞ in its

belief degree Bel(X) defined by

BelðXÞ ¼XE2M

mðEÞ^y2U

ðEðyÞ ! XðyÞÞ !

is called a fuzzy belief measure induced by m, and the

function Pl : FðUÞ ! ½0; 1� that maps every X 2 FðUÞ in

its plausibility degree Pl(X) defined by

PlðXÞ ¼XE2M

mðEÞ_y2U

ðEðyÞ � XðyÞÞ !

is called a fuzzy plausibility measure induced by m.

Proposition 2.10 Let AðBÞ ¼ f½x�B : x 2 Ug be the set of

all fuzzy t-similarity classes generated by fuzzy t-similarity

relation RB. The function mB : FðUÞ ! ½0; 1� defined by

mBðEÞ ¼ Pðfx 2 U : ½x�B ¼ EgÞ

is a basic probability assignment andMB ¼ f½x�B : x 2 Ugis a focal set.

Proof Since P is a normal probability distribution on

U, we know that mB(;) = 0, mB(E) [ 0 for every

E 2 MB andP

E2MBmBðEÞ ¼ Pðfx 2 U : ½x�B ¼ EgÞ ¼ 1:

This means that mB is a basic probability assignment and

M¼ f½x�Bjx 2 Ug is a focal set. h

Theorem 2.11 Let ((U, P), RB) be a RFAS. The fuzzy

belief measure induced by mB satisfies

(1) BelB(;) = 0, BelB(U) = 1,

(2) BelBðX1 [ X2 [ � � � [ XnÞP;6¼I�f1;2;...;ngð�1ÞjIjþ1

BelBðT

i2I XiÞ; for every positive integer n and for

every n-tuple X1;X2; . . .Xn of fuzzy subsets of U.

(3) BelBðXÞ ¼ ePðRBðXÞÞ ¼P

x2U PðfxgÞRBðXÞðxÞ:The fuzzy plausibility measure induced by mB satisfies

(4) PlB(;) = 0, PlB(U) = 1,

(5) PlBðX1 \ X2 \ � � � \ XnÞ�P;6¼I�f1;2;...;ngð�1ÞjIjþ1

PlBðS

i2I XiÞ; for every positive integer n and for

every n-tuple X1;X2; . . .Xn of fuzzy subsets of U.

(6) PlBðXÞ ¼ ePðRBðXÞÞ ¼P

x2U PðfxgÞRBðXÞðxÞ:

Proof The result of this theorem is a special case of [30].

h

Remark 2.12 By Remark 2.5 and Theorem 2.11, we have

that

BelBðXÞ ¼X

E2MB

mBðEÞuð½x�B � XÞ;

PlBðXÞ ¼X

E2MB

mBðEÞuð½x�B \ XÞ:

This expression is very close to the traditional definition of

a belief function through the mass function [40].

3 Knowledge reduction methods in RFIS and RFDIS

Using fuzzy belief measures and fuzzy plausibility mea-

sures, we can obtain the knowledge reduction methods in

RFIS and RFDIS.

Let ((U, P), A, F) be a RFIS. If an attribute subset

B satisfies RA = RB and RB-{b} = RA for any b 2 B; then

B is referred to as a reduction of ((U, P), A, F). A RFDIS

((U, P), A [ D, F) is called consistent if RA � RD: For a

consistent RFDIS ((U, P), A [ D, F), if an attribute subset

B satisfies RB � RD and RB0 6� RD for any B0 � B; then B is

referred to as a reduction of ((U, P), A [ D, F). The

intersection of all reductions is called the core of A.

Proposition 3.1 Let ((U, P), A, F) be a RFIS. For B �A;RB ¼ RA if and only if

Xl

i¼1

PlBðXiÞ ¼ MA;

where AðAÞ ¼ f½x�A : x 2 Ug ¼ fX1; . . .;Xlg;MA ¼Pl

i¼1ePðXiÞ ¼Pl

i¼1

Px2U PðfxgÞXiðxÞ:

Proof If B � A and RB = RA then RBðXiÞ ¼ RAðXiÞ ¼Xiði� lÞ by Theorem 2.6. Hence,

Xl

i¼1

PlBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞ ¼Xl

i¼1

ePðXiÞ ¼ MA:

Conversely, suppose thatPl

i¼1 PlBðXiÞ ¼ MA. By

Theorem 2.6, we have RBðXiÞ Xiði� lÞ: Hence,

Xl

i¼1

PlBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞXl

i¼1

ePðXiÞ ¼ MA:

It follows fromPl

i¼1 PlBðXiÞ ¼ MA that

Xl

i¼1

PlBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞ ¼Xl

i¼1

ePðXiÞ:

Thus, we have from the above equation and RBðXiÞ Xi

that RBðXiÞ ¼ Xiði� lÞ: Based on this fact, we are going to

prove that RA = RB.

In fact, if RA = RB, then there are x0; y0 2 U such that

RA(x0, y0) = RB(x0, y0). On the one hand, we have from

RA � RB that RA(x0, y0) \ RB(x0, y0). On the other hand, it

follows from

1424 Neural Comput & Applic (2013) 22:1419–1431

123

RBð½x0�AÞðy0Þ ¼_z2U

½RBðy0; zÞ � ½x0�AðzÞ�

¼_z2U

½RBðy0; zÞ � RAðx0; zÞ� ¼ ½x0�Aðy0Þ;

that 8z 2 U;RBðy0; zÞ � RAðx0; zÞ� ½x0�Aðy0Þ ¼ RAðx0; y0Þ:Hence, RB(y0, x0) � RA(x0, x0) B RA(x0, y0). This shows

that RB(x0, y0) B RA(x0, y0). This is a contradiction. h

Proposition 3.2. Let ((U, P), A, F) be a RFIS. For B �A;RB ¼ RA if and only if

Xl

i¼1

BelBðXiÞ ¼ MA;

where AðAÞ ¼ f½x�A : x 2 Ug ¼ fX1; . . .;Xlg;MA ¼Pl

i¼1ePðXiÞ ¼Pl

i¼1

Px2U PðfxgÞXiðxÞ:

Proof If B � A and RB = RA, then RBðXiÞ ¼ RAðXiÞ ¼Xiði� lÞ by Theorem 2.6. Hence,

Xl

i¼1

BelBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞ ¼Xl

i¼1

ePðXiÞ ¼ MA:

Conversely, suppose thatPl

i¼1 BelBðXiÞ ¼ MA. By

Theorem 2.6, we have RBðXiÞ � Xiði� lÞ: Hence,

Xl

i¼1

BelBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞ�Xl

i¼1

ePðXiÞ ¼ MA:

It follows fromPl

i¼1 BelBðXiÞ ¼ MA that

Xl

i¼1

BelBðXiÞ ¼Xl

i¼1

ePðRBðXiÞÞ ¼Xl

i¼1

ePðXiÞ:

Thus, we have from the above equation and RBðXiÞ � Xi

that RBðXiÞ ¼ Xiði� lÞ: Based on this fact, we are going to

prove that RA = RB.

In fact, if RA = RB, then there are x0; y0 2 U such that

RA(x0, y0) = RB(x0, y0). On the one hand, we have from

RA � RB that RA(x0, y0) \ RB(x0, y0). On the other hand, it

follows from

RBð½x0�AÞðx0Þ ¼^z2U

½RBðx0; zÞ ! ½x0�AðzÞ�

¼^z2U

½RBðx0; zÞ ! RAðx0; zÞ�

¼ ½x0�Aðx0Þ ¼ RAðx0; x0Þ ¼ 1;

that 8z 2 U;RBðx0; zÞ ! RAðx0; zÞ ¼ 1: Hence, 8z 2 U;

RBðx0; zÞ�RAðx0; zÞ: It shows that RB(x0, y0) B RA(x0, y0).

This is a contradiction. h

Propositions 3.1 and 3.2 seem very interesting in the

sense that a subset of attributes set determines the same

fuzzy t-similarity relation as the set of all attributes can be

characterized by the fuzzy belief measures or fuzzy

plausibility measures of the fuzzy t-similarity class, and

then we can derive whether a subset of attributes set is a

reduction or not by computing the fuzzy belief measures or

fuzzy plausibility measures of the fuzzy t-similarity class.

Hence, we have from Propositions 3.1 and 3.2 the following

Theorem 3.3, it is a knowledge reduction way in RFIS.

Theorem 3.3 Let ((U, P), A, F) be a RFIS. Denote

AðAÞ ¼ f½x�A : x 2 Ug ¼ fX1; . . .;Xlg, MA ¼Pl

i¼1ePðXiÞ:

Then, the following three assertions are equivalent:

(1) B � A is a reduction of ((U, P), A, F).

(2)Pl

i¼1 BelBðXiÞ ¼ MA andPl

i¼1 BelBðXiÞ\MA for

any B0 � B:

(3)Pl

i¼1 PlBðXiÞ ¼ MA andPl

i¼1 PlBðXiÞ[ MA for any

B0 � B:

Theorem 3.4 Let ((U, P), A [ D, F) be a RFDIS. Denote

AðDÞ ¼ f½x�D : x 2 Ug ¼ fD1; . . .;Dkg; MD ¼Pk

j¼1ePðDjÞ:

Then, the following three assertions are equivalent:

(1) B � A is a reduction of ((U, P), A [ D, F).

(2)Pk

j¼1 BelBðDjÞ ¼ MD andPk

j¼1 BelBðDjÞ\MD for

any B0 � B:

(3)Pk

j¼1 PlBðDjÞ ¼ MD andPk

j¼1 PlBðDjÞ[ MD for any

B0 � B:

Proof The proof is similar to that of Theorem 3.3. h

Remark 3.5 If ((U, P), A [ D, F) is a fuzzy decision

system and P is an evenly probability distribution on

U, then D1; . . .;Dk are crisp sets and AðDÞ ¼ f½x�D : x 2Ug ¼ fD1; . . .;Dkg is a partition of U, MD ¼

Pkj¼1

ePðDjÞ ¼Pk

j¼1

Px2U PðfxgÞDjðxÞ ¼

Pkj¼1jDjjjUj ¼ 1: In this

case, the knowledge reduction in this paper is exactly the

upper approximation reduction in [31]. Hence, Theorem

3.4 can be viewed as an extension of Theorem 6 and

Corollary 3 in [31].

Theorem 3.4 gives a knowledge reduction way for a

RFDIS. By computing the fuzzy belief measures or fuzzy

plausibility measures of the fuzzy t-similarity class deter-

mined by decision attributes set, we can derive whether a

subset of attributes set is a reduction or not.

Because the R-implication operators are extensively

used in fuzzy reasoning, in the following, we introduce the

notion of fuzzy-set-valued attribute discernibility matrix

using R-implication operators in RFIS and RFDIS, which

can be regarded as an extension to RFIS and RFDIS of the

corresponding notion given in Pawlak’s rough set theory

[2, 5], and then make use of it to give a knowledge

reduction method in RFIS and RFDIS.

Neural Comput & Applic (2013) 22:1419–1431 1425

123

Let ((U, P), A, F) be a RFIS. We define a fuzzy-set-

valued attribute discernibility matrix in RFIS as follows:

for xi; xj 2 U,

rðxi; xjÞðaÞ ¼ ðFðxi; aÞ $ Fðxj; aÞÞ !^b2A

ðFðxi; bÞ

$ Fðxj; bÞÞ; a 2 A:

Denote

H ¼ fðxi; xjÞ : i jg; FðHÞ ¼ frðxi; xjÞ : ðxi; xjÞ 2 Hg:

If we define P0(xi, xj) = 2 P(xi)P(xj) (i = j) and

P0(xi, xj) = P(xi)P(xj)(i = j), then P0 is a probability

distribution on H. For any E 2 FðHÞ; if we define

jðEÞ ¼ fðxi; xjÞ : ðxi; xjÞ 2 H; rðxi; xjÞ ¼ Eg;mðEÞ ¼ P0ðjðEÞÞ;

thenP

E2FðHÞ mðEÞ ¼ 1: For each attribute set B � A;

denote

Pl�ðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

_a2A

ðrðxi; xjÞðaÞ � BðaÞÞ;

Bel�ðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

^a2A

ðrðxi; xjÞðaÞ ! BðaÞÞ;

where if a 2 B then B(a) = 1 else B(a) = 0.

Theorem 3.6 Let ((U, P), A, F) be a RFDIS. The

meaning of r(xi, xj) as the above. If there are xi, xj such that

fa : a 2 A; rðxi; xjÞðaÞ ¼ 1g is singleton {a0}, then {a0} is

an element of core(A).

Proof If |A| = 1, i.e., A is singleton {a}, then cor-

e(A) = {a}. If there are xi, xj such that fa : a 2A; rðxi; xjÞðaÞ ¼ 1g is singleton {a0}, then we need to prove

that a0 is a common element of all reductions. Since fa :

a 2 A; rðxi; xjÞðaÞ ¼ 1g ¼ fa0g; we know that

Fðxi; a0Þ $ Fðxj; a0Þ�^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ;

and

Fðxi; aÞ $ Fðxj; aÞ[^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ;

for all a 2 A� fa0g: Hence,

RA�fa0gðxi; xjÞ ¼^

b2A�fa0gðFðxi; bÞ $ Fðxj; bÞÞ[ RAðxi; xjÞ

¼^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ:

Therefore, RA-{a_0} = RA. This shows that a0 is a common

element of all reductions. h

On the basis of the above argument, we have the fol-

lowing reduction way in RFIS.

Theorem 3.7 Let ((U, P), A, F) be a RFIS. The following

two assertions are equivalent:

(1) B � A is a reduction of ((U, P), A, F).

(2) Pl*(B) = 1 and Pl*(B0) \ 1 for any B0 � B:

Proof (1) ) (2) Supposing that B � A is a reduction.

Then RB = RA.

Pl�ðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

_a2A

ðrðxi; xjÞðaÞ � BðaÞÞ

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

_a2B

rðxi; xjÞðaÞ

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

_a2B

"ðFðxi; aÞ $ Fðxj; aÞÞ

!^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ#

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

"^a2B

ðFðxi; aÞ $ Fðxj; aÞÞ

!^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ#

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞðRBðxi; xjÞ ! RAðxi; xjÞÞ

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ ¼ 1:

Since RB’ = RA for any B0 � B; there is (xi(0), xj

(0))

such that RB’ (xi(0), xj

(0)) [ RA(xi(0), xj

(0)). Hence,V

a2B0

ðFðxð0Þi ; aÞ $ Fðxð0Þj ; aÞÞ !V

b2AðFðxð0Þi ; bÞ $ Fðxð0Þj ; bÞÞ

\1: Therefore,

Pl�ðB0Þ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

^a2B0ðFðxi; aÞ $ Fðxj; aÞÞ

"

!^b2A

ðFðxi; bÞ $ Fðxj; bÞÞ#\1:

(2)) (1) Supposing that Pl*(B) = 1 and Pl*(B0) \ 1 for

any B0 � B: Then

Pl�ðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞðRBðxi; xjÞ ! RAðxi; xjÞÞ

¼ 1 ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ:

Hence, 8ðxi; xjÞ 2 H; RBðxi; xjÞ ! RAðxi; xjÞ ¼ 1: This

shows that RB(xi, xj) B RA(xi, xj). Notice that RA(xi, xj)

B RB(xi, xj). Therefore, RB = RA.

1426 Neural Comput & Applic (2013) 22:1419–1431

123

It follows from

Pl�ðB0Þ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞðRB0 ðxi; xjÞ ! RAðxi; xjÞÞ

\1 ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ;

that there is (xi(0), xj

(0)) such that RB0 ðxð0Þi ; xð0Þj Þ !

RAðxð0Þi ; xð0Þj Þ\1: Thus, RB’ (xi

(0), xj(0)) [ RA(xi

(0), xj(0)). This

shows that RB0 6� RA: Therefore, B is a reduction of

((U, P), A, F). h

If the R-implication operator is chosen as !Lu or !R0;

which satisfies a! 0 ¼ 1� a; then we have the following

theorem.

Theorem 3.8 Let ((U, P), A, F) be a RFIS. The following

two assertions are equivalent:

(1) B � A is a reduction of ((U, P), A, F).

(2) Bel*(A - B) = 0 and Bel*(A - B0) [ 0 for any

B0 � B:

Proof (1) ) (2) Supposing that B � A is a reduction.

Then, RB = RA. It follows thatW

a2B rðxi; xjÞðaÞ ¼ 1:

Hence,

Bel�ðA� BÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

^a2A

ðrðxi; xjÞðaÞ

! ðA� BÞðaÞÞ¼

Xrðxi;xjÞ2FðHÞ

mðrðxi; xjÞÞ^a2B

ðrðxi; xjÞðaÞ ! 0Þ

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

^a2B

ð1� rðxi; xjÞðaÞÞ

¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ 1�

_a2B

rðxi; xjÞðaÞ !

¼ 0:

Since RB0 6� RA for any B0 � B; there is (xi(0), xj

(0)) such that

RB’ (xi(0), xj

(0)) [ RA(xi(0), xj

(0)). This means thatW

a2B0

rðxi; xjÞðaÞ\1: Hence,

Bel�ðA� B0Þ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

1�_a2B0

rðxi; xjÞðaÞ !

[ 0:

(2) ) (1) Supposing that Bel*(A - B) = 0 and Bel*

(A - B0) [ 0 for any B0 � B: Then from that Bel�ðA� BÞ¼P

rðxi;xjÞ2FðHÞ mðrðxi; xjÞÞð1�W

a2B rðxi; xjÞðaÞÞ ¼ 0; we

obtain 8ðxi; xjÞ 2 H; 1�W

a2B rðxi; xjÞðaÞ ¼ 0: This means

that

_a2B

ðFðxi;aÞ$Fðxj;aÞÞ!^b2A

ðFðxi;bÞ$Fðxj;bÞÞ" #

¼^a2B

ðFðxi;aÞ$Fðxj;aÞÞ!^b2A

ðFðxi;bÞ$Fðxj;bÞÞ¼ 1:

Hence, RB(xi, xj) B RA(xi, xj). Notice that RA(xi, xj) B

RB(xi, xj). Therefore, RB = RA.

It follows from Bel�ðA� B0Þ ¼P

rðxi;xjÞ2FðHÞmðrðxi;

xjÞÞð1�W

a2B0 rðxi; xjÞðaÞÞ[ 0 that there is (xi(0), xj

(0)) such

that 1�W

a2B0 rðxð0Þi ; x

ð0Þj ÞðaÞ[ 0: This means that

_a2B0

"ðFðxð0Þi ; aÞ $ Fðxð0Þj ; aÞÞ

!^b2A

ðFðxð0Þi ; bÞ $ Fðxð0Þj ; bÞÞ#

¼^a2B0ðFðxð0Þi ; aÞ $ Fðxð0Þj ; aÞÞ

!^b2A

ðFðxð0Þi ; bÞ $ Fðxð0Þj ; bÞÞ\1:

Thus, RB’(xi(0), xj

(0)) [ RA(xi(0), xj

(0)). This shows that

RB0 6� RA: Therefore, B is a reduction of ((U, P), A, F). h

We can make a similar discussion in a RFDIS. Let

((U, P), A [ D, F) be a RFDIS. We define the fuzzy-set-

valued attribute discernibility matrix in RFDIS as follows:

for xi; xj 2 U,

rðxi; xjÞðaÞ ¼ ðFðxi; aÞ $ Fðxj; aÞÞ !^d2D

ðFðxi; dÞ

$ Fðxj; dÞÞ; a 2 A:

Denote

H ¼ fðxi; xjÞ : i jg;FðHÞ ¼ frðxi; xjÞ : ðxi; xjÞ 2 Hg:

If we define P0(xi, xj) = 2 P(xi)P(xj) (i = j) and

P0(xi, xj) = P(xi)P(xj)(i = j), then P0 is a probability

distribution on H. For E 2 FðHÞ; if we define

jðEÞ ¼ fðxi; xjÞ : ðxi; xjÞ 2 H; rðxi; xjÞ ¼ Eg;mðEÞ¼ P0ðjðEÞÞ;

thenP

E2FðHÞmðEÞ ¼ 1: For each attribute subset B � A;

denote

Pl�dðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

_a2A

ðrðxi; xjÞðaÞ � BðaÞÞ;

Bel�dðBÞ ¼X

rðxi;xjÞ2FðHÞmðrðxi; xjÞÞ

^a2A

ðrðxi; xjÞðaÞ ! BðaÞÞ:

Theorem 3.9 Let ((U, P), A [ D, F) be a RFDIS, the

meaning of r(xi, xj) as the above. If there are xi, xj such that

Neural Comput & Applic (2013) 22:1419–1431 1427

123

faja 2 A; rðxi; xjÞðaÞ ¼ 1g is singleton {a0}, then {a0} is

an element of core(A).

Proof The proof is similar to that of Theorem 3.6. h

On the basis of the above argument, analogously as in

the proof of Theorems 3.7 and 3.8, we can prove the fol-

lowing theorem.

Theorem 3.10 Let ((U, P), A [ D, F) be a RFDIS. The

following two assertions are equivalent:

(1) B � A is a reduction of ((U, P), A [ D, F).

(2) Pld*(B) = 1 and Pld

*(B0) \ 1 for any B0 � B:

Property 3.11 In computing Pld*(B) in RFDIS (Pl*(B) in

RFIS), if for every r(xi, xj) there is a 2 B such that

r(xi, xj)(a) = 1, then Pld*(B) = 1 (Pl*(B) = 1).

If the R-implication operator is chosen as !Lu or !R0

then we have the following theorem.

Theorem 3.12. Let ((U, P), A [ D, F) be a RFDIS. The

following two assertions are equivalent:

(1) B � A is a reduction of ((U, P), A [ D, F).

(2) Beld*(A - B) = 0 and for any B0 � B; Bel�dðA� B0Þ

[ 0:

4 Algorithm based on fuzzy belief measures and fuzzy

plausibility measures for knowledge reduction

In this section, a heuristic algorithm based on fuzzy belief

measures and fuzzy plausibility measures for knowledge

reduction (FBMKR, for short) is presented.

Since core is the common part of all reductions, core can

be used as the starting point for computing reduction. This

algorithm finds an approximately minimal reduction.

Algorithm on knowledge reduction in a RFDIS (RFIS):

Input: A RFDIS ((U, P), A [ D, F) (or a RFIS ((U, P), A, F)).

Output: One reduction Q of A.

Step 1. |U| ? n,

core(A): = ;.For i = 1 to n Do

For j = 1 to i Do

(1) C(xi, xj): = ;.(2) Compute rðxi; xjÞðaÞ ¼ ðFðxi; aÞ $ Fðxj; aÞÞ !V

d2DðFðxi; dÞ $ Fðxj; dÞÞ for every attribute a 2 A; (in a RFIS,

rðxi; xjÞðaÞ ¼ ðFðxi; aÞ $ Fðxj; aÞÞ !V

b2AðFðxi; bÞ $ Fðxj; bÞÞ:(3) For every attribute a 2 A; if r(xi, xj)(a) = 1 then

C(xi, xj) [ {a} ? C(xi, xj).

(4) If C(xi, xj) is a singleton, i.e., C(xi, xj) = {a} then

core(A) [ {a} ? core(A).

continued

Endfor

Endfor

Step 2. If C(xi, xj) \ core(A) = ; for all C(xi, xj), then the

algorithm terminates (core(A) is the minimal reduction).

(From Step 3 to Step 5, a subset C of attributes set A by adding

attributes is created)

Step 3. C: = core(A), C0: = A - core(A).

Step 4. Taking a 2 C0; C [ fag ! C:

Step 5. If there are C(xi, xj) such that C(xi, xj) \ C = ;, then

C0 - {a} ? C0 and go to Step 4.

Step 6. (Create a reduction Q of A by dropping attributes)

Set C0 0 = C - core(A), |C0 0 | ? m.

For k = 1 to m Do

(1) Remove the kth attribute ai from C0 0.

(2) If there are C(xi, xj) such that C(xi, xj) \ (C0 0 [core(A)) = ;, then C0 0 [ {ai} ? C0 0.

Endfor

Step 7. Let C0 0 [ core(A) ? Q, the algorithm terminates (result

Q constitutes a reduction of A).

By using this algorithm, the time complexity to find one

reduction is polynomial.

At Step 1, the time complexity to compute all r(xi, xj)

is O(|U|2|A [ D|) (the time complexity to compute all

r(xi, xj) is O(|U|2|A|) in RFIS). The time complexity

to compute all C(xi, xj) is O(|U|2|A|). So the price of Step 1

is O(|U|2|A [ D|) (the price of Step 1 is O(|U|2|A|) in

RFIS).

From Step 4 to Step 5, the time complexity is O(|U|2|A|).

At Step 5, the time complexity is O(|U|2|A|).

Thus, the time complexity of this algorithm is

O(|U|2|A [ D|) (the time complexity is O(|U|2|A|) in RFIS).

Example 4.1 Let us consider the RFDIS presented in

Example 2.2.

For Table 2, we compute an approximately minimal

reduction by the use of the algorithm FBMKR. The

R-implication operator is chosen as !G :

Step 1A. Compute rðx1; x1Þ ¼ rðx2; x2Þ ¼ rðx3; x3Þ ¼rðx4; x4Þ ¼ rðx1; x2Þ ¼ rðx2; x3Þ ¼ rðx2; x4Þ ¼ A; rðx1;

x3Þ ¼ 0:5a1þ 1

a2þ 1

a3þ 0:5

a4; rðx1; x4Þ ¼ 1

a1þ 0:6

a2þ 1

a3þ 0:6

a4;

rðx3; x4Þ ¼ 1a1þ 1

a2þ 0:5

a3þ 0:5

a4: Compute C(x1, x1) =

C(x2, x2) = C(x3, x3) = C(x4, x4) = C(x1, x2) = C(x2, x3)

= C(x2, x4) = A, C(x1, x3) = {a2, a3}, C(x1, x4) =

{a1, a3}, C(x3, x4) = {a1, a2}. core(A) = ;.Step 3A. Set C: = ;, C0: = A.

Step 4A. Taking a1, set C: = ; [ {a1} = {a1}.

Step 5A. Since C(x1, x3) \ C = ;, C0: = {a2, a3, a4}

and go to Step 4A0.

1428 Neural Comput & Applic (2013) 22:1419–1431

123

Step 4A0. Taking a2, set C: = {a1} [ {a2} = {a1, a2}.

Step 5A0. Since C(xi, xj) \ C = ; for all xi, xj, go to

Step 6A.

Step 6A. Set C00 = C - core(A) = {a1, a2}, |C00| = 2

? m.

k = 1, set C00 - {a1} = {a2} ? C00. Since C(x1, x4) \(C00 [ core(A)) = {a1, a3} \ {a2} = ;, C00 [ {a1} =

{a1, a2} ? C00.k = 2, set C00 - {a2} = {a1} ? C00. Since C(x1, x3) \(C00 [ core(A)) = {a2, a3} \ {a1} = ;, C00 [ {a2} =

{a1, a2} ? C00.Step 7A. Let C00 [ core(A) = {a1, a2} ? Q, Q =

{a1, a2} is one reduction of A.

In the above running, if we take a4 and set C: = {a1} [{a4} = {a1, a4} in Step 4A0, then we check easily that

{a1, a4} is not a reduction of A. But if the R-implication

operator is chosen as !Lu in computing, then we have the

following result.

Step 1B. Compute rðx1; x1Þ ¼ rðx2; x2Þ ¼ rðx3; x3Þ ¼rðx4; x4Þ ¼ rðx1; x2Þ ¼ rðx2; x4Þ ¼ A; rðx1; x3Þ ¼ rðx2;

x3Þ ¼ 0:9a1þ 1

a2þ 1

a3þ 1

a4; rðx1; x4Þ ¼ 1

a1þ 0:9

a2þ 1

a3þ 0:9

a4;

rðx3; x4Þ ¼ 1a1þ 0:9

a2þ 0:8

a3þ 0:9

a4:

Compute C(x1, x1) = C(x2, x2) = C(x3, x3) = C(x4,

x4) = C(x1, x2) = C(x2, x4) = A, C(x1, x3) = C(x2, x3)

= {a2, a3, a4}, C(x1, x4) = {a1, a3}, C(x3, x4) = {a1}.

core(A) = {a1}.

Step 3B. Set C: = {a1}, A - C = {a2, a3, a4}.

Step 4B. Taking a4, set C: = {a1} [ {a4} = {a1, a4}.

Step 5B. Since C(xi, xj) \ C = ; for all xi, xj, go to Step

6B.

Step 6B. Set C00 = C - core(A) = {a4}, |C00| = 1 ?m.

Set C00 - {a4} = ; ? C00. Since C(x1, x3) \ (C00 [core(A)) = {a2, a3, a4} \ {a1} = ;, C00 [ {a4} = {a4}

? C00.Step 7B. Let C00 [ core(A) = {a1, a4} ? Q, Q =

{a1, a4} is one reduction of A.

Therefore, {a1, a4} is a reduction of A. This fact shows

that the knowledge reduction in RFDIS and RFIS has

relevance to the choice of R-implication operator.

5 Experimentation

To show the utility of FBMKR algorithm and to compare

the knowledge reduction based on neighborhood model

(BNMKR, for short) in [14], the two algorithms avoid the

original use of just the dominant symbolic labels of the

discretized numerical attributes for reduced potential loss

of information, and we test them on four real data sets:

Letter, Diabe, Glass, and Wine. These four data sets are

selected from UCI machine learning repository (http://

archive.ics.uci.edu/ml/), and the attribute values of them

are all numerical. The description of data sets is shown in

Table 3. In order to make the FBMKR algorithm can be

performed on the above four data sets we need to normalize

the data sets, the formula of normalizing the data sets is:

x0 ¼ x�amin

amax�amin; where amin and amax are the minimum and

maximum value of attribute a, respectively.

To observe the impacts on computing time of two

algorithms (FBMKR and BNMKR) on the sample number,

now use the bigger sample set, with attribute set stays

unchanged, and try the experiments by changing the

number of samples gradually. The results are shown in the

form of curve chart as indicated in Fig. 1. The computing

environment is a PC (P4 3.0-GHZ, 1-GB memory). In

algorithm FBMKR, the R-implication operator is chosen as

!G : If neighborhood threshold value is applied in algo-

rithm BNMKR, set the value at 0.15, using infinite norm

distance.

In order to test the validity of algorithm FBMKR to

different data, we can make comparison among other three

data sets. Since the scale of number sets is too small,

another PC (P4 2.40-GHZ, 256-MB memory) is used to

minimize the error of computing, and the results are listed

0 0.5 1 1.5 20

100

200

300

400

500

600

700

800

900

sample numbers

com

putin

g tim

e(s)

×104

FBMKRBNMKR

Fig. 1 Comparison of computational time between two algorithms

with different number of samples

Table 3 Data sets description

Data sets Samples Numerical

attributes

Classes

Letter 20,000 16 26

Diabe 768 8 2

Glass 214 9 7

Wine 178 13 3

Neural Comput & Applic (2013) 22:1419–1431 1429

123

in Fig. 2, as we can see that FBMKR does have better

effects in general. The bigger the scale, the more obvious

the effects turn out to be.

The reduction effect of algorithm FBMKR in which the

R-implication operator is chosen as!G and!Lu are listed

in Tables 4 and 5, respectively, and the reduction effect of

algorithm BNMKR is listed in Table 6. As we can see that

the algorithm FBMKR has better reduction effect than

BNMKR, and R-implication operator !G has better

reduction effect than !Lu.

6 Conclusion

In this paper, we study knowledge reduction in RFIS and

RFDIS combining fuzzy set theory, random set theory, and

rough set theory. Based on a RFIS and used R-implication

operator in fuzzy logic, a fuzzy t-similarity relation on

objects set is derived for a given subset of attributes set.

The corresponding RFAS, in which fuzzy set theory, ran-

dom set theory, and rough set theory are well combined, is

defined, and the properties of lower approximation operator

and upper approximation operator in RFAS are investi-

gated. We introduce fuzzy belief measures and fuzzy

plausibility measures by means of lower approximation

and upper approximation, and then we prove some equiv-

alent conditions for knowledge reduction in RFIS and

RFDIS. Using R-implication operator, we construct fuzzy-

set-valued attribute discernibility matrix in RFIS and

RFDIS. A heuristic algorithm for knowledge reduction is

proposed for finding an approximately minimal reduction

in RFIS and RFDIS. The time complexity of this algorithm

is O(|U|2|A|). Experimental results on the real data sets with

numerical attributes are used to demonstrate the effec-

tiveness of the proposed algorithm.

The present research can be regarded as an extension of

Zhang and Wu’s work [1, 23], in which random rough set

models were analyzed, and the notions of fuzzy belief

measures and fuzzy plausibility measures of fuzzy sets can

also be regarded as an extension of belief measures and

plausibility measures of crisp sets in the Dempster–Shafer

theory of evidence. We believe that these extended

research will turn out to be more useful in application fields

of rough theory and the Dempster–Shafer theory of

evidence.

Acknowledgments The work of this paper has been supported by

the construction program of the key discipline in Hunan Province and

the aid program for science and technology innovative research team

in higher educational institutions of Hunan Province.

References

1. Zhang WX, Liang Y, Wu WZ (2003) Information systems and

knowledge discoveries. Science Press, Beijing

1(Diabe) 2(Glass) 3(Wine)0

1

2

3

4

5

6

7

8

9

data sets

com

putin

g tim

e(s)

FBMKRBNMKR

Fig. 2 Comparison of computational time between two algorithms

on different data sets

Table 4 Comparison of attribute number before and after reduction

by using algorithm FBMKR inwhich the R-implication operator is

chosen as !G

Data sets Attribute number

before reduction

Attribute number

after reduction

Letter 16 13

Diabe 8 7

Glass 9 8

Wine 13 6

Table 5 Comparison of attribute number before and after reduction

by using algorithm FBMKR inwhich the R-implication operator is

chosen as !Lu

Data sets Attribute number

before reduction

Attribute number

after reduction

Letter 16 14

Diabe 8 7

Glass 9 8

Wine 13 7

Table 6 Comparison of attribute number before and after reduction

by using algorithm BNMKR

Data sets Attribute number

before reduction

Attribute number

after reduction

Letter 16 16

Diabe 8 8

Glass 9 8

Wine 13 7

1430 Neural Comput & Applic (2013) 22:1419–1431

123

2. Pawlak Z (1991) Rough sets theoretical aspects of reasoning

about data. Kluwer, Boston

3. Polkowski, L, Tsumoto, S, Lin, TY (eds) (2000) Rough set

methods and applications. Physica-Verlag, Berlin

4. Pawlak Z (1998) Rough sets theory and its application to data

analysis. Cybern Syst Int J 29:661–688

5. Skowron A, Rauszer C (1992) The discernibility matrixes and

function in an information systems. In: Slowinski R (eds) Intel-

ligent decision support—handbook of applications and advances

of rough set theory, Kluwer, Dordrecht, pp 331–362

6. Wang GY (2003) Rough reduction in algebra view and infor-

mation view. Int J Intell Syst 18:679–688

7. Mi JS, Wu WZ, Zhang WX (2004) Approaches to knowledge

reduction based on variable precision rough set model. Inf Sci

159:255–272

8. Li DY, Zhang B, Leung Y (2004) On knowledge reduction in

inconsistent decision information systems. Int J Uncertain Fuzzi-

ness Knowl Based Syst 12:651–672

9. Zhang WX, Mi JS, Wu WZ (2003) Approaches to knowledge

reductions in inconsistent systems. Int J Intell Syst 21:989–1000

10. Zhu W, Wang FY (2003) Reduction and axiomization of cover-

ing generalized rough sets. Inf Sci 152:217–230

11. Wu WZ (2008) Attribute reduction based on evidence theory in

incomplete decision systems. Inf Sci 178:1355–1371

12. Kryszkiewicz M (2001) Comparative study of alternative types of

knowledge reduction in incomplete information systems. Int J

Intell Syst 16:105–120

13. Liang JY, Xu ZB (2002) The algorithm on knowledge reduction

in incomplete information systems. Int J Uncertain Fuzziness

Knowl Based Syst 10:95–103

14. Hu QH, Yu DR, Xie ZX (2008) Numerical attribute reduction

based on neighborhood granulation and rough approximation.

J Softw 19:640–649

15. Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353

16. Dubois D, Prade H (1992) Putting rough sets and fuzzy sets

together. In: Slowinsk R (eds) Intelligent decision support,

Kluwer, Dordrecht, pp 203–232

17. Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets.

Int J Gen Syst 17:191–209

18. Wu WZ, Leung Y, Mi JS (2005) On characterizations of ðI ; T Þ-fuzzy rough approximation operators. Fuzzy Sets Syst 154:76–102

19. Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with

application to web categorization. Fuzzy Sets Syst 141:469–485

20. Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute

selection. IEEE Trans Fuzzy Syst 15:73–89

21. Tsang ECC, Chen DG, Yeung DS, Wang XZ, Lee JWT (2008)

Attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy

Syst 16(5):1130–1141

22. Chen DG, Tsang ECC, Zhao SY (2007) Attributes reduction with

TL fuzzy rough sets. 2007 IEEE Int Conf Syst Man Cybern

1:486–491

23. Zhao SY, Tsang ECC (2008) On fuzzy approximation operators

in attribute reduction with fuzzy rough sets. Inf Sci 178:3163–

3176

24. Hu QH, Xie ZX, Yu DR (2007) Hybrid attribute reduction

based on a novel fuzzy-rough model and information granulation.

Pattern Recogn 40:3509–3521

25. Hu QH, An S, Yu DR (2010) Soft fuzzy rough sets for robust

feature evaluation and selection. Inf Sci 180(22):4384–4400

26. Wu WZ, Zhang M, Li HZ, Mi JS (2005) Knowledge reduction in

random information systems via Dempster–Shafer theory of

evidence. Inf Sci 174:143–164

27. Liu WJ, He JR (1992) Introduction to fuzzy mathematics. Sich-

uan Education Press, Chengdu

28. Hirota K (1981) Concepts of probabilistic sets. Fuzzy Sets Syst

5:31–46

29. Chen DG, Yang WX, Li FC (2008) Measures of general fuzzy

rough sets on a probabilistic space. Inf Sci 178:3177–3187

30. Wu WZ, Leung Y, Mi JS (2009) On generalized fuzzy belief

functions in infinite spaces. IEEE Trans Fuzzy Syst 17(2):

385–397

31. Yao YQ, Mi JS, Li ZJ (2011) Attribute reduction based on

generalized fuzzy evidence theory in fuzzy decision systems.

Fuzzy Sets Syst 170:64–75

32. Hajek P (1998) Metamathematics of Fuzzy logic. Kluwer,

Dordrecht

33. Wang GJ (2000) Non-classical mathematical logic and approxi-

mation reasoning. Science Press, Beijing

34. Valleverde L (1985) On the structure of F-indistinguishibility.

Fuzzy Sets Syst 15:95–107

35. Morsi NN, Yakout MM (1998) Axiomatics for fuzzy rough sets.

Fuzzy Sets Syst 100:327–342

36. Morsi NN (1995) Fuzzy t-locality space. Fuzzy Sets Syst 69:193–

215

37. Morsi NN (1998) Dual fuzzy neighbourhood space II. J Fuzzy

Math 3:29–67

38. Biacino L (2007) Fuzzy subsethood and belief functions of fuzzy

events. Fuzzy Sets Syst 158:38–49

39. Denoeux T (2000) Modeling vague beliefs using fuzzy-valued

belief structures. Fuzzy Sets Syst 116:167–199

40. Shafer G (1976) A mathematical theory of evidence. Princeton

University Press, Princeton

Neural Comput & Applic (2013) 22:1419–1431 1431

123