H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France1 A Retrieval Technique for...

21
H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 1 A Retrieval Technique for Software C omponents Using Directed Replaceabil ity Similarity Hironori Washizaki Yoshiaki Fuka zawa { washi, fukazawa }@fuka.info.waseda.ac.jp http://www.fuka.info.waseda.ac.jp/ Waseda University, Tokyo, Japan rnational Conference on Object-Oriented Information Systems (OOIS’0

Transcript of H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France1 A Retrieval Technique for...

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 1

A Retrieval Technique for Software Components Using Directed Replaceability Similarity

Hironori Washizaki Yoshiaki Fukazawa{ washi, fukazawa }@fuka.info.waseda.ac.jp

http://www.fuka.info.waseda.ac.jp/

Waseda University, Tokyo, Japan

8th International Conference on Object-Oriented Information Systems (OOIS’02)

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 2

ComponentInterface

Coordination services

Component Framework

glue code

Software component Definition: Reusable/substitutable software artifacts

A physical packaging of executable software with published interfaces Object-Oriented classes, reusable at the instance level

Distributed in the form of an object code, without source codes Systems: CORBA・EJB, JavaBeans・ActiveX/COM

Background of component technology Needs:

End user computing (EUC) Reducing developmental cost

Seeds: Growth of reuse technology based on OO Appearance of component market on the Internet

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 3

Targeted components JavaBeans component

composed of one or more Java classes opens one Facade class to the public

Available information Property

Readable property: fields whose value can be observed Writable property: fields whose value can be updated

Read/Write method: operation to observe/update the field’s value from the outside Business method: executable operation

Facade

Subsystemclasses

ClassA ClassB

Facade

BarComponentBarComponent

- foo: String+setFoo(p: String): void+getFoo(): String+dosome(): void

Readable and writable property

Read method

Business method{ setFoo: {String}→ void, getFoo: {void}→ String, dosome: {void}→ void }

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 4

Retrieval System

Requirement gathering

Analysis

Design

Implementation

Testing

・ Architecture analysis・ Component analysis

・ Architecture design・ Component implementation/ integration

Componentrepository

registration

retrievingComponent

retrievalsystem

Component-based Software development System internal design with high reusability Retrieval mechanisms are necessary

Application developer

?Component A

Requirement

specification

Requirement

specification

Component B

Adaptability for requirement

Object-Oriented software development

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 5

Query: type informationPreparation: (component itself)Characteristic: structureProblem: consideration of single characteristic, rough retrieval result

Query: (prototype) componentPreparation: (component itself)Characteristic: structureProblem: consideration of single characteristic

Query: predicate logicPreparation: formal specificationCharacteristic: structure, behaviorProblem: preparation costs are very large

Query: keywords about service names, component names, etc.Preparation: catalog informationCharacteristic: structure, behaviorProblem: preparation costs are large

Query: keywords corresponding to the names of components, interfaces, etc.Preparation: component itselfCharacteristic: structureProblem: extracted information is insufficient when source codes are not available

Conventional retrieval approaches

For components in the wide sense Automatic extraction Catalog-based Formal specification-based Similarity distance-based Type-based

Problems of conventional approaches Additional information are necessary Inefficient or not available when source codes are

not available Consideration of single characteristic, not total

semantics

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 6

Characteristics of components Important characteristics when retrieving

Structure: internal participants Behavior: stateless/statefull behavior Granularity: size, classification level Encapsulation: degree of information hiding Nature: main use stage Accessibility to source code: modifiability/availability of source codes

Index for retrieval result: Directed Replaceability Similarity Consideration of structure, behavior and granularity Only requires components themselves

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 7

Directed Replaceability Similarity (DRS) Definition of DRS(Cq,Cs) : necessary adaptation cost at the surrounding parts of Cq when Cq is replaced Cs

Assumption All interfaces of Cq are uniformly used System requirements are the same before and after replacement

Cq

ReplacingCq with Cs

Cs

Three primitive similarities are combined Structural similarity (DRSS) : name, method signatures Behavioral similarity (DRSB) : method execution results Granularity similarity (DRSG) : size, method execution

times

Component

Necessaryparts to modify

y)dynamicall changed becan lues(weight va 0 , 1

),(),(),(::)(3

1

321,

www

CCDRSwCCDRSwCsCDRSwCCDRS

i

ii

sqGsqBqSsq

 

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 8

Function d(x,y,z) d(x,y,z) is commonly used in primitive similarities

how Y differs from X, with common deepest ancestor Z x,y,z : positions of X,Y,Z in partially ordered set

always normalized [0, 1] d(x,y,z)=0 when X=Y d(x,y,z) < d(y,x,z) when x < y d(x+a,y+a,z+a) < d(x,y,z)

Definition

)2)(1(

)2(1::),,(

,0 set)integer (Positive,,1

)/(21 2 yzyxy

zyxydt

tzyxd

yxzzyxy

yxyz

S: set of value types〈 S, ⊆〉 : partially ordered setlong

int

short

・ Position of int is 2

・ Position of short is 3

・ Common deepest ancestor of long and short is long

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 9

Structural Similarity (DRSS) Attributes of structural characteristic

Component’s name Component’s method structures (method signatures)

Replacement example C1 (former) has one method, int calc1(int x)

C1 = { name=“C1”, methods: { calc1: {int } → int } } C2 (candidate) has one method, short calc2(int x) C3 (candidate) has one method, long calc3(short x)

GBS DRSwDRSwDRSwDRS 321::

C1{ calc1: {int}→int }Surrounding

classes

C2{ calc2: {int}→short }

C3{ calc3: {short}→long }

Replacing C1 with C2,C3

Modified classes

Modified classes

Comparison of adaptation costs

),( ),( 3121 CCDRSCCDRS SS

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 10

Details of structural similarity

Cq : Structure of component

nameq : String

methodsq : Instance set

mq1 ~ mqn : Method structure

nameq : String

signatureq : Functional type

paramsq : Instance set

returnq : Normal type

tq1 ~ tqm : Normal type

DRSS

dwdrdmsdw

dfdr

dt

dt

DRSS is composed of … String similarity (dw ) Instance set similarity (dr ) Method structure similarity (dms ) Functional similarity (df ) Normal type similarity (dt ) Function d(x,y,z) in dw, dt

3

),(2),(::),(

},{

},{

:,

}}:..., ,:{:

String,:{::

1

structure Method

component of Structure

sqsqsq

sss

qqq

Ssq

SnS

S

S

S

msmsdrnndwccDRS

msmethodsnnamec

msmethodsnnamec

Ccc

MmMmmethods

nameC

M

C

S

C s

names

methodss

ms1 ~ mspnames

signatures

paramss

returns

ts1 ~ tsr

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 11

String similarity, Instance set similarity

String similarity (dw ) Difference between two Strings e.g.

)#,max(#/)),(),(),((::),(

)# # (if ),root(::),(

)# # if( )root,(::),( ),(::),(

) ofsmallest thefromorder in , from pairs (selected

}:,...,:{},:,...,:{set instance

321

)':' | :'{3

)':' | :'{2

),(1

11

sqsqsqsqsq

sqsSsxsRss

xsq

sqqSqxqRqq

xsqSsq

sq

sqf

nsmq

RRRRfRRfRRfRRdr

RRsdxRRf

RRqdxRRfsqdxRRf

dx(q,s)RRS

xsxsRxqxqR

f

ff

 

 

)()()(

existnot does 1 ),#,#,(# ::),(

and of substringcommon longest , String oflength ::#

ppsqsq

sqp

wwwwdwwdw

wwwww

Instance set similarity (dr ) Difference between two instance sets (methods,parameters) e.g.

dw(“date”,“getDate”) < dw(“date”,“do”)

dr( {m1,m2,m3 }, {ma,mb } ) < dr( {ma,mb }, {m1,m2,m3 } )

{ m1: int→ int m2: String→ void m3: Date→Time }

{ ma: float→ int mb: String→ void }root: {}→ rootT

C1 C2

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 12

Method structural similarity, functional similarity

Method structural similarity (dms ) Name of method, functional type of method signature

3

),(2),(::),(

},{ },,{

}:String,:{:: structure Method

sqsqsq

sssqqq

S

sigsigdfnamenamedwmmdms

sigsignaturenamenamemsigsignaturenamenamem

FsignaturenameM

Functional similarity (df ) Types of parameters, type of return value e.g.

'':

'::'

tstsH

ttssH

   

⊥⊥

Subtyping relation

2

),(),(::),(

},{

},{

}:}:,...,:{:{:: typefunctional 1

sqqssq

sss

qqq

n

rrdtppdrffdf

rreturnpparamsf

rreturnpparamsf

TreturnTtTtparamsF

df( {int }→ int, {long }→ int ) < df( {int }→ int, {short }→ int )

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 13

Normal type distance Normal type distance (dt )

Normal type: value type, object type Formation of single Is-a graph based on subtyping relation

“ In the context of that the type of expression is t, when s can be used safely, s is a subtype of t ”

e.g.

))(),(),((::),(

1)(root ,root from typeofdepth ::)(

: ,: , and of supertypedeepest

psqsq

TT

pspqsqp

tltltldttdt

lxxl

ttttttt

)()(

rootT

void

long

int

Void

Long

Object

Date

Time

……

l(x) =1

l(x) =2

l(x) =3

l(x) =4

supertype

subtype

ts :

Replacing with subtype is easy, replacing with supertype is difficult ),(),(: qssqqs ttdtttdttt

dt( int, short ) < dt( int, long )

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 14

Behavioral similarity (DRSB) Attributes of behavioral characteristic

Return value of method is the initial value (null, 0, false,…) or not Type of value changed readable properties

Replacement example C1 (former) has two methods and one redable property

C2 (candidate) has two methods and one readable property

C3 (candidate) has two methods and one readable property

GBS DRSwDRSwDRSwDRS 321::

Comparison of adaptation costs: ),( ),( 3121 CCDRSCCDRS SS

int data = 0;int getData(){ return data; }short calc1(int x){ data = x; return (short)x; }

long data = 0L;long getData(){ return data; }short calc2(int x){ data = x; return (short)x; }

long data = 0L;long getData(){ return data; }short calc3(int x){ return (short)x; }

The value ofint is changed

The value oflong is changed

Nothing changes

Observation from outside

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 15

Granularity similarity (DRSG) Attributes of granularity characteristic

Component’s static size [byte] Component’s internal complexity

Execution time [sec] of methods, when invoking methods with same values for parameters, can reflect the internal complexity Replacement example

C1 (former): Size 10k[byte], Total method execution time: 10[msec]

C2 (candidate): Size 15k[byte], Total method execution time: 20[msec] C3 (candidate): Size 100k[byte],Total method execution time: 150[msec]

GBS DRSwDRSwDRSwDRS 321::

Replacing C1 with C2,C3

),( ),( 3121 CCDRSCCDRS SS Comparison of adaptation costs

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 16

Component retrieval system: RetrievalJ Using DRS as index for search ranking

Query: prototype component Without detailed implementation Required interface structure, behavior

Targeted system: JavaBeans Additional information (e.g. source codes) is not necessary User can change weight values corresponding to structure, behavior, granularity

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 17

Implementation of RetrievalJ

(1) Inputting query, weight values

(2) List of retrieval results

(3) Individual output

(4) Trial, download

Implementation environment Java2SEv1.3, Servlet 2.2, Tomcat 3.2.3, Xerces-J 1.4.2, Web browser

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 18

Preparation for evaluation Evaluation samples

257 JavaBeans components 13 agreement groups: functionally resembled Are components in the same group as the query put at the high rank?

Our technique Heuristically, structure is the most important

Conventional techniques Two similarity distance-based methods

using positions in the class inheritance hierarchy using terms of element names

),(2.0),(3.0),(5.0),(' sqGsqBsqSsq ccDRSccDRSccDRSccDRS

[SC94][MN99]

CalendarProgress BarSMTPPOP3ClockCalculatorGaugeFingerStockScrollbarGUI for SMTPGUI for POP3PlotChart

5333322222222

group number

Agreement groups

[MN99] A.Michail et al., “Assessing Software Libraries by Browsing Similar Classes, Functions and Relationships,” ICSE, 1999

[SC94] G.Spanoudakis et al., “Measuring Similarity Between Software Artifacts,” SEKE, 1994

Structure Behavior Granularity

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 19

Normalized recall Normalized recall (Rnorm) for all 13 groups Result of our method

Higher/maximum values in 11 groups Consideration of structural similarity of interfaces which provide the same functions

)#257)(1(#2

)1(##)(21

::)(

component ofrank ::)(

in components ofnumber ::#

}{

GG

GGcrank

GR

CCrank

GG

cgGC

norm

)()(

Component

CalendarSSCalendarCalendarBeanCalPanelCalendarViewer

StringMessageStringint( none )

getResultSelectedDateAsStringgetAllSelectedDatesgetSelectedDategetDate( none )

--0.0670.0110.0710.187

--12512

--19762021

--414563

Date selecting method Return dms Our [SC94] [MN99]

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 20

Consideration of total characteristics Evaluation policy

Query: GUI label component Retrieving targets: all interface structures are the same

Bean1 behaves correctly as the GUI label Bean2 does not at all

Result of our method User can identify Bean1 more desirable using DRS’ which can consider the total characteristics

Technique

[SC94] similarity[MN99] similarityOur: total similarityOur: structural similarity

)( ' DRS)( 0.1 SDRS

Bean1 Bean2

0.3935 (2) 0.3935 (2) 1362.1 (3) 1362.1 (3) 0.0238 (1) 0.0304 (5) 0.0399 (1) 0.0399 (1)

Bean1

public void setText(String text){ this.text = text; ……} Bean2

public void setText(String text){ //do nothing ……}

H.Washizaki and Y.Fukazawa, OOIS'02, September 4, Montpellier, France 21

Conclusion Component retrieval method using DRS

Preparation cost is low, and usefulness is confirmed. Consideration of total semantics of components by changeable weights

Future works Verification of the possibility of using our method together with other retrieval methods Implementation on other component systems (ActiveX,…)