Some Perspectives Software Engineering

8/12/2019 Some Perspectives Software Engineering

1/80

VOL 8 NO 6

2010

SOmE PErSPEcTIVES ON

SOfTwarE ENgINEErINg


2/80

SETLabs BriefingsAdvisory Board

Gaurav RastogiVice President,

Head - Learning Services

George Eby MathewSenior Principal,Infosys Australia

Kochikar V P PhDAssociate Vice President,

Education & Research Unit

Raj JoshiManaging Director,

Infosys Consulting Inc.

Ranganath MVice President &Chief Risk Officer

Subu GoparajuSenior Vice President & Head,

Software Engineering &Technology Labs

Svo Soe ExtenlThought LedeshipHello redes!

at SETLbs Biengs e e lys oitted tod deliveing to you the

best thought ledeship in hosen theti es. In the eent ties you hve

seen you jounl hieving nee nd nee heights, pssing though ny

ilestones nd going in sttue both ntionlly nd intentionlly. I hve

lys believed tht it is the oitted, pssionte nd hdoking thought

ledes o Inosys ho hve helped us so nothds in ou quest to hieve

intelletul exellene.

The uent issue is speil in ny ys. while you e e tht e hve

been ying thid pty intevies nd opinions in ou issues o tie

to tie, this is the st tie e hve opiled n issue tht is dointed by

ontibutions o thought ledes outside o Inosys.

Inosys hs hd the oppotunity to o-sponso the 3d Indi Sote Engineeing

coneene (ISEc) t mysoe duing febuy 25-27, 2010 ith soe vey senio

thought ledes o Inosys nhoing the eg event. at SETLbs Biengs,

e elt tht the pesenttions de in the oneene ee top noth nd nted

to bing soe o the to you, post king soe tiultion level hnges. To

this eet, e sought peission o the oneene ognizes, hih they

gldly gve. Soe long disussions ith the oneene ledeship hve helped

us pik hndul o ppes hih e elt e elevnt nd ould be o inteest

to you.

Thee is no ginsying the t tht only hen knoledge tnsends

boundies n its tue oth be enjoyed. whethe the ontibutos in this

opiltion e o Inosys o o othe ogniztions, pssion o eseh

is ht unies the ll. Thee is no set thee bout the opiltion. In t, e

did not nt to eve the ontibutions ound ny ognized thee. Sine

sote engineeing is the ous e o the opiltion e hve titled it

Soe Pespetives on Sote Engineeing, else the issue hs s ny vos

s you desie.

The ides in the ontibutions nd ssoited eseh inputs e oned by

the espetive uthos nd the oiginl opyight o these ontibutions is

ith ISEc. Neithe SETLbs Biengs no Inosys vouh o the veity in the

opiltion no e they esponsible o ny IP elted issues. The ole o the

jounl ngeent te s estited only to ontent seletion nd outine

opy editing.

Do let us kno i you ound the olletion inteesting. This ill help us bing

oe nd oe extenl thought ledeship into ou jounl though syndition.

Hppy eding!

Pveen B. mll [email protected]


3/80

3

15

29

45

55

61

73

SETLabs BriefingsVOL 8 NO 6

2010

Research Review: Usability of Refactoring Tools for Java DevelopmentBy Jeffrey Mahmood and Prof. Y Raghu ReddyRefactoring tools available commercially and on open source have not been

efficiently developed for Java. The authors do a usability assessment of the

available tools and suggest ways of improvement to make them more effective.

Model: Information Feedback Model for Scalability in Distributed

Software ArchitectureBy Manjunath Ramachandra, Narendranath Udupa and Shyam Vasudev RaoComponents added at later stages in distributed software architecture affect the

network. Prediction of the s tatus of network can overcome such sudden change

in load. The authors propose a mechanism based on information feedback to

manage various components that act as a bottleneck when the architecture is

scaled up.

Framework: Creating and Benchmarking OWL and Topic Map

Ontologies for UBL ProcessesBy Kiran Prakash Sawant and Suman Roy PhDThis paper discusses two knowledge representation formalisms topic maps and

OWL. The authors discuss the results of their experiment by taking the payment

process of UBL as a case study.

Practitioners Perspective: Automated Test Case Generation from C

Program by Model Checking through Re-engineering

By Suman Roy PhD and Kuntal ChakrabortySeveral mainstream industries are implementing model-based development intheir software development life cycle as they realize the benefits of following

this mode. In this paper the authors have discussed a model-based approach

to generate test cases from C program through model checking with suitable

coverage cri teria.

Practitioners Solution: Architecture Reconstruction from Code for

Business Applications - A Practical ApproachBy Santonu Sarkar PhD and Vikrant KaulgudApplication development today is multiplatform, multisite and multilingual and

hence is complex. Comprehending and maintaining such a system remains achallenge. The new system needed to be developed has to co-exist with several

existing applications. The authors propose a semi-automated, iterative approach

to model the hierarchical functional architecture of a family of applications.

Spotlight: A Knowledge Transaction Approach for the Request Lifecycle

in Application MaintenanceBy Himanshu Tyagi and Kapil ShindeApplication maintenance is a major challenge that software vendors face, post

implementation. Given the multiple dependencies involved, the knowledge

gathered and required for maintenance is often lost or becomes inaccessible. The

authors suggest a much needed solution for a knowledge base when handling amaintenance request lifecycle.

Index


4/80

There are as many prescriptions as there are perceptions.

Ontology helps in developing an overarching framework

to standardize the distinct perceptions.

Suman Roy PhDSenior Research ScientistSoftware Engineering Research GroupSETLabs, Infosys Technologies Limited

Cost effective application maintenance sounds oxymoronic

unless maintenance projects are powered with the right mix

of efficacy and domain knowledge.

Himanshu TyagiProduct ManagerMaintenance Center of ExcellenceSETLabs, Infosys Technologies Limited


5/80

3


2010

Usability of Refactoring Tools forJava Development

By Jeffrey Mahmood and Y Raghu Reddy

Proper usage of refactoring tools can pare downthe probability of compile time and run time errors

Refactoring is a technique of changingexisting source code to improve the designwithout changing the external system behavior

[1]. At times, while refactoring, developers or

maintainers may inadvertently inject compile

time or run time errors due to code complexities.

Refactoring can be a time consuming task and

various factors contribute to the complexity of

refactoring. The amount of time consumed in

refactoring mainly depends on the size of the

system, the extent to which the system needs to

be restructured, availability of tools and how

well developers understand the system.

Refactoring tools can be used bydevelopers to prevent human errors and to

perform refactoring tasks faster. However,

a survey of programmers at Agile Open

Northwest 2007 revealed that although 90% of

the developers had access to refactoring tools,

only 40% used them [2]. Some of the reasons

cited by developers for avoiding software

refactoring tools were -- being able to re-factor

by hand faster, too many steps or screens,

complex key sequences, linear menus, code

selection and no general purpose mechanism

for refactoring several pieces of code at once.

Moreover, some of the other reasons for not

using software refactoring tools were the lack of

automated support and multi-stage integration

[3, 4].

The ability to re-factor by hand faster

potentially implies the tools inability to

provide a positive user experience and a lack

of efciency when using the tools. Developers

responses regarding too many steps or screens

and complex key sequences imply that the tools

are not matching the users mental model of the

system or in minimizing memorization. A lack

of general purpose mechanism for refactoringseveral pieces of code at once describes the tools

inability to assist developers with automating

mundane or computable tasks. These attributes

are categorized under the usability guidelines

and imply that the tools poor usability is

deterring developers from using them [4].

Developers understanding of the system

also plays a key role in using refactoring tools.

Often, developers are unsure of a specific

refactoring that may need to be applied.

Also, lack of good tool support that can


6/80

4

increase the understandability of a systemadds to the problem. Providing tool support for

identication and selection of the refactoring

type can reduce the amount of time a developer

spends on refactoring. At the same time, it is

important to characterize the complexity of the

refactoring to be undertaken and automated

tools can aid developers in prioritizing their

efforts [5].

The goal of this study is to evaluate the

usability of four different software refactoring

tools, compare the results to other studies

of similar nature and suggest improvements

for tools that can help reduce the perceived

ineffectiveness of software refactoring tools.

A large software usability gap can

lead to users getting confused, frustrated or

panicky and can result in the software system

being misused or not used at all [6]. A software

system should be easy to use, and quick and

pleasant in order to promote learning and recall

for end-user supported tasks. The consistency

of software applications usability results in

reducing user training time by 25% to 50% [6].

Minimal usage of software refactoring tools

by developers/maintainers suggests that they

suffer from a large software usability gap.

R e f a c t o r i n g m a n u a l l y r e q u i r e s

developers to validate their re-factorings by

updating the affected modules to compensatefor the changes. Li and Thompson agree that

compensating for refactoring is a exible but

fault-prone method for validating refactoring

[7]. The other types of validation for refactoring

are preservation of pre- and post-condition.

Pre-condition check is the method used by

most software refactoring tools and is dened

as preserving all the behavior involved before

allowing the refactoring to take place. In most of

the tools, precondition checks are implemented

using abstract syntax trees (ASTs) derived from

the source code or using control, ow graphsand program dependency graphs [8]. Another

approach of implementing pre-condition checks

is to formalize the proposed transformations

as constrained types [8]. Pre-conditions can

also be checked using analysis functions that

describe the relationships amongst the source

code entities such as classes, methods, and

fields [9]. A post-condition check relies on

testing the code after a refactoring and does

not apply to refactoring tools since it requires

a test suite to verify the changes [7]. However,

the aforementioned analysis functions have

not been used to check the post conditions of a

refactoring [9]. For most refactoring tools, when

a pre-condition is violated the user is notied

via an error message. This error message allows

the user to identify where the error was made

in order to x it. The failure to produce and

properly display error messages has deterred

developers from using software refactoring

tools [2, 10].

Automation of refactoring can reduce

some of the errors caused by manual refactoring.

The benefit of automated tools lies in their

ability to be customized. For example, users can

set their preferred automation level by selecting

specic refactorings that can be automated.

Three suggested levels of automation are

assisted, global, and severity based [5]. At theassisted level code-smells are identied by the

integrated development environment (IDE)

and suggested resolutions are provided to the

user. The global level implies full-automation

where the IDE automatically resolves any issues

found. The severity based level detects issues

the same way as the other levels, but it only

automates a solution based on a complexity

threshold specied by the user.

In the authors experience the assisted

level of automation is the most prevalent


7/80

5

amongst IDEs. As discussed later in the paper,the IntelliJ IDEA and Eclipse IDEs are packaged

with default plug-ins that visually highlight

code smells and provide suggestions to fix

the code smell. These code smell indicators

provide refactorings at a base level in most

cases and can be used to semi-automatically

fix the code. However, certain refactorings

require more thorough examination and do not

provide an automated solution. In large projects

with numerous developers and even in small

projects with inexperienced developers, these

code smells can grow unchecked. The simple

changes that these plug-ins draw attention to

help keep the code clean and maintainable. Such

small scale changes are not restricted to just

syntactical code formatting. Some of the code

smells the tools detect are more symptomatic

of poor design choices such as long and overly

complex methods, method signatures with too

many parameters and large classes with too

many elds.

The Agile Open Northwest 2007 survey

results imply lack of sufficient refactoring

tools leads to developers performing manual

refactorings [2]. One of the goals of the IDE

and refactoring tools should be to facilitate

the development of software by providing

meaningful tools to minimize defects injected

by the developer. Refactoring tools shouldallow developers to perform code refactorings

with relative ease from the syntax level to the

component design level to the package level.

Providing refactoring tools with high usability

has the potential to improve overall code quality

and maintainability and minimize future rework.

EXPERIMENTAL METHOD

The study method can be used for commercial

as well as open source tools. The refactoring

tools chosen were two commercial IDEs and

two open source projects which supportedJa va deve lopm en t. The comm er ci al IDE s

selected for this study are IntelliJ IDEA 7.0.4

and JBuilder 2008; the open source refactoring

tools are RefactorIT 2.7 beta and open source

IDE, Eclipse 3.5.1. In a previous study, Mealy et

al., used the RefactorIT 2.5 plugin version and

an earlier version of Eclipse with and without a

refactoring plug-in [4]. In our study we use the

standalone version of RefactorIT and the latest

default version of Eclipse. When this study was

initially conducted, IntelliJ IDEA was available

as a 30 day trial with all available functionalities

and JBuilder 2008 is the free for download,

limited functionality release of JBuilder. IntelliJ

now offers a free community edition of IDEA

that provides the same refactorings and similar

functionality as the one tested. For JBuilder,

two commercially available versions exist that

charge per seat or per license.

For the evaluation of the tools a small

scale undergraduate level student project and

a considerably larger open source project was

used. The student project was an implementation

of a bowling game simulator that had 24 classes

and approximately 2000 lines of code. The

open source project used was Google Web

Toolkit, in particular only its /dev/src module

was evaluated and contained 421 classes and

approximately 110,000 lines of code. The bowlingalley simulation was used because of its relative

ease in understanding and the open source

project Google Web Toolkit was chosen because

of its maturity and considerably larger size.

Each software refactoring tool chosen

supported more than 20 refactorings and each

one was applied to the two software systems

selected. IntelliJ IDEA supported refactorings

for rename, change signature, make static,

convert to instance method, move, copy, safe

delete, extract method, replace method code


8/80

6

duplicates, invert Boolean, introduce variable,introduce eld, introduce constant, introduce

parameter, extract interface, extract superclass,

use interface where possible, pull members

up, push members down, replace inheritance

with delegation, inline, convert anonymous

to inner, encapsulate eld, replace temp with

query, replace constructor with factory method,

generify and migrate.

JBuilder 2008 supported refactorings

for rename, move, change method signature,

extract method, extract constant, inline,

convert anonymous class to nested, convert

member type to local, convert local variable

to eld, extract superclass, extract interface,

use supertype where possible, push down,

pull up, introduce indirection, introduce

factory, introduce parameter object, introduce

parameter and encapsulate eld.

RefactorIT supported refactorings for

undo, redo, add delegate methods, change

method signature, clean imports, convert

temp to eld, create constructor, create factory

method, extract superclass/interface, inline,

introduce explaining variable, minimize access

rights, move, override/implement methods,

pull up/push down, rename and use supertype

where possible.

Eclipse supported refactorings for

rename, move, change method signature,extract method, extract local variable, extract

constant, inline, convert anonymous class

to nested, convert member type to top level,

convert local variable to eld, extract superclass,

extract interface, use supertype where possible,

pull members up, push members down, extract

class, introduce parameter object, introduce

indirection, introduce factory, introduce

parameter, encapsulate fields, generalize

declared type, infer generic type arguments,

migrate jar le, create script and apply script.

The usability guidelines used to evaluatethe usability of the software refactoring tools

comprised of eight categories: consistency,

errors, user experience, ease of use, design for

the user, information processing, user control

and goal assessment. The specic criteria for

each category can be seen below.

Consistency (C)

(C1) Ensure that things that look the

same act the same and things that look

different act different.

(C2) Be consistent with any interface

standards (either explicit or implicit) for

the domain/environment.

Errors (E)

(E1) Assist the user to prevent errors

(through feedback, constrained interface,

use of redundancy).

(E2) Be tolerant of others.

(E3) Provide understandable, polite,

meaningful, informative error messages.

(E4) Provide a strategy to recover from

errors.

(E5) Permit reversal of actions/ability

to restart.

(E6) Allow the user to nish their entry/

action before requiring errors to be

xed. Do not interrupt the task beingcompleted.

(E7) Automate error-prone tasks/sub-

tasks.

User Experience (UX)

(UX1) Make interface minimal, simple

to understand, organized, without

redundancy, socially relevant (especially

for communication) and aesthetically

pleasing.

(UX2) Provide the information, or access


9/80

7

to the information, needed for a decisionwhen/where the decision is made.

(UX3) Use the fewest number of steps/

screens/actions to achieve the users

goals.

Ease of Use (EU)

(EU1) Make the system exible.

(EU2) Make the system simple to use.

(EU3) Make the system efcient to use.

(EU4) Make the system enjoyable to use.

(EU5) Automate tedious/repetitive/

time-consuming tasks/sub-tasks.

Design for the User (DU)

(DU1) Dene the user and match the

system to the user.

(DU2) Use the users mental model and

language (avoid codes).

(DU3) Automate mundane/computable

tasks/sub-tasks.

Information Processing (IP)

(IP1) Assist the user to understand the

system.

(IP2) Minimize memorization (i.e., reduce

short-term memory load), through use

of selection rather than entry, names

and not numbers, predictable behavior

and access to required data at decisionpoints.

(IP3) Make commands and system

responses self-explanatory.

(IP4) Use abstract ion or layered

approaches to assist understanding.

(IP5) Provide help and documentation,

including tutorials and diagnostic tools.

(IP6) Assist the user to maintain a mental

model of the structure of the application

system/data/task.

(IP7) Maximize the users understanding

of the application system/task/data atthe required levels of detail.

User Control (UC)

(UC1) Adapt to the users ability, allow

experienced users to use shortcuts/

personalize the system, and use multiple

entry formats or styles.

(UC2) Put the user in control of the

system, ensure that they feel in control

and can achieve what they want to

achieve. Allow users to control level of

detail, error messages and the choice of

system style.

Goal Assessment (GA)

( G A 1 ) E n s u r e t h e u s e r a l w a y s

knows what is happening. Respond

quickly, meaningfully, informatively,

consistently and cleanly to user requests

and actions.

(GA2) Make it easy for the user to nd

out what to do next.

(GA3) Make clear the cause of every

system action or response.

(GA4) Provide an action/response for

every possible type of user input/action.

(GA5) Provide feedback/assessment/

diagnostics to allow the user to evaluate

the application system/data/tasks.

Each criterion is rated on an integer scale

from 1 to 5 based on the compliance agreement

of the usability guideline where 1 is strongly

disagree, 2 is disagree, 3 is neutral, 4 is agree

and 5 is strongly agree. Each tool evaluated in

this study is assumed to provide the adequate

set of refactorings necessary for refactoring

both applications. So, tool adequacy has not

been considered as a category in the usability

guidelines.


10/80

8

Mealy et al., were able to conceive 81usability requirements that they rated based

on 34 usability guidelines [4]. These usability

requirements were not made available at the

time of this particular study. For the basis of

the comparison, the 1 to 5 scale was used in

order to add a ner level of granularity to the

assessment of the guidelines in the absence of

the individual requirements.

RESULTS

The entire set of the compliance scores relative

to the refactoring tool is tabulated in Exhibit 1.

Table 1 shows the raw data for the most

significant usability guidelines scores. From

the percentage usability scores in Table 1 it

can be seen that all four tools were similar in

their compliance agreement of the usability

guidelines. RefactorIT scored lower than

IntelliJ IDEA, JBuilder and Eclipse because

of its strict precondition validation rules

that do not allow the user to modify the

code by means of a built-in text editor. This

was especially bad in instances where new

variables needed to be created to continue a

sequence of refactorings or the introduction

of a new parameter in a method signature

could not be extracted. As a result, the ease

of use category for RefactorIT brought down

its overall score.The strategy for error recovery row in

Table 1 refers to the error handling capabilities

of the refactoring tools. For each tool the most

common way to handle improper use of the

refactoring was an obtuse error message.

Clearing the error message provided no further

assistance about the refactoring and the user

is left to empirically figure out what had

caused the error. In more than one instance the

refactoring was never performed and resulted

in manual refactoring.

Full Set ofCompli-

ance

Scores

Intellij

Jbuilder RefactorIT Eclipse

C

C1 4 4 4 4

C2 4 4 4 4

E

E1 3 4 3 4

E2 4 3 3 4

E3 3 2 2 3

E4 1 1 1 1

E5 5 5 5 5

E6 4 4 4 4

E7 1 1 1 1

UX

UX1 3 3 3 3

UX2 5 5 2 4

UX3 4 3 3 4

EU

EU1 4 4 1 4

EU2 3 4 1 4

EU3 4 4 3 4

EU4 4 4 1 4EU5 1 1 1 1

DU

DU1 5 4 4 4

DU2 5 5 3 4

DU3 2 1 1 1

IP

IP1 4 4 3 4

IP2 4 4 4 4

IP3 3 3 3 3

IP4 4 4 4 4

IP5 5 3 5 4

IP6 3 3 3 3

IP7 3 4 3 3

UC

UC1 3 3 1 3

UC2 1 1 1 1

GA

GA1 5 5 5 5

GA2 4 4 3 4

GA3 4 4 3 4

GA4 4 4 3 4

GA5 2 2 3 4

Total 118 114 94 115% 69% 67% 55% 68%

Exhibit 1: Full Set of Compliance Scores


11/80

9

The tools provide a means of reversing

actions when a refactoring is carried out

improperly. This is represented in Table 1 by

high marks in the permit reversal of actions row.

The tools evaluated allowed reversal of actions

by integrating the refactoring tools undo and

redo commands with the overall undo and redo

commands of the IDE.

The low scores in the automate error

prone tasks and automate tedious/time

consuming tasks rows of Table 1 is a reflection

of the lack of automation in the refactoring

tools in general. Performing refactorings on

the bowling simulation student project had

numerous instances where array indexes andstring constants were extracted as constants.

The only failing was that once the constant

had been extracted, unless it was extracted

at the field level, the constant could not

be propagated throughout the class, or

more efficiently, throughout the project.

This left the developer to then perform any

replacements using the newly extracted

constants by hand.

The user control rows of Table 1 were

derived from the ability to provide variable,

parameter and class names when performing

refactorings such as introduce eld, introduce

variable, convert anonymous to inner and

encapsulate field. The refactoring tools

themselves did not allow customizations,

but IntelliJ IDEA provided a set of templates

to create new files and classes that can be

customized.

In Table 2 overleaf Mealy et als previous

scores can be seen. Again, the scores of

the usability guidelines show that there is

little difference in terms of usability of the

software refactoring tools with the exception

of Condenser. A calculation of the percentage

of raw points earned in the Mealy et al., studyshows that the minimum percentage for

compliance was 55% and the maximum was

67%. This is almost the same exact range as the

compliance percentages from Table 1 that show

a minimum compliance of 55% and maximum

compliance of 69%. This shows that there is no

discernable difference between the commercial

tools and the open source tools evaluated

by Mealy et al., and there is no discernable

difference between the previous RefactorIT

plug-ins and the RefactorIT standalone version.

Intellij 7.0.4 Jbuilder 2008 RefactorIT2.7beta Eclipse 3.5.1

Errors

Strategy for error recovery 1 1 1 1

Permit reversal of actions 5 5 5 5

Automate error-prone tasks 1 1 1 1

Ease of use

Automate tedious / t ime consuming tasks 1 1 1 1

User Control

Adapt to the users, customizat ion 3 3 2 3

Allow users to control details, error

messages, style

1 1 1 1

Usability Totals 118 114 94 115

Table 1: Usability Guidelines Compliance Significant Results


12/80

10

The versions of Eclipse 3.2 and Eclipse

3.2 with the Simian UI plug-in showed a

marked improvement with the Simian UI

plug-in in the previous Mealy et al., study [4].

The Simian UI was used in the Mealy et al.,

study because of the code smell inspection

properties the plug-in added to the Eclipse

IDE [2]. With the similarity in compliance of

usability between Eclipse 3.5 and Eclipse 3.2

with the Simian plug-in the conclusion can be

made that the Eclipse platform has since been

updated to incorporate code smell inspections

that improve the refactoring usability of the

IDE similar to the Simian UI plug-in. This was

also apparent because the code smell inspection

capabilities of IntelliJ IDEA and Eclipse were

very similar.

Since there is no difference between theprevious tools evaluated and the current one,

we can state that there has been no signicant

improvement in the usability of software

refactoring tools since the previous evaluation.

The raw score numbers confirm that the

automation of the tools is still non-existent and

that user control is still lacking.

Since the goal of the study was to

evaluate the usability of the refactoring tools,

there was no difference between the usability

of the tools between the student project and

open source project evaluated and therefore

the results are presented as a single table. The

two projects proved useful in providing ample

opportunities to attempt all the refactorings

offered by the tools.

DISCUSSION

A major issue not discussed in the previous

section that consistently scored low according

to the raw data in Table 1 was the strategy for

error recovery. In the current state of software

refactoring tools whenever a pre-condition

is violated, an error message is displayed

to the user. It was the ineffectiveness of this

error message that lead Murphy-Hill and

Black to develop their plug-ins for Eclipse to

enhance extract method [11]. Once an error

is encountered, the user is notified and therefactoring is either canceled or allowed to

continue based on the users discretion, and it

is up to the user to compensate for any errors

injected into the system.

In order to improve the usability of the

tool, the tool should instead analyze the users

code selection, and based on the context of the

refactoring suggest a set of corrective actions

necessary to complete the refactoring. For

instance, in the scope of an extract method, if

the user has an incomplete selection then the

Refactoring Tool Consis-tency Errors Infor-mation

Pro-

cessing

UserExperi-

ence

Denefor

User

UserControl GoalAs-

sess-

ment

Total %

Eclipse 3.2 4 8.5 9 7 7.5 6 3.5 45.5 56%

Condenser 1.05 0.5 5 1 3 2.5 1.5 3 16.5 20%

RefactorIT 2.5.1 4 10.5 7 9 7 4 6 47.5 59%

Eclipse 3.2 with Simian

2.2.12

4 12 9 9 7.5 7 6 54 67%

Total Requirements by

Category

4 15 10 10 9 25 8 81 100%

Table 2: Usability Compliance of Mealy et al - Analysis [4]


13/80

11

refactoring tool should display a correctiveaction such as Did you mean, where the

display is the suggested block of highlighted

text that would make the refactoring feasible.

The user can then conrm the intended action

and the tool can perform the refactoring as

expected. In the instance of a change signature

refactoring, if a method variable name clashes

with a parameter name then the tool could

suggest appending the name withparamor local

for the appropriate case where the suggested

suffix is presented in an editable text field.

Currently this scenario either brings the user

back to the initial refactoring screen or allows

the user to continue and leaves them to resolve

the error post refactoring. By making the

system more tolerant to errors and providing

suggestions, the user will be able to refactor

the code more quickly and relate it with a more

enjoyable user experience.The automation of tasks is another failing

of the refactoring tools highlighted in the ease of

use row in Table 1. When extracting a constant,

most tools already provide an option for

replacing all instances the constant is replacing

that are already pre-existing within the class.

However, these constants can then be moved to

a superclass or interface level and propagating

these new constants to other classes within the

hierarchy, package or project is still an exercise

in performing refactorings manually.

At the package level, refactoringthe location of classes is still left to be

performed manually. For the student project

all the classes were created at the default

package level. The only refactorings that

provided any assistance in this area were

relegated to refactorings related to the class

hierarchy. Between the four tools, none

of them provided a simple refactoring for

grouping classes as packages or provided

a code smell that provided suggestions for

grouping classes into appropriate packages.

This could be done with a tool to check

the dependencies amongst the classes and

suggesting improvements in the current

packaging of the classes.

The c losest too l for automated

refactorings was provided by Eclipse 3.5. The

Eclipse 3.5 IDE provided a method for creating

custom scripts based on previous refactoringscarried out by the user that could then be saved

and applied later on or in another project. Since

most of the refactorings carried out are very

specic according to variable, method and class

name this is a very difcult process to make

generic to be used across multiple projects.

CONCLUSION

From the results of comparison of the

usability guidelines we found that sufcient

improvements have not been made to open

An error-tolerant system will enable the userto refactor the code more quickly


14/80

12

source as well as commercially availablerefactoring tools available for Java development.

However, in the case of extract method - one of

the refactorings singled out by Martin Fowler

as being fundamental to refactoring - work has

been done to aid developers in increasing their

efciency with the development of the Selection

Assist, Box View and Refactoring Annotations

plug-ins available for Eclipse.

Work is currently being done to address

the automation and integration of the entire

refactoring process and applying additional

user control to the proposed automation

improvements. Identification of problem

code is an area that needs improvement and

better identication tools will allow further

automation of refactoring either by mapping

from a code-smell to a transformation or from

a single transformation to numerous source

code candidates.Certain refactorings such as the

introduction of constants, removal of unused

code and even extraction of method can be

done automatically. These types of code-

smells require a simple change and can be

carried out automatically. In the case of an

extract method, performing this refactoring

automatically can be done by identifying

repeated code and encapsulating the code

block into a parameter-constrained method.

However such refactoring requires an explicit

pre- and post-condition check. Any type ofrefactoring that requires redesign, such as

long methods, large classes, too many method

parameters, or high coupling can be detected

and brought to the users attention, but any

type of fix would be at the users discretion.

These types of refactorings can stil l be

identified at the assisted level, but are most

likely to be categorized above the complexity

threshold of a severity level automated

refactoring.

An example of an assisted level tool

is the Inspection Gadgets plug-in for IntelliJ

IDEA. This tool detects a large range of code

smells such as abstraction issues, class structure,

control issues, data ow issues, inheritance

issues, pattern validation, probable bugs and

a myriad of others and ags the developer via

yellow bars in the text editor. Each category

mentioned, as well as the others, has a morene-grained list of the specic inspections that

are carried out on the static source code. Users

are able to select which inspections to turn on

or off and select which inspections they wish

to address. Each agged inspection provides

suggestions for xing the problem that can be

carried out semi-automatically, ignored with

the addition of a code comment, or simply

ignored by the user by not addressing the

ag. The breadth of the tool is impressive as

well as the ability to aid developers without

Robust automation of refactoring can be attainedwith the help of the right identification tools


15/80

13

being distractive and allowing a level ofcustomization.

Incorporating the use of refactoring tools

into the educational process can give future

developers more exposure to their possibilities.

Since the refactorings themselves are universal

to object-oriented systems, and the tools show a

high correlation of the same refactorings being

available this would not be an issue of IDE

selection, but rather a lack of exposure to these

types of tools.

Proper usage of software refactoring

tools reduces the chance of compile time and

run time errors that are inherent in manual

refactorings. It is the belief of these researchers

that a developers time spent on refactoring

could be signicantly reduced by removing the

barriers of usability for software refactoring

tools.

A tool for IntelliJ IDEA is being developed

to aid with automated design level refactoring.

The tool would work at an Assisted and Severity

based level. Based on the user settings the tool

would be able to identify candidates for design

improvement and based on the detected level

of complexity either carry out the refactoring

automatically or provide suggestions to the

developer. The tool will leverage the XML

framework used in previous studies.

REFERENCES

1 . Fowler , M. (1999) , Refac tor ing:

Improving the Design of Existing Code,

Addison-Wesley: New Jersey.

2. Murphy-H ill, E. (2007), Activating

Refactorings Faster, in the Companion to

the 22nd ACM SIGPLAN Conference on

Object Oriented Programming Systems

and Applications Companion, pp. 925-

926.

3. Mealy, E. and Strooper, P. (2006)

Evaluating Software Refactoring ToolSupport, in the Proceedings of the

Austral ian Software Engineering

Conference, pp. 331-340.

4. Mealy, E., Carrington, D., Strooper, P. and

Wyeth, P. (2007), Improving Usability

of Software Refactoring Tools, in the

Proceedings of the Australian Software

Engineering Conference, pp. 1-10.

5. Drozdz, M. Z. (2008), A Critical Analysis

of Two Refactoring Tools, Masters

Dissertation, University of Pretoria,

South Africa.

6 . Henry, P. (1998) , User-Centered

Information Design for Improved

Software Usability, Artech House:

Norwood, Massachusetts.

7. Li, H. and Thompson, S. (2008), Tool

Support for Refactoring Functional

Programs, in the Proceedings of the

2008 ACM SIGPLAN Symposium on

Partial Evaluation and Semantics-based

Program Manipulation, pp. 199-203.

8. Maruyama, K. and Yamamoto, (2005),

Design and Implementation of an

Extensible and Modiable Refactoring

Tool, in the Proceedings of the 13th

International Workshop on Program

Comprehension, pp. 195-204.

9. Mendona, N.C., Maia, P.H.M., Fonseca,L.A., e Andrade, R. M. (2004), RefaX: A

Refactoring Framework Based on XML,

in the Proceedings of the 20th IEEE

International Conference on Software

Maintenance, pp. 147-156.

10. Murphy-Hill, E. and Black, A. P. (2008),

Breaking The Barriers To Successful

Refactoring: Observations And Tools

For Extract Method, in the Proceedings

of the 30th International Conference on

Software Engineering, pp. 421-430.


16/80

14

11. Murphy-H ill, E. (2 006), ImprovingUsability of Refactoring Tools, in the

Companion to the 21st ACM SIGPLAN

s y m p o s i u m o n O b j e c t - o r i e n t e dprogramming systems, languages, and

applications, pp. 746-747.


17/80

15


2010

Information Feedback Model forScalability in Distributed

Software ArchitectureBy Manjunath Ramachandra, Narendranath Udupa and Shyam Vasudev Rao

Pre-empting and appropriately predictingnetwork status through simulation can help

overcome network load related issues

Software architecture that has scalableand extensible components is usefulin meeting the requirements of reuse and

re-configurability. However, it involves

increased message handling and inter-

process communication. The addition of new

components to the scalable and extensible

software architecture poses the issue of

increased memory requirements, delay/

latencies, blockheads in the queue, etc.In this paper, a novel mechanism based

on information feedback is suggested to

control these parameters while scaling up the

architecture. Accurate models are required to

gear up performance with the scaling of the

software components. Although a good amount

of literature exists on the effects of scaling on

performance, they do not provide an insight

for improving or retaining performance. In

this paper, scaling is linked to the time shifts

of the feedback signal provided to the source

component through active control mechanism

such as, Random Early Detection (RED).

The paradigm of component based

software engineering has signicantly reduced

the software development cycle time and cost

[1]. It comes with attractive features such as

parallel development, software reuse, pluggable

maintenance, etc.

In the distributed software architecture,

the inter-component communication hasincreased burden over the available resources

such as bandwidth and buffer space. The

support ing network provides l imi ted

infrastructure. With the addition of every

component to the network, the contention for

the resources increases.

DISTRIBUTED COMPONENT

ARCHITECTURE

Components cluster the reusable portions of the

code together and provide appropriate interfaces


18/80

16

to the external world. The programmer can usethem with the right conguration minimizing

the code. As a result, software development

efforts have been greatly reduced.

The component based software model

has been extended to support distributed

software architecture spanning multiple

machines [2]. As a result, the components

or pieces of software residing on the other

machines can be used conveniently. It led to

the practice of distributed component based

software development being widely practiced

today. This calls for the transfer of components

to the runtime environment and conguration

of the same. Each instance of the component

can bear separate congurations accordingly.

The containers embedding the components

take the responsibility of communication with

reliability, scalability, etc.

The Model-based Approach

Simulation model is used for a variety of

applications. It provides a better understanding

of the issues, based on which the design

parameters are ne tuned. To make it happen,

the model and the actual network over which the

components communicate are to be compared

apple-to-apple.

The actual network and simulation

model are shown in Figures 1 and 2. The details

of implementation of the model are provided inthe next sections. It is expected that the model

is to catch the behavior of the inter-component

communication.

Relevance of the Quality of Service (QoS) in

the Distributed Component Architecture

A component provides a variety of services to

the users. It includes locating other components,

ins tant ia t ing , serv ic ing, computat ion,

communication, etc. In applications that involve

graphical user interface (GUI) the real time

response is mandated to provide a reasonably

good user experience.

Components often communicate the

services that are required to adhere to an agreed

QoS. In applications such as air ticket booking,

the data becomes invalid if the stringent timing

constraints are not honored. Low latency high

throughput systems require controllers to

ensure the QoS.

The increased contention among the

components takes a toll on the performance

of the applications that use these components.

Meeting the agreed QoS such as packet delay

and loss would be difcult. The QoS dened

at the network level translate on to the QoS

of the inter-component communication in the

distributed network. It calls for the usage of an

efcient trafc shaper at the component. The

Input Signal

Network

Observed Output

Characteristics

Desired Output

Figure 1: Inter-component Communication Network

Simulated Input Signal

Observed Output

Properties

SimulationModel

Figure 2: Simulation Model


19/80

17

trafc shaper fairly distributes the availableresources for the different components in

the network considering their real time

performance requirements.

The contention for resources can also

happen when several applications share

and try to access the same component. The

paths leading to this component would get

congested, leading to the dearth of resources.

Prediction of access patterns and priority of

the applications would be helpful to stagger

the access.

The problem gets compounded when

the number of components increase over the

network. The traffic shaper or controller has to

adapt to the dynamic scaling of the components

that get and move out of the network over a

period of time. Applications generally invoke

variable number of components during the run.

Mechanisms for the Implementation of the

QoS

Support for QoS at the network level involves

IntServ, DiffServ, etc. [3]. The QoS mechanisms

are also supported in the component

architectures such as CORBA for message

publishing and subscriptions [4]. However, the

mechanisms are static in nature. For the runtime

transport of the components, an effective

adaptable QoS mechanism is required. The

same is addressed in this paper.

INFORMATION FEEDBACK BASEDCONTROLLER MODEL

The components exchange the data over more

reliable network connection to minimize the

data loss. The equations of the packet ow used

for the trafc shaping are [5]

Where is the data window size, q is the

queue length, R is the round trip time, C is the

link capacity, N is the load and p is the packet

drop probability. The drawback with this

equation is that the status of the network will

be known only after the duration of the round

trip time (RTT). By then, the characteristics

of the network would have changed. So, a

time shift is proposed for the feedback signal

to be used in the controller [6]. In addition,if the status of the network is made known

several steps ahead of time, there would be

ample time for the components pumping the

data to take the appropriate steps. Bigger

the network more would be the traffic

and the time required to control the same.

Scalability of the components is linked to its

performance. In this section, a traffic shaper

blended with the predictor is introduced. The

effect of scaling the components with such a

traffic shaper in place has been analyzed in

Reliable network connection can be used toexchange components to reduce data loss


20/80

18

the next section and demonstrated through

simulation results.

In the proposed method, instead of

present value of the marking probability, its

predicted value is used. This predicted value is

generated with the help of a differentially-fed

articial neural network (DANN) that makes

use of several previous values to generate

the output [7]. The architecture of a DANN is

shown in Figure 3.

It consists of a predictor such as

a conventional artificial neural network

with the differentials of the output as the

additional inputs for prediction. Other

inputs are derived from parameters such as

instantaneous load on the network. Immediate

advantage of a DANN is that it provides

packet drop probability several time steps

in advance.

Apart from the prediction of the packet

drop probability, DANN provides severalinteresting features suitable for shaping the

inter-component communication trafc [8]. The

important ones are highlighted below:

Make use of long history of inputs. This

is because of the inherent autoregressive

nature of the model.

For a given number of iterations, the

square error is found to decrease with

the order of differential feedback.

The model can work on a reduced set of

training data.

For innite order of differential feedback,

the square error is zero.

The output with different orders of the

feedback forms a manifold of hyper

planes each of which is self similar to

the other [9].

The origin of self similarity is due to the

associated long- range dependency or

usage of the previous input and output

samples.

The self similar outputs have varying

degrees of abstraction, each of which

may be generated through the other by

convolving with a Gaussian pulse.

Features of the Proposed Model

The proposed information feedback model goes

well with the characteristics of the network

that support the distributed components.

The traffic over the network, be it LAN or

the WAN, turns out to be self similar [9]. In

other words, the shape of the autocorrelation

function of the traffic observed over different

time scales remains the same. The status

signal such as the probability of packet loss

sampled from the network, spans several

FeedbackController

Status Signalsfrom Network

Control Signalsto Network

Figure 4: Feedback-based Controller in the Network

Delay

Predictor

Other Inputs

NetworkInformation

Figure 3: Differentially Fed Artificial Neural Network


21/80

19

time scales. It happens to be the resultant ofthe events happening at different timescales.

The controller model precisely has all these

characteristics, making it look like the network

[10]. If it sits in the feedback path, it should

be easy to control the network as shown in

Figure 4.

Quality of Service (QoS) is related to

buffer dynamics. Though large buffers reduce

the cell loss, the packet delays increase. On

the other hand, small buffers could result in

reduced delay at the risk of increased cell loss.

It calls for different enforcements i.e., packet

discard rules to ensure the QoS adaptively

depending on buffer size, growth, rate of the

growth, etc. These parameters are considered as

the inputs of the DANN that predict the trafc

at different time scales.

As a result of the differential feedback

being used in the controller, the control signal

spans multiple time scales [11]. This important

feature provides additional advantage of

multiple time scale control for the inter-

component communication.

The usage of these previous values

is equivalent to sieving the traffic through

multiple time scales. Hence, when a DANN is

used, it is equivalent to scale the time by a factor

that is a function of the order of the feedback.

DANN produces same results of scaling.Replace p(t-R(t)) with p(t). The equation may be

thought of as the backward prediction starting

from p(t) and up to p(t-RTT). Artificial neural

network works the same way for backward

prediction. The time shift provided for the

predicted samples amounts to generating the

samples with future and past values. Derivative

action generally reduces the oscillations. So,

proportional derivative with prediction may

be used to reduce q variance. The idea here is

to use the shifted versions of the near future

prediction of loss probability as the controlsignal that is fed back to the input. Shift

provided to the feedback signal is equivalent

to scaling the network with increased number

of components.

Relevance of the Congestion Detection

Interval

The congestion detection interval is a critical

point in the design of the proposed congestion

avoidance algorithm. If the interval is too

long, the controller cannot adapt swiftly to

the changes in trafc input rate, making the

difference between the predicted input rate

and the actual input rate very large. As a result,

packets get dropped in the network.

On the other hand, if the interval is

too short, the prediction error would be too

large and the network would be left with near

congestion state.

The controller handling short timescales

are to account for the bursting trafc from a

single dominant or a limited number of data

sources that tend to ood the buffers along

the path. Such a source can be contained at the

earliest after RTT and adequate buffer space has

been reserved to handle the same.

SCALING OF THE COMPONENTS

As N, the number of components interacting ata given time increases, the time slot allocated for

each of them to share the common bandwidth

decreases. This leads to the trafc control at

different timescales. Let the communicating

N components share a path with xed link of

capacity C. Let T be the propagation delay of

the path.

Let q(t) and p(t) denote, respectively,

the queue length and the packet dropping

probability of the link at time t. Let Wi(t) and

Ri(t) denote, respectively, the window size and


22/80

20

the round trip time of ow i at time t. Then (1)and (2) reduce to

and is the multiplicative parameter,

p is the packet dropping probability and

.

The rescaling or the equivalent shift

given to the feedback signal increases the link

bandwidth and reduces the link delay by the

scaling parameter, . This is because each

packet is served at times larger bandwidth

(which reduces its transmission time by 1/

) and experiences 1/ times larger delay.

The time instants at which an event-over-the-packet occur is also advanced by 1/ . The

delay reaches the stable value faster. That

is, a packet event that occurs at time t in the

original network before scaling the components

advances to (1/ )t in the scaled network as a

result of scaling the components.

As seen earlier, shift in the feedback

signal amounts to scaling of the components or

the supporting network. One of the effects of

scaling is that the number of events the packets

encounter gets reduced. The bandwidth delay

product remains the same and is governed by thecapacity of the network. Physically, it represents

the capacity of the ow in terms of the amount

of data packets that can be transmitted without

waiting for the acknowledgment.

Let B and D represent the available

bandwidth and the delay along a path in the

original network and PBDPbe the bandwidth-

delay product along the path. Let B, D, and

PBDPbe the equivalent parameters in the scaled

network. Since the bandwidth-delay product

remains invariant [8], the constraint for scaling

becomes

PBDP= B . D = B. D = PBDP (6)

From the equation 6, the scaling

parameter is determined as follows:

B = .B; (7)

D =D/ (8)

where >1.

The shift changes the self similarity

because shift implies a different order [12]. In

central interpolation, is replaced by

i.e., the time scale would be different. This is

as though the number of sources increases.

Increase in the number of sources is equal to

changing t to t. This is equivalent to multiplying

differential by It takes to . This is

because the differentials and shifts are related

by the central interpolation.

It is important to note that capacity networkdefines and governs the bandwidth delay


23/80

21

i.e., y= becomes

Here A, B and C are arb i t rary

proportionality constants.

The outputs of the controller are central

differences that are generated by providing

shifts for the predicted probability of the packet

drop signal. These derivatives are realized

with the help of a DANN. Here the near future

prediction of loss probability is being used as

the control signal that is fed back to the input.

The mechanism works because of the round trip

time. The controller will not react immediately

on the packet drop decision.

Scaling of the components resultin increased RTT, increased load, reduced

capacity, reduced buffer size, that happen

dynamically in the network adversely affecting

the performance of the packet network in terms

of increased losses, increased delays, etc. In this

work, it is emphasized that the shifts amount

to scaling and by appropriately shifting the

feedback signal and it is possible to achieve

the same or better performance even when the

number of components scale up. The packet

sources or components can shift this feedback

signal appropriately depending upon theinstantaneously measured network parameters

and achieve better end-to-end performance.

Scaling of the Queue Dynamics

The delay gets scaled as the avai lable

bandwidth reduces and it increases with

increase in N. Last section explains that, with

scaling, all the events get advanced by the

factor 1/ . The queue length in the buffers of

the original network is more than the same in

the downscaled network. This is because, the

downscaled network responds in line with the

original network.

It is assumed that the channel capacity C

remains the same. With this, dq/dt reduces to

This equation shows that indicating

the effect of scaling. It may be noted that, in

equation (11), the change in the queue size is

Increased RTT, increased load, reduced capacity,reduced buffer size result from scaling components


24/80

22

computed as the difference between the arrival

rate of the different ows and the link capacity.

The arrival rate of a ow i is computed using

equation 6, by dividing the bandwidth-delay

product by the delay.

The Tables 1 to 4 show the variance

of the packet queue for different number of

components in the network. The change in

the queue size variance for different degrees

of the shift in the predicted feedback signal is

provided. The simulation has been carried out

in SIMULINK. The simulation setup for all the

experiments in the paper is shown in Figure

5. The block-sets that generate the desired

traffic pattern available in SIMULINK have

been used. The buffer size of the gateway is

8000 packets and the target queue length is

3200 packets.

Figures 6 and 7 on page 24 show the

instantaneous queue with the above simulation.

It can be seen that, when the number of sources

is 40, the ripples in the queue settle down fast.

The same is true with the average queue shown

in the Figures 8 and 9 on page 24.

As seen in Tables 1 4, the variance will

be reduced with increase in the prediction step.

It continues for some time. Again the variance

shows the upward trend. This is because

the autocorrelation function of a long range

dependent series exhibits oscillations and the

data trafc over the network is self similar and

long range dependent.

No. ofsources

Variance withRED

Standard deviation withproposed method

20 125.2279 111.5828

30 134.0159 126.0763

40 140.5793 129.6867

60 142.8687 111.8134

80 177.0254 126.0417

100 194.5093 138.2350

120 219.2376 151.1265

Table 1: Performance with RED and the Proposed Methodfor Different Number of Components in the Network with

Prediction Step=1

No. ofsources

Variance withRED

Standard deviation withproposed method

20 125.2279 106.8508

30 134.0159 120.8611

40 140.5793 128.3137

60 142.8687 111.8134

80 177.0254 126.0417

100 194.5093 138.2350

120 219.2376 151.1265

Table 2: Performance with RED and the Proposed Methodfor Different number of Components in the Network with

Prediction Step=3

No. of

sources

Variance with

RED

Standard deviation with

proposed method

20 125.2279 106.5868

30 134.0159 118.4315

40 140.5793 128.5254

60 142.8687 111.8134

80 177.0254 126.0417

100 194.5093 138.2350

120 219.2376 151.1265

Table 3: Performance with RED and the Proposed Methodfor Different Number of Components in the Network withPrediction Step=4

No. of

sources

Variance with

RED

Standard deviation with

proposed method

20 125.2279 106.6906

30 134.0159 119.2891

40 140.5793 127.5494

60 142.8687 111.8134

80 177.0254 126.0417

100 194.5093 138.2350

120 219.2376 151.1265

Table 4: Performance with RED and the Proposed Methodfor Different Number of Components in the Network withPrediction Step=6


25/80

23

As the load increases, variance increases.

When the transmission rate reaches the capacity

of the network, the controller will be able to

keep the aggregate throughput steady. As a

result, the variance falls. At this stage, if the

load on the network increases with the scaling

of components, the feedback signal dominates

and reduces the transmission rate. Again thevariance starts increasing. The same is evident

in the tables.

RED detects the network congestion

by computing the probability of packet loss at

each packet arrival. As a result, the congestion

detection and the packet drop are computed

at a small time scale. On the contrary, in the

proposed scheme, the congestion detection and

the packet drop are performed at a different

time scales, overcoming the issues stated

above.

As the proposed scheme detects and

predicts the congestion at a large time scale, the

components can react for the network conditions

more rapidly. They receive the congestion

status signal and reduce the data rates before

the network actually gets into congestion. The

reduced chances of congestion minimize the

required packet buffer size and therefore reducethe end-to-end delay and the jitter is evident

from the Figures 10 and 11 on page 25 and

Figures 12 and 13 on page 26.

The queue and delay are shown in the

simulation for different loads. The total number

of data sources activated in each of the 120

second simulation time range from 40 to 120.

To see the effect of the shift, the number

of sources has been changed to 40 and 120

respectively. The results are taken for the shifts

of 4. As the load increases, the advantage due

lost prob BW

RTT window

lost prob BW

RTT window

lost prob BW

RTT window

TCP source

http source1

http source2

1 to 0

0 to 1

Product1 Gain2

Product Gain1

20

60

20

Gain

queuedelaydrop prob

traffic

inst queue length

queueing delay

avg queue length

drop prob

RED routers

sum all TCP

Inherit

Signal Specification

ccc1.mat

From File

Integrator

avg value

avg queue length1

1s

+

+

prpogationdelay

avg1.mat

To File1

To File2

inst1.mat

inst queue length

avg queue length

Congestion Window

0.2

X

X

++

+

Figure 5: Simulation Model


26/80

24

to shifts gets reduced. However, the delay with

shifts is always less than the one without shifts.

The packet loss rates have been found

to be 5.2% and 4.6% respectively. It makes

the throughput with shift 1.2 times larger

compared to the one without shift. The factor

is 1.06 for a shift of 1. The high gain of 5.32 in

the throughput factor has been observed when

the load is low i.e., 20, with a shift of 1.

Scaling of the Window Dynamics

Equation 11 implies that each connection in the

downscaled scenario (reduction in the number of

components) exhibits the same window dynamics

but the response is 1/ times faster than that in

the original case (since the connection as a result

of downscaling adjusts its rate per round trip

time Ri(t)). The equation governing the window

dynamics in the scaled network may be written as

Ins

tan

taneousva

lueo

fQ

for

nos

hift,

1s

tepan

d2s

teps

hift

600

500

400

300

200

100

0

0 5 10 15 20 25 30

Time

Ins

tan

taneousva

lueo

fQ

for

nos

hift,

1s

tepan

d2s

teps

hift

800

700

600

500

400

300

200

100

0

0 5 10 15 20 25 30

Time

Figure 7: Instantaneous Value of the Queue for 40 sourcesWithout Shift With Shift 1 With Shift 2

Ins

tan

taneousva

lueo

fQ

for

nosh

ift,

1s

tepand2s

teps

hift

250

200

150

100

50

0

0 5 10 15 20 25 30

Time

Figure 8: Time Response for 20 Sources. Without Shift With Shift 1 With Shift 2

Ins

tan

taneousva

lueo

fQ

for

nosh

ift,

1s

tepan

d2s

teps

hift

300

250

200

150

100

50

0

0 5 10 15 20 25 30

Time

Figure 9: Time Response for 40 Sources Without Shift With Shift 1 With Shift 2

Figure 6: Instantaneous Value Of The Queue For 20Sources Without Shift With Shift 1 With Shift 2


27/80

25

where = .

By rearranging the terms, the effect of

scaling becomes

where = .It may be noted that, the current

bandwidth-delay product of a data path in

the network as explained in equation 6 is

used as the current window size, and hence

Because, over a short time

(Eq. (11)) and . Due to the advancement

of the time events as a result of scaling, the

window size can be reduced to achieve the same

or better performance relative to the original

network.

Scaling of the RTT Dynamics

The round trip time in the scaled time domain,

t is using predicted versions of the feedback

signal. Here, RTT gets reduced by the same

amount of the prediction or forecast.

When RTT is small, the components

sensing the network respond too fast. If

the network is out of congestion, all the

components start increasing the transmission

rate resulting in the network to stay in thecongestion state. Hence, the phase transition

of a network can happen at a lower load,

much before any of the components tend

to grab the network with higher data rates.

The traffic exhibits self similarity at lower

loads when the RTT is small and therefore

better predictable. The same is true when

the number of components gets scaled up.

The contribution of individual components

towards the network load is still small.

Addition or deletion of a component in to

De

lay

for

nos

hift

an

ds

hift

=

4

0.65

0.6

0.55

0.5

0.45

0.4

0.35

0.3

0.25

0.2

0 50 100 150

Time

Figure 10: Packet Delay for 40 Sources with a Shift of 4.Without Shift With Shift

De

lay

for

nos

hift

an

ds

hift

=

4

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0 50 100 150

Time

Figure 11: Packet Delay for 80 Sources with a Shift of 4.Without Shift With Shift


28/80

26

the network influences the phase transition.

Although content ion for the resources

increases with increase in the number of

components, the traffic is better controlled as

evident from the graphs.

Organization of the Components to meet the

Prescribed QoS in the Communication

The simulation results shown in the figures

indicate that the QoS of the applications

driven by the components may be enhanced

with the usage of the prediction feedback

signal. Depending upon the availability of

the network bandwidth and with the input

from the prediction feedback signal, the

appropriate abstract version of the data may

be transferred over the network. For the end

application that is calling this service, there

will be a reduced service option instead of

total denial of the service. The dynamic

mapping of the different abstractions of the

content and the quality of service is shown

in Figure 14. If the network bandwidth

available is small, abstract version of the

content may be transferred instead of

denying the same.

Q1

Q2

Q3

Q4

Figure 14: Organization of the Data/Signal that getExchanged between Components

K1

K2

K3

K4

K5

Data / Signal Encryption Keys

Figure 15: Organization of the Data/Signal that getExchanged between Components

De

lay

for

nos

hift

an

ds

hift

=

4

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0 50 100 150

Time

Figure 12: Packet delay for 120 Sources with a Shift of 4.Without Shift With Shift

Queues

izew

ithou

ts

hift

an

ds

hift

of

8400

350

300

250

200

150

100

50

0

0

Time10 20 30 40 5 0 60 70

Figure 13: Queue Size for 40 Sources with a Shift of 8.Without Shift With Shift


29/80

27

Organization of the Components to Meet thePrescribed Security Constraints

The component communication has opened up

security concerns with increased transactions

over the internet. Powerful encryption

algorithms are required to support the same.

However, providing the same degree of

security among the communication of all the

components, especially when the numbers are

getting scaled is nearly impossible. A solution

is provided to get around this problem.

The data or commands to be exchanged

among the components are to be broken down

into a set of logically disjoint fragments so that

they are transferred independently. Figure 15

on page 26 shows the hierarchical organization

of the data (or command) at different degrees

of abstraction.

Each of the abstractions is encoded

with different encryption key catering to the

security needs. For example, components

handling nancial applications are required to

be encrypted with higher security.

CONCLUSION

With the addition of every component

in distributed software architecture, the

performance of rest of the components in the

enabling distributed network gets affected.

Through advance prediction of the status of thenetwork as proposed in this paper, it should be

possible to overcome the issue of the network

load. The delay as well as the loss rate gets

reduced which otherwise would be substantial.

Today the simulation provides encouraging

results with improved quality of service. As an

extension of the simulation, the components are

to be loaded in to the actual build of a project.

REFERENCES

1. Heine man, G.T. and Council l, W.

T. (eds), (2001), Component BasedSoftware Engineering: Putting the Pieces

Together, Addison Wesley: Reading,

Massachusetts.

2. Wallnau, K., Hissam, S., and Seacord,

R. (2001), Building Systems from

Commercial Components, Addison

Wesley.

3. Harju, J. and Kivimaki, P. (2000), Co-

operation and Comparison of DiffServ

and IntServ: Performance Measurements,

in the Proceedings of the 25th Annual

IEEE Conference on Local Computer

Networks, p. 177.

4. OMG (2002), Notification Service

Specification, Object Management

Group. Available on www.omg.org.

5. Floyd, S. and Jacobson, V. (1993),

Random Early Detection Gateways for

Congestion Avoidance, IEEE/ACM

Transactions on Networking, Vol. 1 No.

4, pp. 397-413.

6. Manjunath, R. and Gurumurthy,

K.S. (2004), Maintaining Long-range

Dependency of Traffic in a Network,

CODEC04.

7. Manjunath, R. Gurumurthy, K.S.

(2002) , Information Geometry of

Differentially Fed Artificial Neural

Networks, in the Proceedings of IEEERegion 10 Conference on Computers,

Communications, Control and Power

Engineering, pp. 1521-1525.

8. Kim, H. (2004), Enabling Theoretical

Model based Techniques for Simulating

Large Sca le Networks , Doctora l

Dissertation, University of Illinois at

Urban-Champaign.

9. Erramilli, A., Roughan, M., Veitch, D.

and Willinger, W. (2002), Self similar

Trafc and Network Dynamics, in the


30/80

28

Proceedings of the IEEE, Vol. 90, No. 5,pp. 800-819.

10. Manjun ath, R. and Jai n, V. ( 2010),

Trafc Controller For Handling Service

Quality in Multimedia Network, Book

chapter in Intelligent Quality of Service

Technologies and Network Management:

Models for Enhancing Communication,

IDEA Group Publishers, pp. 96-112.

11. Manjunath, R. and Shyam, V.R. (2009),

Data Network Performance Modelingand Control Through Predict ion

Feedback, ISSRE 2009.

12. Marsan, M.A., Garetto, M., Giaccone, P.,

Lionardi, E., Schiattarella, E. and Tarello,

A. (2005), Using Partial Differential

Equations to Model TCP Mice and

Elephants in Large IP Networks, IEEE/

ACM Transactions on Networking, Vol

13, No 6, pp. 1289-1301.


31/80

29


2010

Creating and Benchmarking OWLand Topic Map Ontologies for

UBL ProcessesBy Kiran Prakash Sawant and Suman Roy PhD

A

Build a knowledge-based system that caters to theneed for a common standard framework to store,

manage and retrieve data

rriving at a consensus on what constitutes

software business processes is an

extremely difcult task, as there is a lack of

common agreement amongst stakeholders. A

solution to this lies in proposing a common,

shared and widely acknowledged framework

for the domain. Ontology description can play

a useful role here. Ontology can be viewed as a

special kind of semantic network representing

the terminology, concepts and the relationshipsamong these concepts related to a particular

application domain. Various knowledge-based

systems (KBSs) have been built to support this

ontology. These KBSs can differ in a number

of ways like main memory containment,

secondary storage and reasoning capability.

In this paper, we consider ways to create

and choose an appropriate KBS for a large

business process application with universal

business language (UBL) processes. UBL is an

OASIS standard to develop common business

document schema to provide document inter-

operability in the electronic business domain

[1, 2]. UBL comes with a library of reusable

components such as address, price and a set

of document schema such as order, invoice,

remittance advice that are meant for use in

e-business. It is increasingly becoming popular

with public and private sector organizations

around the world.

Managing the data stored in UBLdocuments is not an easy task, especially when

it comes to ensuring efficient information

retrieval, discovery and auditing. The challenge

is to extract meaningful information from the

large amounts of available data. Search engines

are not of much help for such documents.

Therefore, it is useful to add structure and

semantics to provide a mechanism for

more precisely describing the data in these

UBL documents [3]. Further, for semantic

information to be useful, it should be able to


32/80

30

dene characteristics that the document shouldpossess viz., methods of ordering and payment,

constraints on spatial and temporal availability,

etc. Such semantic information can be dened

effectively through ontology languages like

semantic web initiatives, topic maps (TM) and

web ontology language (OWL) [4, 5, 6].

We felt that it would be useful to use

semantic web formalisms, like TM and OWL to

construct KBSs that support ontologies hidden

in UBL documents. We believe that our TM

based approach is superior to that of an RDBMS

based one because the use of scopes within TM

allows for powerful navigation of the content

by effectively pruning the dataset, leading to

more effective queries as well as being a means

to validate associations and optimize merge

operations. Also, the advantage of using OWL-

based KBS is that it provides sufcient reasoning

capabilities to support semantic requirements ofthe application in mind and carry out queries

about the instances over ontologies.

Further, we would like to evaluate these

KBSs with respect to scalability, efficiency

and reasoning capability. Since we shall be

working with UBL process diagrams we

shall be generating huge datasets that would

provide us a good basis to evaluate these

ontologies for scalability and efciency. For

reasoning requirement, the KBS should be

sound and complete for both OWL-DL and

TM. However, data processing time and queryresponse time would go up with increase in

reasoning capability. One has to keep track of

the trade-offs between scalability and reasoning

capability to evaluate KBSs [7].

For all the above reasons certain

benchmarks should be established to facilitate

the evaluation of semantic web KBSs in a

standard way. These benchmarks should be

based on well known practices for benchmarking

databases and knowledge bases [7, 8, 9]. The

benchmarking is done on ontologies created

out of UBL process diagrams. Its test data are

generated using XML instances of business

documents that follow their standard schema.

They can be scaled to be arbitrary sized.

An experiment is conducted using UBL

process of payment [10]. Two kinds of reasoning

systems are used - one is based on OWL-DL

reasoning and another on TMs. Experimentalresults are provided and the performance of the

systems is using several metrics is discussed.

Different aspects of system performance,

both objective and subjective in nature are

highlighted. Objective performance metrics

are load time, repository size, query response

time, completeness, etc. Ten test queries

are worked upon to evaluate these metrics.

The importance of subjective measures is

demonstrated by capturing various dimensions

of users perspective like effort estimation and

The trade-offs between scalability and reasoningcapability need to be tracked to evaluate KBSs


33/80

31

usability. Using this benchmarking approachwe are able to empirically compare two

different KBSs.

Guo et al., carry out a benchmarking of

semantic web knowledge-based systems with

respect to use in large OWL applications [9].

They consider Lehigh University benchmark

(LUBM) as an example for designing such

benchmarks. The LUBM features four KBSs

representing a University domain. The

authors evaluated these KBSs with respect

to fourteen queries employing the following

performance metrics as yardsticks - load time,

repository size, query response time, query

completeness and soundness and a combined

metric.

There are some pieces of work that deal

with evaluation of ontology-based retrieval

systems with respect to free text searching

systems. Kim conducted an evaluation studybased on 20 questions and using 10 domain

experts with respect to search time and

relevance criteria [10]. The study shows that

ontology-based systems perform better than

free text searching systems. Sure and Iosif

compare two ontology-based systems to a

free-text searching system [11]. It revealed

that users of the ontology-based systems

made fewer mistakes and took less time to

complete tasks. However, most of these studies

have considered objective measurements,

such as search time, mistakes committed,search completeness, etc. They have not

measured user criteria such as satisfaction,

appropriateness and ease of use.

Some studies have also focused on

subjective performance metrics. Gyun Oh

and Park have compared the performance

of a TM based Korean Folk Music (Pansori)

Retrieval System (TMPRS) and a representative

Current Pansori Retrieval System (CPRS) [13].

Participants were given the task to carry out

several predened jobs and execute their own

queries. Based on this, the study measures

subjective as well as objective performance of

two systems. A similar task was carried out

concerning the users searching performance

[14]. In this work Yi compares a TM based

ontology information retrieval system (TOIRS)

and a Thesaurus-based information retrieval

system (TIRS). Forty participants took part ina task-based evaluation where two dependent

variables recall and search time were

measured. The study indicates that TOIRS has a

signicant and positive effect on both variables,

compared to TIRS.

We borrow some of the ideas related

to performance metrics from both LUBM

benchmarking o f semant ic KBSs and

performance studies of the above information

retrieval systems. While we draw material

heavily from the LUBM

Some Perspectives Software Engineering

Documents

Transcript of Some Perspectives Software Engineering