A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

38
A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION VERSUS PRIMITIVE OPERATIONS FOR BUILDING A REAL-TIME COLLABORATIVE EDITING API _______________ A Thesis Presented to the Faculty of San Diego State University _______________ In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science _______________ by Leslie A. Viviani Summer 2013

Transcript of A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

Page 1: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT

OPERATION VERSUS PRIMITIVE OPERATIONS FOR BUILDING A

REAL-TIME COLLABORATIVE EDITING API

_______________

A Thesis

Presented to the

Faculty of

San Diego State University

_______________

In Partial Fulfillment

of the Requirements for the Degree

Master of Science

in

Computer Science

_______________

by

Leslie A. Viviani

Summer 2013

Page 2: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION
Page 3: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

iii

Copyright © 2013

by

Leslie A. Viviani

All Rights Reserved

Page 4: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

iv

DEDICATION

This thesis is dedicated to my family who has stood by and supported me in every

way possible; from loving encouragement to those not-so-gentle nudges to “just get it done.”

Page 5: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

v

Before software can be reusable it first has to be usable. ---Ralph Johnson

Page 6: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

vi

ABSTRACT OF THE THESIS

A Comparison of the Efficiency of an Atomic Component Operation versus Primitive Operations for Building a Real-Time

Collaborative Editing API by

Leslie A Viviani Master of Science in Computer Science

San Diego State University, 2013

Real-time collaborative editing is a productive way to work in groups and drive innovation. A software application is more likely to be adopted by its users if it is familiar to them and something they already know how to use. Thus, an API that would allow a development team to turn a single-user application into a collaborative application is needed. Such an API would need to find a balance between complexity from the perspective of the developers building the API and the developers using the API to build a real time collaborative editor.

The API should be flexible and include enough operations so as to be useful, but not so many operations as to make the operation transformations overly complex. This paper presents a comparison of the efficiency of primitive algorithms versus atomic component algorithms in the context of building a real-time collaborative editing API. The atomic component operations perform better, both in terms of CPU clock cycles as well as in terms of ease of use for a developer building an application.

Page 7: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

vii

TABLE OF CONTENTS

PAGE

ABSTRACT ............................................................................................................................. vi

LIST OF TABLES ................................................................................................................. viii

LIST OF FIGURES ................................................................................................................. ix

ACKNOWLEDGEMENTS .......................................................................................................x

CHAPTER

1 INTRODUCTION .........................................................................................................1 

2 BACKGROUND AND RELATED WORK .................................................................2 

2.1 Real Time Collaborative Editing .......................................................................2 

2.2 Concurrency Control ..........................................................................................3 

2.3 Operational Transform .......................................................................................3 

2.4 Algorithms to Support OT for RTCE ................................................................6 

3 METHODS ....................................................................................................................8 

3.1 Efficiency ...........................................................................................................8 

3.2 Functionality ......................................................................................................8 

3.3 Design ................................................................................................................8 

3.4 Implementation ..................................................................................................9 

3.4.1 The Test Code ...........................................................................................9 

3.4.2 Test Code Example .................................................................................13 

4 RESULTS ....................................................................................................................15 

5 DISCUSSION ..............................................................................................................23 

5.1 Average Run Time Comparison ......................................................................23 

5.2 Developer Time ...............................................................................................24 

6 CONCLUSION ............................................................................................................25 

REFERENCES ........................................................................................................................27 

Page 8: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

viii

LIST OF TABLES

PAGE

Table 3.1. Test Suites Description ...........................................................................................10 

Table 3.2. Core Classes of the Model RTCE System ..............................................................11 

Table 3.3. Test Descriptions ....................................................................................................14 

Table 4.1. Time Results Primitive Algorithms Run First ........................................................16 

Table 4.2. Time Results Atomic Component Algorithm Run First .........................................17 

Table 4.3. Total Time Results All Primitive Operation Tests .................................................17 

Table 4.4. Total Time Results All Atomic Component Operation Tests .................................18 

Page 9: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

ix

LIST OF FIGURES

PAGE

Figure 2.1. Graphical representation of operation transformation. ............................................5 

Figure 3.1. Code listing 1 primitive operation test pseudo code. ............................................12 

Figure 3.2. Code listing 2 atomic component operation test pseudo code. .............................13 

Figure 4.1. Time by test - primitives run first. .........................................................................18 

Figure 4.2. Time by test – atomic component run first. ...........................................................19 

Figure 4.3. Total time of all tests .............................................................................................20 

Figure 4.4. Time by run number primitive operation tests run first. .......................................20 

Figure 4.5. Time by run number atomic component operation tests run first. ........................21 

Figure 4.6. Combined time by run number. .............................................................................22 

Figure 4.7. Total combined test execution time. ......................................................................22 

Page 10: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

x

ACKNOWLEDGEMENTS

I would like to thank my advisor, Dr. Joseph Lewis, for his support through this

process and his enthusiasm for teaching. His work has opened my eyes and mind to a world

that is beyond my normal black and white thinking.

Additionally, I would like to thank the members of my committee for their work and

advice through this process.

Finally, I wish to thank my family whose enthusiasm and support have helped make

this achievement possible.

Page 11: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

1

CHAPTER 1

INTRODUCTION

Many companies have workers that are not co-located, but must work together

anyway to collaborate on work. It is often the case that many hours are wasted sending

documents back and forth across email or servers. A typical scenario is that one user updates

the document, sends the document with their changes to another user and then must wait for

the next user to send their changes back. In today’s fast paced world of short deadlines and

around-the-globe workforce, a real-time collaborative editing (RTCE) system would increase

productivity and lessen frustration among those users.

A real-time collaborative editor could be implemented as a stand-alone application

that users could log into and use or it could be implemented as an interface to an existing

single user application, which would allow it to become a real-time group editor. Building an

interface to an existing single user application is beneficial for the end user in that they don’t

have a new tool to learn. However, this approach raises an important issue – the developer of

the API must find a good balance between the complexity of the RTCE from the point of

view of API development and the ease of use of that API from the perspective of the

developer using that API to build such an interface.

Real time collaborative editors must handle issues several consistency issues and

prevent problems such as divergence, causality violations, and intention violations. There are

many algorithms to handle these issues, but one of the most prevalent is Operational

Transform (OT). Operational transform resolves conflicts that can occur when two or more

users are attempting to update the same portion of a document model. OT allows those users

to do this without locking or manual intervention.

An OT system for document editing can be built using the primitive operations of

Insert and Delete. Any type of behavior needed in a real time collaborative text editor could

be modeled using these two operations. However, it is more efficient, both in terms of CPU

time and in terms of developer time to provide higher level atomic component operations,

such as a Move operation.

Page 12: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

2

CHAPTER 2

BACKGROUND AND RELATED WORK

2.1 REAL TIME COLLABORATIVE EDITING

Real-time collaborative editing (RTCE) allows two or more users in possibly

different locations working on different computers to simultaneously work on the same

electronic file (for example, a text document) and see each other’s changes in real time.

Most modern RTCEs based on Operational Transformation (OT) typically use a replicated

architecture in which the shared document is replicated at each site involved in the

collaboration [1]. A shared copy of the document model at each collaboration site helps

ensure a good user experience. A user makes a change to the document and they see that

change performed in the document immediately. The operation is then propagated to the

remote sites and transformed against local operations at each collaboration site. Some

systems include a server that contains the master document, and each remote site involved in

the collaboration then performs their transformations only with the master document, instead

of every other remote site.

There are many different implementations of RTCE systems developed over the last

several years. These systems all belong in one of two broad categories – the system is either a

standalone, fully self-contained separate application or is an add-on to an existing single user

application to make it collaborative [2]. Some examples of these applications are CoWord,

CoMaya, Ace Editor, ShareJs, and Google Wave.

There are advantages to the approach of modifying a single user application to make

it collaborative. For example, taking a tool that is familiar to many users, such as Microsoft

Word, and providing a way to make it collaborative would get more user buy-in rather than

asking a user to become familiar with a different word processing tool. There is much less of

a learning curve and it may be easier to convince users to try it rather than try a new software

package [3-5].

Page 13: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

3

Regardless of the type of collaborative editing application (stand alone or existing

modification) any RTCE application must handle concurrency control and consistency

correctness.

2.2 CONCURRENCY CONTROL

Concurrency control is concerned with the coordination of concurrent access to a

shared resource and resolving any conflicts that may arise when two or more users attempt to

modify the same portion of a document model [6, 7]. One of the primary functions of

concurrency control is to ensure the consistent state of the model. In other words, it must

ensure that the correct results are generated in all instances of the document model. There are

several inconsistency problems that can occur in RTCE. The primary errors that a RTCE

needs its concurrency control algorithm to prevent are divergence, causality violations, and

intention violations [3, 7, 8].

Divergence has occurred when the final result in all instances of the document model

are not identical. This can happen when operations arrive at different sites in different orders.

Since there may be dependencies among the operations originating from a site, but the

operations get executed in different orders at other sites, the final document state may

diverge [8-10].

A causality violation is when the executing order of the operations is different than

the cause and effect order. Since operations may arrive in different orders in which they were

generated, they may be executed out of their original order which may cause confusion to the

user [11-13].

An intention violation can occur when the actual effect of an operation execution is

different from the intended effect. This can happen if operations cause a different operation

to commit an unintended effect. The intention of an operation must be preserved across all

client sites, regardless of any concurrent operations. This means that the observed effect of

the operation at all sites is the same as the operation at the site it originated from [14, 15].

2.3 OPERATIONAL TRANSFORM

One such method of concurrency control is Operational Transform (OT) which is an

optimistic concurrency algorithm. The premise of optimistic concurrency is that the

probability of two transactions accessing and modifying the same object or set of data is low.

Page 14: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

4

Operations are allowed to execute as if there were no possibility of conflicts (as

opposed to a locking concurrency control which only allows one user at a time to modify an

object or set of data) [4, 16].

An OT system is built from a collection of algorithms that provide a way to resolve

conflicts without user intervention and without locking the data model so that more than one

user may work on the same data model at the same time. Without locking, OT is able to

operate in high latency environments, such as web applications, without lag time and delay in

the user experience [17].

OT allows you to look at handling changes based on the operation level instead of on

a whole document model level. It is much easier to handle transforming a single operation

against another operation and providing those to the remote document sites to bring the

document model into convergence than it is to consider doing so for an entire document

model [7, 4].

Inevitably, conflicts do occur and the algorithm must be accompanied by the

transformation of the operations so that operations invoked by different users can be applied

to the documents whose states have diverged and bring those documents back to the same

state. An example scenario of how a conflict can occur and is resolved with operational

transform follows.

A user named Ben and his colleague named Charlie are both working to complete the

end of the month sales report for their manager. Ben and Charlie are not co-located and are

using a RTCE to complete their work. They both see an issue at the end of the document and

both attempt to insert text in the same spot in the document model. Ben inserts a “b” and

Charlie simultaneously inserts an “s”. Both operations are sent to the server. The

transformation that happens with the operation sent by Ben will retain two items and insert

bs, and the operation send by Charlie will retain the two items and insert sb. The server can

only apply one operation at a time and chooses one set of transformed operations to apply

first. However, as soon as one of the transformed operations is applied, the retain portion of

the second transformed operation becomes invalid. Depending on which transformed

operation was applied first, the ending of the document will either contain sb or bs.

Each client, as well as the server, needs to be made aware of every other client’s

operations. However, just sending Ben’s operations to Charlie’s version of the document

Page 15: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

5

model and vice versa will not work, and the document states will not converge. This is where

Operational Transform can help.

The problem can be visualized as in the diamond problem shown in Figure 2.1. This

diagram shows the application of two separate operations on a document model at the same

time, operation a and operation b. In a diagram such as this, the client operations move the

document model to the left and server operations move the document model to the right.

However, both types of operations move the document model downward. This view is a

representation of the operations applied in what is called a state space. When both the client

and server lines pass through the same point, in means that the document model, at that point

in time, has converged.

Figure 2.1. Graphical representation of operation transformation.

Going back to the above example, imagine that Ben’s operation is the a operation in

the diagram and Charlie’s operation is the b operation. In this case, the b operation is applied

first by the server, followed by the a operation. In order to ensure the document model states

converge, the a operation needs to be transformed with respect to the b operation and the b

operation needs to transform with respect to the a operation.

The transform function is based on the mathematical identity:

xform(a,b) = {a’,b’}

where a and b are the original operations; one server and one client operation. The transform

function takes these two operations and produces a pair of operations such that when applied,

both documents wind up converging. In other words, if the client applies a followed by b’

Page 16: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

6

and the server applies b followed by a’, then both documents will end up with the same final

state.

Using the operational transform identity, when operation b is received on the client

side from the server, it is paired with operation a to produce (a’, b’) and then compose

operation b with a to produce the final document model state. The comparable procedure is

followed on the server side and ensures that the document models on the client side and the

server side converge to the same document model state.

2.4 ALGORITHMS TO SUPPORT OT FOR RTCE

One of the great pioneers of RTCE using OT is Dr. Sun Chengzheng who explains in

several papers that the operational model of a basic OT system needs only the primitive

operations of Insert and Delete. This is correct for string based document transformations;

these two primitives can model virtually any complex operation needed.

However, when building a RTCE API in order for a development team to program an

interface to turn a single-user application into a multi-user collaborative application, it would

be not only more efficient, but more usable for a developer to have access to higher level

atomic component operations.

An application programming interface (API) is the collection of methods intended to

be called to build a program. A good API should be readable and easy to use even without

formal documentation. It should also provide enough building blocks to make it worthwhile

for a development team to invest it’s time to learn and use.

There are tradeoffs that must be considered in designing an API between the usability

of that API for the developer using it, and the complexity of the underlying library of code.

An API has greater usability by providing higher level operations. Higher level operations

make the API easier for a developer to implement a system using the API [18]. Ensuring that

the API is easy to use and well written will increase its chance of being used and adopted by

a developer community [19, 20].

When designing an API to be used to build a RTCE based on Operational Transform,

great care must be taken to balance the usability of the API with the number of operations

provided. For each operation that is provided in the API, you must be able to transform that

operation against every other operation in the API.

Page 17: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

7

The more operations you provide, the greater the complexity of performing the OT

operations that you need to handle [21]. However, you really want someone to use your API

and the more readable and usable it is, the greater the likelihood is that they will use it. If you

push off the cost of effort to the developers using the API, it has a lower chance of being

adopted.

Page 18: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

8

CHAPTER 3

METHODS

In order to answer the question of whether primitive operations are more or less

efficient than an atomic component operation in the context of an RTCE, a couple of issues

need to be identified and described. The first was to define what is meant by efficiency for

this problem. The second was to define a functionality that could be completed by both an

atomic component operation and by using primitive operations.

3.1 EFFICIENCY

An efficient operation is defined as the measurement of a comparison of production

with cost [22]. Cost can be measured in terms of energy, time, and/or money. In this case, I

am considering the efficiency of the algorithms by measuring the CPU clock cycles that it

takes to perform the same amount of work (moving text from one area of a document to

another) as well as considering the efficiency of the algorithm in terms of the amount of work

it takes from a developer standpoint to use the algorithm in question.

3.2 FUNCTIONALITY

The behavior I decided to test is a move behavior; for example, moving a paragraph

of text from one part of the document model to another part. This behavior can be achieved

by both a combination of primitive operations of insert/delete as well as a fully atomic

component operation.

3.3 DESIGN

In order to test the efficiency of the different operations I wrote a suite of tests that

perform various move behaviors and measured the time it took to run those tests. I wrote a

test suite for the move functionality modeled by the Insert and Delete operations and an

analogous test suite for the move functionality performed by the Move operation.

In order to minimize effects on the test results from outside influences such as the

size of text moved, or position of text moved (i.e. from beginning to end versus middle to

Page 19: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

9

middle). I used the same set of starting text for each suite of tests and performed the same

operations. I compared a Move Text From Beginning To End for the primitives with a Move

Text From Beginning To End with the atomic component move algorithm.

I ran the suite of tests in two test batches; test batch 1 ran the primitive operations

first followed by the atomic component operations and in test batch two ran the atomic

component operations followed by the primitive operations. Each test batch consisted of 5

runs of primitive operation tests and 5 runs of atomic component operation tests with each

run consisting of 1000 executions of each of the 7 tests. A description of each test batch,

including the number of runs and the number of executions per run, is given in Table 3.1. I

ran the test suites on a laptop computer with the following specifications:

Intel Core i7-3632QM CPU @ 2.20GHz

8.0 GB Ram

64-bit Operating System, x64-based processor

3.4 IMPLEMENTATION

The algorithms described in this paper were written in Java using built-in libraries

and the test classes were written in Java using JUnit. The data model for this RTCE was

modeled using strings. The core classes that make up the model RTCE are briefly described

in Table 3.2.

3.4.1 The Test Code

The test suites consist of the same set of tests for the primitive operations, as well as

the atomic component operation. Each test for the primitive operations had a corresponding

test for the move operation, which performed the same functionality. Each test in the Atomic

Component Operations suite of tests performed the same amount of work as its counterpart in

the Primitive Operations suite of tests.

The text chosen as the document model is as follows; in the order of how it should

look at the end of each test. I numbered the paragraphs for ease of discussion.

1. When in the course of human events, it becomes necessary for one people to dissolve the political bands which have connected them with another, and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.

Page 20: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

10

Table 3.1. Test Suites Description

Test Batch Run #’s Test Name # of Executions

Batch 1 Primitives 1-5 Move Beginning to End 1000

Move Beginning to Middle

Move End to Beginning

Move End to Middle

Move Middle to Beginning

Move Middle to End

Move Middle to Middle

Batch 1 Atomic Component 6-10 Move Beginning to End 1000

Move Beginning to Middle

Move End to Beginning

Move End to Middle

Move Middle to Beginning

Move Middle to End

Move Middle to Middle

Batch 2 Atomic Component 11-15 Move Beginning to End 1000

Move Beginning to Middle

Move End to Beginning

Move End to Middle

Move Middle to Beginning

Move Middle to End

Move Middle to Middle

Batch 2 Primitives 16-20 Move Beginning to End 1000

Move Beginning to Middle

Move End to Beginning

Move End to Middle

Move Middle to Beginning

Move Middle to End

Move Middle to Middle

Page 21: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

11

Table 3.2. Core Classes of the Model RTCE System

Classes Description

Insert: Setup a new Insert operation by providing the index to insert at,

and the characters to insert.

This operation is then applied to a string.

Insert insertOp = new Insert(14, “and the dog is blue”);

insertOp.apply(The sky is red.”);

Resulting String: The sky is red and my dog is blue.

Delete Setup a new Delete operation by providing the index range of text

to be removed.

This operation is then applied to a string.

Delete deleteOp = new Delete(10,14);

deleteOp.apply(“My dog is was named Jasper”);

Resulting String: My dog is named Jasper.

Move Setup a move operation by providing the index range of the text to

move and the index of where to move it to.

This operation is then applied to a string

OTModel Simple POJO that used a String to model the data and an ID to

keep track of the client.

DeriveOperations Derives the steps necessary to transform from one operation to

another to lead to convergence.

ConcurrencyControl

The controller for handling the concurrency issues.

Page 22: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

12

2. We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable rights, that among these are life, liberty and the pursuit of happiness

3. That to secure these rights, governments are instituted among men, deriving their just powers from the consent of the governed. That whenever any form of government becomes destructive of these ends, it is the right of the people to alter or abolish it, and to institute new government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their safety and happiness.

For each test, I rearranged the order of the starting text to work with, ran the test,

validated that the test passed, and then recorded the time results.

All of the tests work basically the same way, they just test different starting points,

ending points, and insert points. However, the pseudo code given below is a model for all of

the tests in their respective test suite. Note that the work of finding the indices is outside the

scope of the operations themselves.

Code Listing 1 (Figure 3.1) shows the pseudo code for modeling the move behavior

using only the primitive operations of Insert and Delete.

//create the model with the starting string (model the current document state)

OTModel documentModel = new OTModel(getStartString());

//first, find the string to move and then delete it from the main document model

String textToMove = findString(startIndex, endIndex);

//create the delete operation

DeleteOperation deleteOp = new DeleteOperation(startIndex, endIndex);

//apply operation to main string & store the result (main string minus removed text)

String tempString = deleteOp.apply(model.getValue());

// create the insert operation with the correct indices and the text we want to insert

InsertOperation insertOp = new InsertOperation(insertIndex, textToMove);

//now apply the insert operation to the temp string to combine them back together

String endString = insertOp.apply(tempString);

//verify that the endString matches what we expect (with JUnit)

assertEquals(endString, getModelEndString());

//end the test and measure the time

Figure 3.1. Code listing 1 primitive operation test pseudo code.

Page 23: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

13

Code Listing 2 (Figure 3.2) shows the pseudo code for modeling the move behavior

using atomic component Move operation.

//create the model with the starting string (model the current document state)

OTModel documentModel = new OTModel(this.getStartString());

//create the move operation with the indices

MoveOperation moveOp = new MoveOperation(startIndex,endIndex, moveToIndex);

//apply the move operation to the original model

String endString = moveOp.apply(model);

//verify that the endString matches what we expect (with JUnit)

assertEquals(endString, getModelEndString());

//end the test and measure the time

Figure 3.2. Code listing 2 atomic component operation test pseudo code.

I will describe one test in detail, and provide a high level overview of the remaining

tests. Each of the tests had a different setup so that the test result would be easily verifiable

against the same ending string. The purpose of each test was moving a substring of data from

one position of the document to another.

3.4.2 Test Code Example

The specific test, Move Beginning to End, provides a good template for all of the

tests that I wrote. Each test in the test suite can follow the same pattern in the test setup, test

body, and test result.

Test setup: I arranged the setup document such that it looked like paragraph 3,

paragraph 1, then paragraph 2. Abbreviated as P3->P1->P2.

Test body: The code was to move paragraph 3 which is currently in position one, to

the end of the document which is where it belongs.

Test result: Verify the final document was in order such that it was Paragraph 1,

Paragraph 2, Paragraph 3. This is abbreviated as P1->P2->P3.

The lists of tests performed in each test suite along with a brief explanation of each is

given in Table 3.3.

Page 24: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

14

Table 3.3. Test Descriptions

Test Name Test Description

Move Beginning to End Move text from the beginning of the document to the end.

Move Beginning to

Middle

Setup the test in order of P2->P1->P3. Move the text such that

it ends up as P1->P2->P3.

Move End to Beginning Setup the test in order of P2->P3->P1. Move the text such that

it ends up as P1->P2->P3.

Move End to Middle Setup the test in order of P1->P3->P2. Move the text such that

it ends up as P1->P2->P3.

Move Middle to

Beginning

Setup the test in order of P2->P1->P3. Move the text such that

it ends up as P1->P2->P3.

Move Middle to End Setup the test in order of P1->P3->P2. Move the text such that

it ends up as P1->P2->P3.

Move Middle to Middle This test was setup such that it was in order P1->P2->P3, but

that P2 was rearranged into P2a and P2b, so that a middle

section could be moved around and it wind up in the correct

P1->P2->P3 order.

Page 25: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

15

CHAPTER 4

RESULTS

The results of the test runs are given in the next several tables and graphs. The tests

were run in two batches; the first batch ran the primitive operations test first and the atomic

component operations test second and the second batch did the reverse order. Each batch

consisted of 5 runs and each run consisted of 1000 executions of each test, for a total of 5000

executions of each test per batch.

Table 4.1 shows the results of the 5 runs of tests where the primitive operation tests

were run first. The average execution of each test in the atomic component operation test set

was consistently faster than the average time of each test in the primitive operation tests.

Notice though that certain runs of certain tests for the atomic component operation

tests were slower than the corresponding run of the primitive operation test. Most notably,

Run 1 of the Atomic Component Operation test “Move Beginning to End” took 110 ms while

its Primitive Test counterpart took 109 ms and Run 4 of the Atomic Component Operation

test “Move Middle to End” took 63 ms while its Primitive test counterpart only took 47ms.

Table 4.2 shows the results of the 5 runs of tests where the atomic component

operation tests were run first. The average execution of each test in the atomic component

operation test set was consistently slower than the average time of each test in the primitive

operation tests. Notice though that certain runs of certain tests for the primitive operation

tests were slower than the corresponding run of the atomic component operation test. Most

notably, Run 4 of the Primitive Operation test “Move Beginning to End” took 47 ms while its

Atomic Component Test counterpart took 31 ms and Run 1 of the Primitive Operation test

“Move Middle to Middle” took 62 ms while its Atomic Component test counterpart only

took 47ms.

Table 4.3 shows a summary of the primitive operations test results. This table shows

the total combined time of all the atomic component operations when this set was run first,

when it was run second as well as the total combined time of both runs and the average time

of the tests.

Page 26: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

16

Table 4.1. Time Results Primitive Algorithms Run First

Primitive Operations Tests Run 1 Run 2 Run 3 Run 4 Run 5 Average

Move Beginning To End 109 47 63 31 47 59.4

Move Beginning To Middle 93 63 78 32 47 62.6

Move End to Beginning 94 78 47 32 32 56.6

Move End To Middle 94 203 47 47 47 87.6

Move Middle to Begin 78 62 47 63 31 56.2

Move Middle to End 63 47 47 47 31 47

Move Middle to Middle 63 47 31 47 63 50.2

Total time per run 594 547 360 299 298 419.6

Atomic Component Operation

Tests

Run 1 Run 2 Run 3 Run 4 Run 5 Average

Move Beginning To End 110 47 31 31 47 53.2

Move Beginning To Middle 62 63 47 31 47 50

Move End to Beginning 47 47 47 32 47 44

Move End To Middle 63 47 31 47 47 47

Move Middle to Begin 47 47 31 46 47 43.6

Move Middle to End 32 46 31 63 47 43.8

Move Middle to Middle 63 63 31 31 47 47

Total time per run 424 360 249 281 329 328.6

Page 27: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

17

Table 4.2. Time Results Atomic Component Algorithm Run First

Primitive Operations Tests Run 1 Run 2 Run 3 Run 4 Run 5 Average

Move Beginning To End 94 47 47 47 31 53.2

Move Beginning To Middle 94 62 47 47 31 56.2

Move End to Beginning 62 47 47 47 31 46.8

Move End To Middle 47 47 78 47 31 50

Move Middle to Begin 47 31 47 47 47 43.8

Move Middle to End 47 63 47 47 31 47

Move Middle to Middle 62 47 46 47 31 46.6

Total time per run 453 344 359 329 233 343.6

Atomic Component Operation

Tests

Run 1 Run 2 Run 3 Run 4 Run 5 Average

Move Beginning To End 110 47 78 31 46 62.4

Move Beginning To Middle 109 63 47 31 47 59.4

Move End to Beginning 78 62 62 47 32 56.2

Move End To Middle 93 63 31 47 31 53

Move Middle to Begin 94 204 31 47 31 81.4

Move Middle to End 62 31 47 62 31 46.6

Move Middle to Middle 47 47 47 47 31 43.8

Total time per run 593 517 343 312 249 402.8

Table 4.3. Total Time Results All Primitive Operation Tests

Summary Time

Primitive Operations Tests run first 343.6

Primitive Operations Tests run second 419.6

Total run time all primitive operations tests 763.2

Average run time all primitive operations tests 381.6

Page 28: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

18

Table 4.4 shows a summary of the atomic component operations test results. This

table shows the total combined time of all the atomic component operations when this set

was run first, when it was run second as well as the total combined time of both runs and the

average time of the tests.

Table 4.4. Total Time Results All Atomic Component Operation Tests

Summary Time

Atomic Component Operations Tests run first 402.8

Atomic Component Operations Tests run second 328.6

Total run time all atomic component operations tests 731.4

Average run time all atomic component operations tests 365.7

Figure 4.1 shows a comparison of running time for each test run in the first test batch

broken out by test where the primitive test operations were run first. The atomic component

operations slightly outperformed the primitive operations, except in the Move End to Middle

test where the atomic component operation noticeably outperformed the primitive operation.

Figure 4.1. Time by test - primitives run first.

Figure 4.2 shows a comparison of running time for each test run in the second test

batch broken out by test where the atomic component test operations were run first. The

primitive operations slightly outperformed the atomic component operations, except in the

0102030405060708090100

Component

Primitive

Page 29: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

19

Figure 4.2. Time by test – atomic component run first.

Move Middle to Begin test where the primitive operation noticeably outperformed the atomic

component operation.

Figure 4.3 shows a comparison of the total running time for each test in both test

batches broken out by test. The Atomic Component operations ran faster than the Primitive

operations for Move Beginning to Middle, Move End to Beginning, Move End to Middle,

Move Middle to End, and Move Middle to Middle.

For some tests, the difference in running times were small; such as the Move End to

Beginning test, but there was a much greater difference for the Move End to Middle test.

The Primitive Operations outperformed the Atomic Component Operations in the Move

Beginning to End and the Move Middle to Begin tests. The difference in performance for the

Move Beginning to End was slight, while the different for the Move Middle to Begin is more

noticeable.

Figure 4.4 shows a comparison of running time for each test run in the first test batch

broken out by test run where the primitive test operations were run first. The combined tests

for the atomic component operation outperformed the primitive operations for all runs except

for Run 5.

Figure 4.5 shows a comparison of running time for each test run in the first test batch

broken out by test run where the atomic component test operations were run first. The

0102030405060708090

Component

Primitive

Page 30: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

20

Figure 4.3. Total time of all tests

Figure 4.4. Time by run number primitive operation tests run first.

0

20

40

60

80

100

120

140

160

Component

Primitive

0

100

200

300

400

500

600

700

Run 1 Run 2 Run 3 Run 4 Run 5

Component

Primitive

Page 31: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

21

Figure 4.5. Time by run number atomic component operation tests run first.

combined tests for the primitive operations outperformed the atomic component operations in

Run 1, Run 2, and Run 5, but not for Run 3 and Run 4.

Figure 4.6 shows a comparison of the combined running time for all tests run in the

both test batches broken out by test run. The combined tests for the atomic component

operation outperformed the primitive operations for all runs except for Run 5.

Figure 4.7 shows a comparison of the combined running time for all tests across all

runs in the both test batches. The combined tests for the atomic component operation

outperformed the primitive operations.

0

100

200

300

400

500

600

700

Run 1 Run 2 Run 3 Run 4 Run 5

Component

Primitive

Page 32: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

22

Figure 4.6. Combined time by run number.

Figure 4.7. Total combined test execution time.

0

200

400

600

800

1000

1200

Run 1 Run 2 Run 3 Run 4 Run 5

Component

Primitive

3550 3600 3650 3700 3750 3800 3850

Primitive

Component

Total Time

Primitive

Component

Page 33: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

23

CHAPTER 5

DISCUSSION

5.1 AVERAGE RUN TIME COMPARISON

Table 4.1 shows the average time per run per test for batch 1 of the test, in which the

Primitive Operations tests ran first and the atomic component operations tests ran second.

Table 4.2 shows the average time per run per test for batch 2 of the test, in which the Atomic

Component Operations tests ran first and the Primitive operations tests ran second.

From these two tables you can see that the order in which the tests were run had an

impact on the time it took for the tests to complete. The Primitive Operations suite of tests

took an average of 419.6 ms to run when run first versus 343.6 ms when run second. The

Atomic Component Operations test took an average of 402.8 ms when run first and an

average of 328.6 ms when run second.

The individual by test average was generally also affected by the order in which the

test batch was run. All of the primitive tests ran faster when they were run as part of the

second batch of tests, except for Move Middle To End which performed on average the

same. The entire atomic component operations suite of tests performed faster when run as

part of the second batch.

The time efficiency difference seen based on the order in which the tests were run

likely was affected by issues outside the control of these tests. There are specific issues

related to the JVM that are outside the control of this code, such as garbage collection and

object finalization. In addition, there are other issues that could contribute to an individual

run of a test showing a greater slowdown such as other processes running on the computer at

the same time.

I attempted to minimize the influence of these outside variables as much as possible

by running several iterations of the tests prior to capturing results, minimize any other

automatic processes running on the machine, as well as calculating the total average time

over both test batches.

Table 4.3 and Table 4.4 show the total average time across all runs of both batches of

tests. You can see from this data that the Atomic Component Operations tests ran faster than

Page 34: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

24

the primitive operations tests. Keeping in mind that both sets of tests performed the same

amount of work, this shows that the atomic component operations perform slightly faster

than the primitive operations.

5.2 DEVELOPER TIME

It is harder to quantify the amount of work required to use the operations themselves.

This is a measurement of the work done by the development team using the algorithms.

In order to setup the tests for this thesis, the average amount of code to setup the testing to

model a move operation using only Insert/Delete was much greater the average number to

setup the tests to run the atomic component move operation.

Code listing 1 (Figure 3.1) shows the pseudo code for a sample move test using only

primitive operations and Code listing 2 (Figure 3.2) shows the pseudo code for that same

move test using the atomic component move operation. You can see from Code listing 1 that

the number of lines of code is almost double what is shown in Code listing 2, primarily

because I was forced to handle the data structures to store the text and its parts that needed to

be moved. In contrast, all of the complexity of the move is handled behind the scenes for the

developer when using the atomic component move operation.

The amount of time I took setting up the tests for modeling different moves using just

Insert and Delete took at least double the amount of time it took to setup the tests using the

Move operation. This was a small set of tests, simply moving text around. It would be

impractical to develop an application that could turn a single user application into a RTCE

application using only Insert and Delete.

Page 35: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

25

CHAPTER 6

CONCLUSION

I conducted an efficiency comparison of primitive algorithms versus atomic

component algorithms in the context of a RTCE using Operational Transform. Although

virtually any operation that would need to be performed in a RTCE could be modeled just

using primitive Insert and Delete operations, it is more efficient in terms of running time of

the algorithm, as well as from the perspective of the developer building an application using

such an API, to use the atomic component operations.

The tests I ran ensured that each test in the atomic component operation did the same

amount of work as was done by the primitive operations. In other words, I made sure the

amount of text to move, the distance they needed to move the text, and the work of finding

the indices was the same. I ran the tests in two batches, the first batch ran the primitive

operations tests first and the second batch ran the atomic component operations test first.

Each batch consisted of 5 runs of 1000 executions of each test.

The results show that the atomic component operations are more efficient in run time

compared with the primitive operations after you average the total time across all tests, and

across all runs of the tests. Additionally, based on the greater amount of code and the longer

amount of time it took to setup and use the primitive operations compared with the atomic

component operations to perform the same function (i.e., move a paragraph from one part of

a document to another), the atomic component operations outperformed the primitives in this

area as well.

There are several areas for future work in this domain. Another atomic component

operation could be developed and then compared for efficiency against something modeled

with Insert and Delete primitive operations. Further work could be explored on the difference

in effort from the API side of development in how much more time it takes to handle the

transformations when dealing with atomic component operations versus the primitive

operations. Also, the same algorithms used in this paper could be compared for efficiency

with complex custom objects instead of strings.

Page 36: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

26

Any operation needed by an RTCE could be modeled using the primitive operations

of Insert and Delete. However, the atomic component operation performed better in terms in

terms of efficiency of running time, as well as in terms of the amount of code required to use

those operations and the time in terms of developer cost. When building an API for an

RTCE, it would be better to implement some higher level API calls rather than require

consumers of your API to rely solely on the primitive operations.

Page 37: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

27

REFERENCES

[1] D. Wang, A. Mah, and S. Lassen. Google wave operational transformation, 2010. http://wave-protocol.googlecode.com/hg/whitepapers/operational-transform/operational-transform.html, accessed Feb. 2013.

[2] C. Sun, S. Xia, D. Sun, D. Chen. H. F. Shen and W. Cai. Transparent adaptation of single-user applications for multi-user real-time collaboration. ACM Transactions on Computer-Human Interaction, 13(4): 531–582, 2006.

[3] C. A. Ellis and C. Sun. Operational transformation in real-time group editors: Issues, algorithms, and achievements. Proceedings of the 1998 ACM Conference on Computer Supported Cooperative Work, Seattle, 1998. ACM.

[4] S. Xia, D. Sun, C. Sun, D. Chen and H. Shen. Leveraging single-user applications for multi-user collaboration: The coword approach. Proceedings of the 2004 ACM Conference on Computer Supported Cooperative Work, Chicago, 2004. ACM.

[5] L. Kaewkitipong. Diffusion of an Online Collaboration Tool: The case of google wave adoption failure. Proceedings of the System Science (HICSS), 2012 45th Hawaii International Conference on System Sciences, Maui, 2012. IEEE.

[6] D. A. Nichols, P. Curtis, M. Dixon, and J. Lamping. High-latency, low-bandwidth windowing in the Jupiter collaboration system. Proceedings of the 8th Annual ACM Symposium on User Interface and Software Technology, Pittsburgh, 1995. ACM.

[7] C. A. Ellis, and S. J. Gibbs. Concurrency control in groupware systems. Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD '89), New York, 1989. ACM.

[8] C. Ignat and M. C. Norrie. Tree-based model algorithm for maintaining consistency in real-time collaborative editing systems. Proceeding of the 4th International Workshop on Collaborative Editing, New Orleans, 2002. CSCW.

[9] Q. Wu, C. Pu, and J. E. Ferreira. A partial persistent data structure to support consistency in real-time collaborative editing. Proceeding of Data Engineering (ICDE), 2010 IEEE 26th International Conference, Long Beach, 2010. IEEE.

[10] G. Oster, P. Molli, P. Urso, and A. Imine. Tombstone transformation functions for ensuring consistency in collaborative editing systems. Proceedings of the 2006 International Conference on Collaborative Computing: Networking, Applications and Worksharing, Atlanta, 2006. IEEE.

[11] C. Sun, & D. Chen. Consistency maintenance in real-time collaborative graphics editing systems. ACM Transactions on Computer-Human Interaction (TOCHI), 9: 1-41, 2002.

Page 38: A COMPARISON OF THE EFFICIENCY OF AN ATOMIC COMPONENT OPERATION

28

[12] M. Ressel, D. Nitsche-Ruhland, and R. Gunzenhäuser. An integrating, transformation-oriented approach to concurrency control and undo in group editors. Proceedings of the 1996 ACM conference on Computer Supported Cooperative Work, Boston, 1996. ACM.

[13] Y. Cheng, F. He, S. Jing, and Z. Huang. An multiuser undo/redo method for replicated collaborative modeling systems. Proceedings of the 13th International Conference on Computer Supported Cooperative Work in Design, Santiago, 2009. IEEE.

[14] L. Xue, M. Orgun, and K. Zhang. A multi-versioning algorithm for intention preservation in distributed real-time group editors. Proceedings of the 26th Australasian Computer Science Conference, Adelaide, 2003. Australian Computer Society, Inc.

[15] C. Sun, and D. Chen. A multi-version approach to conflict resolution in distributed groupware systems. Proceedings of the 20th International Conference of Distributed Computing Systems, Taipei, 2000. IEEE.

[16] Wikipedia. Operational transformation, 2013. http://en.wikipedia.org/wiki/Operational_transformation, accessed Mar. 20, 2013

[17] D. Li, L. Zhou, R. Muntz, and C. Sun. Operation propagation in real-time group editors. Multimedia, IEEE, 7(4): 55-61, 2000.

[18] Robert W. Sebesta. Language evaluation criteria concepts of programming languages, 9th ed, pages 7-17. Addison-Wesley Publishing Co., Reading, Mass., 2009.

[19] S. G. McLellan, A. W. Roesler, J. T. Tempest, and C. I. Spinuzzi. Building more usable APIs. Software, IEEE, 15: 78-86, 1998.

[20] B. E. Teasley. The effects of naming style and expertise on program comprehension. International Journal of Human-Computer Studies, 40: 757-770, 1994.

[21] D. Li, and R. Li, An admissibility-based operational transformation framework for collaborative editing systems. Computer Supported Cooperative Work (CSCW), 19: 1-43, 2010.

[22] Merriam-Webster Online. Efficiency [Def. 2], 2013. http://www.merriam-webster.com/dictionary/efficiency, accessed Mar. 20, 2013.