SWE 622- Distributed Systems Project Phase III Eric Barnes, David Chang, David Nelson Fisayo...

21
SWE 622- Distributed Systems Project Phase III Eric Barnes, David Chang, David Nelson Fisayo Oluwadiya, Xiang Shen

Transcript of SWE 622- Distributed Systems Project Phase III Eric Barnes, David Chang, David Nelson Fisayo...

SWE 622- Distributed Systems

Project Phase III

Eric Barnes, David Chang, David Nelson

Fisayo Oluwadiya, Xiang Shen

Processor

ExchangeServerSqlite

DB

JFrameExchangeGUI

TraderOutput

TraderInput

initDB

start

start

start

quer

y, co

mm

it

query

SwingWorkerstart

update

input queue

TraderInput

TraderOutput

...

output queue

output queue

time msg

time m

sg

TraderClient

Trader

Monitor

JFrameTraderGUI

TraderWorker

Listener

SwingWorker

Exchange Trader

processed queue

TraderWorker

Listener

start

start

startstart

update

some other exchange some other trader

start

start,

business msg

business msg

sell, buy

start, sync time...

business msg

Exchange

new

one TCP connection

Exchange waits for new connections. Spawn new threads TraderOutput/

Input for each new connection. It also

creates all the queues.

sync time

Initial Design

Final Design

Processor

ExchangeServerSqlite

DB

JFrameExchangeGUI

TraderOutput

TraderInput

initDB

start

start

start

quer

y, co

mm

it

query

SwingWorkerstart

update

input queue

TraderInput

TraderOutput

...

output queue

output queue

time/keepalive msg

time/keepalive m

sg

TraderClient

Trader

SyncClock

JFrameTraderGUI

ExchangeConnectionManager

Listener

SwingWorker

Exchange Trader

processed queue

ExchangeConnectionManager

Listener

start

start

startstart

update

some other exchange some other trader

schedule

business msg

start,business msg

sell, buy

start, sync time...

business msg

Exchange

new

one TCP connection

Exchange waits for new connections. Spawn a new thread

TraderConnectionManager, which starts and manages threads TraderOutput/Input

and CheckTimeout for each new connection.

sync clock

TraderConnectionManager

start

Initial Class Diagram

Final Class Diagram

public void init() throws Exception { _serverSocket = null; try { _serverSocket = new ServerSocket(_exchangePort); } catch (IOException e) { log.error("Could not listen on port:" + _exchangePort); System.exit(1); } while (true) { Socket clientSocket = null; try { log.info("WAITING"); clientSocket = _serverSocket.accept(); log.info("ACCEPTING"); } catch (IOException e) { log.error("Accept failed."); System.exit(1); }

TraderConnectionManager manager = new TraderConnectionManager(clientSocket, this);

traderManagers.add(manager); }}

Class Exchange init() method

public TraderConnectionManager(Socket clientSocket, Exchange _exchange) { this.exchange = _exchange; this.outMsgQueue = new ArrayBlockingQueue<String>(50);

try { out = new PrintWriter(clientSocket.getOutputStream(), true); in = new BufferedReader(new

InputStreamReader(clientSocket.getInputStream())); } catch (IOException e) { ...... }

Thread tin = new Thread(new TraderInput(clientSocket, outMsgQueue)); tin.start();

Thread tout = new Thread(new TraderOutput(clientSocket, outMsgQueue)); tout.start();

//set last keepalive so that we don't time out right away lastKeepaliveReceived = System.currentTimeMillis();

//schedule keepalives keepaliveTimer = new Timer("keepallive"); keepaliveTimer.scheduleAtFixedRate(new CheckTimeout(),

KEEPALIVE_INTERVAL, KEEPALIVE_INTERVAL); }

Class TraderConnectionManager constructor

public void run() {

boolean connectedToExchange = initSocket(exchangAddr, port);

if (connectedToExchange) { // set last keepalive so that we don't time out right away lastKeepaliveReceived = System.currentTimeMillis(); // schedule keepalives keepaliveTimer = new Timer("keepalive-" + _timeMsgQueue); keepaliveTimer.scheduleAtFixedRate(new SendKeepalive(), KEEPALIVE_INTERVAL, KEEPALIVE_INTERVAL);

// schedule clock syncs clockSyncTimer = new Timer("clockSync-" + _timeMsgQueue); clockSyncTimer.scheduleAtFixedRate(new SyncClock(), 0, CLOCK_OFFSET_CHECK_INTERVAL); }}

Note:KEEPALIVE_INTERVAL = 5000;MAX_MISSED_KEEPALIVES = 2;

//check every 1 hour (1000 ms/sec * 60 sec/min * 60 min/hourCLOCK_OFFSET_CHECK_INTERVAL = 1000 * 60 * 60;

Class ExchangeConnectionManager run() method

TradeTransaction t = new TradeTransaction(fromServer);

boolean outOfOrder = t.getMessageID() < newestMessageProcessed;

//if the message was out of order, flag it, but don't set it as the last ID recieved to// ensure that subsequent out of order messages are also flagged. e.g. if IDs arrive// in the order 5 3 4, both 3 and 4 should be flagged

if(outOfOrder){

log.error("Out of order");t.setOutOfOrder(true);

}else{newestMessageProcessed = t.getMessageID();log.info("Message sequence:" + newestMessageProcessed);}

t.setStale(_clockDelta + System.currentTimeMillis());processedTAQueue.put(t);

Just in case the echo messages are out of order

TradeTransaction t = new TradeTransaction(fromServer);

boolean outOfOrder = t.getMessageID() < newestMessageProcessed;

//if the message was out of order, flag it, but don't set it as the last ID recieved to// ensure that subsequent out of order messages are also flagged. e.g. if IDs arrive// in the order 5 3 4, both 3 and 4 should be flagged

if(outOfOrder){

log.error("Out of order");t.setOutOfOrder(true);

}else{newestMessageProcessed = t.getMessageID();log.info("Message sequence:" + newestMessageProcessed);}

t.setStale(_clockDelta + System.currentTimeMillis());processedTAQueue.put(t);

Just in case the echo messages are out of order

private class SyncClock extends TimerTask {@Overridepublic void run() {

long requestTime;long receiveTime;long oneWay;long exchangeTime;long totalDelta = 0;int tryTimes = 3;String timeStr = "";

for (int i = 0; i < tryTimes; i++) {requestTime = System.currentTimeMillis();log.info("requestTime " + i + "=" + requestTime);requestTime(requestTime);try { timeStr = _timeMsgQueue.take();} catch (InterruptedException e) { Thread.currentThread().interrupt();}exchangeTime = Long.parseLong(timeStr.replace("time:", ""));log.info("exchangeTime " + i + "=" + exchangeTime);receiveTime = System.currentTimeMillis();log.info("receiveTime " + i + "=" + receiveTime);oneWay = (receiveTime - requestTime) / 2;totalDelta += exchangeTime + oneWay - receiveTime;

}_clockDelta = totalDelta / tryTimes;log.info("clockDelta=" + _clockDelta);

}}

How to figure how to the clock delta between exchange and trader

All SLOC:SLOC Directory SLOC-by-Language (Sorted)435 gui java=435375 exchange java=375368 trader java=368147 common java=147Totals grouped by language (dominant language first):java: 1325 (100.00%)

Reused SLOC (GUI code provided by Prof. Sousa):SLOC Directory SLOC-by-Language (Sorted)311 gui java=311Totals grouped by language (dominant language first):java: 311 (100.00%)

Total New SLOC:SLOC Directory SLOC-by-Language (Sorted)1014 src java=1014

Totals grouped by language (dominant language first):java: 1014 (100.00%)

Total Physical Source Lines of Code (SLOC) = 1,014Development Effort Estimate, Person-Years (Person-Months) = 0.20 (2.44) (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))Schedule Estimate, Years (Months) = 0.29 (3.51) (Basic COCOMO model, Months = 2.5 * (person-months**0.38))Total Estimated Cost to Develop = $ 27,415 (average salary = $56,286/year, overhead = 2.40).SLOCCount, Copyright (C) 2001-2004 David A. Wheeler

Source Lines Of Code

• Effort spent in design: 120

• Effort spent in design coding: 100

• Effort spent in design integrating: 40

• Effort spent in design experiments: 60

• Total effort: 320

Note: All of them are estimates

Effort spent for the project (man-hour)

• SUSE Linux 10 SP1 (i586)

– Kernel: 2.6.16.46-0.12-bigsmp #1 SMP Thu May 17 14:00:09 UTC 2007– Memory: 4GB– CPU: 2 x Intel(R) 3.40GHz – Java: java version "1.6.0_12“

• Solaris 10

– Kernel: SunOS mason 5.10 Generic_138888-01 sun4u sparc SUNW,Sun-Fire-V890– Memory: 32GB– CPU: 8 x Sparc 1350 MHz– Java: java version "1.6.0_11“

• Microsoft Windows XP SP3

– Java: java version "1.6.0_01“

• Microsoft Windows Vista SP1

– Java: java version "1.6.0_12“

Testing Environment

LAN:

• Ping < 1ms• Direct connected by a network switch

WAN:• 10ms < Ping < 30ms

Pinging mason.gmu.edu [129.174.1.13] with 32 bytes of data:Reply from 129.174.1.13: bytes=32 time=13ms TTL=246Reply from 129.174.1.13: bytes=32 time=14ms TTL=246Reply from 129.174.1.13: bytes=32 time=19ms TTL=246Reply from 129.174.1.13: bytes=32 time=12ms TTL=246

Ping statistics for 129.174.1.13: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),Approximate round trip times in milli-seconds: Minimum = 12ms, Maximum = 19ms, Average = 14ms

• Many network hops

Tracing route to mason.gmu.edu [129.174.1.13]over a maximum of 30 hops:

1 1 ms 1 ms 1 ms 192.168.0.1 2 10 ms 9 ms 8 ms 10.3.224.1 3 12 ms 9 ms 9 ms ip72-219-223-161.dc.dc.cox.net [72.219.223.161]

4 9 ms 10 ms 9 ms mrfddsrj01-ge110.rd.dc.cox.net [68.100.0.161] 5 13 ms 10 ms 10 ms ashbbbrj01-ae0.0.r2.as.cox.net [68.1.0.220] 6 13 ms 64 ms 11 ms ae-11-69.car1.Washington1.Level3.net [4.68.17.3]

7 16 ms 13 ms 12 ms GEORGE-MASO.car1.Washington1.Level3.net [4.79.204.66] 8 * * * Request timed out. 9 * * * Request timed out. 10 13 ms 15 ms 12 ms mason.gmu.edu [129.174.1.13]

Testing Network and network latency

Latency between starting to send an order and proceeding with other activities

0

20

40

60

80

100

120

140

160

180

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Orders

msec

Latency in a LAN Lantency in a WAN

Avg=71.55

Avg=51.72

LAN: Ping statistics for 172.16.104.231: Approximate round trip times in milli-seconds Minimum = 0ms, Maximum = 0ms, Average = 0ms

WAN: Ping statistics for 129.174.1.13 (mason.gmu.edu): Approximate round trip times in milli-seconds: Minimum = 17ms, Maximum = 20ms, Average = 18ms

Latency between issuing an order and the last trader receiving it’s echo

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Orders

msec

Latency

Avg=86.7

• Between each Exchange and Trader there is a TCP connection

• If the Trader fails, each Exchange will know immediately

– Read failed error

– Depends on the network latency

• If the network also fails, each Exchange will disconnect the failed Trader after the time-out

– MAX_MISSED_KEEPALIVES* KEEPALIVE_INTERVAL

In the log file:

ERROR TraderConnectionManager:130 - Read failed. Client seems gone or the exchange is closing. INFO TraderConnectionManager:68 - Disconnecting INFO TraderConnectionManager:85 - disconnecting trader INFO TraderConnectionManager:137 - EXIT INPUT INFO TraderConnectionManager:177 - EXIT OUTPUT

Latency between a trader failure and the last exchange stopping echoing orders

Storage footprint for one exchange

0

50000

100000

150000

200000

250000

300000

350000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Trader Num.

kB

Resident set size Total VM size

• Linux and the Sun box all have ntpd running

• All Windows boxes all have the Windows time service running

• Clock Skew we observed on the Linux box (from the NTP log):……25 Apr 04:56:57 ntpd[3151]: time reset -0.780931 s25 Apr 05:01:13 ntpd[3151]: synchronized to LOCAL(0), stratum 1025 Apr 05:02:18 ntpd[3151]: synchronized to 172.16.200.1, stratum 125 Apr 05:28:11 ntpd[3151]: time reset +0.504970 s25 Apr 05:32:28 ntpd[3151]: synchronized to LOCAL(0), stratum 1025 Apr 05:33:31 ntpd[3151]: synchronized to 172.16.200.1, stratum 125 Apr 05:47:29 ntpd[3151]: time reset +0.241215 s……

• Testing Scenario:

– Stop ntpd on Linux

– Change time, for example: date +%T -s "17:07“

– Test

Clock Skew

Time difference, Clock delta, Transaction time

0

100

200

300

400

500

600

700

1 2 3 4 5 6 7 8 9 10

Time difference

Clock Delta Transaction how long ago Time difference