Towards Real-Time, Many Task Applications on Large...
Transcript of Towards Real-Time, Many Task Applications on Large...
![Page 1: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/1.jpg)
Towards Real-Time, Many Task Applications on Large Distributed Systems
- focusing on the implementation of RT-BOINC
Sangho Yi ([email protected])
![Page 2: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/2.jpg)
Content Motivation and Background
RT-BOINC in a nutshell Internal structures
Design & implementation
Conclusions and future work
![Page 3: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/3.jpg)
Motivation Demands for computing large-scale real-time(RT) tasks
increased in distributed computing environment
Chess, Game of Go
Real-time Forensic Analysis
Ultra HD-level Real-time Multimedia Processing
…
Lack of support for RT in existing Desktop Grids, and Volunteer Computing environment
![Page 4: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/4.jpg)
About BOINC BOINC is tailored for maximizing task throughput, not
minimizing latency on the order of seconds.
XtreemWeb and Condor have similar characteristics.
A BOINC project has
A BOINC server (web, storage, database, ...)
Multiple BOINC clients
Network connection between server - clients
![Page 5: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/5.jpg)
BOINC Projects Normally perform a few transactions in 1 sec with host
clients.
1~50 transactions in 1 sec (ref. http://boincstats.com)
Send large chunk of computation to the host clients.
a couple of hours, or even days of computation
Does not have RT guarantee
Because it is tailored for maximizing total amount of computation.
![Page 6: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/6.jpg)
Significant Gaps here... ”I need a 10-second-car.” - in the movie ”Fast & Furious”
Vin diesel – the main actor in the movie
![Page 7: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/7.jpg)
Significant Gaps here... ”We need a 10-second-completion.” - in a ”Chess game”
![Page 8: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/8.jpg)
RT-BOINC in a Nutshell RT-BOINC features
Providing low WCET (worst-case execution time) for all components
No database operations at run-time
O(1) interfaces for data structures
Reduced complexity for server daemons Almost O(1)
![Page 9: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/9.jpg)
Original BOINC Internal BOINC Server
Host
Host
Host
Host
Host
Scheduler
Work-generator Requests for work distribution
Transitioner
Feeder
workunits in DB w w w
w w w w w
w w
workunit-result ready queue wr wr wr wr wr
Validator
Assimilator workunit-results in DB w
r w r w
r r
w r w r r r w
w r w r
r w
r r w
BOINC Project
File-deleter Results of work ...
: flow of distributing work requests : flow of reporting work results
BOINC Hosts
![Page 10: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/10.jpg)
RT-BOINC Internal
![Page 11: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/11.jpg)
Data management MySQL Database vs. In-memory data structures
BOINC DB
(workunits, results, hosts, users, apps, platforms, and …) - based on MySQL
Complexity for lookup, insert, and remove: O(log
N) ~ O(N2)
In-Memory Data structures - O(1)
a b c
2a 2b 2c
Multi-level lookup tables and fixed-size list
Lookup pools
w w w
w
w w w w r r
r
r r r
r
r r
Main Database
In-memory data records with data
format compaction (workunits, results,
hosts, users, ...) - based on shm-IPC
(a) BOINC (b) RT-BOINC
![Page 12: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/12.jpg)
Example 1) select from where;
ID of result
Retrieving RESULT from the O(1) data structure
1 2 3 4
Ex) select * from result where workunitid = ‘0x1234’; 8 bits 4 bits 4 bits
24 = 16 entries
28 = 256 entries
Result table in main memory
![Page 13: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/13.jpg)
Performance Evaluation 1) Micro and Macro Benchmarks
Based on dummy server load
2) Case Studies Game of Go AI, (and Chess AI – soon)
![Page 14: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/14.jpg)
Macro-benchmarks (high load)
![Page 15: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/15.jpg)
Performance Evaluation - #2 Case Studies
Game of Go - 9x9 board (currently working) FueGo - a monte-carlo-based AI
GTP protocol (go text protocol)
KGS Go Server - can play with AI and human
Chess (developing with Emmanuel Jeannot) Distributed depth-first-search-based AI
UCI protocol (universal chess interface)
![Page 16: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/16.jpg)
Summary RT-BOINC provides...
Faster response time and real-time performance than BOINC.
300~1,000 times lower WCET(worst-case execution time) for each server-side operation.
less difference between the average and the worst-case performance.
less difference between low and high load conditions.
![Page 17: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/17.jpg)
Future work (The rest part)
RT-BOINC Server
Project manager requests work T: deadline Nc: # of computation Ps: probability for successful execution
request
RT-BOINC server provides the worst-case number of transactions processing per second: Nt
Lot of volunteer hosts
...
distribution
returning results
T Nc/Nt
Time for handling transactions in server
Time for computation in volunteer hosts
Time for communication between server and hosts
Checkpointing & Replication is required in the presence of hosts’ failures.
Red: What we have done in the first paper
![Page 18: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/18.jpg)
Future work (The rest part)
RT-BOINC Server
Project manager requests work T: deadline Nc: # of computation Ps: probability for successful execution
request
RT-BOINC server provides the worst-case number of transactions processing per second: Nt
Lot of volunteer hosts
...
distribution
returning results
T Nc/Nt
Time for handling transactions in server
Time for computation in volunteer hosts
Time for communication between server and hosts
Checkpointing & Replication is required in the presence of hosts’ failures.
Blue: What we will show in the next paper
![Page 19: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/19.jpg)
Go AI on RT-BOINC KGS
Go server GTP
Client Go AI Master
RT-BOINC server Work
generator Transitioner Feeder Scheduler Validator Assimilator
(aggregator) File deleter
Ask to move Send “genmove” command
Send input file Generate a workunit (initiate deadline timer)
Generates workunit- results pairs
Insert pairs into scheduler pool
Send works to clients
RT-BOINC Clients
(Worker)
Compute Works
(5~10 secs)
Return results to scheduler
Store results
Set need_validate = TRUE
Activate Transitioner
Validate results, and set ASSIMILATE_READY
Assimilate results into one file and return to Master
Select and return the best move
Return the best move
Set FILE_DELETE_READY, and activate File deleter Set ASSIMILATE_DONE, and activate Transitioner
Delete the result files
Response time = 15~25 secs
Set FILE_DELETE_DONE, and activate Feeder to clean the in-memory data structures
Delete data in-memory
Select the best move
(0~1 secs)
Network Communication Delay (5~10 secs)
Deadline timer can activate Transitioner
![Page 20: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/20.jpg)
Experimental Setup (1) We used a little bit fast machine, but used only 2
cores for this experiements.
We’ll extend the scale of experiments when we have greater # of volunteers.
Component Description Notes
Processor 2.00 Ghz (Dual-Quad) Intel Xeon E5504
Main Memory 32GB (1,000 Mhz)
Secondary Storage HDD - sorry for lack of info :’)
Operating System Ubuntu 9.10 (karmic) Linux Kernel 2.6.31-19
![Page 21: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/21.jpg)
Experimental Setup (2) RT-BOINC
Up to 50k active wu, result, host, users
3.9GBs of memory usage on a 64bit machine 1.9GBs of memory usage for O(1) data structures
(49.5 % of total)
BOINC Recent server-stable version (Jun. 2010)
![Page 22: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/22.jpg)
Minor Things for Experiments Apache & MySQL
Max # of connections (default is 100~256)
Need 2 identical (physical) servers For BOINC vs. RT-BOINC testing
![Page 23: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/23.jpg)
Preliminary Results (Go AI) Only preliminary results are available now.
Two cases: 160, and 480 cores (of volunteers)
Deadline = 30 secs / move
![Page 24: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/24.jpg)
Screen Shot on KGS
![Page 25: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/25.jpg)
Macro-benchmarks Difference of worst-case performance between low and high
load condition
![Page 26: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/26.jpg)
Performance Evaluation - #1 Purpose: to measure real-time performance of BOINC and RT-
BOINC
Criteria: the worst-case and the average execution time
Method: micro and macro benchmarks
Micro-benchmark: for each primary operation related to server process
Macro-benchmark: for each server process (including feeder, scheduler, transitioner, work-generator, assimilator, validator, and file-deleter)
![Page 27: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/27.jpg)
Experimental Environment We used a little bit slow, common-off-the-shelf system. ;-)
For ease of reproduction of the results
Component Description Notes
Processor 1.60GHz, 3MB L2 cache Intel Core 2 Duo
Main Memory 3GB (800 Mhz) Dual-channel DDR3
Secondary Storage Solid State Drive SLC Type
Operating System Ubuntu 9.10 (karmic) Linux Kernel 2.6.31-19
BOINC version Server stable version Nov. 11, 2009 (from SVN)
![Page 28: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/28.jpg)
Micro-benchmarks Average execution time (in seconds)
![Page 29: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/29.jpg)
Micro-benchmarks Worst-case execution time (in seconds)
![Page 30: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/30.jpg)
Micro-benchmarks Performance improvement ratio (RT-BOINC / BOINC)
![Page 31: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/31.jpg)
Micro-benchmarks Performance gap between worst-case and average
![Page 32: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/32.jpg)
Macro-benchmarks (low load)
![Page 33: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/33.jpg)
Source code on the Web http://sourceforge.net/projects/rt-boinc
![Page 34: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/34.jpg)
Size of Data Structures RT-BOINC uses the ’shared memory segment’ IPC between
server daemon processes to share the data structures.
For 10,000 entries of hosts, results, workunits, it consumes totally 1.09GB in main memory.
Memory overhead for O(1) data structures is 38.6% of the total usage.
Using 1GB memory is reasonable on the common-off-the-shelf 64-bit hardware platforms.
![Page 35: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/35.jpg)
Detailed information on the Web http://rt-boinc.sourceforge.net
![Page 36: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/36.jpg)
Future work (Remaining issues) Providing ’dynamic shared-memory management’ to reduce
memory usage
Studying trade-offs between execution time and memory usage
Studying better data structure management for O(1) response
Finding better task deployment policy to
Reduce server-side load and latency
Improve real-time performance
![Page 37: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/37.jpg)
Thanks! / Questions?
![Page 38: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/38.jpg)
Example 2) insert into values(...); Inserting RESULT to the O(1) data structure
Ex) insert into result ... values (...);
Result table in main memory
Get an available result field’s id from end of list Then, remove the ‘id’ from end of list
Lookup pool for available results
Insert result to this place
(a) Insertion
![Page 39: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/39.jpg)
Example 3) delete from where; Deleting RESULT from the O(1) data structure
Ex) delete from result where id=’1234’;
Result table in main memory
Insert ‘1234’ to the end of the result lookup list
Lookup pool for available results
Invalidate 1234th result
(b) Deletion
![Page 40: Towards Real-Time, Many Task Applications on Large ...mescal.imag.fr/membres/derrick.kondo/pubs/yi_europar10_pres.pdf · Motivation Demands for computing large-scale real-time(RT)](https://reader034.fdocuments.in/reader034/viewer/2022051917/6009bd6cd3f3c330174c1b42/html5/thumbnails/40.jpg)
Prototype Implementation Additional information
Compaction of BOINC's data format
Modification of PHP codes
Trade-offs between memory usage and WCET Statically adjustable with parameters
Compatibility with BOINC The rest parts are still compatible with BOINC.