Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision*...
Transcript of Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision*...
![Page 1: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/1.jpg)
Efficient Concurrency Supportfor
Computer Vision Applications
Robert LiKamWaLin Zhong
Rice University
Starfish
1
![Page 2: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/2.jpg)
In the year 2020...
2
![Page 3: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/3.jpg)
Where did I place my keys?
Oh, I haven't talked to Fred
recently...
3
Remind me to tell Lin to let me graduate!
Continuous mobile vision
?????
?????
![Page 4: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/4.jpg)
4
Continuous mobile vision
Insufficient concurrency support for vision!
Energy consumption overwhelms wearable battery!
![Page 5: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/5.jpg)
5
Application Capture Service
Vision Headers Vision
Library
![Page 6: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/6.jpg)
6
Application Capture Service
Vision Headers Scale
Face
![Page 7: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/7.jpg)
VisionLibrary
7
VisionLibrary
VisionLibrary
We need efficient concurrency support!
Camera is overworkedComputation is overworked
200+ ms
200+ ms
200+ ms
![Page 8: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/8.jpg)
8
VisionLibrary
VisionLibrary
VisionLibrary
Observations:• Vision apps utilize common vision library
![Page 9: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/9.jpg)
9
Face
SURF
Scale
Scale
Scale
Face
Observations:• Vision apps utilize common vision library• Capture/Computation is redundant across apps
common primitives
![Page 10: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/10.jpg)
10
Face
SURF
Scale
Scale
Scale
Face
Observations:• Vision apps utilize common vision library• Capture/Computation is redundant across apps
common primitives
![Page 11: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/11.jpg)
11
Share computations,Reduce redundancyKey Idea:
Proposal: Split-‐process design for centralized control
Scale
SURF
Face
![Page 12: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/12.jpg)
Starfish CoreService
12
Vision Library
Starfish LibraryUses vision headers
to intercept library calls
Starfish Core ServiceExecutes, tracks, and shareslibrary call computations
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
![Page 13: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/13.jpg)
Starfish CoreService
13
Vision Library
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
StarfishSplit-‐process library solution for
efficient, transparent multi-‐app service
![Page 14: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/14.jpg)
14
First CallExecute call (20-‐200 ms)
Store results
Subsequent Call(s)Skip execution (0 ms)
Retrieve results
Starfish Core
Vision LibraryExec.
Function CacheStorage/Retrieval
Function Cache Arg Search
faceDetect(img)
faceDetect(img)
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
![Page 15: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/15.jpg)
Starfish Core
FunctionCache
15
StarfishShare computations,Reduce redundancy
Cache Search
Vision Library
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
![Page 16: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/16.jpg)
16
Timing optimizations:+ Reuse "Fresh Frames” to promote cache hits+ Delay call return to encourage device sleep
Starfish Core
FunctionCache
Cache Search
Vision Library
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
![Page 17: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/17.jpg)
Starfish Core
FunctionCache
17
StarfishLibrary
StarfishLibrary
StarfishLibrary
Cache Search
Vision Library
+ Promote sharing by reusing "Fresh Frames”+ Maintain code privacy by delaying call return+ Manage concurrent requests through fine-‐grained locks
Mitigating Split-‐Process Overhead
![Page 18: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/18.jpg)
Starfish Core
18
StarfishLibrary
StarfishLibrary
StarfishLibrary
FunctionCache
Cache Search
Vision Library
Reduce expense of argument passing
Avoid library object modification
![Page 19: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/19.jpg)
Split-‐Process in the Literature
19
Zero-‐copyRequire library redesign
Object trackingOptimized for code offload
• Object Redefinition• Lightweight RPC• SUN RPC
• Cloud Transfer • COMET• MAUI• CloneCloud
![Page 20: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/20.jpg)
Split-‐Process Argument Passing
20
faceDetect( )
Goal: Minimize deep copy overhead
3.5 ms
Shared MemoryStarfish Library Starfish Core
deep copyis expensive
Vision Library
Execution
![Page 21: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/21.jpg)
1) Protected shallow copy
21
Issue copy-‐on-‐write (mprotect) on received objects
faceDetect( )
Shared MemoryStarfish Library Starfish Core
Vision Library
Execution
CoW
CoW
![Page 22: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/22.jpg)
2) Direct output marshalling
22
Write new data directly into shared memory
faceDetect( )
Shared MemoryStarfish Library Starfish Core
Vision Library
Execution
CoW
CoW malloc()
![Page 23: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/23.jpg)
3) Reuse arguments from prior calls
23
Track, reuse previous inputs/outputs
faceDetect( )
Shared MemoryStarfish Library Starfish Core
Vision Library
Execution
CoW
CoW malloc()
Argument Table (img, shm_ptr)
![Page 24: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/24.jpg)
(Usually) Zero-‐Copy Argument Passing
24
+ Reduce deep copies through argument reuse+ Reduce allocation through buffer reuse
faceDetect( )
Shared MemoryStarfish Library Starfish Core
Vision Library
Execution
CoW
CoW malloc()
Argument Table (img, shm_ptr)
![Page 25: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/25.jpg)
Starfish Optimizations• Share Computations• Maintain expectations of developers & users
• Increase sharing efficiency by relaxing timing
• Decrease redundancy• Reduce argument copy• Reuse memory buffers
25
![Page 26: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/26.jpg)
26
Experimental Platform
Benchmarks:1) Per-‐call micro-‐benchmarks2) Multi-‐app benchmarks
OMAP4430 dual-‐core Cortex-‐A9 pinned to 600 MHz
OpenCV + Android + Google Glass Monsoon Power Monitor
![Page 27: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/27.jpg)
27
First CallNative : 21 ms
Unopt. Starfish : 42 ms Starfish : 24ms
Memory optimizations cut Starfish overhead in half
0" 5" 10" 15" 20" 25"
Receive&Outputs&
Prepare&Outputs&
Allocate&Outputs&
Exec.&Function&
Search&Cache&
Send&Inputs&
Prepare&Inputs&
Execution&time&(ms)&
Library'call'execution:'resize()Native&
Starfish&Unopt.&
Starfish&
![Page 28: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/28.jpg)
28
First CallNative : 21 ms
Unopt. Starfish : 42 ms Starfish : 24ms
Second CallNative : 21 ms
Unopt. Starfish : 10 ms Starfish : 6 ms
Starfish works well whennative function execution >> 5 ms
0" 5" 10" 15" 20" 25"
Receive&Outputs&
Prepare&Outputs&
Allocate&Outputs&
Exec.&Function&
Search&Cache&
Send&Inputs&
Prepare&Inputs&
Execution&time&(ms)&
Library'call'execution:'resize()Native&
Starfish&Unopt.&
Starfish&
Memory optimizations cut Starfish overhead in half
![Page 29: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/29.jpg)
Places where Starfish fails• 5-‐6 ms performance overhead per-‐call:– Bad for fast computations– Bad with large, deep arguments
• Non-‐cacheable functions– Random, temporal, external dependencies– Functions with specific parameters
29
But great for many high-‐level functions:Face detect, Corner detect, Image resize
![Page 30: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/30.jpg)
Starfish vs. Multi-‐App Workload
30
Scale FaceDetect
FaceRecog
If Lin then ThatSocial LoggerFacebookGoogle+TwitterMySpaceWhatsapp
...
0.3 FPS
![Page 31: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/31.jpg)
0"0.5"1"
1.5"2"
2.5"3"
3.5"4"
4.5"5"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
Frame&Rate&(FPS)&
Number&of&App&Instances&
Na/ve" Starfish"
When running multiple apps, Starfish achieves higher performance
31
Higher is
better
![Page 32: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/32.jpg)
When running multiple apps,Starfish draws single-‐app power
32
0"
500"
1000"
1500"
2000"
2500"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
Power&Consump-on&
(mW)&
Number&of&App&Instances&
Na.ve" Starfish"
Loweris
better
![Page 33: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/33.jpg)
When running multiple apps,Starfish draws single-‐app power
33
0"
500"
1000"
1500"
2000"
2500"
1" 2" 3" 4" 5" 6" 7" 8" 9" 10"
Power&Consump-on&
(mW)&
Number&of&App&Instances&
Na.ve" Starfish"
Starfish achieves efficient concurrency support!
![Page 34: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/34.jpg)
Starfish Core
FunctionCache
34
StarfishShare computations,Reduce redundancy
Cache Search
Vision Library
Vision LibraryStarfishLibrary
StarfishLibrary
StarfishLibrary
![Page 35: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/35.jpg)
35
Continuingmobile vision
MitigatingAnalog
Signal ChainBandwidth
Preserving
User/Subject
Privacy
DesigningEfficientSystems
![Page 36: Efficient(Concurrency(Supportroblkw.com/likamwa2015starfish-slides.pdf · faceDetect(img) Vision* Library Starfish Starfish Library Starfish Library. Starfish*Core Function Cache](https://reader034.fdocuments.in/reader034/viewer/2022042622/5fa60f0614a37570ba4e6a39/html5/thumbnails/36.jpg)
Starfish: Share computations, Reduce redundancy
36
Transparent, efficient library call caching• Timing optimizations for
frame freshness and performance preservation• Memory optimizations for
minimal deep copy and small cache footprint
Google Glass Experiments:• Low overhead
Memory optimizations slash overhead in half• Reduced power draw
Multi-‐app workloads draw Single app power
•