SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek...
Transcript of SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek...
![Page 1: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/1.jpg)
SVE: Distributed Video Processing at Facebook Scale
Qi Huang, Petchean Ang, Peter Knowles, Tomasz Nykiel, Iaroslav Tverdokhlib, Amit Yajurvedi, Paul Dapolito IV, Xifan Yan, Maxim Bykov, Chuen Liang, Mohit Talwar,
Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt LloydFacebook, University of Southern California, Cornell, Princeton
Presentation by Jonas Umland
![Page 2: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/2.jpg)
Introduction
● Every day:○ 8B video views○ 500M users watch 100M hours video ○ Many tens of millions uploads
![Page 3: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/3.jpg)
Overview● Legacy design (MES) vs new design (SVE)● Performance comparison● DAG execution system ● Overload control● Production lessons
![Page 4: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/4.jpg)
Full Video Pipeline
![Page 5: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/5.jpg)
Tasks 153 22 18 >1000
Production Video Applications
![Page 6: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/6.jpg)
Monolithic Encoding Script
![Page 7: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/7.jpg)
Design Goals for a New Engine
Fast Robust Flexible
![Page 8: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/8.jpg)
SVE Architecture Overview
![Page 9: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/9.jpg)
SVE Architecture - Preprocessor
● Validation● Splitting video into chunks for old clients● DAG generation● Storing input video● Caching
![Page 10: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/10.jpg)
SVE Architecture - Scheduler & Workers
● Scheduler○ Receiving DAG from preprocessor○ Scheduling tasks○ Putting tasks into queue, when no worker is available (high & low prio)
● Worker○ Executing task○ Fetching data from preprocessor or intermediate storage○ Writing to intermediate storage
![Page 11: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/11.jpg)
SVE Architecture Overview - Intermediate Storage
● Caching of application metadata● Caching of video/audio data● Storing DAG state● Automatically free data
![Page 12: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/12.jpg)
Overlap Upload and Encoding
![Page 13: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/13.jpg)
Overlap Upload and Encoding
![Page 14: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/14.jpg)
Parallel Processing
![Page 15: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/15.jpg)
Parallel Processing
![Page 16: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/16.jpg)
Video Sync (Durably Storing)
![Page 17: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/17.jpg)
Video Sync (Durably Storing)
![Page 18: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/18.jpg)
Overall latency improvement
![Page 19: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/19.jpg)
DAG Execution System
![Page 20: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/20.jpg)
Dynamic DAG Generation● Processing tasks depend on
video propterties● Enables performance testing
![Page 21: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/21.jpg)
Fault Tolerance Strategies
Component Strategy
Client device Anticipate intermittent uploads
Front-end Replicate state externally
Preprocessor Replicate state externally
Scheduler Synchronously replicate state externally
Worker Replicate in time
Task Many retries
Storage Replicate on multiple disks
![Page 22: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/22.jpg)
Retry Tasks After Recoverable Error
Success rate
First try 99.788%
2 local retries 99.795%
1 retry on different worker 99.901%
6 retries on different workers 99.995%
![Page 23: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/23.jpg)
Failure of 20 % of Preprocessors in a Region
![Page 24: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/24.jpg)
Gradual Failure of 5% of Workers in a Region
![Page 25: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/25.jpg)
Mitigate overload
1) Delay latency insensitive tasks2) Delay latency sensitive tasks and notify engineer3) Redirect portion of video uploads to different region4) Delay video processing
![Page 26: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/26.jpg)
Overload Control in Practice
![Page 27: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/27.jpg)
Production Lessons
● Mismatch for livestreaming● Failures from global inconsistency● Failures from regional inconsistency● Continuous sandboxing
![Page 28: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/28.jpg)
Summary
● 3 additional parallelities to improve latency● DAG execution system● Robust to overload and fault● Large scale production insights
![Page 29: SVE: Distributed Video Processing at Yajurvedi, Paul ...iwanicki/courses/ds/2019/... · Abhishek Mathur, Sachin Kulkarni, Matthew Burke and Wyatt Lloyd Facebook, University of Southern](https://reader033.fdocuments.in/reader033/viewer/2022050510/5f9b50ff5691142a035b640e/html5/thumbnails/29.jpg)
SourcesMost images are extracted from the paper:
https://www.cs.princeton.edu/~wlloyd/papers/sve-sosp17.pdf
And from Qi Huang's Talk:
www.qhuangcs.com/slides/sosp_sve.pptx