Rendering Takes Flight
-
Upload
avere-systems -
Category
Technology
-
view
273 -
download
0
Transcript of Rendering Takes Flight
Housekeeping
• Recording• Available on-demand approximately 5 minutes after
today’s presentation• Resources• Questions• Please rate this webinar
Agenda
• Learn about Moonbot & Taking Flight• Hear specifics about Moonbot Studios
– Why the cloud?– Goals– Details, details, details– Result Statistics
• Using Google Cloud Platform for Rendering• Ensuring Accessibility and Performance
Today’s Speakers
Sara HebertDirector of Marketing
Moonbot Studios
Jeff KemberCloud Solutions Architect
Google Cloud Platform
Aaron WetheroldSystems Engineer
Avere Systems
Brennan ChapmanPipeline SupervisorMoonbot Studios
• FOUNDED: 2009• LOCATION: Shreveport, Louisiana• EMPLOYEES: 50+• AWARDS: 239, including…
– 1 Oscar– 4 Emmys– 14 Cannes Lions– 12 Clios– 5 Webbys
“Top-shelf creative pedigree” – FAST COMPANY“Storytelling of the future” – LA TIMES
Why we decided to use the cloud
• 3 overlapping projects with high volume of renders– 30 second spot– 5 minute short film– 11 minute pilot episode
• No space on-site for required equipment– Additional space, power, networking needed for on-site
• Faster Scaling
Our Current Setup
• Moonbot has about 50 people on staff• 12 Lighters across the 3 projects• Software
– Maya 2015– Arnold– Nuke
• Qube Render Management• Shotgun
Our Current Setup
• Isilon Storage NAS – 100TB• 32 IBM blades
– 2x E5450 @ 3Ghz– 32GB RAM
• 40 Dell FX2 (Rentals)– 2x E5-2650 v3 @ 2.3Ghz– 64GB RAM
Arnold Notes
• We found it more cost effective to use Arnold licenses on fastest nodes.
• Old IBM blades were 4x slower than the new FX2s
Targets We Set
• Overnight Renders• Compute power should scale hourly if needed• Add up to 2,500 additional cores• Easy to use for both pipeline and artists• Need forecasting system to predict when we need cloud
capacity• Minimal amount of new tools• Consistency between setup of on-site render nodes and
cloud
Targets We Set
• Use Qube for Render Management• Only use cloud for Arnold renders, keep Nuke renders on-
site
Previous Experiences with Cloud
• Zync for Silent, a project for Dolby Laboratories– Rendered a few thousand
frames– Worked well, but added
complexity for artists
How to Handle Storage
• Projects required transfer of about 12TB• Limited to a 100Mbps connection• Need to copy assets to cloud• Need to copy results back on-site
How to Handle Storage
• Presents the Avere storage to render nodes the same way on-site renders nodes see on-site storage
• Handles all transfers to and from cloud for assets and rendered images
• Uses clustered system to spread load across multiple nodes• Simple to setup• Mounts via NFS
How to Handle Storage
• Faster and easier to setup using Avere to automatically load only the required files needed to perform each render
Vs.
• Developing scripts that find and copy all dependencies, then start the render
How to Handle Storage
• We used the cache read and write with 30 second write-back
• Using the write cache allowed the render nodes finish faster, transfer was then completed by the Avere cluster
• ISSUE: No way to get notification from Avere when the writeback transfers were finished.
Cloud Render Node Configuration
• Cent OS• Use fstab to mount nfs storage from Avere just like on-site
nodes mount the nfs storage from the Isilon• Used Ansible to configure our instance, and then saved it to
an image in Google Cloud
Preemptible vs On-Demand Instances
• On-Demand– Higher cost– Availability guaranteed
• Preemptible instances– Lower Cost, 1/3 of price– Restarts every 24 hours– Availability not guaranteed
Instance Groups
• Google Cloud offers a system for managing pools of instances using Instance Groups and Templates.
• This automatically handles starting and stopping instances.• When using pre-emptible instances, they are automatically
added/removed based on their availability.• Using this system we had spin-up times at around 2-3
minutes.
Arnold Notes
• Not all n1-standard-32 nodes are the same• Haswell was best performance, only available in certain
zones• Arnold was 20% faster on the Haswell vs Ivy Bridge
Qube Integration
• Utilized startup and shutdown scripts from Google to facilitate adding and removing workers to Qube
• Instance Groups create instances with new names every time
• Need to register and unregister workers on startup and shutdown
• Didn’t have to configure much else, everything else was setup just like it is on-site
Forecasting Renders
• We utilized preview frames to estimate render times for the farm
• Preview frames usually include the first, middle and last frames of each shot sent to the farm
• Preview frames have the highest priority on the farm
Forecasting Workflow
• Artists submit their jobs to the farm before they leave.• Pipeline team waits until preview frames have been
rendered• Run forecasting tool which calculates the # of cloud render
nodes required to finish the renders by the next morning• Spin-up the required amount of nodes on Google Cloud• Shutdown the instances the next morning
Tile Rendering
• On submission of jobs to Qube we built support to split frames into tiles.
• Goal with tiles is to decrease maximum time for a render to about 30 minutes.
• Less risk of losing work in Pre-emptible instances • Allows frames to finish quicker, especially preview frames
Stats
• Ping time to Google Cloud Iowa data: 35ms• Ingress: 12TB• Egress: 2TB• 60% of renders were done on Google Cloud• 4,000 Render Jobs Total• 3,000 Arnold Jobs• ~1,800 were completed on Google Cloud
Results
• We met our deadlines!• Worked really well once we got the VPN connection figured
out• Will save a lot of time planning and budgeting• Allows render farm to work around the schedule instead of
the schedule around the farm.• Saves time managing local hardware• Allows us to stay nimble as a small company, don’t have to
invest large capital into render farm
Google Cloud Platform for Rendering & Animation
Jeff Kember, Cloud Solutions Architect,Google Cloud Platform
Render Cloud Bursting Use Case
Customer Requirements• Overflow render capacity
into Google Cloud Platform• No copying of data back
and forth • Avoid deploying additional
hardware• No application rewrite• Provide flexibility to adjust
to project demands
Avere Solutions• Deploy the scalable vFXT
cluster within GCE• Access and accelerate existing
on-prem NAS and GCS storage while hiding access latency
• Dynamically tier active working set into SSD and DRAM
• Standard NFS and SMB client access
• Ease of deployment, management and expansion
The Challenge
Google Cloud Storage
Google Compute Engine
On-Prem Storage
NAS
On-Prem Compute
Virtual Render Farm
Render Farm
Artist Workstations
The Avere vFXT Solution
Google Cloud Storage
Google Compute Engine
On-Prem Storage
NAS
On-Prem Compute
Virtual Render Farm
Render Farm
Artist Workstations
Virtual FXT
Avere Deployment Flexibility
Google Cloud Storage
Google Compute Engine
Physical FXT
On-Prem Storage
NAS Object
On-Prem Compute
Virtual Render Farm
Render Farm
Artist Workstations
Virtual FXT
Avere Product Line
Virtual FXTP
erfo
rman
ce
Hardware n1-highmem-8
n1-highmem-32 FXT 3200 FXT 3850 FXT 4850
DRAM (GB) 52 208 96 288 288
SSD (TB)1
(persistent)or 1.5 (local)
4 (persistent) - 0.8 4.8
SAS (TB) - - 4.8 7.8 -Network 10GbE 10GbE 2x10GbE, 6x1GbE
Physical FXT
4850
3850
3200
Protocols•To Client: NFSv3 (TCP/UDP), CIFS (SMB1.0 & 2.0)•To Core Filer: NFSv3 (TCP), S3 APIClustering•Cluster from 3 to 50 FXT nodes for performance and capacity scaling•HA failover, mirrored writes, redundant network ports & powerManagement•GUI, analytics, email alerts, SNMP, XML-RPC interface, policy-based management
•Persistent and Local SSD Support•Standard, DRA and Nearline GCS Support•Per minute billing
Next Steps
• Ask questions• Review the attachments section for relevant resources• Rate this webinar