Copyright © 2008 Altair Engineering, Inc. All rights reserved. Green Provisioning™ with PBS...

27
Copyright © 2008 Altair Engineering, Inc. All rights reserved. Green Provisioning™ with PBS GridWorks: A Success Story Bill Nitzberg, Ph. D. CTO, PBS GridWorks Altair Engineering, Inc. Shajy Thomas Head Of Technology Crest Animation Studios Ltd.

Transcript of Copyright © 2008 Altair Engineering, Inc. All rights reserved. Green Provisioning™ with PBS...

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Green Provisioning™ with PBS GridWorks:A Success Story

Bill Nitzberg, Ph. D.

CTO, PBS GridWorks

Altair Engineering, Inc.

Shajy Thomas

Head Of Technology

Crest Animation Studios Ltd.

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

One Slide Summary™

Only machines running jobs use power

• Shutdown service• Acts when load average stays below the threshold for a period of time

• Wakeup service• Acts when enough queued work (waiting jobs)

• Issues Wake-on-LAN to subsets; throttled to avoid circuit overload

At Crest:

Green Provisioning: ~20% operational savings (in power)

Overall ROI: ~6 months to break-even with GridWorks

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Agenda

Altair GridWorks Crest Discussion

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

About

A global software and technology company focused on:

Product Development

Advanced Computing

Enterprise Analytics

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Innovation Milestones

‘85 ‘89 ‘94 ‘96 ‘99 ‘03 ‘04 ‘05 ‘06 ‘07

Founded ...

Michigan in 1985

Today ...

1400 employees

Operations in 17 countries

Revenues of $140M

$100M

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Global Offices

Seattle, USA

Los Angeles, USA

Austin, USA

Dallas, USA

Mexico City, Mexico

Toronto, Canada

Windsor, Canada

Detroit, USA

Boston, USA

Milwaukee, USA

Atlanta, USA

Sao Paulo, Brazil

Lund, Sweden

Gothenburg, Sweden

Coventry, UK

Manchester, UK

Boeblingen, Germany

Cologne, Germany

Hanover, Germany

Munich, Germany

Paris, France

Sophia Antipolis, France

Torino, Italy

Milan, Italy

Moscow, Russia

Delhi, India

Pune, India

Chennai, India

Hyderabad, India

Bangalore, India

Kuala Lumpur, MY

Beijing, China

Shanghai, China

Tokyo, Japan

Osaka, Japan

Nagoya, Japan

Seoul, Korea

Melbourne, Australia

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Customers

3,800+ Customers Worldwide

Automotive Aerospace Heavy Equipment Government Life/Earth Sciences Consumer Goods Oil & Gas

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

PBS GridWorks

• Easy to use

• Hard to break

• Do more with less

• Keep track & plan

• Open architecture

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Easy to Use Portals Increase Productivity

I push enter and my jobs run;the results come back when they’re done.

--Major Automotive OEM

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Hard to Break Business Continuity & Risk Mgmt

SMP Jobs

MPP Jobs

Works with 1000’s of users, 10,000’s of cpus, 1,000,000’s of jobs

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Do More with Less Maximize Value

Existing Job

Existing Job

Existing Job

Exclusive

MPI non-exclusive

MPIexclusive

Throughput

Exclusive

MPINon-

exclusive

MPIexclusive

Turn-around

User

CEOAdmin

Service Levels (SLAs)

Access

Priority

Fairshare

Preemption

Reservation

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Keep Track & Plan Optimize ROI

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Open Architecture Best-in-Class Technology

• Commitment to standards

• POSIX Batch Standard, key participant

• MPI-2 standard, editor

• Grid Forum, Board of Directors

• OGSA HPC Basic Profile, contributor

• Extensible architecture

• Unified job and resource architecture

• Interfaces for 3rd party integrations (API, AIF, Hooks)

Since 1999

Concepts for Rendering

Presented by :

Shajy ThomasHead Of Technology – Crest Animation Studios Ltd.

9th Sep 2008

What is Green ?

• Green = Efficient– Efficient operations minimize input costs

• Electricity, water, capital• Power and Cooling ,the critical component of TCO evaluation

• Green should be good for business– Reduced operational costs impact the business bottom line– Efficient capital deployment is a competitive advantage

• Green should not– Impact reliability and availability– Cost more than it saves

Datacenter Facts and Challenges

For every rupee spent on a piece of hardware in a data center, another 50 paisa is spent on energy to power it

For every KW of power consumed by a server, roughly another KW or more must go toward cooling it

10x increase in energy footprint in 10 years: from 300watts /sq ft to 4,000 watts /sq ft

As per uptime estimate, by 2012, the cost of cooling and power will be 22 times the initial purchase compared to 2.5 times today

Solution• Adaptive Infrastrusture, Smart Data

Centre Design• Power and Cooling Resource

Management• Eco friendly hardwares

(servers/storage/network)• Proper cabling• Automated Tools, Management

Software's, Virtulization• Future planning and scalability

CGI Work Flow

• Modelling• Texturing• Rigging• Background (BG)• Animation• Lighting• Rendering• Compositing

Rendering Concept

• Rendering is the process of generating an image from a model, by means of a software program. The model is a description of three dimensional objects in a defined language or data structure.

• During rendering the render engine calculates/computes the details like ray tracing settings, resolution settings, textures, lights etc using the scene descriptions.

• During the rendering process it utilizes 98-100% of the resource in the compute nodes like CPU and memory.

• The time taken on the rendering depends on the number and complexities of elements present in the scene.

• Local machine rendering leads to loss of productivity of the artists

• Common available Network distributed rendering software's creates lot of IO and network bottlenecks.

• When the no of render scenes increases there are more requirement for compute power, HPC is the ideal solution for it.

• HPC based render farms helps scaling of the setup horizontally and vertically.

• High efficiency, Stability and performance.

Need of HPC for Rendering

eRender/PBS Pro focus towards green IT

• Crest and Altair join Hands for the development of the global rendering product.

• Consolidated design: All the compute nodes accessing the same central storage and authentication from single LDAP, This prevents multiple data duplication, and single point authentication and reporting.

• Smart scheduler: the Scheduling engine takes control of the job submission, prioritization, and management etc

• Server Pooling: All the compute nodes are in a single cluster. The nodes are allocated project wise. Based on the availability the idle nodes can be dynamically assigned to the active pool if required.

• Power on demand: so that if there are no jobs in the queue the servers will be shutdown automatically. And when the new job is fired the master node will calculate the total number of nodes required to compute the job and will switch on only that many nodes (based on the frames per core). This saves lot of power and cooling consumption

• Cycle Stealing: Based on the availability the idle windows workstation’s CPUs can be dynamically assigned to the active pool for the rendering whenever required.

Features• Virtualization- The complete grid is like a super

computer. The master node make sure none of the nodes is underutilized or idle if there are jobs in the queue

• Pre File Analyzing -Before the compute node get the job, Master node does the file analysis of the pre request for the scene to get rendered

• PreChecks-The master node check the scene to be rendered like references, textures etc before it commands the compute nodes

• Self healing: Some of the common errors like memory segmentation, NFS errors and IO errors are solved automatically by the scripts

Copyright © 2008 Altair Engineering, Inc. All rights reserved.

Current ScenarioRacks are ON all

the time

WastedPower

Power management feature shuts down the servers which are not utilized.

When user submits jobs

When user submits more jobsOnly required resources are powered ON

Operational Savings 20%

Green Provisioning™

Giga Network

SAN Storage

eRender-Control Architecture

Internet

Remote Access Monitoring

Web serving LayereRender Login

Process Layer

R-PRO

End User GUI

ADS Auth

RenderEngines

NFS Req

uest &

Mounts

NFS Req

uest &

Mounts

O/P fi

les

to S

tora

ge

O/P fi

les

to S

tora

ge

HPC: Facts

• Huge and complex Problems• Increasing time pressure and

deadlines• Resource negotiations with team

members• Fluctuating utilization,

too many jobs running on the same machine

• Constant change of hardware and solvers

ROI• Optimal utilization of compute resources (from around 25%

to 30%+)• High Availability

• Improvised IT service levels to users

• Helps allocation of resources for “Starved JOBS”

• Dynamic License Check

• The project timing got reduced

• Efficiency and quality improvement

• Number of retakes and rerendering got reduced.

• Increase in delivery outputs.

Thank YouShajy Thomas

www.crestindia.com

Bill Nitzberg

www.altair.com