1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

39
1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion

Transcript of 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

Page 1: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

1© Bull, 2013 BTSA Feb. 2013

February, 2013

Sizing & TCO for bullion

Page 2: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

2© Bull, 2013 BTSA Feb. 2013

Agenda

Sizing: – Methodology– Bullion performance numbers– Consolidation (scale-out vs. scale-up)– Excel tool

TCO:– Examples– Excel tool

Page 3: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

3© Bull, 2013 BTSA Feb. 2013

Input data

Inventory of physical servers and VMs to by replaced by bullions => enables to find the SPECint*rate performance Performance requirements (SPECint*rate, SAPS, …)Physical : number of cores, GHz, sockets, RAM sizeNumber of VMs and possible VM consolidationESXi over-commitment ratio for CPU and memory : usually from 1 to 5 according the performance expected for the applicationsIOs : bandwith for vMotion, for VMs, H.A.High Availability : number of nodes in the clusterDRS (2 sites) : yes or no; 1 or 2 clusters(synchronous/asynchronous)

Page 4: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

4© Bull, 2013 BTSA Feb. 2013

bullion perf.(1) ratio within E7-4800 series

E7-48076 cores

1.86 GHz95WH.T

E7-48208 cores

2.00 GHz105WH.T

E7-48308 cores

2.13 GHz105WH.T

E7-88378 cores

2.66 GHz130W

No H.T

E7-485010 cores2.00 GHz

130WH.T

E7-8867L10 cores2.13 GHz

105WH.T

E7-486010 cores2.26 GHz

130WH.T

E7-487010 cores2.40 GHz

130WH.T

ratio perf (2)

E7-xxxx/E7-4870

0.461 0.672 0.727 0.741 0.853 0.887 0.940 1

perf 4 sock. bullion 517 753 814 829 955 994 1052 1120(3)

perf 8 sock. bullion 974 1419 1533 1563 1800(3) 1872 1983 2110(3)

perf 12 sock bullion 1430 2084 2253 2296 2645 2750 2913 3100(4)

perf 16 sock bullion 1896 2763 2987 3044 3506 3646 3862 4110(3)

ratio perf/pricebullion 4s 2.241 2.286 1.594 1.679 1.45 0.943 1.102 1

E7-4850 is the best ratio perf./price

in 10 cores

(1) SPECint*rate_base2006

(2) Intel reference

(3) Published on spec.org

(4) estimatedE7-4820 is the best

ratio perf/price in 8 cores

E7-4870 is the best perf

Page 5: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

5© Bull, 2013 BTSA Feb. 2013

CPU perf. (native Linux SPECint®_rate2006 with E7-4870)

1 2 3 40

500

1000

1500

2000

2500

3000

3500

4000

4500

1120

2110

4110

modules

SPECint*rate

Perfect linearityScalability ~x4 (x3.67)

Page 6: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

6© Bull, 2013 BTSA Feb. 2013

SPECint*rate

Benchmark Hardware Vendor System Result Baseline # Cores # Chips

CINT2006rate Bull SAS bullion E7-4870 (160 cores - 4TB RAM) 4110 3890 160 16

CINT2006rate Hewlett-Packard Company ProLiant DL980 G7 (2.4 GHz, Intel Xeon E7-4870) 2180 2070 80 8

CINT2006rate Bull SAS bullion E7-4870 (80 cores - 2TB RAM) 2110 2000 80 8

CINT2006rate HITACHI BladeSymphony BS2000 (Intel Xeon E7-8870) 1920 1790 80 8

CINT2006rate HITACHI Compute Blade 2000 (Intel Xeon E7-8870) 1920 1790 80 8

CINT2006rate Unisys Corporation Unisys ES7000 Model 7600R G3 (Intel Xeon E7-8870) 1910 1780 80 8

CINT2006rate NEC Corporation Express5800/A1080a-E (Intel Xeon E7-8870) 1900 1790 80 8

CINT2006rate Fujitsu PRIMEQUEST 1800E2, Intel Xeon E7-8870, 2.40 GHz 1890 1770 80 8

CINT2006rate Fujitsu PRIMERGY RX900 S2, Intel Xeon E7-8870, 2.40 GHz 1890 1770 80 8

CINT2006rate IBM Corporation IBM System x 3850 X5 (Intel Xeon E7-8870) 0 1770 80 8

BCS or BCS-like

Glueless 8 sockets

Page 7: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

7© Bull, 2013 BTSA Feb. 2013

Intel Xeon Processor E5 and E7 performance comparison

Intel Xeon E7-4800 series ideal for data-demanding application performanceIntel Xeon E5-4600 for HPC

Bullion 4-sockets

Page 8: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

8© Bull, 2013 BTSA Feb. 2013

On-Line Transaction Processing (OLTP) perf.

bullion E7-4870 with VMware

tpmC(estimation)

tpsE(estimation)

4 sockets ~ 2,800,000 ~ 2,700

8 sockets ~ 5,500,000 ~ 4,600

12 sockets ~ 7,500,000 ~ 7,100

16 sockets ~ 10,000,000 ~ 9,500

Page 9: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

9© Bull, 2013 BTSA Feb. 2013

Virtualization perf. : SPECvirt benchmark

Bullion

with VMwareSPECvirt_sc2010

4 sockets(X7560)

2721@168 (28 tiles) (1)

8 sockets(E7-4870)

8287@512 (85 tiles) (2)

12 sockets N/A (3)

16 sockets N/A (3)

(1) published in February 2011 with 512GB, 32 cores & ESXi 4.1

(2) estimation

(3) ESXi V5 is limited to 512 VMs and 160 logical CPUs

Page 10: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

10© Bull, 2013 BTSA Feb. 2013

ERP performance

bullion

with VMwareSAPS

4 sockets(X7560)

41 420 (1)

8 sockets(E7-4870)

100 000 (2)

12 sockets(E7-4870)

175 000 (2)

16 sockets(E7-4870)

250 000 (2)

(1) published in may 2010 with 128 GB in a 2-tier SD architecture

(2) estimation in a 3-tier SD architecture

Page 11: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

11© Bull, 2013 BTSA Feb. 2013

In each server : 2 VMs of 8 vCPUs => no vCPU left 1 VM of 16 vCPUs 0 VM of 32 vCPUs

In each server : 16 VMs of 8 vCPUs 8 VMs of 16 vCPUs => 32 vCPUs left 4 VMs of 32 vCPUs 2 VMs of 64 vCPUs

• VMs limited to 16 vCPUs• Load peaks => servers are 100% full • vMotion impossible

= 8 vms w/ vCPUs

20x 16-core servers => 20 ESXi 2x 160-core bullions => 2 ESXi

• No limitation on the VM size• Load peak: fully managed (without vMotion)• vMotion possible for big VMs

scale-out (2-socket x 8-core servers)

scale-up(16-socket x10-core bullions)

CPU load & VMs: comparison scale-out/scale-up

32 free vCPUs32 free vCPUs

same number of cores (320)

Page 12: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

12© Bull, 2013 BTSA Feb. 2013

• 256 cores used• 512 cores paid

= 8 vCPUs

• 256 cores used• 320 cores paid

32 free vCPUs32 free vCPUs

HW investissement :

-37,5 %

32x 16-core servers => 32 ESXi 2x 160-core bullions => 2 ESXi

scale-out (2-socket x 8-core servers)

scale-up(16-socket x10-core bullions)

CPU load & VMs: comparison scale-out/scale-up

320

Page 13: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

13© Bull, 2013 BTSA Feb. 2013

• 256 cores used• 512 cores paid

• 320 cores used• 400 cores paid

HW investment : -22 %Performance : +25%

HA VMware

32x 16-core servers => 32 ESXi 5x 80-core bullions => 5 ESXi

= 8 vCPUs

scale-out (2-socket x 8-core servers)

scale-up(8-socket x10-core bullions)

CPU load & VMs: comparison scale-out/scale-up

Page 14: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

14© Bull, 2013 BTSA Feb. 2013

32 free vCPUs32 free vCPUs

Communication through the NICs

Communication internal to bullion => less Eth. adapters/cables/switches=> best performance

= 8 vCPUs

32x 16-core servers => 32 ESXi 2x 160-core bullions => 2 ESXi

scale-out (2-socket x 8-core servers)

scale-up(16-socket x10-core bullions)

CPU load & VMs: comparison scale-out/scale-up

Page 15: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

15© Bull, 2013 BTSA Feb. 2013

VMs : size and quantity

In a 16 socket bullion you can theoretically fit up to 5 VMs with 32 vCPUs with one physical core available for each vCPU (160 cores) with best performance (no over-commitment)

On a 4 socket X7560 bullion (64 logical CPUs with H.T.), we could run 168 VMs with a CPU over-commitment of x2,6 and a good QoS (cf SpecVirt constraints):

28 tiles each one with 6 VMs (with 1 vCPU):– 1 DB server + 1 JAVA Application Servers + 1 mail server + 1 WEB server

+ 1 NFS server + 1 server in standby to measure the latency of the network latency (SPECpoll, 99,5% of request < 1 s)

Some consolidation projects allow to consolidate VMs inside the same cluster:

allows reduction of the necessary HW (CPU, RAM, IOs)

Page 16: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

16© Bull, 2013 BTSA Feb. 2013

VDI sizing

For Citrix XenDesktop (used above ESXi hypervisor):1 VM per user 1 physical core for 8 VMs 1 GB of memory per VM

(no memory over commitment in order to avoid swapping)

More precisely, memory varies according the OS guest : from 512 MB for a Windows XP VM to 2 GB for Windows 7 VM

Example: for 1500 concurrent users => 190 cores (1500/8) & 1,5 TB memory

Configuration must be tuned to take into account the following:Considerations about load and HA (number of ESX)Hosting of other necessary VMs for XenApp (XenApp broker, ...) other CITRIX modulesConsolidation of other applicationsEtc.

Page 17: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

17© Bull, 2013 BTSA Feb. 2013

CPU/memory load & High Availability

• use several bullions (ESXi) for your VMware cluster :

if one ESXi/bullion fails, VMware HA will restart the VMs on the

other bullions

• Minimum is 2 bullions (fail-safe / maintenance)

• For no perf degradation (no CPU/memory over-commitment*):– 50% average load for 2 bullions– 67% average load for 3 bullions <= best compromise– 75% average load for 4 bullions– 80% average load above

*max average load regardless of number of bullions s.b. up to 80%

Page 18: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

18© Bull, 2013 BTSA Feb. 2013

CPU consolidation

Consolidating an existing park of small (1/2 sockets) physical servers

By default consider average CPU load to be no more than 15%Use Capacity Planner to obtain the exact number(e.g. XX => 7% CPU for 49 servers)

Consolidating an existing park of small (1/2 sockets) virtualized servers

By default consider the average CPU load to be 50%Use Capacity Planner to obtain the exact number

bullion proposition :bullion should be sized for an average load of up to 80%

Page 19: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

19© Bull, 2013 BTSA Feb. 2013

Memory consolidation

• get the amount and load of memory of existing park to be consolidated

• % memory load is either given by an audit tool like Capacity Planner, or use 80% if you don’t know

• sizing rules for the memory in bullion are the same than CPU (50%-50%, 67%-67%-67%, max 80%)

Page 20: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

20© Bull, 2013 BTSA Feb. 2013

bullion Inputs/Outputs sizing

I/O : check the capabilities of bullion :6 PCIe adapters / module : FC 4/8 Gbps, Ethernet 1/10 Gbps4 internal 1 GigE / moduleWARNING: check bullion limitations with multi-modules

Consider that VMs running in the same server (specially 16 sockets) allows to reduce the number of Ethernet adapters compared to smaller servers where VMs need to communicate out of the server

Page 21: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

21© Bull, 2013 BTSA Feb. 2013

Sizing Ethernet communication

For applications with many IOs between VMs (e.g. Xerox dematerialisation application) :

=> you may decrease up to ~25% your global bullion configuration (compared to a small server)

For applications with not many IOS between VMs (e.g. VDI):=> decrease from -5% your global bullion configuration

Page 22: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

22© Bull, 2013 BTSA Feb. 2013

IO configurations max for quadri-module bullion

Activated Kawela (1 GigE) 4 0 2 2 0 2

MegaRAID (disks) 0 0 0 0 0 1

LPE12002/1250 (FC) 7 7 4 7 7 7

I350-T2 (1 GigE) 0 0 0 0 0 0

i350-T4 (1 GigE) 0 4 0 0 0 0

X520-SR2/T2 * (10 gigE) 0 0 3 3 4 3

* X520-DA2 can be ordered through SFR

smaller configurations are possible by removing adapters

Page 23: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

23© Bull, 2013 BTSA Feb. 2013

IO configurations max for tri-module bullion

Activated Kawela (1 GigE) 0 0 0

MegaRAID (disks) 0 0 1

LPE12002/1250 (FC) 6 6 5

I350-T2 (1 GigE) 0 2 2

i350-T4 (1 GigE) 4 0 0

X520-SR2/T2 * (10 GigE) 0 3 3* X520-DA2 can be ordered through SFR

smaller configurations are possible by removing adapters

Page 24: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

24© Bull, 2013 BTSA Feb. 2013

Activated Kawela (1 GigE) 0 4 2 0 0 0 2 0* 2**

MegaRAID 0 0 0 0 1 1 1 1 0

LPE12002/1250 (FC) 4 4 4 4 4 3 0 0 2

i350-T2 (1 GigE) 0 0 0 0  2 2 0 0 0

i350-T4 (1 GigE) 4 2 0 0 0 0 0 0 0

X520-SR2/T2 (10 GigE) 0 0 2 4 0 3 2 4 3

IO configurations max for bi-module bullion

SFR only

* vSphere 5 maximum of 8x Eth 10 Gbps ports is respected

smaller configurations are possible by removing adapters

** vSphere 5 maximum of Eth combinated 6x 10 Gbps ports + 4x 1 Gbps ports is respected

Page 25: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

25© Bull, 2013 BTSA Feb. 2013

Ethernet network example for a bi-module

- 2 links 10 Gb/s dedicated to vMotion (1 TB can be evacuated in ~20’) + admin VMware– huge bandwith for the VMs (6 links 10 Gb/s)– internal bandwith inter-modules very important (~300 Gb/s))– Hyper-Threading can be activated (perf. + 5-10%)

Page 26: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

26© Bull, 2013 BTSA Feb. 2013

FC SAN example for a bi-module

- 4 HBAs (2 HBAs per module)- 4 boot paths

Page 27: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

27© Bull, 2013 BTSA Feb. 2013

Bullion sizing calculator (excel file)

Page 28: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

28© Bull, 2013 BTSA Feb. 2013

Sizing exercise

Propose an alternative solution with bullions to a DC with :- 20 blades UCS B200 M2 (2x 6-core CPU X5690) , 96 GB

mem/blade- SPECint*rate 1 blade = 432

Page 29: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

29© Bull, 2013 BTSA Feb. 2013

Sizing exercise

- SPECint*rate 1 UCS B200 M2 (2x 6-core CPU X5690) = 432- 20 blades => 8 640 SPECint*rate- CPU load blade = 50% => 4 320 SPECint*rate

- 3 bi-module E7-4820 bullions provide :- With a 100% CPU load: 4256 SPECint*rate, i.e. 101% of the target- With a 2/3 CPU load: 2851 SPECint*rate, i.e. 68% of the target

- 4 bi-module E7-4820 bullions provide :- With a 100% CPU load: 5674 SPECint*rate, i.e. 134% of the target- With a 2/3 CPU load: 4256 SPECint*rate, i.e. 101% of the target

A good choice is to propose 4 bi-module E7-4820 bullions

Page 30: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

30© Bull, 2013 BTSA Feb. 2013

Project example: target architecture

Page 31: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

31© Bull, 2013 BTSA Feb. 2013

Comparison blades vs bullion

1 blade4 sockets E7-487040 cores/ 256 GB(32 DIMMs of 8 Go; max 48 DIMMs)

+ 1 châssis + Fabric Extender+ 1 switch Fabric Interconnect

7U + 2U1 553 watts

1 module bullion 4 sockets E7-4870 40 cores / 256 GB(32 DIMMs of 8 Go; max 64 DIMMs)

3U900 watts (-42%)

blades bullion

Page 32: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

32© Bull, 2013 BTSA Feb. 2013

FermatGaloisGalois Fermat

Project example : initial proposal

Needs : 752 vCPUs => /5 = 150 cores 1504 GB vRAM => x 0,7 = 1052 Go

4 blades => 4 ESXi16 sockets (160 cores)1 024 GB

18 U5 256 watts

4 servers bullion => 4 ESXi16 sockets (160 cores)1 024 GB

12 U3 600 watts (-32%)

blades bullion

Page 33: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

33© Bull, 2013 BTSA Feb. 2013

FermatGalois FermatGalois

Project example: 1st evolution

10 blades=> 10 ESXi36 sockets (360 cores)2 592 GB

32 U10 100 watts

vSphere licenses (Entreprise+):

36 sockets x $4152 = $149,500

4 servers bullion => 4 ESXi32 sockets (320 cores)2 592 GB

24 U7 200 watts (-29%)

vSphere licenses (Entreprise+):

32 sockets x $4152 = $132,885 (- 12%)

Needs: 1799 vCPUs => /5 = 360 cores vRAM 3 598 GB => x 0,7 = 2 518 GB

blades bullion

Page 34: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

34© Bull, 2013 BTSA Feb. 2013

FermatGalois

Project example: 2nd evolution

16 blades => 16 ESXi 60 sockets (600 cores)

4 080 GB

32 U14 300 watts

vSphere licenses (Entreprise+):

58 sockets x $4152 = $240,239

4 servers bullion=> 4 ESXi 48 sockets (480 cores) 4 080 GB

36 U10 800 watts (-25%)

vSphere licenses (Entreprise+):

48 sockets x $4152 = $199,327 (-17%)

Needs: 2847 vCPUs => /5 = 570 cores vRAM 5 693 GB => x 0,7 = 3 985

GB +25% still available for future upgrade without adding servers

blades bullion

FermatGalois

Need to add 2 extra chassis (+14U ) in order to add more than 2 CPUs

Project example: 2nd evolution

+25% still available for future upgrade without adding servers

Page 35: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

35© Bull, 2013 BTSA Feb. 2013

Example #2

Requirements (split in 2 datacenters for PRA):428 VMs (spread among 10 application domains)2 576 cores (/40 = 68,9)RAM 9 832 GB

Bullion scenarios:

Scénario MONO Total per DC Actual

Total modules without VM consolidation 77 38,5 39

Total modules 78Total RAM 9 832

RAM per serveur 126 128

Total cores 3 120

Scenario QUAD Total per DC Actual

Total modules with VM consolidation 68,9 34,45 35Total quad bullions per DC 18 9 9Total modules 72Total RAM 9 832 10 368RAM per serveur 546 576Total cores 2 880

Scenario QUAD optimized (consolidation -7% ) Total per DC ActualTotal modules 32 64Total servers 8 8

6 modules less

14 modules less

Page 36: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

36© Bull, 2013 BTSA Feb. 2013

TCO calculation

TCO 3 years

UCS B200 BullionCapex

Total Hardware $329,250 $368,760

Hardware Installation $3424 $1712

VMware licenses $136,178 $37,718Opex

Hardware administration $61,635 $20,545

Hardware Maintenance subscription $19.913 $16,352

ESXi Admin/Maintenance $51,363 $5136

Power supply $214,287 $65,619

Space use $11,853 $11,853Total $827,904 $527,697

savings with bullion = $300,207 36%

- Quantity of HW to install/maintain

- Nb of licenses based on nb of sockets

- Power consumption

- Space in Data Center

- Nb of VMware nodes

Page 37: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

37© Bull, 2013 BTSA Feb. 2013

TCO calculator bullion (excel file)

Page 38: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

38© Bull, 2013 BTSA Feb. 2013

Summary

bullion : best performance & capacity (4110 SPECint*rate, 160 cores, 2 TB)=> ideal for consolidation

Consolidation:– HW (sockets, memory, IO adapters)– VMs

Tools:– Sizing tool (based on SPECint*rate and number of cluster nodes)– TCO (comparison against competition: OPEX, CAPEX 3 & 5 years)

Page 39: 1 © Bull, 2013 BTSA Feb. 2013 February, 2013 Sizing & TCO for bullion.

39© Bull, 2013 BTSA Feb. 2013