Changing the world we live in Healthcare is proactive Airlines go social Manufacturers mass...
-
Upload
prosper-oliver -
Category
Documents
-
view
221 -
download
1
Transcript of Changing the world we live in Healthcare is proactive Airlines go social Manufacturers mass...
Transform your mission-critical environment Superdome X + Windows Server/SQL Server
Maurice De VidtsLaurence GrizaudKen Pomaranski
BRK2584
Agenda Introduction HP Integrity Superdome X overview HP Integrity Superdome X architecture Scalability and flexibility with Microsoft
Windows Server 2012 R2 Enterprise Class Server with built-in
reliability Leveraging SQL Server on HP Superdome
X
The most exciting shifts of our time are underwayChanging the world we live
in
Healthcareis proactive
Airlinesgo social
Manufacturersmass customize
Time to revenueis critical
Decisions mustbe rapid
Change isconstant
Business happensanywhere
Cloud2-fold growth
in next 2 years
Security$104B black
market per year
Mobility100B connecteddevices by
2020
Big Data14.6 PB
created per company
Creating new imperatives for IT
Speed innovation
Create flexible
infrastructure
Control energy and space costs
Speed
Effi
cie
ncy
Businessvalue
Reimagine the server.Delivered as general-purpose, dedicated, physical infrastructure
Silo’edTechnology
centric
Manual
Infrastructure is a cost center
Think Compute.Dynamically aligns pools of resourceswith laser precision to business goals
Infrastructure is a service differentiator
Software-defined and cloud-ready
Workload
optimized
Converged
Compute enables efficient and effective IT services
Common customer challengesBusiness Processing and Decision Support workloads
“We’re deploying new applications on Windows and need more reliability than we have today.”
“My scale-out solution has become too complex, and our management and networking costs are escalating.”
“I need more scalability and availability for our existing x86 applications.”
“We need to reduce our operational costs for mission-critical applications.”
“I’m not getting the x86 performance we require for our large database.”
Superdome X + Windows and SQL Server:A winning combination to meet your challenges today
Redefine compute
economics
Boost business performance
Breakthrough x86 scalability and efficiencies
Groundbreaking x86 performance
Increase competitive
differentiation
Superior x86 uptime
The right compute for the right workload at the right economics … every time
x86 performance not there for large database workloads
x86 uptime not able to meet
mission-critical SLAs
4-socket x86 scale-up not sufficient for the new generation of workloadsComplex, high OPEX large scale-out environments
Today’s challenges New with Superdome X
What is Superdome X?HP Integrity Superdome X, for your critical business processing and decision support workloads
Power your most demanding workloads
Support your largest enterprise applications
Maximize the uptime of your critical x86 apps
16 sockets
12TB memory
7.5x performance of current HP 8 Sockets system
20x greater reliability with unique hardware partitions
CertifiedNew
Drive Real-Time Business with Real-Time InsightsSQL Server 2014
9
Over 100x query speed and significant data compression withIn-Memory ColumnStore
Up to 30x faster transaction processing with In-Memory OLTP
Greater performancewith In-Memory Analysis Services
Billions of rows per second with PowerPivot In-Memory for Excel
Faster InsightsIN-MEMORY ANALYTICS
Faster QueriesIN-MEMORY DW
Faster TransactionsIN-MEMORY OLTP
Decision Support
Business Processing
Target workloads
Superdome X + Windows Server/SQL Server
• Enterprise Resource Planning (ERP) • Customer Relationship Management (CRM)• Online transaction processing (OLTP) • Batch
• Data warehousing/data mart (Scale-up)
• Data analysis/data mining
Hig
hest
Mis
sio
n C
riti
cality
Mega instance and in-memory OLTP UNIX migration Large scale workload
consolidation
HP Integrity Superdome X BladeSystem EnclosureIdeal for scale-up and consolidation
HP Superdome X – at a glance
Form Factor
2s BladeFull extended height
Enclosure18U in standard HP 19” rack
standard power, airflow, cooling
Compute Blades & CPUs
• 2 CPU sockets for Xeon E7 v2• Low and High core count
CPU SKUs
• Up to eight 2s cell blades• 2-16 sockets• Xeon CPUs in one or many nPARs
Memory • 48 DIMM slots, 1.5TB per blade w/ 32GB DIMMs
• 12TB memory capacity w/ 32GB DIMMS
I/O • 2 LOM Cards: Fully configurable / customizable 10Gbe Flex NICs• 3 mezzanine slots
• 16 LOM Cards: Configurable 10Gbe Flex NICs • 24 mezzanine slots
RASUM • Mission-Critical RAS• iLO 4 management processor
• Mission-Critical RAS• SD2 based mission-critical OA
Partitioning and Virtualization
• Electrically isolated blades, can be grouped into nPARs through the flexible crossbar fabric
• Electrically isolated nPARs• Industry standard Virtualization (Hyper-V, VMware)
Superdome X architecture
Crossbar for Reliability
Crossbar for Scaling
Compute16 CPUs 240 cores 480 threads
Memory384 DIMMs12 TB capacity
I/O16 FlexLOMs24 Mezz I/O
• Crossbar aggregate bandwidth (BW) > 1.2TB/s
• CPU/MEM aggregate BW > 1TB/s (measured)
• I/O aggregate BW ~ 800GB/s
• End to End retry through multiple paths
• Electrical isolation of hard partitions (HP nPars) for ultimate flexibility and maintainability
Superdome X optimizes cache coherency4-socket and 8-socket “glueless” designs use the E7 v2 “In memory snoop directory” (also known as directory mode) capability Removes snoop/snoop response processing from critical path Reduces snoop traffic that consume QPI interconnect bandwidth on 4S and larger systems
The external node controller (XNC2) chip isolates snoop traffic between blades in the nPartition, so SDX can use the new E7 v2 “Opportunistic Snoop Broadcast” Mode (OSB) to improve performance Improves cache latency in many scenarios Reduced memory lookup translates to reduced memory latency and improved bandwidth usage
The result is an optimized cache coherency solution for Superdome X Intel Opportunistic Snoop Broadcast (OSB) Cache coherency enabled – even for enormous 16S
partitions without “snoop” traffic penalty Improves performance by reducing local memory access latency Increased performance when compared to Directory mode for NUMA optimized workloads Scalable and Fast Directory Cache in XNC2 ASIC for off-blade accesses
Maximize Uptime
SQL Server Always-on
Windows failover cluster
Integrated HA and backup/restore
Encryption Audit
Main-memory optimized Optimized for in-
memory data Indexes (hash and
range) exist only in memory
No buffer pool Stream-based
storage for durability
High concurrency
Multiversion optimistic concurrency control with full ACID support
Core engine uses lock-free algorithms
No lock manager, latches, or spinlocks
T-SQL compiled to machine code T-SQL compiled to
machine code via C code generator and Visual C compiler
Invoking a procedure is just a DLL entry-point
Aggressive optimizations at compile time
Steadily declining memory price, NVRAM
Many-core processors
Stalling CPU clock rate
TCO
Hardware trends Business
Availability / Security
High-performance data operations
Frictionless scale-up
Efficient, business-logic processingB
en
efi
tsS
QL S
erv
er
Tech
P
illa
rsD
rivers
SQL Server 2014 ArchitecturePerfect match for Superdome X hardware technology!
Unprecedented x86 performanceHP Integrity Superdome X SPECcpu2006 8-socket Superdome X performance exceeds all x86 competition
#1 8S x86 SPECfp_rate_base_2006 / #1 8S x86 SPECfp_rate2006 #1 8S x86 SPECint_rate_base_2006 / #1 (tie) 8S x86 SPECint_rate2006
16-socket Superdome X system performance even beats out “big iron” #1 16S SPECfp_rate_base_2006 / #2 16S SPECfp_rate2006 #1 16S SPECint_rate_base_2006 / #1 16S SPECint_rate2006
HPTC workloads Preliminary Graph 500 results already put 16S Superdome X as the #2 single-node
solution in performance terms, as well as, the #4 most power efficient – and still tuning left to do
Early HPLinpack results at 5 TFlops out of a theoretical peak possible 5.376 TFlops (93% efficiency)
SPEC and the benchmark name SPEC CPU are registered trademarks of the Standard Performance Evaluation Corporation (SPEC); see spec.org/ as of 12/1/2014
Overshadows the best from:• Fujitsu PRIMEQUEST 2800E• IBM System x3950 X6• Hitachi BladeSymphony BS520X
Superdome X 240-core system wins over the best from:• Fujitsu M10-4S (256-cores)• IBM Power 780 (128-cores)
Superdome X 16S performance w/ SQL 2014
7.5x BI Performance improvement over 8S DL980 G7
60 Billion rows processed in 16 seconds. Internal Workload - DSS focused query SDX – 16S/4TB RAM, DL980 8S/4TB RAM 100% CPU bound query
Benchmark work in progress… Stay tuned
The unique value of HP nParsHard partitions add resource and cost efficiencies
Lower your TCOOptimize software costs by using HP nPars
Maximize resource utilizationCreate different development, test, and production environments within a single enclosure
Minimize downtimeTake one partition offline, while the others continue to run undisturbed
20x greater reliability than soft partitions
Protect your dataElectronic isolation provides a high degree of security between partitions
HP BladeSystem Superdome Enclosure
OS
APP
OS
APP
OS
APP
nPar A:Dev
System
nPar B:Test
System
nPar C:Prod
System
Memory
Memory
Processor 1/1/0
Processor 1/1/1
Memory
Memory
Processor 1/3/0
Processor 1/3/1
Memory
Memory
Processor 1/5/0
Processor 1/5/1
Memory
Memory
Processor 1/7/0
Processor 1/7/1
Blade 1/1 Blade 1/3
Memory
Memory
Processor 1/2/0
Processor 1/2/1
Blade 1/2
Memory
Memory
Processor 1/4/0
Processor 1/4/1
Blade 1/4
Blade 1/5 Blade 1/7
Memory
Memory
Processor 1/6/0
Processor 1/6/1
Blade 1/6
Memory
Memory
Processor 1/8/0
Processor 1/8/1
Blade 1/8
Blade I/O Blade I/O Blade I/O Blade I/O
Blade I/O Blade I/O Blade I/O Blade I/O
Crossbar fabric
Experience superior x86 availability with Superdome X
60%
Increase availability with end-to-end mission critical design
ZeroPerform maintenance and updates online without application outage
95% Ensure continuity with HP Firmware First
downtime reduction
planned downtime
reduction inmemory outages
20x greater reliability with
HP nPars
Superdome X with Power-on-Once technology
Availability from components to complete solutions
Up to 100% application availability
Error identification, reporting, recovery
Infrastructure reliability
Common components
Hard partitioning (HP nPars)
‘Firmware First’ architecture
Error Analysis Engine
Enhanced Failover Clustering in R2
Insight Remote Support
Proactive services
Windows Server 2012 R2
Online optimization and repair
Fault-tolerant fabric
Integrated with SQL Server AlwaysOn at database level
The ideal foundation for your mission-critical x86 environment
Benefit from proven reliability, availability, and serviceability (RAS)
Superdome X
Extending the proven HP Integrity Superdome 2 mission-critical RAS features
HP fault management Diagnostics Error analysis engine True ‘One Stop’ fault
managementSelf healing Deconfiguration (core,
DIMM and blade) Runtime deactivation
(DIMM, I/O, and fabric)
Memory RAS Proactive memory
scrubbing Enhanced DDDC + 1
Platform RAS Clock redundancy Fault-tolerant cross-bar fabric Partitioning/error isolation Cross bar and hard partitionsServiceability Redundant, hot swap: Power supplies and fans I/O switches HP Onboard Administrator
modules
HP firmware PCIe Live Error Recovery (LER) Advanced error reporting Viral error containmentHP hardware Advanced memory error recovery Corrupt data containment LER containmentOS level RASProcessor RAS Processor interconnect (CRC) Advanced MCA recovery
Superdome X RAS features begin where most commodity x86 servers leave off
End-to-end mission-critical design
Protect your mission–critical data on HP Superdome X Superdome X protects, analyzes the evidence, recovers – results in 60% downtime reduction
1100X100110MCA
Detect!OS crashes, attempts reboot, no critical analysis
1100X100110
Bad data may end up in storage
Weak containmentNo critical analysis
MC x861100X100110
Self-heal
Collect evidence
OS recovery
Bad data contained
immediately
Critical Analysis
Strong containmentDeep analysis
110011001100
Repair
MCADetect!
Generic x86
Data Integrity
Results in fewer memory, IO, and processor based outages HP Advanced Error
Recovery Recovery from uncorrectable
processor, cache and memory errors during execution
HP Memory Quarantine Recovery from uncorrectable
memory errors which may cause a system crash
1Uncorrectable error detected in processor, cache or memory under execution pipeline
Firmware informs OS, hypervisor and end-application
OS, hypervisor and end-application initiates recovery action – thread, process, VM, application kill/re-started
System keeps running and prevents crash
24 ProcessorsL1 Cache
Cor
e
L2 Cache
Unco
re
DRAM
HP Advanced Error Recovery and HP Memory Quarantine require recovery awareness in OS, Hypervisor and End-application
1MCA Recovery detects uncorrectable memory error
HP Memory quarantine tags memory location as bad and sends address to OS/hypervisor
OS/hypervisor decides how to handle recovery
OS/hypervisor blocks use of bad memory location
3
4
3
2
Windows Server 2012 R2 has a 4TB RAM Limit
Maximum configuration
16 socket 8 socket 4 socket 2 socket
Processor cores(15 cores per socket)
240 120 60 30
Logical processors(Hyper-Threading on)
480 240 120 60
Memory capacity(Current supported limit)
12TB(4TB)
6TB(4TB)
3TB 1.5 TB
Mezzanine slots(Current supported limit)
24(16)
12(8)
6 (4)
3(2)
LOM 16 8 4 2
Superdome X and Microsoft Windows Server/SQL Server
Deploy your most demanding Business Processing and Decision Support workloadswith confidence
• Large OLTP – In memory DB• Scale up DW – real-time DW• Large SQL for SAP backend
Windows
• SQL server consolidation
…
Scale-upMulti-workload consolidation
Multi-instance Multi-partitions
Mission Critical
• SQL server Consolidation• OLTP – In-memory DB• BI in a box• SAP in a box
SQL Server/ Windows
SQL Server/ Windows
SQL Server/ Windows
Mixtures of consolidation types provide additional flexibility
Windows SQL Server
instance 1 SQL
Server instance 2
SQL Server
instance “n”
SQL Server
instance 3
SQL Server Mega
Instance
Scale-up Reference Configuration
Scale-up single instance DB OLTP workload Windows Server 2012 R2
and SQL Server 2014 8 Sockets - 4 TB 16 Sockets - 4 TB
Reference Configuration white paper in development
Windows Server
SQL Server
Mega Instance
HP Integrity Superdome X
nPar (Server)16 Sockets/ 4TBHP 3PAR StoreServ
nPar (server)8 Sockets/ 4TBHP 3PAR StoreServ
Windows Server
SQL Server
Mega Instance
HP 3PAR Storage
HP 3PAR Storage
OLTP workload
OLTP workload
Mixed Workload Reference Architecture
Mixed workload (DW/OLTP) in multi-partition mixed workload configuration Performance information from the Reference
Config.
Config: Windows Server 2012 R2 and SQL Server
2014 2x 8S 4TB nPar OLTP instance HA failover In-Memory Database engine Data Warehouse instance
2x nPar (server) - (2x) 8 Sockets/ 4TB
HP 3PAR StoreServ
Windows Server
SQL Server
Mega Instance(Active)
HP 3PAR Storage
OLTP workload
Windows Server
SQL Server
Data WarehouseDW/BI workload
SQL Server
Mega Instance(Passive)
In MemoryOLTP
HP Integrity Superdome X
nPar1
nPar2
Windows
Failo
ver
Cluste
r
SQL Server 2014 - Platform Migration options Preserve legacy
architecture
(where possible) One to one server to nPAR
mapping Preserve topology
Pros Quick Lower risk
Cons Suboptimal use of resources Stranded resources
Windows Server
Windows Server
SQL Server Instance
nPar1
nPar2
SQL Server Instance
Windows Server
SQL Server Instance
nPar3
SQL Server 2014 - Platform Migration (Cont) Refactor multi-server
architecture into fewer larger NPAR’s than servers Many to many server to nPAR
mapping Refactored topology
Pros Better performance Better resource utilization Better ROI
Cons Requires architecture modification Takes longer planning/design time Higher implementation risks
Windows Server
Windows Server
SQL Server Instance
nPar1
nPar2
SQL Server Instance
Windows Server
SQL Server Instance
nPar3
SQL Server 2014 - HA Windows Failover
Clustering SAN storage Traditional SQL Server
Failover Cluster Instance (FCI) support between hardware partitions (nPar).
Windows Failover Cluster between
two hardware partitions (nPar)s
Windows Server
HP 3PAR Storage
Windows Server
SQL Server Instance
(Passive)
nPar1
nPar2
Windows
Failo
ver
Cluste
r
SQL Server Instance
(Active)
SQL Server 2014 - HA Availability groups
Requires duplicate storage on SAN
Useful for quick as-is migrations of disparate report servers using replicated data
Can be quickly setup using 3PAR virtual copy technology
Windows Failover Cluster
Windows Server
HP 3PAR Storage
Windows Server
SQL Server Instance
(Passive)
nPar1
nPar2
Windows
Failo
ver
Cluste
r
SQL Server Instance
(Active)
Windows Server
SQL Server Instance
nPar3
HP 3PAR Storage
SQL Server
Availability
GroupSecondary Read-only
SQL Server 2014 – Resource Management SQL Server
Instance is NUMA aware
By default all processors allocated to an instance, and all memory
NUMA Node or individual Processor affinity
SQL Server 2014 – Resource Management Memory Resource
Management Set static limits to DB
Instance and SSIS engine(s)
Example nPar 8 Sockets – 4 TB OLTP instance SSAS (Tabular) SSAS (Multidimensional)
Windows Server
SQL Server Instance
nPar
OLTP workload
SQL ServerAnalysis ServicesTabular instance
Analytics
SQL ServerAnalysis Services
Multidimensional
Analytics
CPU 0 CPU 1 CPU 2 CPU 3
CPU 4CPU 5 CPU 6 CPU 7
1920 GB
968 GB
968 GB
SQL Server 2014 – Resource Management SQL Server
Database NUMA affinity SQL Server
Configuration Manager
Database/application Soft NUMA by ports
I/O Resource Management SQL 2014 Resource
Governor (IOPS only) 3PAR Priority Optimization
(QoS) Includes latency targets
Automatically throttles down lower priority work loads
SQL Server 2014 Performance
7.5x BI performance increase over DL980G7 (16s to 8s)
2x OLTP increase over DL980G7 (8s to 8s)
Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.
Please evaluate this sessionYour feedback is important to us!
Scaling together to drive the most demanding workloadsWindows Server 2012 R2 scalability
Ever evolving hardware support Windows Server 2012 R2 64 Sockets - 4TB RAM
NUMA – processors have their own set of memory Faster access (local vs remote)
Processor groups Scheduling entity
x2APIC - interrupt address scaling above 256 LPs MSI-X distribution across logical processors
Many applications benefit For example SQL Server
HP BL920s Gen8 Windows Drivers Bundle HP drivers and software
Processor Group (k-group)
NUMA Node (Socket)core
Logical processo
r
Logical processo
r
coreLogical
processor
Logical processo
r
coreLogical
processor
Logical processo
r
coreLogical
processor
Logical processo
r
Support for Windows Server 2012 R2
Hardware configuration
Hardware Partition (nPar) Minimum of 1 BL920S Support for 2, 4, 8 blades (2S, 4S, 8S,
16S) per nPar Maximum 4 TB RAM per nPar (Microsoft
Windows Server 2012 R2 limit) At least one NIC per blade (Flex LOM - HP
Ethernet 10Gb 2-port 560FLB Adapter) At least one Fibre Channel adapter per
blade (HP QMH2672 16Gb FC HBA) for maximum flexibility
SAN Boot There is no internal disk
UEFI Support Support of larger drives (GPT)
NUMA – OVERVIEW
BL920s Gen8 #3
NUMA Node 3
NUMA Node 2
BL920s Gen8 #1
NUMA Node 1
CPUMemor
y
CPUMemor
y
Cro
ss bar
NUMA Node 0
CPUMemor
y
CPUMemor
y
NUMA nodes cannot span Processor Groups. All the processors within a NUMA node must be in the same group If a NUMA node has >64 processors it will be split by the OS into
smaller nodes. Legacy Applications that do not understand Processor Groups
only run within a single processor group Full access to all available memory but not all processors
Windows round robins processes start up to the different processor groups.
From the user perspective assignment seems random.
PROCESSOR GROUP RULES
PROCESSOR GROUPS FOR APPLICATIONS To get information regarding processor groups applications must use
the new system API’s only available on OS releases on Windows Server 2008 R2 and later.
Applications will start in one Processor Group the first thread will be running on that Processor Group Any threads it creates will also run in that Processor Group
Any API calls that specifies an affinity in relation to a Processor Group using the new call will allow threads to cross processor group boundaries
Use SetThreadGroupAffinity() for example DLLs mays also have to be modified to support the Processor Group Structures
RUNNING NON PROCESSOR GROUP AWARE APPLICATIONS IN SPECIFIC PROCESSOR GROUPS
CMD window START can override Processor Group assignment by using the /NODE option
Process will be assigned to start in the Processor Group that the NUMA Node is affiliated with. It won’t be restricted to processors on just that NUMA node but the whole Processor Group.
Starts LegacyApp within the Processor Group that contains NUMA node 3
Similarly services can also be set to have a preferred NUMA node with the SC.exe command
sc.exe preferrednode AnyService 3
START /NODE 3 LegacyApp.Exe
PROCESSOR GROUPS /NODE
START /NODE 3 LegacyApp will cause the Application to start in Processor Group 1, and run on both groups of processors of NUMA 2 and NUMA 3
Processor Group spanning NUMA Nodes
CPU 0 CPU 1 CPU 3
CPU 4 CPU 5 CPU ...
CPU 40
CPU 2
Processor Group 0
NUMA NODE 0
LP 0
LP 1
LP 2
LP 3
LP 4
LP 5
LP …
LP 31
NUMA NODE 1
LP 0
LP 1
LP 2
LP 3
LP 4
LP 5
LP …
LP 31
CPU 0 CPU 1 CPU 3
CPU 4 CPU 5 CPU ...
CPU 40
CPU 2
Processor Group 1
NUMA NODE 2
LP 0
LP 1
LP 2
LP 3
LP 4
LP 5
LP …
LP 31
NUMA NODE 3
LP 0
LP 1
LP 2
LP 3
LP 4
LP 5
LP …
LP 31
Starting PG
SQL Server 2014 Performance Superdome vs DL980 8 socket comparison x.y increase using
X cores Y cores disabled on Superdome