IBM Systems
© 2015 IBM Corporation
z/VM Support for IBM z13:MultiThreading and CPU Scalability
Romney Whitez Systems Architecture and Technology
IBM Systems
© 2015 IBM Corporation
TrademarksThe following are trademarks of the International Business Machines Corporation in the United States and/or other countries.
The following are trademarks or registered trademarks of other companies.
* Registered trademarks of IBM Corporation
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
IBM*
IBM Logo*
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of
Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
2
IBM Systems
© 2015 IBM Corporation
Agenda
� Simultaneous MultiThreading
� z/VM SMT Objectives
� SMT Value for z/VM Clients
� Implementation
� Externals
� Operation
� Guest Reconfiguration Considerations
� CPU Scalability
� Performance
� Limitations
� Support Information
3
IBM Systems
© 2015 IBM Corporation
Which approach is designed for the higher volume of traffic?
Which road is faster?
*Illustrative numbers only
Simultaneous MultiThreading
� Allows instructions from more
than one thread to execute in any
given pipeline stage at a time
� Supported for IFLs and zIIPs on
z13
� Helps address memory latency to
increase processing efficiency and
throughput
– zIIPs have an average of 38% capacity improvement compared to zEC12
– IFLs have an average of 32% capacity improvement compared to zEC12
– zIIPs have an average of 72% capacity improvement compared to z196
– IFLs have an average of 65% capacity improvement compared to z196
4
IBM Systems
© 2015 IBM Corporation
SMT Pipeline View
5
A
/
B
A
B
B
Load/Store (L1 Cache)
A A B
Execution Units (FXU/FPU)
instructions
A B A A
B AA
B
Shared Cache
A B A A
B A
BA
Cache
Thread-A
Thread-B
Use of Pipeline Stages in SMT
Both threads
Stage idle
IBM Systems
© 2015 IBM Corporation
z/VM SMT Objectives
� Provide increased capacity by exploiting SMT on z13 IFLs
� Do not require (do not support) guest awareness or exploitation of SMT
– Deliver SMT benefits to guests transparently
� Support up to 32 multithreaded cores
� Increase CPU scalability to 64 CPUs (threads or cores)
6
IBM Systems
© 2015 IBM Corporation
SMT Value for z/VM Clients
� Increased capacity
– Wider, not (much) higher
� Twice as many execution paths as zEC12
– Cores or threads
� No guest changes required to exploit SMT
– Automatic upgrade
� Comprehensive measurement data for performance monitoring, capacity planning, accounting, and chargeback
� SSI compatibility
7
IBM Systems
© 2015 IBM Corporation
SMT Value Example
Two guests, one core
Two guests, two threads
Elapsed Time
(assumes one thread
delivers 70% of a core)
Additional capacity
Guest A Guest B
8
IBM Systems
© 2015 IBM Corporation
Implementation
� Enable SMT for IFLs
� Treat each thread as an independent processor (CPU)
� Dispatch virtual IFLs on threads
– Same or different guests can share threads of a core
– Adds to variability
� Exploit topology awareness
– Single Dispatch Vector per core
– Topologically-aware steal
– Slight bias towards placing virtual MP sibling CPUs on same Dispatch Vector
9
IBM Systems
© 2015 IBM Corporation
Implementation …
� Exploit Compare and Delay in spin loops to yield to other thread
� Improve handling of guest IPTE and similar instructions
– Helps compensate for guest and host overhead that additional virtual
CPUs might induce
– Benefits all (i.e., SMT and non-SMT) environments
10
IBM Systems
© 2015 IBM Corporation
Implementation …
� CPU address expansion
– Without SMT
• CPU x0014 = 0000 0000 0001 0100
– With SMT
• CPU x0014 thread 0 = 0000 0000 0010 1000 (x0028)
• CPU x0014 thread 1 = 0000 0000 0010 1001 (x0029)
– Non-IFL processor odd address unavailable or unused
ZVMPROD
CP 00
0100
CP 01
0302
zIIP 02
0504
IFL 03
0706
IFL 04
0908 Threads (CPUs)
Cores
Partition
11
IBM Systems
© 2015 IBM Corporation
Externals
� MULTITHREADING Configuration Statement
� QUERY MULTITHREAD|MT
� QUERY PROCESSOR Response
� VARY CORE
� VARY PROCESSOR
� INDICATE MULTITHREAD|MT
� Metrics
� Processor Time Accounting
� STSI
� STHYI
� Live Guest Relocation Implications
� Monitor Changes
12
IBM Systems
© 2015 IBM Corporation
MULTITHREADING Configuration Statement
►►──MULTITHreading──┬─DISAble─────────────────────┬────►◄└─ENAble──┤ Enable Operands ├─┘
Enable Operands:
┌─MAX_THREADS──MAX─────┐ ┌─TYPE──ALL──MAX──────────────┐├──┼──────────────────────┼──┼─────────────────────────────┼──┤
└─MAX_THREADS──┬─nn──┬─┘ ├─TYPE──ALL──┬─nn──┬──────────┤└─MAX─┘ │ └─MAX─┘ │
│ ◄─────────────────────────◄ │└───TYPE──┬─CP───┬──┬─nn──┬───┘
├─ICF──┤ └─MAX─┘├─IFL──┤└─ZIIP─┘
13
IBM Systems
© 2015 IBM Corporation
QUERY MULTITHREAD|MT
►►──Query──┬─MULTITHread─┬─────────────────────────────►◄└─MT──────────┘
Multithreading is enabled.
Requested Activated
Threads Threads
MAX_THREADS MAX 2
CP core MAX 1
IFL core MAX 2
ICF core MAX 1
zIIP core MAX 1
14
IBM Systems
© 2015 IBM Corporation
QUERY PROCESSOR Response
PROCESSOR 00 MASTER CP CORE 0000
PROCESSOR 02 ALTERNATE CP CORE 0001
PROCESSOR 04 ALTERNATE CP CORE 0002
PROCESSOR 06 ALTERNATE IFL CORE 0003
PROCESSOR 07 ALTERNATE IFL CORE 0003
PROCESSOR 08 ALTERNATE IFL CORE 0004
PROCESSOR 09 ALTERNATE IFL CORE 0004
PROCESSOR 0A ALTERNATE IFL CORE 0005
PROCESSOR 0B ALTERNATE IFL CORE 0005
PROCESSOR 0C ALTERNATE ZIIP CORE 0006
PROCESSOR 0E ALTERNATE ZIIP CORE 0007
PROCESSOR 10 ALTERNATE CP CORE 0008
15
IBM Systems
© 2015 IBM Corporation
VARY CORE
►►──VARY──┬─ONline──┬──CORE──nnnn──────────────────────►◄└─OFFline─┘
16
IBM Systems
© 2015 IBM Corporation
VARY PROCESSOR
� Not permitted when SMT enabled
– Must vary entire core
� VARY CORE supported in SMT and non-SMT environments
17
IBM Systems
© 2015 IBM Corporation
INDICATE MULTITHREAD|MT
►►──INDicate──┬─MULTITHread─┬──────────────────────────►◄└─MT──────────┘
Multithreading is enabled.
AvgUtil 35% 7
Core Type CP 1 AvgUtil 8% AvgProd 78%
CF 25% MaxCF 100%
Core 0000 VH Util 15% Prod 85% Procs 0000
Core 0001 VM Util 10% Prod 83% Procs 0002
Core 0002 VL Util 5% Prod 71% Procs 0004
Core 0006 VL Util 2% Prod 72% Procs 000C
Core Type IFL 2 AvgUtil 92% AvgProd 81%
CF 80% MaxCF 140%
Core 0003 VH Util 100% Prod 75% Procs 0006-0007
Core 0004 VL Util 90% Prod 88% Procs 0008-0009
Core 0005 VH Util 85% Prod 80% Procs 000A-000B
18
IBM Systems
© 2015 IBM Corporation
Metrics
� Cycle count with one thread active (C_1)
� Cycle count with two threads active (C_2)
� Instruction count with one thread active (I_1)
� Instruction count with two threads active (I_2)
� Instructions per cycle with one thread active (IPC_1)
� Instructions per cycle with two threads active (IPC_2)
� Prod = Productivity (work actually accomplished/maximum capacity)
� CF = Capacity Factor (instructions/cycle)
19
IBM Systems
© 2015 IBM Corporation
Metrics …
� Mean Thread Density = ((1 * C_1) + (2 * C_2)) / (C_1 + C_2)
� Core Productivity = (I_1 + I_2) / (IPC_2 * (C_1 + C_2))
� Core Busy Time = (C_1 + C_2) / Processor Speed
� CPU Type Productivity = (I_1 + I_2) / (IPC_2 * (C_1 + C_2))
� CPU Type Maximum Capacity = IPC_2 / IPC_1
� CPU Type ChargeBack Factor = (CPU Type Capacity Factor) / (Mean Thread Density for CPU type)
20
IBM Systems
© 2015 IBM Corporation
Processor Time Accounting
� z/VM keeps up to three sets of books with respect to virtual processor CPU time
– Raw Time
• Time dispatched on thread
• Reported by guest Processor Timer
• Accurate when SMT not enabled
• Used for Dispatcher time slice and Scheduler priority computations
– MT-1 Equivalent Time
• When SMT enabled, approximates Raw Time if SMT were not enabled
– Prorated Core Time (deferred)
• Proportional distribution of core use among consumers
• Requires CPUMF counter extraction
• Intended for future use with CPU Pools
21
IBM Systems
© 2015 IBM Corporation
Processor Time Accounting …
� MT-1 Equivalent Time reported by
– All command responses that include CPU time
• INDICATE USER, QUERY TIME, LOGOFF
– User type 1 accounting record
� Monitor data currently reports Raw Time
– MT-1 Equivalent and (when calculated) Prorated Core Times added
� New type F accounting record reports Raw Time and (when calculated) Prorated Core Time
22
IBM Systems
© 2015 IBM Corporation
STSI
� Response indicates SMT enabled
� Reports threading level
� Processor counts report numbers of cores
23
IBM Systems
© 2015 IBM Corporation
STHYI
� Response enhanced
– SMT enablement indicator
– Number of threads enabled on each CP and IFL core
– Which CPU-time set of books enforces LIMITHARD and CAPACITY limits
� Counts and capacities expressed in cores, not threads/processors/CPUs
24
IBM Systems
© 2015 IBM Corporation
Live Guest Relocation Implications
� Guest can relocate between SMT and non-SMT systems
– Capacity will be affected
• Number of virtual CPUs might require adjustment
– CPU time reconciled in any combination of circumstances
• Special cases for times in Monitor data, which must be monotonically increasing
25
IBM Systems
© 2015 IBM Corporation
Monitor Changes
� Many records that report information per processor will report per thread
� Some records that report information per processor will report per core
� Some records will report both core and thread information
� Some records have MT-1 Equivalent Time and Prorated Time added
� New record for CPUMF SMT counters
26
IBM Systems
© 2015 IBM Corporation
Records Reporting Data per Thread
� D0R1 SYTSYP System wide utilization
� D0R4 SYTRSP Real storage exception conditions
� D0R5 SYTXSP Expanded storage statistics
� D0R11 SYTCOM IUCV/VMCF communication activity
� D0R13 SYTSCP Scheduler activity
� D0R22 SYTSXP System Execution Space storage utilization
� D3R2 STORSP Real memory utilization
� D3R20 STOSXP System Execution Space storage utilization
� D5R1 PRCVON Processor varied online
� D5R2 PRCVOF Processor varied offline
27
IBM Systems
© 2015 IBM Corporation
Records Reporting Data per Thread …
� D5R3 PRCPRP Work dispatched and CPU state
� D5R11 PRCINS Instruction simulation counters
� D5R12 PRCDIA Diagnose counters
� D5R13 PRCMFC CPU Measurement Facility counters
� D5R15 PRCDSV Logical CPU dispatch vector assignment
� D5R18 PRCDHF High frequency dispatch vector sampling
28
IBM Systems
© 2015 IBM Corporation
Records Reporting Data per Core
� D0R15 SYTCUG CPU utilization in a LPAR environment
� D0R16 SYTCUP Logical partition CPU utilization
� D0R17 SYTCUM LPAR management physical CPU utilization
� D0R19 SYTSYG System wide utilization data
� D1R4 MTRSYS System configuration data
� D1R18 MTRCCC CPU capability change
� D1R26 MTRTOP System topology counts
� D5R14 PRCTOP System topology
29
IBM Systems
© 2015 IBM Corporation
Records Reporting Both Core and Thread Data
� D0R2 SYTPRP Real processor data
� D1R5 MTRPRP Real processor configuration
� D5R16 PRCPUP CPU park/unpark decision
� D5R17 PRCRCD Real CPU data
30
IBM Systems
© 2015 IBM Corporation
Records With Additional CPU Time Data
� D2R4 SCLADL Add user to dispatch list
� D2R5 SCLDDL Drop user from dispatch list
� D2R13 SCLALL Add VMDBK to limit list
� D2R14 SCLDLL Drop VMDBK from limit list
� D4R2 USELOF User logoff data
� D4R3 USEACT User activity data
� D4R9 USEATE User activity data at transaction end
31
IBM Systems
© 2015 IBM Corporation
New Record
� D5R20 PRCMFM CPUMF MT-Diagnostic Counters
32
IBM Systems
© 2015 IBM Corporation
Guest Reconfiguration Considerations
� Guests may need additional virtual CPUs in order to consume sufficient processor resources
� For illustration purposes, assume
– z13 core is 10% faster than zEC12
– Thread delivers 70% of core
� Then, z13 thread = 77% (70% of 1.1) of zEC12 core
� Guests with utilization above 77% may need to be reconfigured
� Alternative is longer run durations, increased response times, different resource contention profiles (e.g., more memory, less I/O)
33
IBM Systems
© 2015 IBM Corporation
Guest Reconfiguration Illustration
One vCPU, one core
Two vCPUs, two threads
Elapsed Time
(assumes one thread
delivers 70% of a core)
Additional capacity
vCPU 0 vCPU 1
One vCPU,
one thread
34
IBM Systems
© 2015 IBM Corporation
CPU Scalability Improvements
� Processing streamlined, bottlenecks removed to allow scaling to 64 processors (64 cores without SMT or 32 cores with SMT) on z13
– Prior machine support limit remains 32 processors
� Reduced scheduler lock contention
– Some high-frequency paths changed to obtain lock in shared mode or bypass
lock altogether
– Eliminated dispatch-list reorder on test-idle transitions
35
IBM Systems
© 2015 IBM Corporation
CPU Scalability Improvements …
� More efficient memory management in key areas
– Batching and processor-local queues for VSWITCH buffers
– Streamlined address space access in VDISK support
– Adaptive resizing of local available lists, to address demand spikes in
concurrent IPL of many guests
– Segregation of firmware and software frame post-processing lists to
reduce contention and streamline operation
– Cache-friendlier page-level locking
36
IBM Systems
© 2015 IBM Corporation
Performance
� Too early to provide comprehensive measurements
� 70% factor based on modeling and only for illustration purposes
� Pertinent factors
– Customers on older hardware will see more benefit
– z/VM imposes more load on address translator and TLB than z/OS and Linux
– Additional virtual CPUs only help guests running software that can exploit them
� Plan to update z/VM Performance Report
– Measurement results
– Virtual machine sizing considerations
– SMT effectiveness evaluation methodology
� Anticipate individual workload evaluation required
37
IBM Systems
© 2015 IBM Corporation
Workloads Suited to SMT
� Care mainly (or only) about throughput, not thread speed / latency
– Tolerant of longer elapsed time for jobs / transactions
� “Scale out” well with many more logical processors / active threads
– Tolerant of more/slower threads to get capacity
� Are capacity-constrained by the size of available systems
– SMT yields more throughput per box
� Share instructions and/or read-only data among threads
– Turns SMT cache sharing into an asset
� Make balanced / moderate use of processor resources
– Do not drive any features to very-high utilization
38
IBM Systems
© 2015 IBM Corporation
Workloads Not Suited to SMT
� Need the fastest available threads
� Have high interaction via serialized resources
– (More + slower threads) = (higher lock contention)
� Are very sensitive to cache sizes
– SMT = less effective cache for each thread to use (but SMT is helpful for mitigating cache miss latency, up to a point)
� Drive some processor features / design elements to saturation
– E.g. Floating-point-computation-intensive programs
� Run at very high overall instructions per cycle (low CPI) in CPU
– Little “white space” in pipeline for SMT to leverage
39
IBM Systems
© 2015 IBM Corporation
Limitations
� No dynamic switching of SMT mode
� No mechanism to give guest whole core by leaving one thread idle
� Requires vertical polarization (HiperDispatch)
– No support for dedicating processors
– RESHUFFLE is only supported work distribution method
� Thread-aware CPU Pooling not available
– Will be remedied after GA
� No STSI 15.1.4 support (i.e., no drawer awareness)
40
IBM Systems
© 2015 IBM Corporation
Support Information – Available 2015-03-13
� VM65577 (z/VM 6.2, 6.3)
– Crypto CEX5S with Enhanced Domain Support
– z13 Processor compatibility
– z13 I/O compatibility
� VM65586 (z/VM 6.3)
– CPU Scalability
– Host SMT Exploitation
41
IBM Systems
© 2015 IBM Corporation
Support Information – Available 2015-03-13 …
� Performance Toolkit
– VM65527 (z/VM 6.2, 6.3)
• z13 Compatibility
– VM65529 (z/VM 6.3)
• SMT
� VM65588 (z/VM 6.2, 6.3)
– DirMaint Support for Crypto CEX5S with Enhanced Domain Support
� VM65676 (CMS) and VM65677 (CP)
– Stand-alone dump SMT support
– Order separately (will not be on RSU due to size)
42
IBM Systems
© 2015 IBM Corporation
Support Information – Available 2Q2015
� VM65583 and PI21053 (z/VM 6.3)
– Multi-VSWITCH Link Aggregation
� VM65528 (z/VM 6.3)
– Performance Toolkit Multi-VSWITCH Link Aggregation
43
IBM Systems
© 2015 IBM Corporation
Support Information – Availability TBD
� VM65680 (z/VM 6.3)
– Manage CPU Pool capacity using prorated core time
44
IBM Systems
© 2015 IBM Corporation
Questions?
45