Virtualizing Databases Doing IT Right – The Sequel
VAPP1318
Michael Corey, Ntirety - A Division of Hosting Jeff Szastak, VMware, Inc
Jeff Szastak Jeff Szastak MSIA, CISSP, VCP, MCSE, etc. Manager, Systems Engineering CTO Ambassador VMware, Inc. Microsoft Exchange & SQL virtualization BC/DR SME @szastak Blog contributor:
blogs.vmware.com/apps www.virtualinsanity.com
Michael J Corey
Books Include: Virtualizing SQL Server with VMware Doing IT Right Oracle Database 12c: Install, Configure & Maintain like a Professional Oracle 11g A Beginner’s Guide Oracle 10g A Beginner’s Guide Oracle 9i - A Beginner's Guide SQL Server 7 Data Warehousing Oracle8i - Data Warehousing Oracle8i - A Beginner's Guide Oracle8 - Data Warehousing Oracle8 – Tuning Oracle8 - A Beginner's Guide Oracle - Data Warehousing Oracle - A Beginner's Guide Tuning Oracle
Key Past/Current Affiliations: Past President of the IOUG Founding Board IOUG Virtualization SIG Past Member IOUG Board of Directors Past Director of Education IOUG Founder Professional Association of SQL Server Talkin’Cloud Top 200 Channel Partner Experts Cloud Past Member Microsoft Data Warehouse Council Past Member Oracle Educational Advisory Council Past Director of Conferences IOUG Alive Executive Board Massachusetts Robert H. Goddard Council on Science, Technology, Engineering & Mathematics
Started Working with Oracle Version 3.0 Beta Tested Oracle 5,6,6.2,7,8.X,9.X.…. Presented on Technology & Business Topics from Brazil to Australia Worked with Oracle on UNIX, Linux, Windows, MVS,VM, VMS,..
Doing Something Different • PresentaSon Covers Both Oracle & MicrosoU SQL Server • More & More DBA’s are faced with maintaining both • Many Issues faced are shared
5
“This is a Database on Virtualized Infrastructure Session, Principals Apply all Databases”
VMware
Concise Set
Very Efficient Drivers
Focused
Driver Set Well
Vetted O/S
Hardware Resource O/S
Du Jour Many Drivers
Many Versions
New
Driver’s Can Cause
Issues
Why Your Company Cares: VirtualizaSon is Strategic
" 1:1 relationship between applications and hardware
" Relevant cost metric = cost per server
• 8% - 12% Utilization is typical
" Many:1 relationship between applications and hardware
" Relevant cost metric = cost per application
• 60 - 80% Utilization: is typical • 60% reduction in CapEx • 30% reduction in OpEx • 80% reduction in Energy
Physical World
1 :1
Virtual World
Many :1
The New Norm
“Can You Say Right-Sizing”
Oracle -‐ Hot Add Memory Oracle database memory parameters are defined at instance startup.
You will have to restart the database to take advantage of added memory. Unless you have set SGA_MAX_SIZE to Big Caution Shared Resource Environment ! Typically… SGA_TARGET_SIZE <= SGA_MAX_SIZE or could be wasting memory http://www.vmware.com/files/pdf/solutions/oracle/Oracle_Databases_VMware_Workload_Characterization_Study.pdf
1St Time Goal of Consistency Standardization Can Be Achieved
“Any Resource, Any Server, At Any Time” in the (Pool)
The 10 Millionth Model T was produced on June 4, 1927
Very Large ERP System • 75+ application tiers – VMware/RHEL • 8 TB database; 8.8 billion rows of data • 52 million transactions per day • 79K IOPS • 40K blocks per second interconnect traffic • 40,000+ named users • 4,000+ peak concurrent users
Source EMC
“Yes This is Virtualized”
Performance Test Environment (Topology)
20
■ VMware vSphere 5.1, Red Hat Enterprise Linux (RHEL) 6.3
■ Oracle 11gR2 (11.2.0.3) Single Instance and RAC
■ 3PAR StoreServ 10400
■ 192 x 15K RPM Fibre Channel Disks
■ 32 x 150K RPM Solid State Disk (SSD)
■ ProLiant DL580 G7 (client)
■ Intel® Xeon® CPU X7560 @ 2.26 GHz (8 cores)
■ 128GB memory
■ ProLiant BL660c Gen8 - 4 sockets / 24 cores (database server)
■ Intel® Xeon® CPU E5-4610 @ 2.40 GHz (6 cores)
■ 64GB memory
■ HP Virtual Connect FlexFabric 10Gb/24-Port Module
Recent “HP” Performance Study – Choose Your Vendor DU-JOUR
Performance Results • Virtualization has ~5% overhead as
compared to native • The database tps on a virtual machine is 5%
less than that on the physical machine.
• 2P represents 12 cores and 4P represents 24 cores
21
• For 100 users the delta is ~6% and that increases up to ~10% for 1700 users.
• When the system gets busier, native starts to have a slightly larger advantage over virtualization.
Performance Results -‐ ConSnued • Both virtual and naSve, by moving from 2P (12 cores) to 4P (24 cores)
• The database tps increases by 40% to 50%
• The CPU uSlizaSon drops from 80% to 60%
22
• For RAC , by moving from 2P (12 cores) to 4P (24 cores)
• The database tps increases by 40% to 60%
• The CPU utilization drops from 75% to 60%
“Who Architects a Database With Less than 5% Overhead - One Busy Day Your Done”
Workload CharacterisScs • OLTP type of workload with a read write raAo of 2:1 • Oracle Database size of 600GB
• workload is an implementaSon of an online store
• The driver program simulates users logging in, browsing for products by Stle, or category, adding selected products to their shopping cart, and then purchasing those products
23
Mega vMoSon RAC on vSphere FuncSonal Stress Test
VMW, EMC, Cisco Executed by “Principled Technologies” 2013 WWW.principledtechnologies.com/Vmware/vMoSon_oracle_rac_1013.pdf 3 RAC Node, vMoSon on all 3 Nodes Simultaneously – Without any network disrupSon
24
Service Level Agreement/The DBA Situation: Customer Monitors Critical Medical Equipment within a Hospital. A SQL Server Database is at core of system. Having Huge performance problems “Failure is not an option”. Solution: Need to take Server Down. Adjust BIOS Setting Causing SQL Server to only have access to 50% of the available CPU. Customer: Never a time they can take Server down for 5 minutes Stand Alone Instance – Had it been virtualized DBA would have had options
No Win -‐ SLA Yet this situation points to a bigger issue concerning
“Managements” expectations concerning
the availability of the database and the
physical infrastructures ability to support those
expectations.
Have The ConversaSon • Get the Resources You Need to meet the expectation • OR – Reset Expectations concerning Database Uptime
Avoid Good IntenSon BIOS Seong Check Power Management Settings • Default lot of Servers is “Green” Friendly Setting
• Saves Energy, When Server Inactive • Many Times Does Not Ramp UP CPU Quickly and in Some Cases
Completely • Avoid Dozing Setting
• Slows CPU to half its Speed
Proper Setting for server hosting a Database is “High Performance”
BIOS Seongs to Consider If Your Processors Support it
• Enable “Turbo Mode” • Enable “Hyper-threading”
Enable all hardware-assisted virtualization features in the BIOS.
Fun Facts
30
Faster than the rate of babies born in the U.S.
10 VMs STARTED EVERY MINUTE 80 ,000 VMware-certified Professionals in 146 Countries (July 2012) 6 vMOTIONS PER SECOND
More VMs are in motion than planes in flight. 20 MILLION VMs - 2011
If they were physical machines they would stretch 2x the length of Great Wall of China
Lessons Learned – Tier 1 “What Works in Tier-‐2 (non-‐producAon), will not always
work with Tier-‐1 (producAon)”
32
Doing It Right 1st Time: Very ConservaSve
Designed to Insure You Avoid Common Traps & PiUalls Associated with ProducAon Databases being Virtualized
Doing It Right: Read Best PracSces Guides Read The Documentation From All Your Vendors……
VMware, Microsoft, Storage Vendor, Network Vendor….
Appendix of this deck
Professional AssociaSon of SQL Server
http://virtualization.sqlpass.org/ “Take Advantage of All resources Available to You”
• “Oracle Performance Management with vCenter Operations Manager and Oracle Enterprise Manager Adapter”
• “Virtualizing Oracle 11gR2 RAC on Vmware vSphere: Best Practices” • “Virtualization Bootcamp: Optimizing Oracle Databases on Vmware”
Sign-up for the NEW VMware SIG and gain access to content, webinars and networking opportunities
Blogs: Longwhiteclouds.com
38
http://vsphere-land.com/news/2014-top-vmware-virtualization-blog-voting-results.html?utm_content=bufferc62e1&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
#13
InstallaSon • Plan your SQL Server installaSon
q SLAs, RPOs, RTOs q Baseline current workload, at least 1 business cycle q Baseline exisAng (workload) vSphere implementaAon q EsSmated growth rates
q I/O requirements (I/O per sec, throughput, latency) q Storage (Disk type/speed, RAID, flash cache soluSon, etc) q SoUware versions (vSphere, Windows, SQL) q Product Keys q Licensing (may determine architecture) q Workload type (OLTP, Batch, Warehouse) q Accounts needed for installaSon / service accounts q High Availability strategy q Backup & Recovery strategy
“If you aim at nothing, you will hit it every time” – Zig Ziglar
Planning a High Availability Strategy § Requirements
• Recovery Time ObjecSve (RTO) • What does 99.99% availability really mean?
• Recovery Point ObjecSve (RPO) • Zero data lost? • HA vs. DR requirements
§ EvaluaSng a technology • What’s the cost for implemenSng the technology? • What’s the complexity of implemenSng, and managing the technology? • What’s the downSme potenSal? • What’s the data loss exposure?
Availability % DownAme / Year DownAme / Month * DownAme / week
"Two Nines" -‐ 99% 3.65 Days 7.2 Hours 1.69 Hours "Three Nines" -‐ 99.9% 8.76 Hours 43.2 Minutes 10.1 Minutes "Four Nines" -‐ 99.99% 52.56 Minutes 4.32 Minutes 1.01 Minutes "Five Nines" -‐ 99.999% 5.26 Minutes 25.9 Seconds 6.06 Seconds
* Using a 30 day month
Baseline, Baseline, Baseline………
44
Why will making it Virtual make it perform bexer? IF so how?
– New Hardware? – Faster CPU? – Faster Drives?
“There are no silver bullets”
“IT” Food Groups: What to Baseline
• ExisSng Physical Database Infrastructure • ExisSng/Proposed vSphere Infrastructure
45
When You Base Line a database § Make Sure The Sample Interval Is frequent § CPU, Memory, Disk (15 Seconds or less) § SQL Server TSQL (1 Minute)
“A Lot can happen in a
short amount of time”
“SAME Applies to Oracle ! ! ! - A lot Can Happen
Oracle 12c Cloud Control/DB Express
The Default thresholds for alerting in Cloud Control 12c good starting point
Database As A Service – Road Map MulAple Tier Approach • Different levels for different DB placement • Basic and Premium
– Basic = Low uSlizaSon, test / dev DBs – Premium = Moderate to High uSlizaSon, producSon, high visibility
• Different underlying hardware • Different SLAs, RTO, RPOs and HA between Sers Center of Excellence • Assist with migraSons, net new DBs and Capacity Management
– CommunicaSon, no “throwing it over the wall”
• VMware/SAN/Network/DB teams to discuss DB migraSons – OpSonal Teams: Security, Procurement
49
“Few Dedicated Personnel to each Level of Stack – End Users are taking advantage automation”
Understanding Workload Resource Requirements
Basic performance characterisAcs (CPU, memory, IO, Network) • Daily average resource usage • Daily peak resource usage • Daily peak hours • Month-‐end, quarter-‐end, year-‐end peaks Monitoring Tools • Windows Perfmon (Example)
– Processor(*) à %Processor Time – Process(sqlservr) à %Processor Time – SQLServer:Memory Manager à Total Server Memory (KB) – PhysicalDisk(*) à Disk Reads/Sec, Disk Writes/Sec – PhysicalDisk(*) à Disk Reads Bytes/Sec, Disk Write Bytes/Sec – Network Interface(*) à Bytes Received/Sec, Bytes Sent/Sec
50
%MLMTD § VM Level -‐ The percentage of Sme the vCPU was ready to run but deliberately wasn’t scheduled because that would violate the “CPU limit” seongs. If larger than 0 the world is being throxled due to the limit on CPU
MigraSon – Baseline: Physical (disk) Pre LogicalDisk\Avg Disk sec/Read read latency
LogicalDisk\Avg Disk sec/Write write latency
LogicalDisk\Disk Read Bytes /sec Read throughput
LogicalDisk\Disk Write Bytes /sec Write throughput
LogicalDisk\Disk Reads/sec Read IOPS
LogicalDisk\Disk Writes/sec Write IOPS
LogicalDisk\Disk Transfers/sec Combined IOPS
MigraSon – Baseline: Virtual (disk) Post
§ Export output Excel, and graphed using a variety of tools, such as Jonathan Kehayias’ Powershell script.
§ Compare the results against the required IOPS as measured in the pre-deployment assessment.
Determine IOPS & Throughput ORION (Part of 11.2 now) sudo -‐u root ./orion_linux_x86-‐64 -‐run advanced -‐testname traxpoc -‐num_disks 20 -‐cache_size 8000 -‐duraSon 240 -‐matrix basic SLOB (Silly Lixle Oracle Benchmark) Calibrate I/O – NaSve to Oracle starSng in 11.1 SQL> declare 2 l_latency integer; 3 l_iops integer; 4 l_mbps integer; 5 begin 6 dbms_resource_manager.calibrate_io 7 (5,10,l_iops,l_mbps,l_latency); 8 dbms_output.put_line ('max_iops = '||l_iops); 9 dbms_output.put_line (’latency = '||l_latency); 10 dbms_output.put_line ('max_mbps = '||l_mbps); 11 end; 12 / max_iops = 5348 latency = 10 max_mbps = 641
Other Free Tools: • Swingbench • TPC Benchmark • Custom scripts How do you know for sure? Oracle’s -‐ $$$: Database Replay
Don’t’ keep it a Secret • DBA’s – tell vSphere, Storage, and Network Admins your needs
– Storage: (IOPS / throughput) – CPU: (MHz) – Memory: (Total GB) – Network: Bandwidth – Features (i.e.: Windows clustering) – AnScipated Growth Rates – AnScipated AcSvity – Other
“They Flunked Mind Reading”
Before You Install a Database on New VM • Do basic throughput tesSng of the IO subsystem prior to
deploying a Database • Tools you can use
– SQLIO/IOMETER
– Slob…..
61
“Check It Before You Wreck it” -- Jeff Szastak
SQL Server -‐ Unaxended InstallaSon OpSons
§ VMware vCAC Command Line • hxp://msdn.microsoU.com/en-‐us/library/ms144259
§ ConfiguraSon File • hxp://msdn.microsoU.com/en-‐us/library/dd239405
§ Sysprep • hxp://msdn.microsoU.com/en-‐us/library/ee210664
• FYI – Available as of SQL Server 2008 R2
ORACLE-‐ Unaxended InstallaSon OpSons
You At the VMworld Party While your Database is Provisioned
VMware vCAC DBCA Silent Install
http://docs.oracle.com/cd/E11882_01/install.112/e24321/app_nonint.htm#CIHHFDGG RAC Silent Install http://docs.oracle.com/cd/E11882_01/install.112/e24660/cripts.htm#RILIN1119
Phone-‐A-‐Friend
VMware has stated that it will take the ______support call if a customer calls ______ Support and ______ Support is being difficult because the
customer is running on VMware.
• Hint……. “TSANET.ORG--- Hardware or Software”
Use SQL Server/Oracle recommended installaAon guidelines for respecAve operaAng
system – same as physical !
Physical World 1 :1 Virtual World
Many :1
Same As Physical
If your OS and database don’t know they are virtualized do you need to tell them?
Did You Hear That?
OLTP § Large amount of small queries § Sustained CPU utilization during working hours § Sensitive to peak contentions (slow downs affects SLA)
§ Generally Write intensive § May generate many chatty network round trips § Typically runs during off-peak hours, low CPU utilization
during the normal working hours § Can withstand peak contention, but sustain activity is key
Batch / ETL
Database Workloads Types
DSS
§ Small amount of large queries § CPU, memory, disk IO intensive § Peaks during month end, quarter end, year end § Can benefit from inter-query parallelism with large number of
threads
OLTP vs. Batch Workloads § What this says:
• Average 15% USlizaSon • Moderate sustained acSvity (around
28% during working hours 8am-‐6pm) • Minimum acSviSes during non working
hours • Peak uSlizaSon of 58%
§ What this says: • Average 15% USlizaSon • Very quiet during the working day (less
than 8% uSlizaSon) • Heavy acSvity during 1am-‐4am, with avg.
73%, and peak 95%
Batch Workload (avg. 15%)
OLTP Workload (avg. 15%)
OLTP vs. Batch Workloads § What This Means
• Bexer Server USlizaSon • Improved ConsolidaSon RaSos • Less Equipment To Patch,
Service, Etc • Saves Money/Less Licensing
OLTP/Batch Combined Workload
Separate development, test from production environments into different host clusters in the beginning
More VMs vs. More DB Instances
More VMs • Bexer resource isolaSon • Bexer security, patch
management • Befer Performance • Less Risk
Fewer VMs (More instances) • Less expensive in some licensing models • No OS isolaSon (configuraSon, security, fault) • No resource isolaSon • Less SegmentaSon (HIPPA, PCI,…..)
Note: Both Work, Both Valid Strategies
General Rule of Thumbs • Resource uSlizaSon is the basics, but not all
• Consider business, security, management, and other requirements
• Consider workload characterisScs • OLTP workloads can be stacked up to a sustained uSlizaSon level • OLTP workloads that are high usage during day Ame, and batch workloads that run during
off-‐peak hours mixed well together • Batch/ETL workloads with different peak periods share well together
• Consider operaSonal history, e.g. month end, quarter end • AddiAonal VMs may be added to handle peak period during month end, quarter end, and
year end if scale out is a possibility
• CPU, memory hot-‐add may be used to handle the peak workload • Reduce VM density, or add more hosts to the cluster
Golden Rules
“Your Database is just an
extension of your Storage”
Michael Webster
“Your Storage is Just a Set
of containers for your
database”
Don Sullivan
Storage • The fundamental relaAonship between consumpAon and supply has not changed
• Spindle count and RAID configuraAon sSll rules
• host demand is an aggregate of VMs
• Factors that affect storage performance • storage protocols • storage configuraSon • VMFS configuraSon (Separate LUN’s, All on one
LUN, Does it even maxer?)
VMFS
Use VMFS vs. RDM • VMFS Advantages
– Negligible performance cost and superior funcSonality
– Ability to take full advantage of future funcSonality enhancements (Future Awesomeness)
• Align VMFS on 64K boundaries – AutomaSc with vCenter – www.vmware.com/pdf/esx3_parSSon_align.pdf
• With vSphere 4.1 – Use VAAI (Storage API)*
• With vSphere 5.x – Use VASA (Storage API)*
0 1000 2000 3000 4000 5000 6000 7000 8000
4K IO 16K IO 64K IO
VMFS
RDM (virtual)
RDM (physical)
IOPS
VMFS Scalability
* Work With Storage Vendor For Details
Thin Provisioning Perf / Block Zeroing MBs I/O Throughput
§ USE use Thick Eager Zerod Disk for best performance
§ Maximum Performance happens eventually, but
when using lazy zeroing, zeroing needs to occur before you can get maximum performance
§ At minimum Databases, LOGS, TEMPDB
§ Check with Storage Vendor to see how they handle Thin Provisioning. Your Mileage may vary
§ VAAI capable array can alter config
hfp://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf
Database Thick Provision Eager Zeroed OpSons
Inflation Storage vMotion
Windows
vmkfstools - VMware KB 1011170 - vmkfstools –D “My VM.vmdk
- Eager or zeroedthick - vmkfstools –k “My VM.vmdk
- converts to eager Zeroed
OpSmizaSons – SQL Server: Disk § Disk
• Instant file iniSalizaSon – add SQL Server service account to PERFORM VOLUME MAINTAINCE TASK under User Rights Assignment in Local Policies of Server’s seongs.
• By default, every Ame the database file needs to grow, OS will zero fill this file & block writes unAl complete
• Adding requires a restart of the SQL Service,
• removal requires a reboot
hxp://msdn.microsoU.com/en-‐us/library/ms175935(v=SQL.105).aspx
SQL Server: System Databases Tempdb
• Depending on workload, consider creaSng mulSple tempdb files (see next slide) • Microson recommends 1 datafile per CPU • Isolate tempdb from database and logs, and consider dedicated vSCSI adapter • Verify via tesSng
hxp://technet.microsoU.com/library/Cc966534
Oracle - No Datafile to CPU relationship
For those who want to be less conservaSve (for TempDB ) SQL 2005 50% the number of cores up to 8, 2008+ 25%-‐50% raSo of files to cores, usually up to 8.
The number of data files and tempdb files is important enough that MicrosoU has two spots in the Top 10 SQL Server Storage best pracSces highlighSng the number of data
files per CPU
TEMPDB 1 datafile per CPU (DUAL Core Counts as 2 CPU’s)
(Raid 1+0 – Write Intensive)
Data Files 1 datafile per CPU 200GB DB/4 vCPU = 4@50GB Make Equal Size/Grow Equally
http://technet.microsoft.com/en-us/library/cc966534.aspx
Storage Paravirtual SCSI (PVSCSI) adapters PVSCSI adapters are high-‐performance storage adapters that can result in greater throughput and lower CPU uSlizaSon. • Up to 30% CPU Savings • Up to 12% I/O Improvement
Paravirtual Adapter Knows Its Virtual
* Very Important to Use Most Current Version
PVSCSI adapters are best suited for environments, especially SAN environments, where hardware or applicaSons drive a very high amount of I/O throughput.
PVSCSI adapters are not suited for DAS (Direct Afached Storage) environments.
Paravirtual SCSI (PVSCSI) Storage Adapters
Always Check Storage Vendors Best PracSces
“>80% of the issues in a virtualized
Environment have to do with Storage misconfigurations”
Storage – Puong It All Together • Work with storage engineer, deliver realisSc requirements early in the cycle
• Size for performance, not capacity • Large number of small drives, not small number of large drives
• More / faster spindles are befer for performance • Understand the I/O requirements of different workloads
• TransacSonal data vs. log vs. backup • OLTP vs. DSS
“Golden Rule: Capacity Versus Performance”
Storage – Puong It All Together • Understand the path to the drives, i.e. throughput, mulA-‐pathing • Use eagerzeroedthick disk provisioning to avoid lazy zeroing • Place swap file on separate dedicated drive on SAN, miSgate the impact of swapping with EFD (for high performance workload)
• Can potenSally slow down vMoSons
• Follow SQL Server storage best pracSces hxp://technet.microsoU.com/en-‐us/library/cc966534.aspx Work with your SAN Vendor as well, they have Best PracAces for running these workloads on your array
The Bottom Line
“>80% of performance problems with virtualization occur at the storage layer” Now that you know, don’t let it happen to YOU
vCPUs – Hyper-‐Threading
hyper-‐threading processor to appear as two "logical" processors to the host operaAng system 98
⎨ í Still only One Processor
vCPU’s • With Databases Avoid Over Commitment of Processor Resources Sll have “acSonable” performance data you can scale (vCOPs)
• 1-‐1 RaSo Physical Cores to vCPU’s
• Out of the gate !
Hyper-Threaded CPU != Full vCPU
Within The VM In a virtual environment each vCPU is a single thread. There is no virtual equivalent of a hyper-‐
thread.
Guest Operating O/S sees the number of allocated vCPU’s Non-Virtualized O/S – Would see the Hyper threads. Oracle: Latches, Parallelism… Based upon visible CPU’s. Be Careful How You Set these things.
Hardware GeneraSon Maxers • Use the latest processors • Support for Hardware Assisted VirtualizaSon
• H/W assist for CPU : AMD-‐V on AMD or VT-‐x on Intel
• H/W assist for MMU
• NPT* on AMD or EPT on Intel : NPT used in our tests
• Enabled at BIOs level
• Enable NUMA support • Understand VMM (Virtual Machine Manager)
Benefits of hardware assistance for CPU and Memory Virtualization
hxp://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf
Point – Use Latest Greatest Hardware ! ! !
Processor – Puong It All Together
• Leverage hardware-‐assisted virtualizaAon (enabled by default) • Consider avg. and peak uSlizaSon • Be aware of hyper-‐threading, a hyper-‐thread does not provide the full power of a physical core
• Consider future growth of the system, sufficient head room should be reserved • In high performance environment, consider adding addiAonal hosts when avg. host CPU uAlizaAon exceeds 65%
• Consider increasing CPU resource if guest VM CPU uSlizaSon is above 65% in average
• Ensure Power Saving Features are “OFF” • Use vCOPs for consumpAon & capacity
OpSmizaSons SQL Server: Memory Memory – Max / Min § Min is set to 0
• only change when the OS is requesSng memory for other apps
§ Max, is 2 TB by default • Should not equal or exceed total VM
RAM, may lead to OS starvaSon • Do not set to 0, may prevent SQL
from starSng • If using “Hot Add” remember to
modify this seong
SSQL Max Memory = VMMem – ThreadStack – OS Mem – VM Overhead • ThreadStack = NumOfSQLThreads(ThreadStackSize) • ThreadStackSize = 1 MB on x86 | 2 MB on x64
hxp://msdn.microsoU.com/en-‐us/library/ms178067.aspx
Max SQL Mem Example NArety Rule** • 2 Gig + AddiAonal 1 Gig per 16 Gig Physical Memory
105 **In the context of the VM size or Physical Machine Size
Running MulAple Instances on Same VM Two opSons, and do nothing is not one of them OpSon 1: Use max server memory
• Create max seong for each instance • Give each instance memory proporSonal to expected workload / db size • Do not exceed total RAM allocated to VM
OpSon 2: Use min server memory • Create min seongs for each instance • Give each instance memory proporSonal to expected workload / db size • The sum should be 1-‐2 GB less than RAM allocated to VM
§ Seongs can be modified without having to restart the instances Pro Con
Max server memory When a new process or instance starts, memory is available immediately to fulfill the request
If instances are not running, the running instances cannot access the available RAM
Min server memory Running instances can leverage memory previously used by instances that are no longer running
When a new process or instance starts, running instances need to release memory
SQL Server: Memory
107
Lock Pages in Memory
■ This keeps SQL more responsive when paging occurs
■ SQL Server Lock Pages in Memory is ON in >= 32/64 bit Standard Edition (2012)
■ Account needs “Locked pages in Memory” rights
▪ Give it the RIGHTS
hxp://msdn.microsoU.com/en-‐us/library/ms178067.aspx
Non-‐Uniform Memory Access (NUMA) • NUMA, avoiding the performance hit when several processors axempt to address the
same memory by providing separate memory for each NUMA Node. • Speeds up Processing • NUMA Nodes Specific to Each Processor Model
108
Non-‐Uniform Memory Access (NUMA) “All Processors Can Use All Memory”
• 4 Sockets, 6 cores. • 4 NUMA Nodes • 128 Gig RAM • Each NUMA Node = 32 Gig RAM
109
“In this example Optimal Performance: Each VM < 32GB*”
*CPU Overhead Needs to be accounted for. Minimal
*vNuma – Minimizes Impact when this happens
Home Node -‐ NUMA
The home node for a virtual machine is first selected considering current CPU and memory load across all NUMA nodes. Wide NUMA Allows for the use of Mul3ple NUMA Nodes Efficiently Hot Add CPU disables vNUMA **** Properly Size Database/Don’t Need Hot Add CPU ***** 110
Memory Allocated to VM Is Determined by…… • DRS Shares/Limits** • Total Memory of Host • ReservaSons • Memory Load of the Host
112
** Avoid shares/Limits Unless you really understand How they work
Swapping Occurs Two Places 1. Guest VM Swapping 2. ESXi Host Swapping
113
Swapping can slow down I/O performance of disks for other VM’s
Is Google You Best Friend….
“There is the Google DBA, The GUI DBA , or the DBA that does all the work” Charles Kim
Ballooning • Kicks in – When Physical Host experiencing memory contenSon
• Balloon Driver Runs on each individual VM • Communicates with guest O/S to determine what is happening with memory
• Works with the server to reclaim pages that are considered least valuable by the guest OS
Exceeding Host Memory can lead to ballooning, Memory Compression or Swapping
Swapping can slow down I/O performance of disks for other VM’s
How Many VMs can I Put on Host? § As many whose acSve memory will fit in physical RAM, while leaving some room for memory spikes.
Total Memory Demand AcAve memory (%ACTV) of VM’s + Memory Overhead – Page sharing of VM’s (DE-‐Duping)
DE-‐Duping = Transparent Page Sharing
Transparent Page Sharing more effecAve The more similar the VM’s are
“Put Like OperaAng Systems On Same Physical Host”
TPS – When It Kicks In • Before Ballooning • Always Running on preset cycle looking for opportunity to reclaim memory
• Very Low Overhead • Runs At HOST Level
• This is incorrect guidance floaSng around the Internet – Here’s why:
Reference: www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdf
Myth: Disable Memory TPS
Disable Unnecessary Foreground/Background within Guest O/S
• Windows Example – Alerter, AutomaSc Updates, clip book, error reporSng – Help & Support, indexing messenger, netmeeSng – Remote desktop – Once Established (Clone for reuse by Vmware)
124
Keep VM Footprint as small as Possible: NUMA, Shared Resource Pool
Memory ReservaSons • VM is only allowed to power on if the
CPU & memory reservaSon is available (Strict admission)
• The amount of memory can be guaranteed even under heavy loads.
• SET CPU/Not Guaranteed
• VMware HA Strict Admission Control – Seongs Can Override this behavior
125
ReservaAons Rock ! • Set the appropriate reservaSons to guarantee physical memory for the VM.
• In many cases, the configured size and reservaSon size could be the same
Oracle Approximate Memory Architecture
Set the memory reservation to SGA size plus OS. (Reservation & configured memory might be the same.)
Client sessions and context
SGA (DB buffer cache, and others)
Operating System
VM C
onfig
ured
M
emor
y Instance (PMON, SMON, DBWR, LGWR, CKPT, others)
Large Pages/Huge Pages -‐-‐ Broken Down at Hypervisor Level. Not Guest O/S
“Large/Huge PAGES Do
Not Normally SWAP”
In the cases where host memory is overcommitted, ESX may have to swap out pages. Since ESX will not swap out large pages, during host swapping, a large page will be broken into small pages. ESX tries to share those small pages using the pre-generated hashes before they are swapped out. The motivation of doing this is that the overhead of breaking a shared page is much smaller than the overhead of swapping in a page if the page is accessed again in the future.
http://kb.vmware.com/kb/1021095
Oracle – Hugepages /etc/security/limits.conf to set soft and hard limits. oracle soft nofile 131072 oracle hard nofile 131072 oracle soft nproc 131072 oracle hard nproc 131072 oracle soft core unlimited oracle hard core unlimited
# -- The following entries need to adjusted with HugePages settings # oracle soft memlock 50000000 # oracle hard memlock 50000000 “HUGE PAGES Do Not Normally SWAP”
§ Use large pages in the guest (start SQL Server w/ Trace flag –T834) SQL Server In-‐Guest Memory Best PracSces
Memory – Puong It ALL Together • Do not overcommit memory for producSon, mission criScal SQL Server VMs • Set provision memory = reservaSon = SQL Server max server memory + OS memory + virtualizaSon overhead
• Set provision memory = reservaSon = Oracle SGA + OS memory + virtualizaSon overhead
• To avoid swapping, memory limit should never be set below the provisioned size. Seong memory limit is not recommended in general
• To avoid NUMA remote memory access, size VM memory equal to or less than the memory per NUMA node if possible
Jumbo Frames • Jumbo frames are Ethernet Frames Ethernet with more than 1500 bytes of payload. ConvenSonally, jumbo frames can carry up to 9000 bytes of payload
Jumbo Frames The original 1500-‐byte payload size for Ethernet frames was used because of the high error rates and
low speed of communicaSons.
“Why The Picture Of A Typewriter Here?”
Enable Jumbo Frames Check to see Will Suceed
ping -‐M do -‐s 8972 -‐c 2 rac01a-‐priv ping -‐M do -‐s 8972 -‐c 2 rac01b-‐priv ping -‐M do -‐s 8972 -‐c 2 rac02a-‐priv ping -‐M do -‐s 8972 -‐c 2 rac02b-‐priv PING rac01a (10.17.33.31) 8972(9000) bytes of data. 8980 bytes from rac01a-‐priv (10.17.33.31): icmp_seq=1 xl=64 Sme=0.017 ms 8980 bytes from rac01a-‐priv (10.17.33.31): icmp_seq=2 xl=64 Sme=0.018 ms
Will Fail ping -‐M do -‐s 8973 -‐c 2 rac01a-‐priv ping -‐M do -‐s 8973 -‐c 2 rac01b-‐priv ping -‐M do -‐s 8973 -‐c 2 rac02a-‐priv ping -‐M do -‐s 8973 -‐c 2 rac02b-‐priv
Make sure: switch support is enabled
9000 Bytes - 20 Bytes IP Header - 8 Bytes of ICMP Header
“8192/64 = 128”
SQL Server: Network Network § Default packet size is 4,096
• If jumbo frames are available for the enSre stack, set packet size to 8,192
§ Maximize Data Throughput for Network ApplicaSons
• Limit file system cache by OS • NIC > File & Printer Sharing
MicrosoU Networks • Use Minimize Memory or Balance
hxp://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql-‐server-‐checklist.aspx
Network – Puong All Together
• Separate SQL workloads with chafy network traffic (MicrosoU Always On – Are you there) from the one with chunky access into different physical NIC
• With 10Gbe do at VLAN level (4Gig-‐E NICs (4Gb total vs 20Bg total) 2 10Gbe Nics)
• Separate traffic for vMo.on, service console, and SQL Server at physical NIC level • 10Gbe Sufficient Bandwidth at Host but separate by VLAN
• Have 4 NICs per host to ensure performance and redundancy of network (Virtualized Environment = Network Heavy)
• Using 4 10Gbe NIC’s overkill from redundancy perspecSve. 2 10 Gbe Nic’s Usually enough
• vSphere 5.0 Introduced ability to use more than 1 NIC for vMoAon. (More vMoi.ons going at one .me. Added specifically for memory intensive applica3ons, ie: Databases)
• Use VMXNET3 (VMware driver – reduces physical CPU uSlizaSon)
AlwaysOn Availability Group Cluster Seongs
§ Depending on YOUR network, tuning may be necessary – work with Network Team and MicrosoU to determine appropriate seongs
Cluster Heartbeat Parameters Default Value
CrossSubnetDelay 1000 ms
CrossSubnetThreshold 5hb
SameSubnetDelay 1000 ms
SameSubnetThreshold 5 hb
View: cluster /cluster:<clustername> /prop Modify: cluster /cluster:clustername> /prop <prop_name> = <value>
WSFC – Cluster ValidaSon Wizard
143
§ Use this to validate support for your configuraSon • Required by MicrosoU Support for condiSon of support for YOUR
configuraSon
§ Run this before installing AAG (AlwayOn Availabilty Group), and every Sme you make changes
• Save resulSng html reports for reference
§ If running non-‐symmetrical storage, possible ho�ixes required • hxp://msdn.microsoU.com/en-‐us/library/ff878487(SQL.110).aspx#
SystemReqsForAOAG
SQL Server Best PracSce Analyzer
144
§ Use SQL Server Best PracAce Anaylzer to check local or remote systems
• If running against remote system, issue Enable-‐PSRemoAng –f via PowerShell on the target system
• In the wizard, don’t click “connect to remote computer on Home page • On Enter Parameters link, enter SQL Server under Alternate_Server_to_Scan
• Select opSons • Scan
http://www.pearsonitcertification.com/store/virtualizing-oracle-databases-on-vsphere-9780133570182 http://www.pearsonitcertification.com/store/virtualizing-sql-server-with-vmware-doing-it-right-9780321927750
New RDBMS books from VMware Press
vmwarepress.com
Thank You Michael Corey [email protected] Blog: hxp://michaelcorey.nSrety.com hxp://www.dbtablog.com/ @Michael_Corey
Jeff Szastak @Szastak
Fill out a survey Every completed survey is
entered into a drawing for a $25 VMware company store gift
certificate
Top Related