Database Hardware Selection Guidelines - Bruce...

26
Database Hardware Selection Guidelines BRUCE MOMJIAN Database servers have hardware requirements different from other infrastructure software, specifically unique demands on I/O and memory. This presentation covers these differences and various I/O options and their benefits. Creative Commons Attribution License http://momjian.us/presentations Last updated: January, 2017 1 / 26

Transcript of Database Hardware Selection Guidelines - Bruce...

Page 1: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Database HardwareSelection Guidelines

BRUCE MOMJIAN

Database servers have hardware requirements different from other

infrastructure software, specifically unique demands on I/O and

memory. This presentation covers these differences and various I/O

options and their benefits.

Creative Commons Attribution License http://momjian.us/presentations

Last updated: January, 2017

1 / 26

Page 2: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Outline

◮ CPU

◮ Multi-threading

◮ GHz

◮ Pipelining

◮ SMP

◮ NUMA

2 / 26

Page 3: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Nope!

◮ CPU

◮ Multi-threading

◮ GHz

◮ Pipelining

◮ SMP

◮ NUMA

3 / 26

Page 4: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Normal Server Priorities

I/O

Memory

CPU

4 / 26

Page 5: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Database Server Priorities

I/O

CPU

Memory

5 / 26

Page 6: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Why the Difference?

Traditional servers are often CPU constrained because of:

◮ Network overhead (http)

◮ Text processing (email)

◮ Virtual machines (application servers)

◮ Application code

6 / 26

Page 7: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Database Server’s Unique Requirements

◮ Sequential scans of large tables

◮ Index scans causing random I/O

◮ Unpredictable query requirements

◮ Reporting

These do not require major CPU resources.

7 / 26

Page 8: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Durability Adds Even More I/O Requirements

ACID (D = durability) requires committed transactions to bestored permanently. Few other server facilities must honor thisrequirement.

Recovery

fsync

fsync

Query and Checkpoint Operations Transaction Durability

BackendPostgres

BackendPostgres

BackendPostgres

PostgreSQL Shared Buffer Cache Write−Ahead Log

Kernel Disk Buffer Cache

Disk Blocks

8 / 26

Page 9: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Magnetic Disk I/O Stack

fsync

fsync

Kernel Disk Buffer Cache

Disk Cache

HBA/RAID Cache

PostgreSQL Shared Buffer Cache Write−Ahead Log

Magnetic Disk

write−through

write−back

9 / 26

Page 10: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Magnetic Disk I/O Stack With BBU

fsync

fsync

Kernel Disk Buffer Cache

Disk Cache

HBA/RAID Cache

PostgreSQL Shared Buffer Cache Write−Ahead Log

Magnetic Disk

BATT

write−back

10 / 26

Page 11: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Flash / NAND Storage I/O Stack

fsync

fsync

Kernel Disk Buffer Cache

HBA/RAID Cache

PostgreSQL Shared Buffer Cache Write−Ahead Log

Disk Staging Cache

Flash (NAND) Solid State Disk (SSD)

BATT

BATT

?

?

write−back

11 / 26

Page 12: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

DRAM Storage I/O Stack

fsync

fsync

Kernel Disk Buffer Cache

HBA/RAID Cache

PostgreSQL Shared Buffer Cache Write−Ahead Log

DRAM Solid State Disk (SSD)BATT

BATT

?

12 / 26

Page 13: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Write-Back vs. Write-Through Caching

◮ Write-back caching returns write success before passing datato lower storage layers

◮ Write-through caching waits for write acknowledgementfrom lower storage layers before returning success

13 / 26

Page 14: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Caching Layers

◮ HBA/RAID cache behavior is usually controlled by theHBA/RAID firmware, often conditionally based on the healthof the BBU

◮ Storage drive cache behavior can be set by utility commandsor by using certain operating system calls

◮ Enterprise/SAS storage devices usually default towrite-through, while consumer/SATA devices usually defaultto write-back

14 / 26

Page 15: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

HBA/RAID CACHING

◮ HBA/RAID controllers often set storage drive caching mode towrite-through

◮ With an HBA/RAID non-volatile cache, there is littleadvantage to using write-back mode on storage drives

15 / 26

Page 16: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Magnetic Disk Selection

◮ More small spindles is better than fewer large spindles

◮ RAID 5/6 is too slow for database writes

◮ RAID 10 is popular

◮ make sure SMART reporting is fully supported

◮ SAS/SCSI disks are usually designed for enterprise workloads,unlike SATA/ATA

◮ reliability◮ error reporting◮ 24-hour operation◮ heat◮ vibration◮ http://www.intel.com/support/motherboards/server/sb/CS-031831.htm

16 / 26

Page 17: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

SSDs

◮ Flash/NAND vs. DRAM

◮ Write staging area — it is not just cache

◮ Running a NAND SSD in write-through mode can reduce itsusable life because of increased write cycles

◮ Best for WAL and random I/O, e.g. indexes

◮ Set random_page_cost = 1.1

◮ Set effective_io_concurrency = 256◮ Intel SSD 320 Series: http://blog.2ndquadrant.com/en/2011/04/

intel-ssd-now-off-the-sherr-sh.html

17 / 26

Page 18: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Filesystem Options

◮ xfs or ext4 over ext3

◮ reduce file system logging, particularly for /pg_xlog directory

◮ disable access (atime) recording

18 / 26

Page 19: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Battery-Backed Unit (BBU)

◮ Verify battery or super-capacitor (supercap) existence visually

◮ Typically lasts for 48-72 hours

◮ Some write the cache to local flash memory on power failure

◮ Detected battery/super-capacitor failure can disablewrite-back cache mode

◮ Requires failure monitoring

◮ Requires replacement

19 / 26

Page 20: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Battery-Backed Unit (BBU)

https://www.flickr.com/photos/jemimus/

20 / 26

Page 21: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Supercapacitor-Backed Unit

https://commons.wikimedia.org/wiki/File:Embedded_World_2014_SSD.jpg

21 / 26

Page 22: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Shared Storage

◮ SAN and NAS replace direct-attached storage (DAS) withshared storage

◮ Often used for easier storage management

◮ Shared I/O resource

◮ Databases often wait for I/O completion, meaning they haveto contend with shared resource contention

◮ SAN serves block devices, NAS serves file systems

22 / 26

Page 23: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

RAM

◮ The more RAM, the better; this reduces I/O requirements

◮ Ideally, five minutes of your working set

◮ The more RAM, the more possibility of RAM failure

◮ Use ECC (Error Correction Codes) RAM

◮ detect errors◮ correct errors◮ report faulty memory◮ cosmic radiation

23 / 26

Page 24: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

CPUs

◮ Parallel query allows a single session to use multiple CPUs

◮ Partial support added in Postgres 9.6

◮ Heavy use of server-side functions might generate significantCPU load

◮ CPUs can become a bottleneck if the entire database fits inRAM and the workload is read-only

24 / 26

Page 25: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Not the Same

Just because something has the same interface doesn’t mean hasthe same capabilities. Compatible computer hardware is not allthe same.

https://www.flickr.com/photos/cdevers/

25 / 26

Page 26: Database Hardware Selection Guidelines - Bruce Momjianmomjian.us/main/writings/pgsql/hw_selection.pdf · Database Hardware Selection Guidelines ... Disk Staging Cache Flash (NAND)

Conclusion

I/O

CPU

Memory

http://momjian.us/presentations

26 / 26