Supporting SPDK in Oracle RDBMS

33

Transcript of Supporting SPDK in Oracle RDBMS

Page 1: Supporting SPDK in Oracle RDBMS
Page 2: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Presenters: Bang Nguyen, Akshay ShahContributors: Zahra Khatami, Avneesh Pant, Sumanta Chatterjee

Oracle Database Virtual OSMay 2018

Supporting SPDK in Oracle RDBMS

Page 3: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Page 4: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Program Agenda

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

2

3

4

4

5

Page 5: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

2

3

4

5

5

Page 6: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Drivers

StorageServices

StorageProtocols

iSCSI Target

NVMe-oF*Target

SCSI

vhost-scsiTarget

NVMe

NVMe Devices

Blobstore

NVMe-oF*

Initiator

Intel® QuickDataTechnology Driver

Block Device Abstraction (bdev)

Ceph RBD

Linux AIO

Logical Volumes

3rd Party

NVMe

NVMe*

PCIeDriver

Released

In Progress

vhost-blkTarget

BlobFS

Integration

RocksDB

Ceph

Tools

fio

GPT

PMDKblk

virtio(scsi/blk)

VPP TCP/IP

QEMU

Cinder

QoS

Linux nbd

RDMA

DPDKEncryption

TCP

RDMA

TCP

SPDK Architecture

iSCSI malloc

spdk-cli

vhost-nvmeTarget

virtio

virtio-PCIe

vhost-user

nvme-cli

Page 7: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK benefits

7

Enables scalable storage applications.

High-performing millions of I/Os per second.

Direct access to local NVMe SSDs as well as access to remote storage targets using NVMeoF.

Highly concurrent and asynchronous runtime with no locking in the I/O path.

Directly polling the hardware queues for completions.

Significantly improves I/O performance for latency sensitive applications processing lots of concurrent disk I/O requests

Page 8: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

2

3

4

8

5

Page 9: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Challenges

9

NVMe SSDs contain a limited number ofhardware IO queues.

• Databases usually comprise 10s-1000s of processes.

• Each client process can allocate one or more IO queues for PCIe I/O to local NVMe devices.

• This can cause rapid exhaustion of available hardware queues.

Oracle’s existing memory management infrastructure conflicts with DPDK.

• Allocating private/shared memory from same region.

Page 10: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

3

2

4

10

5

Page 11: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle/ SPDK I/O stack

11

• 2 different I/O code paths into SPDK.

• Remote I/O: Submit/ Poll I/O directly to/ from SPDK.

• Local I/O: Submit/ Poll I/O directly to/ from Oracle dispatcher.

• Oracle dispatcher submits/ polls I/O to/ from SPDK.

Page 12: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Dispatcher

12

Local IO proxy that runs one or more slaves.

Each Slave runs in its own core.

Each Slave has one or more request queues.

Request queue implemented as a lock-free ring.

Slave processes IO requests from clients.

Queues IO completions to client completion queues.

Runs as secondary process to the target.

Page 13: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Client

13

Bound to specific Dispatcher Slave/Request queue

Clients bound to the same Request queue must run on separate cores.

Clients bound to different Request queues can run on the same core.

Each client has its own completion queue.

Submits I/O requests to the Slave request queue.

Polls I/O completions from its completion queue.

Page 14: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Lock-free Queues

14

• Multiple producers Single consumer queue.

• IO clients submit requests and Dispatcher Slave processes them.

Request Queue

• Single producer Single consumer queue.

• Dispatcher Slave queue IO completions and Client reaps them.

Completion Queue

Page 15: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

4

2

3

15

5

Page 16: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

DPDK Memory Model

16

• DPDK toolkit is used for memory management:

1. Attaches to huge pages upfront.

2. Allocates private/shared memory from same area.

3. Shares memory map with both primary/secondary processes.

Page 17: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Memory Model

17

Managed Global Area

(MGA)

Shared Global Area (SGA)

Private Global Area

(PGA)

Page 18: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Memory Model

18

Managed Global Area

(MGA)

Shared Global Area (SGA)

Private Global Area

(PGA)

PGA: private and not-shared

Memory needed for the operation of one process.

Page 19: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Memory Model

19

Managed Global Area

(MGA)

Shared Global Area (SGA)

Private Global Area

(PGA)

SGA: large physically shared memory.

Addressable by all processes within an instance.

Page 20: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Oracle Memory Model

20

Managed Global Area

(MGA)

Shared Global Area (SGA)

Private Global Area

(PGA)

MGA: Can be shared.

Uniquely identified by its name.

New segments can be added or existing segments be deleted.

Page 21: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Environment Abstraction Layer (EAL)

21

• Provides access to low-level resources such as hardware and memory space.

• Hides environment specifics from applications and libraries.

• Provides core assignment/ affinity, memory management, PCI enumeration, address translation, etc.

• DPDK is the default SPDK RTE.

• ORAENV is DPDK equivalent for Oracle database.

Page 22: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ORAENV

22

• Dynamic allocation of shared memory from SGA and MGA pools and private memory from PGA pools.

• Similar features to DPDK RTE environment using Oracle runtime services.

• RDMA data transfer optimizations for both local and remote IO.

Page 23: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ORAENV: Dynamic Memory Allocation

23

• Space is organized into heaps.• Heap can be allocated in address

space that is private to a particular process or shared by many processes.

• When client requests memory, chunk is allocated from a particular heap.

• A heap is composed of a set of contiguous chunks.

• Returns set of extents contained in the heap.

Page 24: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ORAENV: RDMA Data Transfer

24

Used for key lookup services.

Access to shared data regions.

Read and write access to the process.

Allows a single mapping to be registered.

Re-using memory mapping improves cache performance.

PD is valid as long as one user process is attached.

Shared Protection Domain

Problem: Registering entire memory region in each process is not scalable

Page 25: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ORAENV: Shared Protection Domain

25

Process A:

Allocate a shared protection domain and register memory using the PD in process A.

All other processes:

Map/attach to the allocated memory using the same shared PD.

Shared memory registration enables zero copy data transfer.

local/remote zero_copy

data transfer

Page 26: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

ORAENV

26

P1 P2 P3 P4 P5

SGA

ORA

MGA

Direct Memory Access SGA and MGA are both registered

using shared PD so only a single memory registration for the region can be used across 1000s of processes.

Page 27: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

SPDK

Challenges

Oracle Dispatcher

Memory Model

Future Work

1

4

2

3

27

5

Page 28: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

I/O Resource Management with SPDK

28

• Ability to rate limit IOPS and throughput for database workloads.

• Database workloads can be standalone databases or pluggable databases in a multi-tenant container.

• Prioritize high priority I/Os such as redo log writes over other I/Os.

• Prevent low priority tasks such as database backups from impacting other workloads.

Page 29: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Security Management for NVMeoF

29

• All devices in an NVM subsystem are accessible from all hosts and databases.

• Need ability to isolate access to namespaces for different database tenants.

• Implement per-connection authentication and access control checks to restrict visibility and access to namespaces.

• Add IPSec support for encrypted data transfer over the network.

Page 30: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

NVMeoF Transport

30

• Current implementation is based on NVMe over RDMA.

• Implement support for TCP as it is more commonly available in data centers.

• Awaiting TCP/IP standardization from NVMe technical working group.

• Awaiting TCP/IP support in SPDK.

Page 31: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Conclusion

31

• SPDK enables scalable I/O performance for Oracle database.

• Oracle dispatcher reduces I/O latency for local NVMe devices.

• ORAENV integrates Oracle’s existing memory model with SPDK.

• Provides support for memory management, registration and address translation.

Page 32: Supporting SPDK in Oracle RDBMS

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 32

Thank you!

Page 33: Supporting SPDK in Oracle RDBMS