DB2 V11 Overview NEDB2UG 2016-05-153 © 2016IBM"Corporation...

© 2016 IBM Corporation

DB2 V11.1 Technology Overview

2016-04-18

© 2016 IBM Corporation2

Safe Harbor Statement

2

Copyright © IBM Corporation 2016. All rights reserved.U.S. Government Users Restricted Rights - Use, duplication, or disclosure restricted by GSA ADP Schedule Contract

with IBM Corporation

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON CURRENT THINKING REGARDING TRENDS AND DIRECTIONS, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. FUNCTION DESCRIBED HEREIN MY NEVER BE DELIVERED BY I BM. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.

IBM, the IBM logo, ibm.com and DB2 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml


DB2 Version 11.1 Overview

§ Themes and Highlights§ Packaging and Pricing§ Installation, Migration, and Platform Considerations§Manageability Enhancements§ Enterprise Security§ pureScale Enhancements§ BLU on MPP§ BLU Enhancements§ SQL Functions and Syntax Compatibility§ New, Changed, Deprecated and Discontinued


Announcement Summary

§ DB2 Version 11.1 Announced on April 12th with eGA of June 15th

§ End of Marketing for DB2 Version 10.5 (September 30th, 2016)– All DB2 Version 10.5 Editions and features/offerings

• IBM DB2 Encryption Offering• IBM DB2 Business Application Continuity Offering• IBM DB2 BLU Acceleration In-Memory Offering

– Performance Management Feature Version 10.5– Tools and DB2 Connect Version 10.5

§ End of Service for both DB2 Version 9.7 and 10.1– Effective End of Service date of September 30th, 2017– Direct direct upgrade from DB2 Version 9.7, 10.1, and 10.5 to V11.1– End of Service date is not applicable to SAP customers with an ASL as they have a different end of service period


Packaging Summary

§ DB2 Express Edition replaced by Workgroup Server Edition§Workgroup and Enterprise Edition– pureScale Standby Node Option– Table partitioning, Encryption– Multi-dimensional Clustering– Limited Federation (DB2 & Informix)

§ Exclusions– Data Partitioning– SQL Warehouse (SQW)– BLU Acceleration, Compression, Materialized Query Tables (ESE includes)– No MQ or CDC Replication

§Workgroup Simple Limits– Authorized User (AUSI) , PVU, and Virtual Server– 16 Cores, 128 GB of memory with no database size limit

§ Advanced Workgroup and Enterprise Include All Features

Metric New Price Workgroup Express

PVU 100 158 68

AUSI 354 554 291

Prices are in US dollars and are subject to change.


New DB2 Direct Editions

§ New Delivery Mechanism for DB2 licenses– New license metrics to facilitate hybrid cloud deployments– Option to deploy either on-premises or on cloud

§ Two Versions depending on Requirements– DB2 Direct Standard Edition 11.1

• Has all of the database features of DB2 Workgroup Server Edition– DB2 Direct Advanced Edition 11.1

• Has all of the database features of DB2 Advanced Enterprise Server Edition– Both include access to previous (10.5) editions of DB2

§ New Simplified Licensing Metric– Virtual Processor Core (VPC) sold as a monthly license charge– 1 VPC for every VPC available to a virtual Servers Operating System, or 1 VPC for each physical core of a non-partitioned physical server

§ Predictable maintenance windows – Sliding 24 month version N, N-1 Support

Metric Standard Advanced

VPC Monthly 135 354

Prices are in US dollars and are subject to change.


Federation Included in Packaging

§ Integrated support for homogeneous federation– Single install replacing any prior separate Infosphere Federation Server install– Support for upgrading from either a DB2 database product or InfosphereFederation Server

§ Additional Wrappers in Advanced Editions– DB2, PureData System for Analytics (PDA), Oracle, Informix, dashDB, SQLServer, BigSQL, SparkSQL, Hive, Impala, and other Big Data sources.

Application


Encryption Included in Packaging

§ V11.1 adds support for KMIP 1.1 complaint centralized key managers– Validated on IBM's Security Key Lifecycle Manager (ISKLM)

§ Direct support for Hardware Security Modules (HSMs) (Preview)– Support to include SafeNet Luna & Thales nShield Connect+

DB2 Native Encryption

Centralized Key Manager

KMIP 1.1

Local KeystoreFile

DB2 V10 FP5

Hardware Security Module

DB2 V11.1

TechnologyPreview

Simple Key Mgt : a local flat file used for a specific DB2 instance

Enterprise Key Mgt : a centralized key manager or HSM that can be used across many databases, file systems and other uses across an enterprise


DB2 pureScale Included in Packaging

§ DB2 pureScale available in DB2 Advanced Editions, including new Direct Advanced Edition

§ Low cost active/passive licensing where one DB2 member has minimal licensing and the other DB2 member(s) fully licensed– All application workloads are directed to the primary active member(s)

• Sometimes referred to as the “primary" member– Utilities and admin tasks allowed on the secondary admin member

• Admin member licensed as warm standby (e.g. 100 PVUs or 1 VPC)• Great for off-loading backups from primary members $


Licensing - DB2 pureScale Active/Passive Model

CF CF

PrimaryMember*

SecondaryAdminMember

BackupRestore

Configuration

DDL

Runstats

Reorg

Replication

Security

Monitoring

Backup

Backup

Workload

Workload

Workload

Workload

Application workloads (transactional, batch, etc.) run on the primary member

Administrative tasks/utilities allowed to run onsecondary memberAdministrative

tasks/utilities allowed, but best practice is to run them on secondary member *can have up to 127 Primary Members


INSTALLATION, MIGRATION, AND PLATFORM CONSIDERATIONS


Operating Systems - Supported

§ New Operating System Support– Power Linux LE (Little Endian)

• Red Hat Enterprise Linux (RHEL) 7.1+• SUSE Linux Enterprise Server (SLES) 12• Ubuntu 14.04 LTS

§ Supported Operating Systems– Intel 64-bit

• Windows 7, 8.1, 10, Windows Server 2012 R2 • Red Hat Enterprise Linux (RHEL) 6.7+, 7.1+• SUSE Linux Enterprise Server (SLES) 11SP4+, 12• Ubuntu 14.04 LTS

– AIX Version 7.1 TL 3 SP5+– zLinux

• Red Hat Enterprise Linux (RHEL) 7.1+• SUSE Linux Enterprise Server (SLES) 12


Operating Systems - Discontinued

§ In DB2 V11, the following operating systems (on any platform) are no longer supported for Client or Server:– HP-UX– Solaris– Power Linux BE– Inspur K-UX

§Migration– Customers on these platforms will continue to be supported until the end-of-service date for DB2 V10.5 (last release that supports these platforms)


Operating Systems - Virtualization

§ IBM System z– IBM Processor Resource/System Manager– z/VM and z/KVM on IBM System z

§ IBM Power– IBM PowerVM and PowerKVM and IBM Workload Partitions on IBM Power Systems

§ Linux X86-64 Platforms– Red Hat KVM– SUSE KVM

§ VMWare ESXi§ Docker container support – Linux only§Microsoft– Hyper-V– Microsoft Windows Azure on x86-64 Windows Platforms only

§ pureScale support on Power VM/KVM, VMWare, and KVM


Streamlined Upgrade Process

§ Upgrade directly from Version 9.7, 10.1 and 10.5 (3 releases back)§ Ability to roll-forward through database version upgrades– Upgrading from DB2 Version 10.5 Fix Pack 7, or later– Users are no longer required to perform an offline backup of existing databases before or after they upgrade

– A recovery procedure involving roll-forward through database upgrade now exists

– Applies to all editions and configurations except Database Partitioning Feature (DPF)

§ HADR environments can now be upgraded without the need to re-initialize the standby database after performing an upgrade on the primary database– Applies to all editions except DB2 pureScale– DB2 Version 10.5 Fix Pack 7, or later


DB2 Database Migration Log Records

§ Two New Propagatable Log Records– The database migration begin log record is written to mark the start of database upgrade

– The database migration end log record is written to mark the successful completion of database upgrade

§ Supports the ability to roll forward (replay) through a database upgrade– Also used to allow for an HADR update on secondary sites without re-initialization of the database

§ Database upgrade no longer renames log files– When upgrading from 10.1 or 10.5, database log files will no longer be renamed from .LOG to .MIG

§ For Version 10.1 and Version 10.5, any unarchived files prior to the upgrade will be archived from uplevel and will not require any action


10.5FP7

11.1

Online Backup

Transactions

Transactions

10.5

11.1

Migration


10.5FP7

Reinstall DB2 10.5

Stop at Failure Point

10.5 Apply 10.5 Logs

Restore Online Backup

11.1Install DB2 11.1

11.1

Apply 11.1 Logs

Recovery


MANAGEABILITY AND TOOLING


INPLACE Table Reorganization

§ The manageability of large range partitioned tables has been improved– A single partition of a partitioned table can now be reorganized with the INPLACE option if:• the table has no nonpartitioned indexes• ON DATA PARTITION is specified

– Only one data partition can be reorganized at a time– INPLACE table reorganization can be run only on tables that are at least three pages in size


New options for the ADMIN_MOVE_TABLE

§ REPORT– The REPORT option can be used to monitor the progress of table moves– Calculates a set of values to monitor the progress of a single or multiple table moves

– Focus is the COPY and REPLAY phase of a running table move– To get values for all table moves, tabschema and tabname must be NULL or the empty string

§ TERM– The TERM option can be used to terminate a table move in progress– Terminates a running or table move– TERM will force off the application running the table move, roll back all open transactions and set the table move to a well defined operational status

– From here, the table move can be cancelled or continued


Remote Storage Option for Utilities

§ Remote storage is now accessible from:– INGEST, LOAD, BACKUP, and RESTORE– Accessed through the use of storage access aliases

§ Supported Storage– IBM® SoftLayer® Object Storage – Amazon Simple Storage Service (S3)


DB2 Support for the NX842 Accelerator

§ DB2 backup and log archive compression now support the NX842 hardware accelerator on POWER 7+ and POWER 8 processors

§ DB2 BACKUPs require the use of a specific NX842 library– backup database <dbname> compress comprlib libdb2nx842.a

§ Backups can be compressed by default with NX842– Registry variable DB2_BCKP_COMPRESSION has to be set to NX842– Use the following backup command format:– backup database <dbname> compress

§ Log archive compression is also supported– Update the database configuration parameter LOGARCHCOMPR1 or LOGARCHCOMPR2 to NX842

– update database configuration for <dbname>using LOGARCHCOMPR1 NX842

– Note: These two parameters can still take different values


DB2 Backup Compression Performance Results

§ Preliminary results from early system testing § About 50% DB2 backup size reduction compared to uncompressed§ Factor 2x less CPU consumption compared to DB2 compression

• Very significant reduction in CPU consumption • Very significant reduction in elapsed time• Maintains almost all of the compression storage benefits

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

Internal Tests at IBM Germany Research & Development

Uncompressed Compressed NX842 Compressed0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

00:00

01:00

02:00

03:00

04:00

05:00

06:00

Size related to UncompressedBackup Runtime to /dev/null in minutesRestore in minutesM

inutes

Backup to /dev/nullSoftware Compressed H/W

Compressed

NotComp-‐ressed


PURESCALE ENHANCEMENTS


DB2 pureScale: Simplified Install and Deployment

§ Fast Up and Running– Up and running in hours compared to competitive cluster databases

§ Install re-engineering includes:– “Push-Button” install for pureScale clusters

• Socket complexity reduced by at least 40% • Smarter defaults, intuitive options, parallel & quick pre-deployment host validation

– 30-step native GPFS setup reduced to simple 4-step DB2 install process• Also easier conversion to GPFS replication post-deployment using db2cluster

– Increased Resiliency for aborted/partial installations• Clean rollback for re-installation

§ Additional assistance via:– Simplified documentation– Enhanced pre-checking of storage, tiebreaker devices, network adapters, firmware libraries

– Intuitive and user-friendly errors & warnings


Simplified pureScale Storage Replication Deployment

Today DB2 Version 11.1

1Takes ~8 native GPFS commands to create a replicated file system with the standard three redundancy groups.

1 db2cluster command

2Takes ~24 native GPFS commands to convert a non-replicated FS to a replicated FS.

2 db2cluster commands (one for conversion and one for adding storage)

3Takes ~8 native GPFS commands to add a new disk to an existing replicated FS


4Takes ~7 native GPFS commands to remove a disk from existingreplicated FS



DB2 pureScale Health Check

§ Unified health check tool for a DB2 pureScale cluster– Post-installation command:

• db2cluster –verify

– Validations performed include, but are not limited to, the following:• Configuration settings in peer domain and GPFS cluster• Communications between members and CFs• Replication setting for each file system• Status of each disk in the file system


Linux Virtualization Enhancements

§ RDMA over Converged Ethernet (RoCE) support added in VMWare– RoCE SR-IOV for RHEL 7.2 only– Allows a single adapter to be shared across multiple patitions

§ Single-Root I/O Virtualization (SR-IOV) – Standard that enables one PCI Express (PCIe) adapter to be presented as multiple separate logical devices (Virtual Functions) to virtual machines

– Allow the virtual machines to run native RoCE and achieve near wire speed performance.

– Can be enabled on Mellanox ConnectX-3/ConnectX-3 Pro/Connect X-3 VPI adapters for Ethernet


RoCENetwork

Shared RoCE Adapters

VMWarePartitions

VMWare RoCE Adapter Sharing Example


HADR Support for SYNC and NEARSYNC Mode

§ Support for SYNC and NEARSYNC has been added to pureScale– This enhancement combines the continuous availability of DB2 pureScale with the robust disaster recovery capabilities of HADR providing an integrated zero data loss (i.e. RPO=0) disaster recovery solution

– HADR peer window (hadr_peer_window) is not supported § HADR support with pureScale now includes:– SYNC, NEARSYNC, ASYNC and SUPERASYNC modes – Time delayed apply, Log spooling– Both non-forced (role switch) and forced (failover) takeovers

CFCF

CFCFPrimary Cluster

Standby DR Cluster


GDPC Support Enhancements

§ DB2 V11 adds improved high availability for Geographically dispersed DB2 pureScale clusters (GDPC) for both RoCE & TCP/IP– Multiple adapter ports per member and CF to support higher bandwidth and improved redundancy at the adapter level

– Dual switches can be configured at each site to eliminate the switch as a site-specific single point of failure (i.e. 4-switch configuration)

M1 M3 M2 M4CFSCFP

Site A Site BWorkload fully balanced

Secondary CF

Member3

Member 4

Site 1

Storage

Storage

GPFS

replication

ro1ro0

Switch 1Peer 1

Switch 2Peer 1

Switch 3Peer 2

Switch 4Peer 2

Site 2

Primary CF

ro0 ro1 ro0 ro1

Member 1

ro0 ro1

ro1ro0

Member 2

ro1ro0

en2 en2 en2

en2 en2 en2

Site 3


• 80% read / 20% write OLTP workload• POWER8 4c/32t, 160 GB LBP• 10 Gb RoCE RDMA Ethernet / 10 Gb TCP sockets

26.8

50.8

7082.7

38

72.8

105

133

0

20

40

60

80

100

120

140

1 member 2 members 3 members 4 members

Thousands of SQL statements/s

Scale-out Throughput – DB2 pureScale on LE POWER Linux

Sockets RDMA

Horizontal Scaling with DB2 pureScale on POWER Linux



DB2 V10.5 fp5 DB2 V11.1tps 5848 12448

02000400060008000100001200014000

Workload #2 – DB2 pureScale

DB2 V10.5 fp5 DB2 V11.1tps 5040 7950

0100020003000400050006000700080009000

Workload #1 - DB2 ESE

1.58x2.1x

• Workload 1 based on an industry benchmark standard

• POWER7 32c, 512 GB

• Workload 2 implements a warehouse-based transactional order system

• 4 members, 2 CFs with 16c, 256 GB

Improved Performance for Highly Concurrent Workloads

§ Streamlined bufferpool latching protocol implemented in DB2 V11– Reduces contention which can develop on large systems with many threads– Particularly helpful with transactional workloads



Application throughput - DB2 v10.5 fp5 Application throughput – DB2 v11.1

Improved Table TRUNCATE Performance in pureScale

§More efficient processing of Global Bufferpool (GBP) pages– Speeds up truncate of permanent tables especially with large GBP sizes– Helps DROP TABLE and LOAD / IMPORT / INGEST with REPLACE option– Enables improved batch processing with these operations

§ Example– Workload with INGEST (blue) and TRUNCATE (green) of an unrelated table – DB2 v11.1 has much smaller impact on OLTP workload than DB2 10.5 fp5


OLTPBATCH

Member 0 Member 1 Member 2 Member 4

CFCF

Member 5

Database

Member 3

CALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('BATCH', NULL,'(ADD 2 FAILOVER_PRIORITY 1)');CALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('OLTP', NULL,'(ADD 3 FAILOVER_PRIORITY 1)');

SUBSET MEMBER FAILOVER_PRIORITY

BATCH 0 0BATCH 1 0BATCH 2 1OLTP 4 0OLTP 5 0OLTP 3 1

This information is available from SYSCAT.MEMBERSUBSETMEMBERS anddb2pd –membersubsetstatus -detail

Member Subsets : FAILOVER PRIORITY


OLTPBATCH


CFCF

Member 5

Database

Member 3



This information is available from SYSCAT.MEMBERSUBSETMEMBERS anddb2pd –membersubsetstatus -detail

Member Subsets : FAILOVER PRIORITYCALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('BATCH', NULL,'(ADD 2 FAILOVER_PRIORITY 1)');CALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('OLTP', NULL,'(ADD 3 FAILOVER_PRIORITY 1)');


OLTPBATCH


CFCF

Member 5

Database

Member 3



This information is available from SYSCAT.MEMBERSUBSETMEMBERS anddb2pd –membersubsetstatus detail

Member Subsets : FAILOVER PRIORITYCALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('BATCH', NULL,'(ADD 2 FAILOVER_PRIORITY 1)');CALL SYSPROC.WLM_ALTER_MEMBER_SUBSET('OLTP', NULL,'(ADD 3 FAILOVER_PRIORITY 1)');


DB2 BLU ENHANCEMENTS


BLU Acceleration: MPP Scale Out

§ Technology– Pervasive SMP & MPP Query Parallelism– Inter-partition query parallelism simultaneous with intra-partition- parallelized, memory-optimized, columnar, SIMD-enabled, BLU processing

§ Value Proposition– Improve Response Time

• All servers contribute to the processing of a query– Massively Scale Data

• Significantly beyond current practical limits– Streamline BLU Adoption

• Add BLU Acceleration to existing data warehouses

1/3 data

Hash partition(BLU

Acceleration)

Query #1processing

Query #1

Query #1processing

Query #1processing

1/3 data

Hash partition(BLU

Acceleration)

1/3 data

Hash partition(BLU

Acceleration)

DB2 10.5 BLU Capacity

DB2 V11.1 BLUCapacity

10s of TB 1000s of TB

100s of Cores 1000s of Cores


3 Node (10TB) -‐ 6 MLNS


QpH 111 213

0

50

100

150

200

250

Que

ries P

er Hour

Scaling Hardware at constant Data Volume



QpH 113 109

020406080

100120

Que

ries P

er Hour

Scaling Hardware along with Data Volume

1.92xQpH

Held up!


Demonstrating BLU MPP Linear Scaling

§ DB2 Version 11.1 on an IBM Power Systems E850 Cluster

§ Scaling was measured in two different ways– Doubling the hardware but keeping the database constant– Doubling the hardware and doubling the database size– Both tests used the BD Insights Heavy Analytics Internal Workload


Competition on 3 Nodes of AWS DB2 V11.1 on 3 Nodes of AWSQpH 145 274

0

50

100

150

200

250

300

Queries Per Hour

Query Throughput BD Insights (4TB)

1.9x Faster!!

Outstanding BLU MPP Performance

§ 1.9x Higher Throughput with DB2 Version 11.1 vs. the Competition



Optimized SQL Support for Columnar Tables

§ SQL OLAP improvements for deeper in-database analytics with column-organized tables

§ Additional Oracle Compatibility Support– Wide rows– Logical character support (CODEUNITS32)

§ DGTT support (except not logged on rollback preserve rows)– Parallel insert into not-logged DGTT from BLU source

§ IDENTITY and EXPRESSION generated columns§ European Language support (Codepage 819)§ NOT LOGGED INITIALLY support§ Row and Column Access Control (RCAC)§ ROWID Support§ Faster SQL MERGE processing§ Nested Loop Join Support

DB2 : a polyglotdatabase


SQL Functions Optimized for Columnar Mode

§ String Functions– LPAD, RPAD, TO_CHAR, INITCAP

§ Numeric Functions– POWER, EXP, LOG10, LN, MOD, TRUNCATE, TO_NUMBER– SIN, COS, TAN, COT, ASIN, ACOS, ATAN

§ Date and Time Functions– TO_DATE, MONTHNAME, DAYNAME

§OLAP– RANK, DENSE_RANK, ROW_NUMBER – AVG, SUM, MIN, MAX– COUNT, COUNT_BIG– FIRST_VALUE – RATIO_TO_REPORT

§Miscellaneous– COLLATION_KEY


with v1 as(select i_category, i_brand, cc_name, d_year, d_moy, sm_type,

sum(cs_sales_price) sum_sales,avg(sum(cs_sales_price)) over(partition by i_category, i_brand, cc_name, d_year)

avg_monthly_sales,rank() over(partition by i_category, i_brand, cc_name

order by d_year, d_moy) rnfrom BDINSIGHTS.item

, BDINSIGHTS.catalog_sales BDINSIGHTS.date_dim, BDINSIGHTS.call_center, BDINSIGHTS.ship_mode

where cs_item_sk = i_item_sk and cs_sold_date_sk = d_date_skand cc_call_center_sk= cs_call_center_sk

and cs_ship_mode_sk = sm_ship_mode_skand d_year = 2000

group by i_category, i_brand , cc_name , d_year , d_moy, sm_type),

v2 as(select v1.i_category, v1.i_brand, v1.cc_name, v1.d_year, v1.d_moy

, v1.avg_monthly_sales, v1.sum_sales, v1.sm_type, v1_lag.sum_sales psum, v1_lead.sum_sales nsum

from v1, v1 v1_lag , v1 v1_lead

where v1.i_category = v1_lag.i_categoryand v1.i_category = v1_lead.i_category and v1.i_brand = v1_lag.i_brandand v1.i_brand = v1_lead.i_brand and v1. cc_name = v1_lag.cc_nameand v1. cc_name = v1_lead.cc_name and v1.rn = v1_lag.rn + 1and v1.rn = v1_lead.rn - 1)

select *from v2where d_year = 2000and avg_monthly_sales > 0and case when avg_monthly_sales > 0

then abs(sum_sales - avg_monthly_sales) / avg_monthly_saleselse null end > 0.1

order by sum_sales - avg_monthly_sales, cc_name

fetch first 100 rows only

V10.5 V11.1OLAP Query 22.814 5.326

0

5

10

15

20

25

Elapsed Time in Se

cond

s

OLAP Query Elapsed Time (s) (lower is better)

4.3x Faster!!

Columnar Engine Native Sort + OLAP Support

§ No longer compensated on single instance DB2 V11.1


Columnar Engine Native Sort + OLAP Support

§ Access Plan Difference with Native Evaluator support


BLU Acceleration: Massive Gains for ELT & ISV Apps

16x Faster !

BLU Declared Global Temporary Table (not-‐logged DGTT) Parallelism




Industry Leading Parallel Sort§ Leverages the latest sort innovations from IBM TJ Watson Research and DB2 Development– Enhancements can increase BLU Acceleration performance by as much as 13.9X

§ BLU Sort+OLAP on SMP Environment

§ Configuration Details– On 4-socket Intel Xeon platform with 72 Cores and 742G RAM– 1 TB TPC-DS database– Query scenarios involving multiple sort and OLAP operations

13.9 X faster 5.4 X faster4.6 X faster

0

500

1000

1500

2000

Sort+OLAP query 1 Sort+OLAP query 2 Sort+OLAP query 3

Query elapsed time(s)

Row Sort BLU Sort


DB2 V10.5 FP5 DB2 V11.1QpH 703.85 955.82

020040060080010001200

Queries Per Hour

Query Throughput BD Insights (800GB)

1.36x• Native Sort• Native OLAP (usually combined with sort)• Enables query plans to remain as much as possible within the columnar engine

Native BLU Evaluation

• Find areas to improve degree determination and improve parallel use

Query Rewrite Improvements

• SORTHEAP used for building hash tables for JOINs, GROUP BYs, and other runtime work• Efficient use allows for more concurrent intra-query and inter-query operations to co-exist.

Improved SORTHEAP Utilization

Reasons for Improvement

Demonstrating BLU Single Instance Improvement

§ DB2 V11.1 on Intel Haswell EP

§ Configuration Details– 2 socket, 36 core Intel Xeon E5-2699 v3 @ 2.3GHz– 192GB RAM– BD Insights Internal Multiuser Workload 800GB



SQL ENHANCEMENTS


New Functions, Data Types and Columnar OptimizationDate/Time Date/Time Statistics Bit Manipulation Data Types Strings OLAP Pushdown OLAP Pushdown

DATE_PART ADD_YEAR COVARIANCE_SAMP HASH INT2 STRPOS RANK FIRST_VALUE

DATE_TRUNC ADD_MONTHS STDDEV_SAMP HASH4 INT4 STRLEFT DENSE_RANK RATIO_TO_REPORT

AGE ADD_DAYS VARIANCE_SAMP HASH8 INT8 STRRIGHT ROW_NUMBER EXP

LOCALTIMESTAMP ADD_HOURS CUME_DIST TO_HEX FLOAT4 REGEXP_COUNT LPAD LOG10

NOW Function ADD_MINUTES PERCENT_RANK RAWTOHEX FLOAT8 REGEXP_EXTRACT RPAD COLLATION_KEY

THIS_QUARTER ADD_SECONDS PERCENTILE_DISC INT2AND BPCHAR REGEXP_INSTR TO_CHAR LN

THIS_WEEK DAYOFMONTH PERCENTILE_CONT INT2OR BINARY REGEXP_LIKE INITCAP TO_NUMBER

THIS_YEAR FIRST_DAY MEDIAN INT2XOR VARBINARY REGEXP_MATCH_COUNT TO_DATE MOD

THIS_MONTH DAYS_TO_END_OF_MONTH WIDTH_BUCKET INT2NOT LOG REGEXP_REPLACE MONTHNAME SIN

NEXT_QUARTER HOURS_BETWEEN COVAR_POP INT4AND RANDOM REGEXP_SUBSTR DAYNAME COS

NEXT_WEEK MINUTES_BETWEEN STDDEV_POP INT4OR BTRIM POWER TAN

NEXT_YEAR SECONDS_BETWEEN VAR_POP INT4XOR AVG COT

NEXT_MONTH DAYS_BETWEEN VAR_SAMP INT4NOT COUNT ASIN

NEXT_DAY WEEKS_BETWEEN INT8AND COUNT_BIG ACOS

EXTRACT INT8OR MIN ATAN

INT8XOR MAX TRUNCATE

INT8NOT SUM


BINARY and VARBINARY data

§ BINARY and VARBINARY data types– Allow binary string data to be stored and manipulated without the overhead of using a BLOB type

– A binary string is a sequence of bytes that are used to store as pictures, sound, or mixed media

– BINARY and VARBINARY data types are compatible with each other and are compatible with the BLOB data type

– Binary string data types are not compatible with character string data types, except those character strings that are defined as FOR BIT DATA

– Support for BINARY and VARBINARY data types enhances compatibility with other relational database management systems

– BINARY can contain 1-254 bytes– VARBINARY can be up to 32672 bytes


CHAR LENGTH

§ The CHAR data type now supports a new maximum length of 255 bytes (previously 254 bytes)– Makes the CHAR data type compatible with a variety of systems

§ The new maximum length can be used wherever CHAR types are specified– Tables, views, host variables, and also user-defined types and functions

§ Possible application specific changes required– This change affects the return type of the CONCAT scalar function when the combined length is 255 bytes

– Change also affects the result length of the CHAR scalar function without an explicit length parameter


CREATE TABLE Extension

§ The CREATE TABLE statement can now use a SELECT clause to generate the definition and LOAD the data– CREATE TABLE EMP AS (SELECT * FROM EMPLOYEE) WITH DATA

§ CREATE statement can also override column names and create the table definition only – CREATE TABLE EMP(LAST) AS (SELECT LASTNAME FROM EMPLOYEE)

DEFINITION ONLY

§Option to LOAD data directly into the new table– CREATE TABLE TEMP(LAST,PAY) AS

(SELECT LASTNAME, SALARY FROM EMPLOYEE

WHERE WORKDEPT='D11'FETCH FIRST 3 ROWS ONLY) WITH DATA


OFFSET with FETCH FIRST

§ The OFFSET clause can be specified before a FETCH FIRST statement

§ Rows are retrieved after offset values are skipped§ SELECT * FROM EMPLOYEE

OFFSET 10 ROWS FETCH FIRST 5 ROWS ONLY

§OFFSET can also be used in a subselect– SELECT * FROM EMPLOYEE

WHERE SALARY > (SELECT SALARY FROM EMPLOYEE OFFSET 10 FETCH FIRST 1 ROW ONLY)


CREATE FUNCTION Statement for Aggregate UDFs

§ The new CREATE FUNCTION (aggregate interface) statement allows you to create your own aggregate functions– An aggregate function returns a single value that is the result of an evaluation of a set of like values, such as those in a column within a set of rows

– Use your choice of programming language– Four sections within the function are defined based on the stage of the aggregation process • INITIATE• ACCUMULATE• MERGE• FINALIZE


Support for the (+) Outer Join Operator

§ Support for the outer join operator enhances cross-vendor support§Queries can use the outer join operator (+) as alternative syntax within predicates of a WHERE clause

§ The use of the Oracle compatibility mode is not required to enable this feature


COLLATION_KEY Function

§ COLLATION_KEY– The COLLATION_KEY function returns a VARBINARY string that represents the collation key of the expression argument, in the specified collation

– This function is used in SQL for ordering and comparison operations§ Format– COLLATION_KEY( expression, collation_name, [length] )

§ Arguments– Expression - An expression for which the collation key is determined– Collation_Name - An expression that specifies the collation to use when the collation key is determined

– Length - An expression that specifies the length attribute of the result in bytes§ Example: – The following query orders employees by their surnames by using the language-aware collation for German in code page 923

– SELECT FIRSTNME, LASTNAME FROM EMPLOYEE ORDER BY COLLATION_KEY (LASTNAME, 'SYSTEM_923_DE')


Date and Time FunctionsFunction Description

DATE_PART or EXTRACT The DATE_PART function returns a portion of a datetime based on its arguments

DATE_TRUNC THE DATE_TRUNC function returns a time or timestamp expression truncated to the unit specified by the format-string

AGE The AGE function returns a numeric value that represents the number of full years, full months, and full days between the current timestamp and the argument

LOCALTIMESTAMPspecial register

Synonym for CURRENT TIMESTAMP

NOW Function Returns a timestamp based on a reading of the time-of-day clock when the SQL statement is executed at the current server

THIS_QUARTER Returns the first day of the quarter

THIS_WEEK Returns the first day of the week (Sunday is considered the first day of the week)

THIS_YEAR Returns the first day of the year

THIS_MONTH Returns the first day of the quarter

NEXT_QUARTER Returns the first day of the next quarter

NEXT_WEEK Returns the first day of the next week (Sunday is considered the first day of the week)

NEXT_YEAR Returns the first day of the next year

NEXT_MONTH Returns the first day of the next month

NEXT_DAY Returns a datetime value that represents the first weekday, named by string-expression, that is later than the date in expression


Date and Time FunctionsFunction Description

ADD_YEAR Add years to a date.

ADD_MONTHS Add months to a date.

ADD_DAYS Add days to a date.

ADD_HOURS Add hours to a date.

ADD_MINUTES Add minutes to a date.

ADD_SECONDS Add seconds to a date.

DAYOFMONTH Returns an integer between 1 and 31 that represents the day of the argument.

FIRST_DAY Returns a date or timestamp that represents the first day of the month of the argument.

DAYS_TO_END_OF_MONTH Returns the number of days to the end of the month.

HOURS_BETWEEN Returns the number of full hours between two arguments.

MINUTES_BETWEEN Returns the number of full minutes between two arguments.

SECONDS_BETWEEN Returns the number of full seconds between two arguments.

DAYS_BETWEEN Returns the number of full days between two arguments.

WEEKS_BETWEEN Returns the number of full weeks between two arguments.

YEARS_BETWEEN Returns the number of full years between two arguments.

MONTHS_BETWEEN Returns an estimate of the number of months between two arguments.

YMD_BETWEEN The YMD_BETWEEN function returns a numeric value that specifies the number of full years, full months, and full days between two datetime values.

OVERLAPS Use the OVERLAPS predicate to determine whether two chronological periods overlap.


Extract Function

§ EXTRACT Function– The EXTRACT function returns a portion of a datetime based on its arguments

§ Format– result = EXTRACT( element FROM expression )

§ Element– Various elements that can be extracted from the date, time, or timestamp are defined on the Date and Time Elements page

§ Expression – An expression that specifies a date, time or timestamp that will have a field extracted from it

§ Result– The data type of the result of the function depends on the part of the datetimevalue that is specified and is described on the next page


Date and Time Elements

§Many of the Date/Time functions refer to a date or time element– The element name is used as an argument to the functionElement Name Description

EPOCH Specifies the number of seconds since 1970-01-01 00:00:00.00. The value can be positive or negative.

MILLENNIUM(S) Specifies the millennium is to be returned.

CENTURY(CENTURIES) Specifies the number of full 100-year periods represented by the year.

DECADE(S) Specifies the number of full 10-year periods represented by the year.

YEAR(S) Specifies that the year portion is to be returned.

QUARTER Specifies the quarter of the year (1 - 4) is to be returned.

MONTH Specifies that the month portion is to be returned.

WEEK Specifies the number of the week of the year (1 - 53) that the specified day is to be returned. The value uses the ISO-8601 definition of a week, which begins on Monday.

DAY(S) Specifies that the day portion is to be returned.

DOW Specifies the day of the week that is to be returned. Note that "1" represents Sunday.

DOY Specifies the day (1 - 366) of the year that is to be returned.

HOUR(S) Specifies that the hour portion is to be returned.

MINUTE(S) Specifies that the minute portion is to be returned.

SECOND(S) Specifies that the second portion is to be returned.

MILLISECOND(S) Specifies the second of the minute, including fractional parts to one thousandth of a second

MICROSECOND(S) Specifies the second of the minute, including fractional parts to one millionth of a second


OVERLAPS Predicate

§ Use the OVERLAPS predicate to determine whether two chronological periods overlap– A chronological period is specified by a pair of date-time expressions (the first expression specifies the start of a period;; the second specifies its end)

– (start1,end1) OVERLAPS (start2, end2)– The begin and end values are not included in the periods. For example, the periods 2016-10-19 to 2016-10-20 and 2016-10-20 to 2016-10-21 do not overlap

§ Example– SELECT * FROM T1 WHERE

('2016-03-17','2016-03-21') OVERLAPS ('2016-03-20','2016-03-22')


FunctionsFunction Description

HASH The HASH function returns a 128-bit, 160-bit, 256-bit or 512-bit hash of the input data, depending on the algorithm selected, and is intended for cryptographic purposes.

HASH4 The HASH4 function returns the 32-bit checksum hash of the input data.

HASH8 The HASH8 function returns the 64-bit checksum hash of the input data.

TO_HEX The TO_HEX function converts a numeric expression into the hexadecimal representation.

RAWTOHEX The RAWTOHEX function returns a hexadecimal representation of a value as a character string.

BTRIM The BTRIM function removes the characters that are specified in a trim string from the beginning and end of a source string.

DATASLICEID A pseudo column used to return the database partition number of any number of rows.


Statistical FunctionsFunction Description

COVARIANCE_SAMP The COVARIANCE_SAMP function returns the sample covariance of a set of number pairs.

STDDEV_SAMP The STDDEV_SAMP function returns the sample standard deviation (division by [n-1]) of a set of numbers.

VARIANCE_SAMPor VAR_SAMP

The VARIANCE_SAMP function returns the sample variance (division by [n-1]) of a set of numbers.

CUME_DIST The CUME_DIST function returns the cumulative distribution of a row that is hypothetically inserted into a group of rows.

PERCENT_RANK The PERCENT_RANK function returns the relative percentile rank of a row that is hypothetically inserted into a group of rows.

PERCENTILE_DISC Returns the value that corresponds to the specified percentile given a sort specification by using a discrete distribution model.

PERCENTILE_CONT Returns the value that corresponds to the specified percentile given a sort specification by using a continuous distribution model.

MEDIAN The MEDIAN function returns the median value in a set of values.

WIDTH_BUCKET The WIDTH_BUCKET function is used to create equal-width histograms.


WIDTH_BUCKET Scalar Function

§WIDTH_BUCKET– The WIDTH_BUCKET function is used to create equal-width histograms

§ Format– WIDTH_BUCKET(expression, left_bound, right_bound, num_buckets)

§ Arguments– Expression

• An expression that specifies the value to be assigned into a bucket– Left_bound, Right_bound

• An expression that specifies the left end point or right end point– NUM_BUCKETS

• An expression that specifies the number of buckets between bound1 and bound2

§ Example– SELECT EMPNO, SALARY,

WIDTH_BUCKET(SALARY, 35000, 100000, 13)FROM EMPLOYEE ORDER BY EMPNO


Regular Expressions

§ A regular expression is a sequence of characters that act as a pattern for matching and manipulating strings– Much more powerful than the traditional "LIKE" statement in SQL

§ A regular expression is made up of:– Special meta-characters– Expression operators– Modifier flags

§ Seven new Regular Expression Functions– REGEXP_COUNT– REGEXP_EXTRACT– REGEXP_INSTR– REGEXP_LIKE– REGEXP_MATCH_COUNT– REGEXP_REPLACE– REGEXP_SUBSTR


Regular Expression Functions

§ Regular expression use a similar syntax:– REGEXP(source, pattern, flags, start_pos, codeunits)– Source – string to be searched– Pattern – the regular expression that contains what we are searching for– Flag – settings that control how matching is done– Start_pos – where to start in the string– Codeunits – which type unit of measurement start_pos refers to (for Unicode)

Function Description

REGEXP_COUNT Returns a count of the number of times that a regular expression pattern is matched in a string.

REGEXP_EXTRACT Returns one occurrence of a substring of a string that matches the regular expression pattern.

REGEXP_INSTR Returns the starting or ending position of the matched substring, depending on the value of the return option argument.

REGEXP_LIKE Returns a Boolean value indicating if the regular expression pattern is found in a string. The function can be used only where a predicate is supported.

REGEXP_MATCH_COUNT Returns a count of the number of times that a regular expression pattern is matched in a string.

REGEXP_REPLACE Returns a modified version of the source string where occurrences of the regular expression pattern found in the source string are replaced with the specified replacement string.

REGEXP_SUBSTR Returns one occurrence of a substring of a string that matches the regular expression pattern.


Regular Expression Operators (1/2)Operator Description

| Alternation. A|B matches either A or B

* Match 0 or more times. Match as many times as possible.

+ Match 1 or more times. Match as many times as possible.

? Match zero or one time. Prefer one.

n Match exactly n times.

n, Match at least n times. Match as many times as possible.

n,m Match between n and m times. Match as many times as possible, but not more than m.

*? Match 0 or more times. Match as few times as possible.

+? Match 1 or more times. Match as few times as possible.

?? Match zero or one time. Prefer zero.

n? Match exactly n times.

n,? Match at least n times, but no more than required for an overall pattern match.

n,m? Match between n and m times. Match as few times as possible, but not less than n

*+ Match 0 or more times. Match as many times as possible when first encountered, do not retry with fewer even if overall match fails (Possessive Match)

++ Match 1 or more times. Possessive match.

?+ Match zero or 1 time. Possessive match.


Regular Expression Operators (2/2)Operator Description

n+ Match exactly n times.

n,+ Match at least n times. Possessive Match.

n,m+ Match between n and m times. Possessive Match.

( ... ) Capturing parentheses. Range of input that matched the parenthesized subexpression is available after the match.

(?: ... ) Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. More efficient than capturing parentheses.

(?> ... ) Atomic-match parentheses. First match of the parenthesized subexpression is the only one tried. If it does not lead to an overall pattern match, back up the search for a match to a position before the "(?>"

(?# ... ) Free-format comment (?# comment )

(?= ... ) Look-ahead assertion. True if the parenthesized pattern matches at the current input position, but does not advance the input position.

(?! ... ) Negative look-ahead assertion. True if the parenthesized pattern does not match at the current input position. Does not advance the input position.

(?<= ... ) Look-behind assertion. True if the parenthesized pattern matches text that precedes the current input position. The last character of the match is the input character just before the current position. Does not alter the input position. The length of possible strings that is matched by the look-behind pattern must not be unbounded (no * or + operators.)

(?<!...) Negative Look-behind assertion. True if the parenthesized pattern does not match text that precedes preceding the current input position. The last character of the match is the input character just before the current position. Does not alter the input position. The length of possible strings that is matched by the look-behind pattern must not be unbounded (no * or + operators.)


Regular Expression Examples

Example expression Description

[abc] Match any of the characters a, b, or c

[^abc] Negation - match any character except a, b, or c

[A-M] Range - match any character from A to M. The characters to include are determined by Unicode code point order.

[\u0000-\U0010ffff] Range - match all characters.

[\pLetter] [\pGeneral_Category=Letter] [\pL] Characters with Unicode Category = Letter. All forms that are shown are equivalent.

[\PLetter] Negated property. (Uppercase \P) Match everything except Letters.

[\pnumeric_value=9] Match all numbers with a numeric value of 9. Any Unicode Property might be used in set expressions.

[\pLetter&&\pscript=cyrillic] Logical AND or intersection. Match the set of all Cyrillic letters.

[\pLetter--\pscript=latin] Subtraction. Match all non-Latin letters.

[[a-z][A-Z][0-9]] [a-zA-Z0-9]] Implicit Logical OR or Union of Sets. The examples match ASCII letters and digits. The two forms are equivalent.

[:script=Greek:] Alternate POSIX-like syntax for properties. Equivalent to \pscript=Greek


Bitwise FunctionsFunction Description

SMALLINT Functions

INT2AND Performs a bitwise AND operation, 1 only if the corresponding bits in both arguments are 1.

INT2OR Performs a bitwise OR operation, 1 unless the corresponding bits in both arguments are zero.

INT2XOR Performs a bitwise exclusive OR operation, 1 unless the corresponding bits in both arguments are the same.

INT2NOT Performs a bitwise NOT operation, opposite of the corresponding bit in the argument.

INTEGER Functions





BIGINT Functions






SQL_COMPAT Global Variable

§ This built-in global variable specifies the SQL compatibility mode– Its value determines which set of syntax rules are applied to SQL queries

§ This global variable has the following characteristics:– It is a read/write variable, with values maintained by the user– The type is VARCHAR(3)– The schema is SYSIBM– The scope of this global variable is session– It has a default value of NULL

§ SQL_COMPAT must have a value of NULL, 'DB2', or 'NPS'– NULL: The default setting ('DB2') is used– DB2: DB2 syntax rules are applied to SQL queries– NPS: Netezza syntax rules are applied to SQL queries. If you use this setting, some SQL behavior will differ from what is documented in the SQL reference information


Compatibility features for Netezza Performance Server

§ Use the SQL_COMPAT global variable to activate the following optional NPS compatibility features– SET SQL_COMPAT='NPS'

§ Compatibility Features– Double-dot notation

• You can use double-dot notation to specify a database object– TRANSLATE parameter syntax

• TRANSLATE (char-string-exp, from-string-exp ,to-string-exp)– NPS Operator Symbols

• The operators ^ and ** are both interpreted as the exponential operator, and the operator # is interpreted as bitwise XOR

– Grouping by SELECT clause columns • You can specify the ordinal position or exposed name of a SELECT clause column when grouping the results of a query

– Routines written in NZPLSQL• The NZPLSQL language can be used in addition to the SQL PL language.


NPS Grouping

§ A GROUP BY clause groups the results of a query that have matching values for one or more grouping expressions– Each column included in a grouping expression must unambiguously identify a column of the query's SELECT clause or an exposed column of the query's intermediate result

– If a SELECT clause contains column expressions that are not aggregate expressions, and if a GROUP BY clause is specified, those column expressions must be in the GROUP BY clause

– Example:• SELECT c1 AS a, c2+c3 AS b, COUNT(*) AS c

FROM t1 GROUP BY c1, c2+c3;

• SELECT c1 AS a, c2+c3 AS b, COUNT(*) AS c FROM t1

GROUP BY 1, 2;

DB2 V11 Overview NEDB2UG 2016-05-153 © 2016IBM"Corporation...

Documents

Transcript of DB2 V11 Overview NEDB2UG 2016-05-153 © 2016IBM"Corporation...