Analysis of Database Workloads on Modern Processors Advisor: Prof. Shan Wang P.h.D student: Dawei...

35
Analysis of Database Workloads on Modern Processors Advisor: Prof. Shan Wang P.h.D student: Dawei Liu Key Laboratory of Data E ngineering and Knowledge Engineering MOE School of Information Renmin University of China

Transcript of Analysis of Database Workloads on Modern Processors Advisor: Prof. Shan Wang P.h.D student: Dawei...

Analysis of Database Workloads on Modern Processors

Advisor: Prof. Shan Wang P.h.D student: Dawei Liu

Key Laboratory of Data Engineering and Knowledge

Engineering MOE

School of Information Renmin University of China

Outlines

• 1. Background

• 2. Motivation

• 3. Our research work

• 4. Future works

Background

• LAMA Project

• Goal Advanced issues of Massively Parallel

Processing (MPP) databases Architecture and design aspects; Next generation memory oriented DB

My Focus

rgle Scale Data nagement La Ma

Joint research with HP Lab China

Outlines

• 1. Background

• 3. Our research

• 4. Future works

2. Motivation

Motivation

• Continued evolution of hardware Processor

Motivation(cont.)

• Memory

Larger and Larger Flash Memory

Cont.

• Traditional research Dedicate to I/O optimization Fail to utilize processor resources

efficiently

Cont.

• Modern processors (Itanium II) multi-level memory hierarchies; superscalar out-of-order execution; multi-threading; multi-cores;

Create opportunity for database performance improve.

Cont.

• Object Accurately characterizing workload behavi

or on modern processor Find out the bottleneck;

• Benefit Identify a set of characteristics; performance optimization

Detailed issues ?

My P.h. D Track (1) Accurately characterize the database

workloads on modern processors;

(2) Investigating the MMDB workloads on modern processor;

(3) Developing a specialized benchmark for MMDB

(1) Processor Issue

• Previous research[*]

Conlusion• DBMSs achieve low IPC (instructions-per-cycl

e)• Processors are inefficiently used

Platform• Intel Pentium II / Pentium Pro

----------------------------------------------------------------------------------------------------------------------------------------

* A. Ailamaki, D. J. DeWitt, M. D. Hill, D. A. Wood. DBMSs on a Modern Processor: Where Does Time Go? In Proc. VLDB, 1999.

Cont.

We are interested in• DBMS on today’s processors• Itanium II• AMD Opteron (tm)

Where does 8 years go ?

(2) Main Memory DB Issue

• Previous research DB: Disk Resident Databases (DRDB) Workload: TPC-C

• Current problems DB: Main Memory Databases (MMDB) Workload: TPC-H (compute intensive)

The “moved up” on the memory hierarchy ;

Larger and larger on-chip and off-chip

caches ;

Steady increased RAM;

(3) MMDB-Oriented Benchmark

• Performance evaluation OO1-Benchmark OO7-Benchmark

obsolete

• Industrial standards

How to benchmark memory database ?

TPC Benchmark C TPC Benchmark H

OLAPOLTP

We found they are not approprite to benchmark MMDB

Outlines

• 1. Background

• 2. Motivation

• 4. Future works

3. Our research

Methodology

• Analysis framework

• Experiment study

Pipeline of modern processors

Query Execution Time Breakdown

• TQ = TC + TM + TB + TR − TOV L [*]

TC: Useful computation time;

TM: Stall time because of memory stalls;

TB: Branch misprediction overhead;

TR: Resource-related stalls;

TOVL: Stall time can be overlapped

* A. Ailamaki, D. J. DeWitt, M. D. Hill, D. A. Wood. DBMSs on a Modern Processor: Where Does Time Go? In Proc. VLDB, 1999.

Execution time components on Itanium II platform

Experimental setup

• Platform-specific hardware

• Software

• Experimental methodology

The Hardware Platform

• HP Integrity rx2620-2 server

• Itanium II based server

• Cache

Cache characteristics

Software and Methodology

• Calibrator (CWI *) cache access and miss latency; main memory access latency; number of TLB levels ; each level’s TLB miss latency

* Centrum voor Wiskunde en Informatica National research institute for mathematics and computer science in the Netherlands

Cont.

• Perfsuite (NSCA *)

* National Center for Supercomputing Applications (NCSA)

Control hardware counters

Measure 60 event types for the results

Hardware counters

Stall time components on Itanium II

Results analysis• Part one: DRDB

Characterization workload on Itanium II• OLTP• OLAP

• Part two: MMDB issue Characterization of MMDB TPC-H workload

---------------------------------------------------------------------------------------------------------

• Dawei liu, Shan Wang, Biao Qin, Weiwei Gong: Characterizing DSS Workloads from the Processor Perspective. The International Workshop on Database Management and Application over Network DBMAN 2007: 235-240

• DaweiDawei Liu, Shan Wang, Qiming Chen, Yun Tian, Weiwei Gong “Main Memory Database TPC-H Workload Characterization on Modern Processor,” Renmin University of China., TR-01, 2007, http://deke.ruc.edu.cn/tr/TR 2007-01.

Memory stall time breakdown

TPC-H Workload on a DRDB

Index Influence

TPC-H Workload on a DRDB

Branch Instruction Misprediction

TPC-H Workload on a DRDB

DRDB vs. MMDB

Storage Architecture Influence

Summary Characterized workload on Itanium II based

platform; Characterized MMDB read optimized workload on

modern processors; Compare the workload breakdown of DRDB and

MMDB; Explored the difference of column-oriented and

row-oriented storage models in CPU and cache utilization;

Investigated the index influence at low level

Outlines

• 1. Background

• 2. Motivation

• 3. Our research

4. Future works

Future works

In-depth analysis of the results Develop new parallel techniques Instruction level parallelism MMDB benchmark issue

• The results expected to benefit The performance optimization of DBMS; The architecture of next-generation

memory-oriented databases.

The End

Thanks! Welcome to visit RUC.

| Dawei Liu | School of Information| Renmin University of China

| [email protected] || http://deke.ruc.edu.cn | | Tel.: +86 (10) 62513934 |

Key Laboratory of Data Engineering and Knowledge

Engineering MOE