GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13...

78
DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #13: QUERY COMPILATION

Transcript of GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13...

Page 1: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

DATABASE SYSTEM IMPLEMENTATION

GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ

LECTURE #13: QUERY COMPILATION

Page 2: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

TODAY’S AGENDA

BackgroundCode Generation / TranspilationJIT Compilation (LLVM)Real-world Implementations

2

Page 3: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control

Query Parser

Query Optimizer

Query Executor

Lock Manager (Concurrency Control)

Access Methods (or Indexes)

Buffer Pool Manager

Log Manager

Memory Manager + Disk Manager

Networking Manager

3

QueryTransactional

Storage Manager

Query Processor

Shared Utilities

Process Manager

Source: Anatomy of a Database System

Page 4: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HEKATON REMARK

After switching to an in-memory DBMS, the only way to increase throughput is to reduce the number of instructions executed.→ To go 10x faster, the DBMS must execute 90% fewer

instructions…→ To go 100x faster, the DBMS must execute 99% fewer

instructions…

4

COMPILATION IN THE MICROSOFT SQL SERVER HEKATON ENGINEIEEE Data Engineering Bulletin 2011

Page 5: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

OBSERVATION

One way to achieve such a reduction in instructions is through code specialization.

This means generating code that is specific to a particular query in the DBMS.→ Encode everything known about the data (e.g., type) and

query (e.g., where clause)Most code is written to make it easy for humans to understand rather than performance…→ Interpretation vs. Compilation

5

Page 6: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

EXAMPLE DATABASE

6

CREATE TABLE A (id INT PRIMARY KEY,val INT

);

CREATE TABLE B (id INT PRIMARY KEY,val INT

);

CREATE TABLE C (a_id INT REFERENCES A(id),b_id INT REFERENCES B(id),PRIMARY KEY (a_id, b_id)

);

Page 7: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY PROCESSING

7

SELECT A.id, B.valFROM A, BWHERE A.id = B.idAND B.val > 100

A B

A.id=B.id

val>100

A.id, B.val

⨝s

p

Page 8: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY PROCESSING

8

Tuple-at-a-time→ Each operator calls next on their child to get

the next tuple to process.Vector-at-a-time→ Each operator calls next on their child to get

the next chunk of data to process.Operator-at-a-time→ Each operator materializes their entire output

for their parent operator.

SELECT A.id, B.valFROM A, BWHERE A.id = B.idAND B.val > 100

A B

A.id=B.id

val>100

A.id, B.val

⨝s

p

Page 9: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY PROCESSING

9

Tuple-at-a-time→ Each operator calls next on their child to get

the next tuple to process.Vector-at-a-time→ Each operator calls next on their child to get

the next chunk of data to process.Operator-at-a-time→ Each operator materializes their entire output

for their parent operator.

SELECT A.id, B.valFROM A, BWHERE A.id = B.idAND B.val > 100

A B

A.id=B.id

val>100

A.id, B.val

⨝s

p

Page 10: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY INTERPRETATION

10

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

⨝A.id=C.a_id

σA.val=123

A

⨝B.id=C.b_id

ΓB.id, COUNT(*)

σB.val=?+1

B C

Page 11: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY INTERPRETATION

11

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

⨝A.id=C.a_id

σA.val=123

A

⨝B.id=C.b_id

ΓB.id, COUNT(*)

σB.val=?+1

B C

⨝ for t1 in left.next():buildHashTable(t1)

for t2 in right.next():if probe(t2): emit(t1⨝t2)

for t in child.next():if evalPred(t): emit(t)σ ⨝ for t1 in left.next():

buildHashTable(t1)for t2 in right.next():

if probe(t2): emit(t1⨝t2)

for t in A:emit(t)A

for t in B:emit(t)B for t in C:

emit(t)C

for t in child.next():if evalPred(t): emit(t)σ

Γfor t in child.next():

buildAggregateTable(t)for t in aggregateTable:

emit(t)

Page 12: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PREDICATE INTERPRETATION

12

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Page 13: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PREDICATE INTERPRETATION

13

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Page 14: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Execution Context

PREDICATE INTERPRETATION

14

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 15: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Execution Context

PREDICATE INTERPRETATION

15

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 16: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Execution Context

PREDICATE INTERPRETATION

16

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 17: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

Execution Context

PREDICATE INTERPRETATION

17

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 18: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

Execution Context

PREDICATE INTERPRETATION

18

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 19: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

Execution Context

PREDICATE INTERPRETATION

19

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 20: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

999

Execution Context

PREDICATE INTERPRETATION

20

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 21: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

999

Execution Context

PREDICATE INTERPRETATION

21

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 22: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

999 1

Execution Context

PREDICATE INTERPRETATION

22

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 23: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

1000

999 1

true

1000

Execution Context

PREDICATE INTERPRETATION

23

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Current Tuple(123, 1000)

Query Parameters(int:999)

Table SchemaB→(int:id, int:val)

TupleAttribute(B.val)

Constant(1)

=

+

Parameter(0)

Page 24: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

CODE SPECIALIZATION

Any CPU intensive entity of database can be natively compiled if they have a similar execution pattern on different inputs. → Access Methods→ Stored Procedures→ Operator Execution→ Predicate Evaluation→ Logging Operations

24

Page 25: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

BENEFITS

Attribute types are known a priori.→ Data access function calls can be converted to inline

pointer casting.Predicates are known a priori.→ They can be evaluated using primitive data comparisons.No function calls in loops→ Allows the compiler to efficiently distribute data to

registers and increase cache reuse.

25

Page 26: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

ARCHITECTURE OVERVIEW

26

SQL Query

ParserAbstract

SyntaxTree

Physical Plan

CostEstimates

SystemCatalog

Binder

OptimizerAnnotated

AST

Native Code

Compiler

Page 27: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

CODE GENERATION

Approach #1: Transpilation→ Write code that converts a relational query plan into

C/C++ and then run it through a conventional compiler to generate native code.

Approach #2: JIT Compilation→ Generate an intermediate representation (IR) of the query

that can be quickly compiled into native code .

27

Page 28: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HIQUE – CODE GENERATION

For a given query plan, create a C/C++ program that implements that query’s execution.→ Bake in all the predicates and type conversions.

Use an off-shelf compiler to convert the code into a shared object, link it to the DBMS process, and then invoke the exec function.

28

GENERATING CODE FOR HOLISTIC QUERY EVALUATIONICDE 2010

Page 29: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

OPERATOR TEMPLATES

29

SELECT * FROM A WHERE A.val = ? + 1

Page 30: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Interpreted Plan

OPERATOR TEMPLATES

30

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

Page 31: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Interpreted Plan

OPERATOR TEMPLATES

31

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

Page 32: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Interpreted Plan

OPERATOR TEMPLATES

32

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

1. Traverse predicate tree and pull values up.2. If tuple value, calculate the offset of the target

attribute.3. Perform casting as needed for comparison operators.4. Return true / false.

Page 33: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Templated PlanInterpreted Plan

OPERATOR TEMPLATES

33

tuple_size = ###predicate_offset = ###parameter_value = ###

for t in range(table.num_tuples):tuple = table.data + t ∗ tuple_sizeval = (tuple+predicate_offset)if (val == parameter_value + 1):

emit(tuple)

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

1. Traverse predicate tree and pull values up.2. If tuple value, calculate the offset of the target

attribute.3. Perform casting as needed for comparison operators.4. Return true / false.

Page 34: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Templated PlanInterpreted Plan

OPERATOR TEMPLATES

34

tuple_size = ###predicate_offset = ###parameter_value = ###

for t in range(table.num_tuples):tuple = table.data + t ∗ tuple_sizeval = (tuple+predicate_offset)if (val == parameter_value + 1):

emit(tuple)

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

1. Traverse predicate tree and pull values up.2. If tuple value, calculate the offset of the target

attribute.3. Perform casting as needed for comparison operators.4. Return true / false.

Page 35: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Templated PlanInterpreted Plan

OPERATOR TEMPLATES

35

tuple_size = ###predicate_offset = ###parameter_value = ###

for t in range(table.num_tuples):tuple = table.data + t ∗ tuple_sizeval = (tuple+predicate_offset)if (val == parameter_value + 1):

emit(tuple)

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

1. Traverse predicate tree and pull values up.2. If tuple value, calculate the offset of the target

attribute.3. Perform casting as needed for comparison operators.4. Return true / false.

Page 36: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

Templated PlanInterpreted Plan

OPERATOR TEMPLATES

36

tuple_size = ###predicate_offset = ###parameter_value = ###

for t in range(table.num_tuples):tuple = table.data + t ∗ tuple_sizeval = (tuple+predicate_offset)if (val == parameter_value + 1):

emit(tuple)

for t in range(table.num_tuples):tuple = get_tuple(table, t)if eval(predicate, tuple, params):

emit(tuple)

1. Get schema in catalog for table.2. Calculate offset based on tuple size.3. Return pointer to tuple.

1. Traverse predicate tree and pull values up.2. If tuple value, calculate the offset of the target

attribute.3. Perform casting as needed for comparison operators.4. Return true / false.

Page 37: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

DBMS INTEGRATION

The generated query code can invoke any other function in the DBMS.

This allows it to use all the same components as interpreted queries.→ Concurrency Control→ Logging / Checkpoints→ Indexes

37

Page 38: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

EVALUATION

Generic Iterators→ Canonical model with generic predicate evaluation.Optimized Iterators→ Type-specific iterators with inline predicates.Generic Hardcoded→ Handwritten code with generic iterators/predicates.Optimized Hardcoded→ Direct tuple access with pointer arithmetic.HIQUE→ Query-specific specialized code.

38

Page 39: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION EVALUATION

39

0

50

100

150

200

250

Generic Iterators

Optimized Iterators

Generic Hardcoded

Optimized Hardcoded

HIQUE

Exec

utio

n Ti

me (

ms)

L2-cache Miss Memory Stall Instruction Exec.

Intel Core 2 Duo 6300 @ 1.86GHzJoin Query: 10k⨝ 10k→10m

Source: Konstantinos Krikellas

Page 40: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION EVALUATION

40

0

50

100

150

200

250

Generic Iterators

Optimized Iterators

Generic Hardcoded

Optimized Hardcoded

HIQUE

Exec

utio

n Ti

me (

ms)

L2-cache Miss Memory Stall Instruction Exec.

Intel Core 2 Duo 6300 @ 1.86GHzJoin Query: 10k⨝ 10k→10m

Source: Konstantinos Krikellas

Page 41: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION COST

41

121 160 213274

403

619

0

200

400

600

800

Q1 Q2 Q3

Com

pila

tion

Tim

e (m

s)

Compile (-O0) Compile (-O2)

Intel Core 2 Duo 6300 @ 1.86GHzTPC-H Queries

Source: Konstantinos Krikellas

Page 42: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

OBSERVATION

Relational operators are a useful way to reason about a query but are not the most efficient way to execute it.

It takes a (relatively) long time to compile a C/C++ source file into executable code.

HIQUE does not allow for full pipelining…

42

Page 43: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PIPELINED OPERATORS

43

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

⨝A.id=C.a_id

σA.val=123

A

⨝B.id=C.b_id

ΓB.id,COUNT(*)

σB.val=?+1

B C

Page 44: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PIPELINED OPERATORS

44

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

⨝A.id=C.a_id

σA.val=123

A

⨝B.id=C.b_id

ΓB.id,COUNT(*)

σB.val=?+1

B C

Pipeline Boundaries #1

#4

#2

#3

Page 45: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – JIT QUERY COMPILATION

Compile queries in-memory into native code using the LLVM toolkit.

Organizes query processing in a way to keep a tuple in CPU registers for as long as possible.→ Push-based vs. Pull-based→ Data Centric vs. Operator Centric

45

EFFICIENTLY COMPILING EFFICIENT QUERY PLANS FOR MODERN HARDWAREVLDB 2011

Page 46: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

LLVM

Collection of modular and reusable compiler and toolchain technologies.

Core component is a low-level programming language (IR) that is similar to assembly.

Not all of the DBMS components need to be written in LLVM IR.→ LLVM code can make calls to C++ code.

46

Page 47: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

LLVM

47

C Frontend

CommonOptimizer

X86 Backend

Fortran Frontend

Ada Frontend

PowerPCBackend

ARMBackendIntermediate

Representation (IR)

Source: The Architecture of Open Source Applications

Page 48: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PUSH-BASED EXECUTION

48

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Generated Query Planfor t in A:

if t.val == 123:Materialize t in HashTable ⨝(A.id=C.a_id)

for t in B:if t.val == <param> + 1:

Aggregate t in HashTable Γ(B.id)

for t in Γ(B.id):Materialize t in HashTable ⨝(B.id=C.b_id)

for t3 in C:for t2 in ⨝(B.id=C.b_id):

for t1 in ⨝(A.id=C.a_id):emit(t1⨝t2⨝t3)

Page 49: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PUSH-BASED EXECUTION

49

SELECT *FROM A, C, (SELECT B.id, COUNT(*)

FROM BWHERE B.val = ? + 1GROUP BY B.id) AS B

WHERE A.val = 123 AND A.id = C.a_idAND B.id = C.b_id

Generated Query Planfor t in A:

if t.val == 123:Materialize t in HashTable ⨝(A.id=C.a_id)

for t in B:if t.val == <param> + 1:

Aggregate t in HashTable Γ(B.id)

for t in Γ(B.id):Materialize t in HashTable ⨝(B.id=C.b_id)

for t3 in C:for t2 in ⨝(B.id=C.b_id):

for t1 in ⨝(A.id=C.a_id):emit(t1⨝t2⨝t3)

#1

#4

#2

#3

Page 50: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION EVALUATION

50

1

10

100

1000

10000

100000

Q1 Q2 Q3 Q4 Q5

Exec

utio

n Ti

me (

ms)

HyPer (LLVM) HyPer (C++) VectorWise MonetDB ???

Dual Socket Intel Xeon X5770 @ 2.93GHzTPC-H Queries

Source: Thomas Neumann

Page 51: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION COST

51

274403

619

13 37 150

200

400

600

800

Query #1 Query #2 Query #3

Com

pila

tion

Tim

e (m

s)

HIQUE HyPer

HIQUE (-O2) vs. HyPerTPC-H Queries

Source: Konstantinos Krikellas

Page 52: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

QUERY COMPILATION COST

LLVM's compilation time grows super-linearly relative to the query size.→ # of joins→ # of predicates→ # of aggregations

Not a big issue with OLTP applications.Major problem with OLAP workloads.

52

Page 53: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

First generate the LLVM IR for the query.Then execute that IR in an interpreter.Compile the query in the background.When the compiled query is ready, seamlessly replace the interpretive execution.

53

ADAPTIVE EXECUTION OF COMPILED QUERIESICDE 2018

Page 54: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

54

Page 55: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

55

Optimizer(0.2 ms)

SQL Query

Page 56: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

56

Optimizer(0.2 ms)

SQL Query

Code Generator(0.7 ms)

Query Plan

Page 57: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

57

Optimizer(0.2 ms)

SQL Query

Code Generator(0.7 ms)

Query Plan

Page 58: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

58

Optimizer(0.2 ms)

SQL Query

Code Generator(0.7 ms)

Query Plan

Byte Code Compiler(0.4 ms)

LLVM IR

Page 59: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

59

Optimizer(0.2 ms)

Byte Code

SQL Query

Code Generator(0.7 ms)

Query Plan

Byte Code Compiler(0.4 ms)

LLVM IR

Page 60: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

60

Optimizer(0.2 ms)

Byte Code

SQL Query

Code Generator(0.7 ms)

Query Plan

Byte Code Compiler(0.4 ms)

Unoptimized LLVM Compiler

(6 ms)

LLVM IR

LLVM IR

x86 Code

Page 61: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

61

Optimizer(0.2 ms)

Byte Code

SQL Query

Code Generator(0.7 ms)

Query Plan

LLVM Passes(25 ms)

Byte Code Compiler(0.4 ms)

Unoptimized LLVM Compiler

(6 ms)

LLVM IR

LLVM IR

LLVM IR

x86 Code

Page 62: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

HYPER – ADAPTIVE EXECUTION

62

Optimizer(0.2 ms)

Byte Code

SQL Query

Code Generator(0.7 ms)

Query Plan

LLVM Passes(25 ms)

Byte Code Compiler(0.4 ms)

Unoptimized LLVM Compiler

(6 ms)

Optimized LLVM Compiler

(17 ms)

LLVM IR

LLVM IR

LLVM IR

LLVM IR

x86 Code

x86 Code

Page 63: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

REAL-WORLD IMPLEMENTATIONS

IBM System ROracleMicrosoft HekatonCloudera ImpalaActian Vector

63

MemSQLVitesseDBApache SparkPeloton

Page 64: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

IBM SYSTEM R

A primitive form of code generation and query compilation was used by IBM in 1970s.→ Compiled SQL statements into assembly code by

selecting code templates for each operator.

Technique was abandoned when IBM built DB2:→ High cost of external function calls→ Poor portability→ Software engineer complications

64

A HISTORY AND EVALUATION OF SYSTEM RCOMMUNICATIONS OF THE ACM 1981

Page 65: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

ORACLE

Convert PL/SQL stored procedures into Pro*C code and then compiled into native C/C++ code.

They also put Oracle-specific operations directlyin the SPARC chips as co-processors.→ Memory Scans→ Bit-pattern Dictionary Compression→ Vectorized instructions designed for DBMSs→ Security/encryption

65

Page 66: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MICROSOFT HEKATON

Can compile both procedures and SQL.→ Non-Hekaton queries can access Hekaton tables through

compiled inter-operators.Generates C code from an imperative syntax tree, compiles it into DLL, and links at runtime.

Employs safety measures to prevent somebody from injecting malicious code in a query.

66

COMPILATION IN THE MICROSOFT SQL SERVER HEKATON ENGINEIEEE DATA ENGINEERING BULLETIN 2011

Page 67: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

CLOUDERA IMPALA

LLVM JIT compilation for predicate evaluation and record parsing.→ Not sure if they are also doing operator compilation.

Optimized record parsing is important for Impala because they need to handle multiple data formats stored on HDFS.

67

IMPALA: A MODERN, OPEN-SOURCE SQL ENGINE FOR HADOOPCIDR 2015

Page 68: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc.Converts all queries into a parameterized form and caches the compiled query plan.

68

Page 69: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc.Converts all queries into a parameterized form and caches the compiled query plan.

69

SELECT * FROM A WHERE A.id = 123

Page 70: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc.Converts all queries into a parameterized form and caches the compiled query plan.

70

SELECT * FROM A WHERE A.id = ?

SELECT * FROM A WHERE A.id = 123

Page 71: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc.Converts all queries into a parameterized form and caches the compiled query plan.

71

SELECT * FROM A WHERE A.id = ?

SELECT * FROM A WHERE A.id = 123

Page 72: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (PRE–2016)

Performs the same C/C++ code generation as HIQUE and then invokes gcc.Converts all queries into a parameterized form and caches the compiled query plan.

72

SELECT * FROM A WHERE A.id = ?

SELECT * FROM A WHERE A.id = 123

SELECT * FROM A WHERE A.id = 456

Page 73: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

MEMSQL (2016–PRESENT)

A query plan is converted into an imperative plan expressed in a high-level imperative DSL.→ MemSQL Programming Language (MPL)→ Think of this as a C++ dialect.The DSL then gets converted into a second language of opcodes.→ MemSQL Bit Code (MBC)→ Think of this as JVM byte code.Finally the DBMS compiles the opcodes into LLVM IR and then to native code.

73

Source: Drew Paroski

Page 74: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

APACHE SPARK

Introduced in the new Tungsten engine in 2015.The system converts a query's WHERE clause expression trees into ASTs.It then compiles these ASTs to generate JVM bytecode, which is then executed natively.

74

SPARK SQL: RELATIONAL DATA PROCESSING IN SPARKSIGMOD 2015

Page 75: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PELOTON

Full compilation of the entire query plan.

Relax the pipeline breakers of HyPer to create mini-batches for operators that can be vectorized.

Use software pre-fetching to hide memory stalls.

75

RELAXED OPERATOR FUSION FOR IN-MEMORY DATABASES: MAKING COMPILATION, VECTORIZATION, AND PREFETCHING WORK TOGETHER AT LASTVLDB 2017

Page 76: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PELOTON

76

8814726350

87473

996021500

901 13962641

383 540892 8461763

191 220

1

10

100

1000

10000

100000

Q1 Q3 Q13 Q14 Q19

Exec

utio

n Ti

me (

ms)

Interpreted LLVM LLVM + ROF

Dual Socket Intel Xeon E5-2630v4 @ 2.20GHzTPC-H 10 GB Database

Source: Prashanth Menon

Page 77: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

PARTING THOUGHTS

Query compilation makes a difference but is non-trivial to implement.

The 2016 version of MemSQL is the best query compilation implementation out there.Hekaton is very good too.

Any new DBMS that wants to compete has to implement query compilation.

77

Page 78: GT 4420/6422 // SPRING 2019 // @JOY ARULRAJ LECTURE #13 ...jarulraj/courses/4420-s19/slides/13... · CODE GENERATION Approach #1: Transpilation →Write code that converts a relational

NEXT CLASS

Query Optimization

78