Simple Nested Sets and some other DB optimizations

44
NESTED SETS & DB DESIGN PRINCIPLES INTRODUCTION AND TIPS & TRICKS JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Transcript of Simple Nested Sets and some other DB optimizations

NESTED SETS & DB DESIGN PRINCIPLES

INTRODUCTION AND TIPS & TRICKS

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

AGENDA

• WHO AM I ?

• NESTED SET INTRO

• NESTED SET BASIC QUERIES

• NESTED SET BASIC QUERIES OPTIMIZATION

• STRING LOOKUP OPTIMIZATION TRICK

• INDEXING OPTIONS

• SUMMARY

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

WHO AM I ?

Eli Aschkenasy

I LOVE DATA!

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

WHO AM I ?

Eli Aschkenasy

Lived in 7 countries

Live for Skiing/Kayaking

Love! Data

Like Joomla!

DATA

I LOVE DATA!

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

WHO AM I ?

Eli Aschkenasy

I LOVE JOOMLA!

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

WHO AM I ?

Eli Aschkenasy

Lived in 7 countries

Live for Skiing/Kayaking

Like Data

Love Joomla!

JOOMLA

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

AGENDA

• WHO AM I ?

• NESTED SET INTRO

• NESTED SET BASIC QUERIES

• NESTED SET BASIC QUERIES OPTIMIZATION

• STRING LOOKUP OPTIMIZATION TRICK

• INDEXING OPTIONS

• SUMMARY

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

NESTED SETS

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

ROOT NODE

root - lft = 1

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

TOTAL NODES

(rgt – lft + 1) / 2 = total

(36 – 1 + 1) / 2 = 18

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

IS LEAF NODE?

leaf = (rgt – lft ==1) ? true : false;

true = (5 – 4 == 1)

false = (8 – 3 == 1)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

LEAF NODE OPTIMIZATIONNaïve implementation

SELECT *FROM #__WHERE rgt – lft = 1

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

LEAF NODE OPTIMIZATIONNormal implementation

SELECT *FROM #__WHERE lft = (rgt -1)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

LEAF NODE OPTIMIZATIONOptimized implementation

SELECT x, x, xFROM #__WHERE lft = (rgt -1)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

SUBTREE SELECTIONSchoolbook implementation

SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgtAND c.rgt BETWEEN b.lft AND b.rgt;

Anything between 22 and 35

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

SUBTREE SELECTION OPTIMIZATIONNaïve implementation

SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgt;

Anything between 22 and 35

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

SUBTREE SELECTION ONLY SUBTREENaïve implementation

SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgtAND c.lft <> b.lft;

Anything between 22 and 35 and isn’t 22

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

SUBTREE SELECTION ONLY SUBTREEOptimized implementation

SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN (b.lft+1) AND b.rgt;

Anyone between 23 and 35

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

PATH TO A NODE

SELECT aliasFROM #__WHERE lft < 4 AND rgt > 5ORDER BY lft ASC;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

PATH TO A NODEIncluding leaf node

SELECT aliasFROM #__WHERE lft < 5 AND rgt > 4ORDER BY lft ASC;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

PATH TO A NODE

SELECT aliasFROM #__WHERE lft < 27 AND rgt > 26ORDER BY lft ASC;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

LEVEL OF NODE

SELECT b.id, COUNT(a.id) AS levelFROM #__ AS a, #__ AS b

WHERE b.lft BETWEEN a.lft AND a.rgt

GROUP BY b.id

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

MAXIMUM DEPTH

SELECT MAX(level) AS heightFROM (

SELECT b.id, (COUNT(a.id) - 1) AS level

FROM #__ AS a, #__ AS bWHERE b.lft BETWEEN a.lft AND

a.rgtGROUP BY b.id) AS L1

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

PARENT ID OF A NODE

SELECT id, (SELECT id FROM #__ t2 WHERE t2.lft < t1.lft AND t2.rgt > t1.rgt ORDER BY t2.rgt-t1.rgt ASC LIMIT 1) AS parent FROM #__ t1 ORDER BY (rgt-lft) DESC

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

INSERT A NODE

UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

35

23

28

24

25

26

27

29

34

30

32

31

33

36

INSERT A NODE

UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Root Node

1

2

3

4 5

lft rgt

1 36

2 15

3 8

4 5

6 7

9 14

10 11

12 13

16 21

17 18

19 20

22 35

23 28

24 25

26 27

29 34

30 31

32 33

6 7

8 9

10

11

12

13

14

15

16

17

18

19

20

21

22

37

23

30

26

27

28

29

31

36

32

34

33

35

38

INSERT A NODE

UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;

24

25

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

AGENDA

• WHO AM I ?

• NESTED SET INTRO

• NESTED SET BASIC QUERIES

• NESTED SET BASIC QUERIES OPTIMIZATION

• STRING LOOKUP OPTIMIZATION TRICK

• INDEXING OPTIONS

• SUMMARY

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

OPTIMIZED STRING LOOKUP

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Common INSERT

INSERT INTO #__ (part_number, unit_price,eau,….)VALUES (….);

Common SELECT

SELECT unit_price FROM #__WHERE part_number = $part_number;

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Common INSERT

INSERT INTO #__ (part_number, unit_price,eau,….)VALUES (….);

Common SELECT

SELECT unit_price FROM #__WHERE part_number = $part_number;

PROBLEMS:

1. Speed2. Data accuracy (inconsistent white spacing)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Amended INSERT

php: $concat = preg_replace('/\s+/', '', $input);

INSERT INTO #__ (part_number, unit_price, eau,part_number_concat,…)VALUES (…,$concat);

Amended SELECT

SELECT unit_price FROM #__WHERE part_number_concat = $concat;

PROBLEMS:

1. Speed2. Data accuracy (inconsistent white

spacing)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Optimized INSERT

php: $concat = preg_replace('/\s+/', '', $input);php: $crc = crc32($concat);

INSERT INTO #__ (part_number, unit_price, eau,part_number_concat,crc_partnumberconcat…)VALUES (…,$concat,$crc);

Optimized SELECT

SELECT unit_price FROM #__WHERE crc_partnumberconcat = $crcAND part_number_concat = $concat; (singularity)

PROBLEMS:

1. Speed2. Data accuracy (inconsistent white

spacing)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Optimized INSERT (* BATCH)

php: $concat = preg_replace('/\s+/', '', $input);php: $crc = crc32($concat);

ALTER TABLE #__ DROP INDEX xINSERT INTO #__ (part_number, unit_price, eau,part_number_concat,crc_partnumberconcat…)VALUES (…,$concat,$crc);ALTER TABLE #__ ADD INDEX x (‘column’)

Optimized SELECT

SELECT unit_price FROM #__WHERE crc_partnumberconcat = $crcAND part_number_concat = $concat;

PROBLEMS:

1. Speed2. Data accuracy (inconsistent white

spacing)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Indexing Options

ALWAYS BENCHMARK (against COLD DB)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

Indexing Options

ALWAYS BENCHMARK (against COLD DB)USE EXPLAIN (EXTENSIVELY!!!)

Assumption:I might search part_number_concat directly:idx_pnc_crc (part_number_concat, crc_partnumberconcat)

Alternative:idx_crc_pnc (crc_partnumberconcat, part_number_concat)idx_pnc (part_number_concat)

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

AGENDA

• WHO AM I ?

• NESTED SET INTRO

• NESTED SET BASIC QUERIES

• NESTED SET BASIC QUERIES OPTIMIZATION

• STRING LOOKUP OPTIMIZATION TRICK

• INDEXING OPTIONS

• SUMMARY

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

SUMMARY

• DETACH “LEAF CONTENT” FROM TREE

• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB

• ALTERNATIVELY - CREATE SUMMARY TABLES

• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB

• CREATE SIMPLIFICATION COLUMNS FOR INDEXING

• CHOOSE INDEXING STRATEGY ACCORDING TO USAGE AND SPECIFICITY

• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB

JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles

THANK YOU