Simple Nested Sets and some other DB optimizations
-
Upload
eli-aschkenasy -
Category
Technology
-
view
135 -
download
4
Transcript of Simple Nested Sets and some other DB optimizations
NESTED SETS & DB DESIGN PRINCIPLES
INTRODUCTION AND TIPS & TRICKS
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
AGENDA
• WHO AM I ?
• NESTED SET INTRO
• NESTED SET BASIC QUERIES
• NESTED SET BASIC QUERIES OPTIMIZATION
• STRING LOOKUP OPTIMIZATION TRICK
• INDEXING OPTIONS
• SUMMARY
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
WHO AM I ?
Eli Aschkenasy
Lived in 7 countries
Live for Skiing/Kayaking
Love! Data
Like Joomla!
DATA
I LOVE DATA!
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
WHO AM I ?
Eli Aschkenasy
I LOVE JOOMLA!
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
WHO AM I ?
Eli Aschkenasy
Lived in 7 countries
Live for Skiing/Kayaking
Like Data
Love Joomla!
JOOMLA
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
AGENDA
• WHO AM I ?
• NESTED SET INTRO
• NESTED SET BASIC QUERIES
• NESTED SET BASIC QUERIES OPTIMIZATION
• STRING LOOKUP OPTIMIZATION TRICK
• INDEXING OPTIONS
• SUMMARY
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
ROOT NODE
root - lft = 1
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
TOTAL NODES
(rgt – lft + 1) / 2 = total
(36 – 1 + 1) / 2 = 18
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
IS LEAF NODE?
leaf = (rgt – lft ==1) ? true : false;
true = (5 – 4 == 1)
false = (8 – 3 == 1)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
LEAF NODE OPTIMIZATIONNaïve implementation
SELECT *FROM #__WHERE rgt – lft = 1
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
LEAF NODE OPTIMIZATIONNormal implementation
SELECT *FROM #__WHERE lft = (rgt -1)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
LEAF NODE OPTIMIZATIONOptimized implementation
SELECT x, x, xFROM #__WHERE lft = (rgt -1)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
SUBTREE SELECTIONSchoolbook implementation
SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgtAND c.rgt BETWEEN b.lft AND b.rgt;
Anything between 22 and 35
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
SUBTREE SELECTION OPTIMIZATIONNaïve implementation
SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgt;
Anything between 22 and 35
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
SUBTREE SELECTION ONLY SUBTREENaïve implementation
SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN b.lft AND b.rgtAND c.lft <> b.lft;
Anything between 22 and 35 and isn’t 22
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
SUBTREE SELECTION ONLY SUBTREEOptimized implementation
SELECT c.type AS choices, b.type AS bottomFROM #__ AS c, #__ AS bWHERE c.lft BETWEEN (b.lft+1) AND b.rgt;
Anyone between 23 and 35
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
PATH TO A NODE
SELECT aliasFROM #__WHERE lft < 4 AND rgt > 5ORDER BY lft ASC;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
PATH TO A NODEIncluding leaf node
SELECT aliasFROM #__WHERE lft < 5 AND rgt > 4ORDER BY lft ASC;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
PATH TO A NODE
SELECT aliasFROM #__WHERE lft < 27 AND rgt > 26ORDER BY lft ASC;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
LEVEL OF NODE
SELECT b.id, COUNT(a.id) AS levelFROM #__ AS a, #__ AS b
WHERE b.lft BETWEEN a.lft AND a.rgt
GROUP BY b.id
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
MAXIMUM DEPTH
SELECT MAX(level) AS heightFROM (
SELECT b.id, (COUNT(a.id) - 1) AS level
FROM #__ AS a, #__ AS bWHERE b.lft BETWEEN a.lft AND
a.rgtGROUP BY b.id) AS L1
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
PARENT ID OF A NODE
SELECT id, (SELECT id FROM #__ t2 WHERE t2.lft < t1.lft AND t2.rgt > t1.rgt ORDER BY t2.rgt-t1.rgt ASC LIMIT 1) AS parent FROM #__ t1 ORDER BY (rgt-lft) DESC
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
INSERT A NODE
UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
35
23
28
24
25
26
27
29
34
30
32
31
33
36
INSERT A NODE
UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Root Node
1
2
3
4 5
lft rgt
1 36
2 15
3 8
4 5
6 7
9 14
10 11
12 13
16 21
17 18
19 20
22 35
23 28
24 25
26 27
29 34
30 31
32 33
6 7
8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
37
23
30
26
27
28
29
31
36
32
34
33
35
38
INSERT A NODE
UPDATE #__ SET rgt=rgt+2 WHERE rgt >= 25;UPDATE #__ SET lft=lft+2 WHERE lft >= 24;INSERT INTO #__ SET lft=24, rgt=25, ….;
24
25
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
AGENDA
• WHO AM I ?
• NESTED SET INTRO
• NESTED SET BASIC QUERIES
• NESTED SET BASIC QUERIES OPTIMIZATION
• STRING LOOKUP OPTIMIZATION TRICK
• INDEXING OPTIONS
• SUMMARY
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Common INSERT
INSERT INTO #__ (part_number, unit_price,eau,….)VALUES (….);
Common SELECT
SELECT unit_price FROM #__WHERE part_number = $part_number;
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Common INSERT
INSERT INTO #__ (part_number, unit_price,eau,….)VALUES (….);
Common SELECT
SELECT unit_price FROM #__WHERE part_number = $part_number;
PROBLEMS:
1. Speed2. Data accuracy (inconsistent white spacing)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Amended INSERT
php: $concat = preg_replace('/\s+/', '', $input);
INSERT INTO #__ (part_number, unit_price, eau,part_number_concat,…)VALUES (…,$concat);
Amended SELECT
SELECT unit_price FROM #__WHERE part_number_concat = $concat;
PROBLEMS:
1. Speed2. Data accuracy (inconsistent white
spacing)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Optimized INSERT
php: $concat = preg_replace('/\s+/', '', $input);php: $crc = crc32($concat);
INSERT INTO #__ (part_number, unit_price, eau,part_number_concat,crc_partnumberconcat…)VALUES (…,$concat,$crc);
Optimized SELECT
SELECT unit_price FROM #__WHERE crc_partnumberconcat = $crcAND part_number_concat = $concat; (singularity)
PROBLEMS:
1. Speed2. Data accuracy (inconsistent white
spacing)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Optimized INSERT (* BATCH)
php: $concat = preg_replace('/\s+/', '', $input);php: $crc = crc32($concat);
ALTER TABLE #__ DROP INDEX xINSERT INTO #__ (part_number, unit_price, eau,part_number_concat,crc_partnumberconcat…)VALUES (…,$concat,$crc);ALTER TABLE #__ ADD INDEX x (‘column’)
Optimized SELECT
SELECT unit_price FROM #__WHERE crc_partnumberconcat = $crcAND part_number_concat = $concat;
PROBLEMS:
1. Speed2. Data accuracy (inconsistent white
spacing)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Indexing Options
ALWAYS BENCHMARK (against COLD DB)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
Indexing Options
ALWAYS BENCHMARK (against COLD DB)USE EXPLAIN (EXTENSIVELY!!!)
Assumption:I might search part_number_concat directly:idx_pnc_crc (part_number_concat, crc_partnumberconcat)
Alternative:idx_crc_pnc (crc_partnumberconcat, part_number_concat)idx_pnc (part_number_concat)
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
AGENDA
• WHO AM I ?
• NESTED SET INTRO
• NESTED SET BASIC QUERIES
• NESTED SET BASIC QUERIES OPTIMIZATION
• STRING LOOKUP OPTIMIZATION TRICK
• INDEXING OPTIONS
• SUMMARY
JWC13 - Eli Aschkenasy - Nested Sets & DB Design Principles
SUMMARY
• DETACH “LEAF CONTENT” FROM TREE
• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB
• ALTERNATIVELY - CREATE SUMMARY TABLES
• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB
• CREATE SIMPLIFICATION COLUMNS FOR INDEXING
• CHOOSE INDEXING STRATEGY ACCORDING TO USAGE AND SPECIFICITY
• UTILIZE LIVE QUERIES INSTEAD OF SAVING IN DB