Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries
-
Upload
jason-velasquez -
Category
Documents
-
view
43 -
download
0
description
Transcript of Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries
Dual Bitmap Index: Space-Time Efficient Bitmap
Index for Equality and Membership Queries
Niwan Wattanakitrungroj and Sirirut Vanichayobon
Artificial Intelligence Research LaboratoryDepartment of Computer Science, Prince of Songkla University
ISCIT 2006Artificial Intelligence Research Group 2/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
ISCIT 2006Artificial Intelligence Research Group 3/24
2541
SO N G K L A
B K K
D ata M art
D ata W ar ehouse0
10
20
30
40
50
60
70
80
90
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
2535
M etadata R epository
E xtractT ransf rom
L oadR ef resh
S erver
O L A P S erver
M onitorng & A dm instration
A nalysisQ uery /R eportingD ata M ining
T oolsD ata S ource
O perational dbs
E xternal S ource
D ata W areho using A rch i tecture
Introduction
- A data warehouse is a large repository of information accessed through OLAP application.
- A majority of requests for information from a data warehouse involve dynamic ad hoc queries.
- The ability to answer these queries quickly is a critical issue in the data warehouse environment.
ISCIT 2006Artificial Intelligence Research Group 4/24
Introduction
Summary tables
Indexes
Parallel machines
To speed up query processing :
ISCIT 2006Artificial Intelligence Research Group 5/24
Bitmap Index
simple to represent
uses less space
more CPU-efficient
low-cost Boolean operations
Characteristic :
Introduction :
ISCIT 2006Artificial Intelligence Research Group 6/24
Bitmap IndexName Gender Education
Suda F BS
Wichai M BS
Jonh M MS
Marry F PhD
Somsak M BS
… …
F
1
0
0
1
0
…
BS
1
1
0
0
1
…
MS
0
0
1
0
0
…
M
0
1
1
0
1
…
PhD
0
0
0
1
0
…
RID
1
2
3
4
5
…
RID
1
2
3
4
5
…
Select Count(*)
From Employee
Where Gender=“F”;
Answer : 2
Select Name
From Employee
Where Gender=“M” and
Education=“MS”
Answer : John
Introduction :
Select Name
From Employee
Where Education in {MS,PhD}
Answer : John, Marry
Employee Table
Equality Query
Membership Query
ISCIT 2006Artificial Intelligence Research Group 7/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
ISCIT 2006Artificial Intelligence Research Group 8/24
Related WorkSimple Bitmap Index C = 15 15 bitmap vectors
Variations of Bitmap Index
Let C be a number of distinct values of the indexed attribute(Cardinallity).
Bitmap vectors : 0 1 2 1, , ,..., CS S S S " " vA v s
Query :
" " 22A S
ISCIT 2006Artificial Intelligence Research Group 9/24
3v
Interval Bitmap Index Related Work
C = 15 8 bitmap vectors
Variations of Bitmap Index
Bitmap vectors : 1
0 1 2 2, , ,..., , C
I I I I
0
0
1
1
0
if 0, 0,
if 1, 2,
if 1, 3,
" " if ,
v v
v
I v m
I v C
I v C
A v I I v m
I I
12
1
0
if , 0,
if m 1, 0,
( ) if 1
C
v m v m
v m m
I I v C m
I I v C
Query" " 2 32A I I , jI j j m1,2
C
m
ISCIT 2006Artificial Intelligence Research Group 10/24
Scatter Bitmap Index C = 15 8 bitmap vectors,
Variations of Bitmap Index
Related Work
Bitmap vectors : 1 2 1 0 1, ,..., , , ,...,C CL L L Z Z Z
1, m C
( - ) ( )
( - ) mod( )
if " "
otherwise
1 1 1
1 1 1
0v m v m
v m v m
Z Z vA v
Z L
m = 5
Query
" " 1 22A Z L
ISCIT 2006Artificial Intelligence Research Group 11/24
Encoded Bitmap Index Related Work
C = 15 4 bitmap vectors
Variations of Bitmap Index
Mapping all Bitmap Vector
Query :Bitmap vectors : log 10 1 2, , ,..., CE E E E
" "2A
ISCIT 2006Artificial Intelligence Research Group 12/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
OutlineVariations of Bitmap Index
ISCIT 2006Artificial Intelligence Research Group 13/24
Dual Bitmap Index
Variations of Bitmap Index
Encoding Scheme of five bitmap indices
Need
C bitmap vectors
Need
bitmap vectors
2
CNeed
bitmap vectors
2 C
Need
bitmap vectors
log C Need
bitmap vectors
. . 2 0 25 0 5C
ISCIT 2006Artificial Intelligence Research Group 14/24
Dual Bitmap Index
Variations of Bitmap Index
ISCIT 2006Artificial Intelligence Research Group 15/24
1. Assign an increasing sequence of numbers to each of the distinct values of A (i.e., 0,1,…,C-1).
4. For each value v on record at position i in A
1
0
iD
if i = r and s
otherwise
where 2( ) 0.25 0.5 ,r hiC v
rrnrn
vrs mod2
)1)((1
and v is the value of an indexed attribute for any record.
Creation of Dual Bitmap Index
C =15 A = {0,1,2,…,14}
Variations of Bitmap Index
2. Calculate n :
2 0.25 0.5n C (The total number of bitmap vectors created )
hiC
2
nhiC3. Calculate : (the highest value of C that can be represent
by n bitmap vector)
n = 6
hiC = 15
ISCIT 2006Artificial Intelligence Research Group 16/24
1. Find the sequence number of the searching value.
2. " " r sA v D D
where 2( ) 0.25 0.5 ,r hiC v
rrnrn
vrs mod2
)1)((1
and v is the value of an indexed attribute for any record.
Equality and Membership Queries
“A = 2” 5 2D D
Variations of Bitmap Index : Propose Bitmap Index
ISCIT 2006Artificial Intelligence Research Group 17/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
ISCIT 2006Artificial Intelligence Research Group 18/24
Performance study
ISCIT 2006Artificial Intelligence Research Group 19/24
Performance studyNumber of bitmap vectors used to
represent an attribute with cardinality C
(Space)
Scatter
Dual
Encoded
Simple
Interval
Scatter
Dual
Encoded
ISCIT 2006Artificial Intelligence Research Group 20/24
Performance study
ISCIT 2006Artificial Intelligence Research Group 21/24
Space-Time Trade-off for five Bitmap Indices
C=50, N=1,000,000 (The data sets from TPC-H Benchmark)
Performance study
Simple
Interval
Scatter
Encoded
Dual
ISCIT 2006Artificial Intelligence Research Group 22/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
ISCIT 2006Artificial Intelligence Research Group 23/24
Conclusion
Dual bitmap index uses less space while maintaining query processing time for equality and membership queries.
Dual Bitmap Index achieves this by representing each attribute value using only two bitmap vectors, and only the low-cost Boolean AND operation is used to answer equality query.
Dual Bitmap Index has better space-time performance than the other bitmap indexing techniques.
Simple Bitmap Index requires the most space.
Encoded Bitmap Index’ s processing time is the worst.
ISCIT 2006Artificial Intelligence Research Group 24/24
Thank You
Question & answer