System Concepts and Architecture Rose-Hulman Institute of Technology Curt Clifton.
Indexes Rose-Hulman Institute of Technology Curt Clifton.
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Indexes Rose-Hulman Institute of Technology Curt Clifton.
Indexes
Rose-Hulman Institute of Technology
Curt Clifton
Overview Introduction to Indexes Index Architecture How SQL Server Retrieves Stored Data How SQL Server Maintains Index and Heap
Structures Deciding Which Columns to Index
Data Pages
Page 7 Page 8 Page 9AkhtarAkhtarFunkFunkSmithSmithMartinMartin......
Page 4 Page 5 Page 6 ..............................
ConConFunkFunkWhiteWhite............
RuddRuddWhiteWhiteBarrBarr............
SmithSmithOtaOtaJonesJones............
MartinMartinPhuaPhuaJonesJonesSmithSmith......
GanioGanioJonesJonesHallHall............
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
......
Storing and Accessing Data How Data Is Stored
Rows are stored in data pages Heaps are a collection of data pages for a table
Whether to Create Indexes Why to Create an Index
Speeds up data access Enforces uniqueness of rows
Why Not to Create an Index Consumes disk space Incurs overhead
Using Heaps SQL Server: Uses Index Allocation Map Pages That:
Contain information on where the extents of a heap are stored
Navigate through the heap and find available space for new rows being inserted
Connect data pages Reclaims Space for New Rows in the Heap When a
Row Is Deleted
Using Clustered Indexes Each Table Can Have Only One Clustered
Index The Physical Row Order of the Table and the
Order of Rows in the Index Are the Same Key Value Uniqueness Is Maintained
Explicitly or Implicitly
Using Nonclustered Indexes Nonclustered Indexes Are the SQL Server
Default Existing Nonclustered Indexes Are
Automatically Rebuilt When: An existing clustered index is dropped A new clustered index is created The DROP_EXISTING option is used to change
which columns define the clustered index
Maintaining Index and Heap Structures
Page Splits in an Index
Index Pages
Non-LeafLevel
INSERT member (last name)VALUES lastname = ‘Jackson'
INSERT member (last name)VALUES lastname = ‘Jackson'
LangLangSmithSmith
……
AkhtarAkhtarGanioGanio
……
AkhtarAkhtar……
MartinMartin
Jackson
Leaf Level(Key Value)
AkhtarAkhtarBarrBarrBarrBarrBormBormBuhlBuhl
…………………………
LangLangMartinMartinMartinMartinMartinMartinMorisMoris
…………………………
SmithSmithSmithSmithSmithSmithSmithSmithSmithSmith
…………………………
… ……GanioGanioHallHallHartHart
JonesJonesJonesJones
…………………………
…
Leaf Level(Key Value)
AkhtarAkhtarBarrBarrBarrBarrBormBormBuhlBuhl
…………………………
JacksonJacksonJonesJonesJonesJones
………………
LangLangMartinMartinMartinMartinMartinMartinMorisMoris
…………………………
SmithSmithSmithSmithSmithSmithSmithSmithSmithSmith
…………………………
… ……GanioGanioHallHallHartHart
……………… …
Jackson …
Forwarding Pointer in a Heap
Non-LeafLevel
Page 12 - RootPage 37 Page 28
Leaf Level(Key Value)
Page 41 Page 51 Page 61 Page 71
AkhtarAkhtar......MartinMartin
AkhtarAkhtarBarrBarrConConFunkFunkFunkFunk
4:706:014:706:014:705:034:705:034:704:014:704:014:706:024:706:024:704:024:704:02
MartinMartinSmithSmith......
SmithSmithSmithSmithSmithSmithWhiteWhiteWhiteWhite
4:706:034:706:034:708:044:708:044:707:014:707:014:704:034:704:034:705:024:705:02
AkhtarAkhtarGanioGanio......
GanioGanioHallHall
JonesJonesJonesJonesJonesJones
4:709:014:709:014:709:044:709:044:709:024:709:024:708:034:708:034:707:034:707:03
HeapPage 707 Page 708 Page 709
0101020203030404......
......
......
......
......
......
AkhtarAkhtarFunkFunkSmithSmithMartinMartin......
Page 704 Page 705 Page 706 010102020303............
......
......
......
......
......
ConnConnFunkFunkWhiteWhite............
010102020303............
......
......
......
......
......
RuddRuddWhiteWhiteBarrBarr
......
......
01010202030304040505
......
......
......
......
......
SmithSmithOtaOta
JonesJonesCoretsCoretsNashNash
0101020203030404......
......
......
......
......
......
MartinMartinPhuaPhuaJonesJonesSmithSmith......
010102020303............
......
......
......
......
......
GanioGanioJonesJonesHallHall
......
......
MartinMartinMartinMartin
OtaOtaPhuaPhuaRuddRudd
4:708:014:708:014:706:044:706:044:707:024:707:024:708:024:708:024:705:014:705:01
NonClustered
Index
File ID #4
idid indid = 2indid = 2 rootrootsysindexes
Non-Leaf
Level
Page 12 - RootPage 37 Page 28
Leaf Level(Key Value)
Page 41 Page 51 Page 61 Page 71
AkhtarAkhtar......MartinMartin
AkhtarAkhtarBarrBarrConConFunkFunkFunkFunk
4:706:014:706:014:705:034:705:034:704:014:704:014:706:024:706:024:704:024:704:02
MartinMartinSmithSmith......
SmithSmithSmithSmithSmithSmithWhiteWhiteWhiteWhite
4:706:034:706:034:708:044:708:044:707:014:707:014:704:034:704:034:705:024:705:02
AkhtarAkhtarGanioGanio......
GanioGanioHallHall
JonesJonesJonesJonesJonesJones
4:709:014:709:014:709:044:709:044:709:024:709:024:708:034:708:034:707:034:707:03
HeapPage 707 Page 708 Page 709
0101020203030404......
......
......
......
......
......
AkhtarAkhtarFunkFunkSmithSmithMartinMartin......
Page 704 Page 705 Page 706 010102020303............
......
......
......
......
......
ConnConnFunkFunkWhiteWhite............
010102020303............
......
......
......
......
......
RuddRuddWhiteWhiteBarrBarr
......
......
01010202030304040505
......
......
......
......
......
SmithSmithOtaOta
JonesJonesCoretsCoretsNashNash
0101020203030404......
......
......
......
......
......
MartinMartinPhuaPhuaJonesJonesSmithSmith......
010102020303............
......
......
......
......
......
GanioGanioJonesJonesHallHall
......
......
MartinMartinMartinMartin
OtaOtaPhuaPhuaRuddRudd
4:708:014:708:014:706:044:706:044:707:024:707:024:708:024:708:024:705:014:705:01
Nonclustered
Index
File ID #4
0202 ...... OtaOta0202 ...... OtaOta
MartinMartin
MartinMartin
0202 OtaOta
0404 ...... OtaOta
OtaOta 4:707:024:707:02
UPDATE memberSET Address = <really long>WHERE lastname = 'Ota'
UPDATE memberSET Address = <really long>WHERE lastname = 'Ota'
Row Updates Generally do not cause rows to move Like a delete followed by an insert
Logically Sometimes practically
Batch updates touch each index once
Deletion Deletion creates “ghost records” Reclaiming space
Free pages when empty For indexed table:
Can overwrite ghost records immediately For non-indexed table:
Compact records when more space is needed for insert
Deciding What to Index
What You Need to Know Logical and Physical Database Design Data Characteristics How Data Is Used
The types of queries performed The frequency of queries that are typically
performed
Indexing Guidelines Columns to Index
Primary and foreign keys Those frequently searched in ranges Those frequently accessed in sorted order Those frequently grouped together during aggregation
Columns Not to Index Those seldom referenced in queries Those that contain few unique values Those defined with text, ntext, or image data types
Choosing the Clustered Index Heavily Updated Tables
A clustered index with an identity column keeps updated pages in memory
Sorting A clustered index keeps the data pre-sorted
Column Length and Data Type Limit the number of columns Reduce the number of characters Use the smallest data type possible
Data Characteristics – Densitylast_namelast_namelast_namelast_name first_namefirst_namefirst_namefirst_name
RandallRandall
..
..
..
JoshuaJoshua
RandallRandall CynthiaCynthia
RandallRandall TristanTristan
..
..
..
OtaOta LaniLani
..
..
..
SELECT *FROM memberWHERE last_name = ‘Ota’
SELECT *FROM memberWHERE last_name = ‘Ota’
Low Density
SELECT *FROM memberWHERE last_name = ‘Randall’
SELECT *FROM memberWHERE last_name = ‘Randall’
High Density
Data Characteristics – Selectivity How effective is a column at selecting a
subset of the data A property of a given query:
Rows matching property / Total number of rows
Determining Selectivity High selectivitymember_nomember_nomember_nomember_no
11
22
..
..
last_namelast_namelast_namelast_name first_namefirst_namefirst_namefirst_name
RandallRandall
FloodFloodJoshuaJoshua
KathieKathie
..
1000010000 AndersonAnderson BillBill
SELECT *FROM memberWHERE member_no > 8999
SELECT *FROM memberWHERE member_no > 8999
100010000 = 10%
Number of rows meeting criteriaTotal number of rows in table =
Low selectivitymember_nomember_nomember_nomember_no
11
22
..
..
last_namelast_namelast_namelast_name first_namefirst_namefirst_namefirst_name
RandallRandall
FloodFloodJoshuaJoshua
KathieKathie
..
1000010000 AndersonAnderson BillBill
SELECT *FROM memberWHERE member_no < 9001
SELECT *FROM memberWHERE member_no < 9001
900010000 = 90%
Number of rows meeting criteriaTotal number of rows in table =
Indexing to Support Queries Writing Good Search Arguments
Specify a WHERE clause in the query Verify that the WHERE clause limits the number
of rows Avoid using leading wildcards
Introduction to Statistics
Standard Distribution of Values
F - JA - E K - O P - U V - ZLast Name
Number ofLast Names
Even Distribution of Values
Last Name
Number ofLast Names
C - FA - B G - K L - N O - Z
How Statistics Are Gathered DMBS reads/samples column values
Produces an evenly distributed sorted list of values
Performs a full scan or sampling of rows Depending on size of table and granularity
wanted Selects samplings if necessary
Picks rows to be sampled Includes all rows on the page of selected rows
Creating Statistics Automatically Creating Statistics
Indexed columns that contain data Non-indexed columns that are used in a join
predicate or a WHERE clause Manually Creating Statistics
Columns that are not indexed All columns other than the first column of a
composite index
Viewing Statistics The DBCC SHOW_STATISTICS Statement
Returns Statistical Information in the Distribution Page for an Index or Column
Statistical Information Includes: The time when the statistics were last updated The number of rows sampled to produce the
histogram Density information Average key length Histogram step information
Performance Considerations Create Indexes on Foreign Keys Create the Clustered Index Before
Nonclustered Indexes Consider Creating Composite Indexes Create Multiple Indexes for a Table That Is
Read Frequently