Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. ·...
Transcript of Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. ·...
Views and Indexing
CSCI 4380 Database Systems
Sunday, October 24, 2010
Views• A view is a query.
• Views can be anonymous
SELECT *
FROM
(SELECT id, name, username, password
FROM student
WHERE password is null) R
WHERE R.id = 5 ;
• The relation R above is an anonymous view (it is not known outside of this query).
• This query is combined with the remaining query to find the optimal query plan.
Sunday, October 24, 2010
Views optimization• For example, after optimization the query below
SELECT *
FROM
(SELECT id, name, username, password
FROM student
WHERE password is null) R
WHERE R.id = 5 ;
• may become:
SELECT *
FROM student
WHERE password is NULL and id = 5 ;
Sunday, October 24, 2010
Views optimization
• It is best to use an anonymous view if the query cannot be written without it.
• Otherwise, the optimizer may loose some optimizations and rewritings of the query when views are used.
Sunday, October 24, 2010
(non anonymous) Views
• Views can also be given a name. This allows them to be used in many different queries.CREATE VIEW nullpwd(id, name, username, password)
AS
SELECT id, name, username, password
FROM student
WHERE password is null ;
Sunday, October 24, 2010
Views
• Attributes can be renamed in views.CREATE VIEW facultyWCourse(faculty_id, name)
AS
SELECT DISTINCT f.id, f.name
FROM faculty f, class c
WHERE f.id = c.instructor_id ;
Sunday, October 24, 2010
Using views in queries• Views can be used in any query as if they were a table.
• Remember, views are just queries. No tuples are stored for them. SELECT * FROM nullpwdWHERE id = 5 ;
• When executing this query, the query processor first takes the query definition and replaces the query name with its definition (getting the expression in slide 2). Then, the query is optimized.
Sunday, October 24, 2010
Why use views?• Creating views allows the system designer to customize the
application code so that:
• The functionality for different users can be built on top of views.
• For example, faculty cannot access financial information of students and can only the information about the students who are currently taking a course from them.
• Solution: Create a view for the students in a specific class which only includes the relevant attributes. The application code will be built on top of this view.
Sunday, October 24, 2010
Other uses of views• Views can also be used to insert/update/delete tuples
instead of the table they are based on.
• This builds on the philosophy of building functionality based on views.
• However, this is only possible for a very restricted subset of views, called updatable views.
• Updatable views are such that each tuple in the view maps to one and only one tuple in the table it is based on.
Sunday, October 24, 2010
Updatable views
• A view is updatable if:
• It has only one table T in its from clause
• It contains all attributes in that cannot be null
• It does not have any distinct, group by statements
Sunday, October 24, 2010
Updatable views
• Example:
CREATE VIEW nullpwd(id, name, username, password)
AS
SELECT id, name, username, password
FROM student
WHERE password is null ;
UPDATE nullpwd SET password = ‘abc’ WHERE id = 5 ;
Sunday, October 24, 2010
Updatable views
• Note: nullpwd does not store any tuples. This expression allows only those tuples of student that are accessible through view to be updated.
• Furthermore, after the update, the resulting tuple may not even be in the view (unless the view is created with the CHECK OPTION).
UPDATE nullpwd SET password = ‘abc’ WHERE id = 5 ;
Sunday, October 24, 2010
Why use views?
• Using views to create functionality hides data complexity from developers.
• Also, if the data model changes, the application code does not have to change as long as the new model can be mapped to the same view.
Sunday, October 24, 2010
Why not use views?
• Writing a query using views may hide some optimizations from the database, creating sub optimal query plans.
Sunday, October 24, 2010
INDEXING• Views do not improve performance,
sometimes they may even cause a loss of performance.
• One way to improve performance is store (cache) the result of some queries in the database.
• Indexes are exactly that, cached results of queries.
Sunday, October 24, 2010
Indexing exampleSELECT CrsCode, Grade
FROM Transcript
WHERE studId = 5 AND semester = ‘Fall’ AND year = 2010
Answering this query requires reading all of transcript from disk to find the matching tuples to return. Note that the matching tuples will be few, 4-5 in this case. However, transcript is large.
Sunday, October 24, 2010
Cost Analysis• Let us some basic cost analysis.• Suppose transcript is stored on disk in 100 disk pages.
Then, answering this query requires reading 100 disk pages.
• Suppose instead we had an index that allowed us to find all the matching tuples. Example:• Index on Transcript (studid, semester, year)• Then, answering this query will cost:
• Finding the tuples from index• Reading them from disk, at most 4-5 pages.
Sunday, October 24, 2010
Indexing• Indexes are just query results stored explicitly.• They are also stored on disk, but can be cheaper to
use because:• They have fewer disk pages as they store only a
subset of the attributes in the relation.• They are stored in a way to make it easy to find
queries on specific values in the index (we will see how).
• For now, we can assume that querying the index is small for most queries.
Sunday, October 24, 2010
Index cost/benefit analysis• Indices are good if
• they reduce the cost of frequently asked queries
• the reduction is considerable
• Indices must be kept up to date when the tables change
• Indices increase the cost insert/update/delete operations (at least one extra disk page access for each index created)
Sunday, October 24, 2010
What are good indices
• A good index will reduce the total number of matching tuples to 1 or a few
• Examples: attributes in a key
• An index on student(id) will greatly improve queries like
• SELECT * FROM student WHERE id = 5;
Sunday, October 24, 2010
What are good indices• If the underlying relation is sorted with respect to some
attribute, then an index on that attribute will help performance.
• Suppose, transcript tuples are sorted by semester, year.
• Create an index on Transcript(semester, year)
• Given the query:
SELECT CrsCode, Grade FROM Transcript
WHERE semester = ‘Fall’ AND year = 2010
use the index to find the first tuple for Fall/2010, and then scan the Transcript relation starting from that point.
Sunday, October 24, 2010