Tech Jam 01 - Database Querying

49
Database Querying Tech Jam -25 november 2015 Rodger oates

Transcript of Tech Jam 01 - Database Querying

Page 1: Tech Jam 01 - Database Querying

Database QueryingTech Jam -25 november 2015Rodger oates

Page 2: Tech Jam 01 - Database Querying

Database Querying

• STUFF

• COALESCE vs ISNULL

• The LIKE Predicate

• Combining Predicates

• Joining Queries

• Table Expressions

• Window Functions

Page 3: Tech Jam 01 - Database Querying

A Quick Note on Query Processing1. FROM

2. JOINS

3. WHERE

4. GROUP BY

5. CUBE | ROLLUP

6. HAVING

7. SELECT

8. DISTINCT

9. ORDER BY

10. TOP

Page 4: Tech Jam 01 - Database Querying

STUFF

Page 5: Tech Jam 01 - Database Querying

REPLACE

• Used to replace parts of a string

• Will perform action through whole of original string

• Format

• REPLACE(<original>, <string to replace>, <replacement>)

• Note: <string to replace> and <replacement> do not need to be equal in length

Page 6: Tech Jam 01 - Database Querying

STUFF

• Used to insert one string into another

• Replaces characters in range specified with given string

• Format:

• STUFF(<original>, <start index>, <length>, <string to stuff>)

• Examples of use:

• www.sql-server-helper.com/

Page 7: Tech Jam 01 - Database Querying

For XML PATH

• Can be used with STUFF to translate a table of results into a single result.

• Real world example:

• Comma delimited list of names

• E.g. in names of users assigned to a given site

• See: Usage #6 in www.sql-server-helper.com

Page 8: Tech Jam 01 - Database Querying

COALESCE vs ISNULL

Page 9: Tech Jam 01 - Database Querying

ISNULL

• T-SQL specific

• Takes two inputs

• Format:

• ISNULL(<expression 1>, <expression 2>)

• Returns <expression 1> if not null, else <expression 2>

• Note: Data type of result is same as <expression 1> regardless of which expression is returned

Page 10: Tech Jam 01 - Database Querying

COALESCE

• Standard SQL

• Takes any number of inputs

• Format:

• COALESCE(<expression 1>, <expression 2>, …, <expression n>)

• Returns first input object that is not null

• Note: Data type of result is the same as the returned expression

Page 11: Tech Jam 01 - Database Querying

The LIKE Predicate

Page 12: Tech Jam 01 - Database Querying

Why LIKE?

• Used to find strings that are similar to a search term

• Without wildcards, predicate would only match with the search term in its entirety, case insensitive

• With wildcards, we can find matches where only part of the string matches our term

• Search terms that commence with wildcard(s) do not allow SQL to use index ordering to speed up the query

• E.g UserFullName LIKE ‘%oates%’

Page 14: Tech Jam 01 - Database Querying

Combining Predicates

Page 15: Tech Jam 01 - Database Querying

Types of combinations

• AND

• Predicates on both sides must be true

• OR

• Either predicate must be true

• NOT

• Negation of succeeding predicate

• Note: when predicate equates to NULL, the NOT of this predicate will still be NULL

Page 16: Tech Jam 01 - Database Querying

Natural Precedence Rules for Combinations

1. NOT

2. AND

3. OR

e.g.

WHERE col1 = ‘w’ AND col2 = ‘x’ OR col3 = ‘y’ AND col4 = ‘z’

Page 17: Tech Jam 01 - Database Querying

Use of parentheses

• Can use parentheses if different precedence must be given to predicates

• Previous example would equate to:

• WHERE (col1 = ‘w’ AND col2 = ‘x’) OR (col3 = ‘y’ AND col4 = ‘z’)

• However, we could use:

• WHERE col1 = ‘w’ AND (col2 = ‘x’ OR col3 = ‘y’) AND col4 = ‘z’

Page 18: Tech Jam 01 - Database Querying

Joining Queries

Page 19: Tech Jam 01 - Database Querying

Joining Queries

• Types of Join

• Multi-Join Queries

• EXISTS

• Set Operators

• UNION vs UNION ALL

• INTERSECT

• EXCEPT

Page 20: Tech Jam 01 - Database Querying

Types of Join

• CROSS JOIN

• Cartesian product of two tables

• Does not require a link between the tables

• INNER JOIN

• Requires a predicate between the tables (specified using the ON keyword)

• Filters out any rows that do not match the ON predicate

• SELF-JOIN

• Is simply an INNER JOIN

• Joins records in a table with other records in the same table

Page 21: Tech Jam 01 - Database Querying

OUTER JOINS

• Outer joins preserve at least one side of the join if there is no match

• LEFT OUTER JOIN preserves the left table

• E.g. <table A> LEFT OUTER JOIN <table B> preserves <table A>

• RIGHT OUTER JOIN preserves the right table

• E.g. <table A> RIGHT OUTER JOIN <table B> preserves <table B>

• FULL OUTER JOIN preserves both tables where there is no match

• Further information on joins at TechNet

Page 22: Tech Jam 01 - Database Querying

A Note on NULLs within JOINs

• Note: When determining row inclusion via predicate, a NULL value on either side of the comparison will be discarded

• NULL is not a value, it is unknown

Page 23: Tech Jam 01 - Database Querying

Multi-Join Queries

• Often need to look at multiple tables within the same query

• Use joins to link each table together

• Joins are evaluated in order of definition

• Need to be careful when combining outer joins with others, as the results may not be what you expect

Page 25: Tech Jam 01 - Database Querying

Why was a supplier not returned?

Source: Exam 70-461 Training Kit

Page 27: Tech Jam 01 - Database Querying

EXISTS

• Accepts a subquery as input:

• WHERE EXISTS (SELECT … FROM <table>)

• Returns true if at least one row returned

• EXISTS doesn’t need to return the result of the subquery

• Rather, true or false depending on whether rows would have been returned by the subquery

• Can be negated:

• WHERE NOT EXISTS (SELECT … FROM <table>)

Page 28: Tech Jam 01 - Database Querying

Set Operator Guidelines

• Complete rows are matched between the input sets

• Number of columns in all queries must be the same

• Column types for corresponding columns must be compatible

• Considers two NULLs to be compatible

• This is different to JOINs

• Result column names are defined in the first query

• Result ordering can only be defined after the last input query

Page 29: Tech Jam 01 - Database Querying

UNION vs UNION ALL

• UNION unifies the results of two input queries

• Both input queries must have

• UNION has an implicit DISTINCT property

• No duplicates in result

• UNION ALL returns all results, including any duplicates

Page 30: Tech Jam 01 - Database Querying

UNIO N / UNIO N A L L UNIO N A L L

Page 31: Tech Jam 01 - Database Querying

INTERSECT

• Only rows that exist in all input queries are returned

• Has an implicit DISTINCT operator

Page 32: Tech Jam 01 - Database Querying

EXCEPT

• Works with two input queries

• Returns rows that exist in the first query, but not the second

Page 33: Tech Jam 01 - Database Querying

Table Expressions

Page 34: Tech Jam 01 - Database Querying

Types of Table Expression

1. Derived Table

2. Common Table Expression (CTE)

3. View

4. Inline Table-Valued Function

• 1 and 2 are only visible within the scope of the current statement and are not reusable

• 3 and 4 have their definitions stored in the database and can be reused

Page 35: Tech Jam 01 - Database Querying

Optimisation Notes

• Table expressions are not tables, but definitions

• They do not hold any data

• They interact directly with the underlying tables

• They do not have a performance impact in themselves

• The above points must be noted when using a table expression within a query operating on a large data set

Page 36: Tech Jam 01 - Database Querying

Derived Tables

• Closely resemble a subquery

• Defined in the FROM clause of the outer query

• E.g. I want the two lowest priced products per category

Source: Exam 70-461 Training Kit

Page 37: Tech Jam 01 - Database Querying

Common Table Expressions (CTEs)

• Derived tables cannot be nested, if this is required, the statement must define multiple instances of the same query

• Higher risk of mistake

• This is a CTE version of the previous query

Note: we could reuse C as many times as we want, but only within the scope of the current statement

Source: Exam 70-461 Training Kit

Page 38: Tech Jam 01 - Database Querying

A Note on Scoped Tables

• If you need to refer to the same data in multiple statements for a given query it would be better to retrieve the data into a table. You have two options:

• Create a temporary table

• CREATE TABLE #<table name>

• Remember to use a DROP TABLE #<table name> command

• Declare a table variable

• DECLARE @<table name> TABLE

• This way, you are not having to go back to the raw tables to retrieve the same data

Page 39: Tech Jam 01 - Database Querying

Views and Inline Table-Valued Functions

• Define queries in the database, which can then be reused

• Do not store any data

• Performance issues arise from how it is used

• Views do not allow input parameters

• Functions do allow input parameters

Page 40: Tech Jam 01 - Database Querying

Window Functions

Page 41: Tech Jam 01 - Database Querying

Window Functions

• Window Aggregate Functions

• Window Ranking Functions

Page 42: Tech Jam 01 - Database Querying

Window Aggregate Functions

• Same as group aggregate functions

• SUM, COUNT, AVG, MIN, MAX

• Are applied to a window of rows defined by the OVER clause

• Do not hide row detail

• Can mix detail and aggregated elements in the same query

• Without having to define a load of columns in the group section

Page 43: Tech Jam 01 - Database Querying

Example

Page 44: Tech Jam 01 - Database Querying

Window Ranking Functions

• ROW_NUMBER

• Unique number based on ORDER BY clause

• RANK

• Number based on ORDER BY clause

• Assigns same number when ordering values are tied

• Numbers may therefore not be sequential

Page 45: Tech Jam 01 - Database Querying

Window Ranking Functions

• DENSE_RANK

• Similar to RANK

• Numbers are sequential

• NTILE

• Arranges rows within a partition of a number of equally sizes tiles

• Number of rows per partition is calculated by

• <total number of rows> / <partition size>

• If this calculation produces a remainder, additional row(s) are assigned to tiles in their order defined by the ORDER BY clause (see example below)

Page 46: Tech Jam 01 - Database Querying

Example

• Assuming 800 rows of data, and the following query

Source: Exam 70-461 Training Kit

Page 47: Tech Jam 01 - Database Querying

Example

custid orderid Val rownum rnk densernk ntile100

12 10782 12.50 1 1 1 1

27 10807 18.40 2 2 2 1

66 10586 23.80 3 3 3 1

76 10767 28.00 4 4 4 1

54 10898 30.00 5 5 5 1

88 10900 33.75 6 6 6 1

48 10883 36.00 7 7 7 1

41 11051 36.00 8 7 7 1

71 10815 40.00 9 9 8 2

38 10674 45.00 10 10 9 2

53 11057 45.00 11 10 9 2

75 10271 48.00 12 12 10 2

Source: Exam 70-461 Training Kit

Page 48: Tech Jam 01 - Database Querying

PARTITION

• Can be applied to window ranking functions

• Each partition has its own numbering

• E.g. to get the latest child event for each parent