T-SQL Essentials: Chapter 12 Aggregation
description
Transcript of T-SQL Essentials: Chapter 12 Aggregation
T-SQL Essentials: Chapter 12Aggregation
Syed Rizvi
Objectives
• Aggregations– Count / Count_Big– Sum– Max/Min– Avg
• Grouping Data– GROUP BY– HAVING– Distinct Values
Aggregations
– Performing a function on a set of data to return one aggregated value per grouping of data.
– This will vary from counting the number of rows returned from a SELECT statement to figuring out maximum and minimum values.
AggregationsCOUNT/COUNT_BIG
– COUNT/COUNT_BIG is probably the most commonly used aggregation
– finds out the number of rows returned from a query. – You use this for checking the total number of rows in a table or,
more likely, the number of rows returned from a particular set of filtering criteria.
– Quite often this is used to cross-check the number of rows from a query in SSE with the number of rows an application is showing to a user.
– The syntax is COUNT(*) or COUNT_BIG(*). There are no columns defined, because it is rows that are being counted.
– The difference in these two functions is that COUNT returns an integer data type, and COUNT_BIGreturns a bigint data type
AggregationsCOUNT/COUNT_BIG
Example
SELECT COUNT(*) AS ' Number of Rows‘FROM ShareDetails. Shares
AggregationsCOUNT/COUNT_BIG
Example
SELECT COUNT(*) AS ' Number of Rows‘FROM ShareDetails. SharesWHERE CurrentPrice > 10
AggregationsSUM
– If you have numeric values in a column, it is possible to aggregate them as a summation.
– The ideal scenario for this is to aggregate the number of transactions in a bank account to see how much the balance has changed by. This could be daily, weekly, monthly, or over any time period required.
– A negative amount would show that more has been taken out of the account than put in, for example.
AggregationsSUM
– The syntax :SUM(column1| @variable| Mathematical function).– The summation does not have to be of a column
but could include a math function. – One example would be to sum up the cost of
purchasing shares, so you would multiply the number of shares bought by the cost paid.
AggregationsSUM Example
SELECT SUM(Amount) AS ' Amount Deposited‘FROM TransactionDetails. TransactionsWHERE CustomerId = 1 AND TransactionType = 1
AggregationsSUM Example
AggregationsMAX/MIN
– Performed on a set of data to get the minimum and maximum values of a column of data.
– This is useful if you want to see values such as the smallest share price or the greatest portfolio value
– Or the maximum number of sales of each product in a period of time, or the minimum sold, so that you can see whether some days are quieter than others.
AggregationsMAX/MIN Example
SELECT MAX(Price) MaxPrice, MIN(Price) MinPriceFROM ShareDetails. Sh
Grouping Data using Group ByExample
• The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
• Groups the data to return and provide a summary value for each grouping of data. • The basic syntax for grouping:
SELECT column_name, aggregate_function(column_name)FROM table_nameWHERE column_name operator valueGROUP BY column_name
• GROUP BY [ ALL] (column1[ , column2, . . . ])• The option ALL is a bit like an OUTER JOIN. If you have a WHERE statement as part of your SELECT
statement, any grouping filtered out will still return a row in the results, but instead of aggregating the column, a value of NULL will be returned.
• You can use this as a checking mechanism. You can see the rows with values and the rows without values, and visually this will tell you that your filtering is correct.
• Any column defined in the SELECT statement that does not form part of the aggregation must be contained within the GROUP BY clause and
• be in the same order as the SELECT statement. • Failure to do this will mean that the query will give erroneous results and in many cases use a lot of
resources in giving these results.
Grouping Data using Group ByExample
Grouping Data using Group ByExample
Now we want to find the total sum (total order) of each customer.We will have to use the GROUP BY statement to group the customers.We use the following SQL statement:
SELECT Customer, SUM(OrderPrice) FROM OrdersGROUP BY Customer
Grouping Data using Group ByExample
Problem:Now we want to find the total sum (total order) of each customer.We will have to use the GROUP BY statement to group the customers.We use the following SQL statement:
SELECT Customer, SUM(OrderPrice) FROM OrdersGROUP BY Customer
Grouping Data using Group ByExample
Grouping Data using Group ByExample
• find maximum and minimum values for every share that has a row in the ShareDetails. SharePrices table where the share ID is less than 9999.
• The row that has no Share record will be excluded.
SELECT ShareId, MIN(Price) MinPrice, Max(Price) MaxPrice
FROM ShareDetails. SharePricesWHERE ShareId < 9999GROUP BY ShareId
Grouping Data using Group ByExample
Having
• Is used in conjunction with Group By• Behaves like a filter• Work on Aggregation of Data– Also include aggregation conditionSyntaxGROUP BY column1[, column2. . . ]HAVING [ aggregation_condition]
HavingExample
SELECT CustomerId, COUNT(*)FROM CustomerBankTransactionsWHERE TransactionDate BETWEEN ' 1 Aug 2010
' AND ' 31 Aug 2010 ‘GROUP BY CustomerIdHAVING COUNT(*) > 20
HavingExample
Retrieve Rows Where Min Share Price is greater than $10:
SELECT sp. ShareId, s. ShareDesc, MIN(Price) MinPrice, Max(Price) Max
FROM ShareDetails. SharePrices spLEFT J OIN ShareDetails. Shares s ON s. ShareId = sp. ShareIdWHERE sp. ShareId < 9999GROUP BY ALL sp. ShareId, s. ShareDescHAVING MIN(Price) > 10
Distinct Values
• Some tables might have multiple entries having same values
• You might want to filter only one row• Place the keyword DISTINCT after the SELECT
statement and before the list of columns.
Distinct ValuesExample
This query may return multiple rows for a share price
SELECT s. ShareDesc, sp. Price, sp. PriceDateFROM ShareDetails. Shares sJOIN ShareDetails. SharePrices sp ON sp. ShareId = s. ShareId
Distinct ValuesExample
Placing DISTINCT at the start of the column list doesn’t make any difference, because there are different prices and different price dates.
SELECT DISTINCT s. ShareDesc, sp. Price, sp. PriceDateFROM ShareDetails. Shares sJOIN ShareDetails. SharePrices sp ON sp. ShareId = s. ShareId
Distinct ValuesExample
What about this:SELECT DISTINCT s. ShareDescFROM ShareDetails. Shares sJOIN ShareDetails. SharePrices sp sp. ShareId = s. ShareId