Instructor: Craig Duckett Lecture 03: Tuesday, April 14, 2015 SQL Sorting, Aggregates and Joining...

31
Instructor: Craig Duckett Lecture 03: Tuesday, April 14, 2015 SQL Sorting, Aggregates and Joining Tables 1

Transcript of Instructor: Craig Duckett Lecture 03: Tuesday, April 14, 2015 SQL Sorting, Aggregates and Joining...

1

Instructor: Craig Duckett

Lecture 03: Tuesday, April 14, 2015SQL Sorting, Aggregates and Joining Tables

2

Assignment 1 is due LECTURE 4, Thursday, April 16th, in StudentTracker by MIDNIGHT

• MID-TERM EXAM is LECTURE 10, Tuesday, May 12th • Assignment 2 is due LECTURE 11, Thursday, May 14th, in StudentTracker

by MIDNIGHT

3

What? Tinnitus! That’s What!

4

Tuesday (LECTURE 3)• Database Design for Mere Mortals: Chapter 2

Thursday (LECTURE 4)• The Language of SQL: • Chapter 3: Calculations and Aliases• Chapter 4: Using Functions

5

• SQL Sorting (Sort Query Results)• SQL Aggregate Functions• SQL Joining Tables• In-Class Exercises• Working with XAMPP• SQL Queries

SQL Sorting (Sort Query Results)

The way that the database will return the results of your query is not always what you want, so let's see how to sort those results. In this simple example I have a Product table in my database, and I've got a very straightforward SELECT statement that's going to select three of the columns Description, ListPrice, and Color from the Product table, meaning, give me all the rows.

I haven't even applied a WHERE clause on this. So, get everything returned whether that's a thousand rows or a million rows, but there will be no inherent order in these. The way that they are going to be returned is currently more to do with the internal structure of the database. Now, it doesn't necessarily match what I might find most useful.

SQL Sorting (Sort Query Results)

Let's say what I want to do is find out the most expensive products that I have. I would get that data in my results, but not presented to me in an easy to scan way. I'd like to see most expensive first, cheapest last. And I can do this with another optional keyword in an SQL query, which I'll put right at the end of this statement after the FROM clause.

So, I'm going to add the keyword Order By, and it is written as two separate words. The question is Order By what?

SQL Sorting (Sort Query Results)

In this case, I'd like to order by the values in the column called, ListPrice, whatever those values are. So, I'll use the name of that column, ORDER BY ListPrice. Now by default, ordering is in ascending order, which would mean the row with the smallest ListPrice would be arranged first.

SQL Sorting (Sort Query Results)

If I want to order by the most expensive first, I need to make this descending order and to do that, I just type in the word DESC afterwards. There is an ASC keyword for ascending, but ascending is the default. The results of these will come back, and this time around, it's going to bring back the same number of rows, but it will order them by ListPrice descending.

SQL Sorting (Sort Query Results)

Now you can also pick multiple columns to ORDER BY. So if in this example, I was writing a simple SELECT statement to select all the rows from the Employee table, in this case where a Salary is greater than 50,000, because we can do WHERE clause as well as an ORDER BY. I'll do an Order By LastName, FirstName. I didn't use the word DESC or ASC, so they're both going to be ascending. So, the results will come back, again, it doesn't matter how many rows they were. We're going to first be ordering by LastName all the way through these results. But then wherever the LastName is the same, we're going to do a sub ordering within it and in this case we'll order by FirstName. So as you see a very simple format to start to impose some kind of structure on the results you're getting back in your query.

Next up, we have a few more SQL keywords to work with and these are all grouped under the term aggregate functions, which doesn't really suggest how useful they can be. An aggregate or grouping function means it will perform some kind of calculation on a set of data that we describe, but return a single value. Now, what that single value is depends on which aggregate function we use.

Aggregate Functions

We've seen already how we can do a simple SELECT statement like this one, and this will return everything, all our columns, all our rows, whether that's five employee rows or 5,000 or 500,000. But what if that number itself was the piece of information I wanted? What if I just wanted you to know how many rows are in this table? I don't need anything else. I just want to know how many rows. Well, I can do that by using an Aggregate function in SQL called Count.

Aggregate Functions

And what I'm going to do is just change the SELECT * to SELECT COUNT (*). Count everything in the Employee table. If I execute this, it will COUNT all the rows and just return a single value just, in this case 547.

Aggregate Functions

Now, you could of course use a WHERE clause if you wanted to restrict the results to just counting the number of employee rows that have a Salary greater than 5,000, we get a different result. That is all COUNT will do for you is just COUNT the number of rows for this particular condition.

Aggregate Functions

Now sometimes, however, the COUNT isn't what you want. In the previous topic on Sorting, I used the example of selecting Products and using the Order By clause to sort them by, in this case ListPrice descending. But what if the only reason I was doing that was to find out what the maximum ListPrice was? Well, instead of using this Order By descending and looking at the top row, I could instead just do something like this. SELECT MAX. Now instead of saying SELECT MAX with the asterisk, saying SELECT MAX of everything, I'm just focused on one particular column. What's the maximum value of ListPrice in the entire Product table? In this case, we'll bring back $699.

Aggregate Functions

Instead of using this Order By descending and looking at the top row, I could instead just do something like this. SELECT MAX. Now instead of saying SELECT MAX with the asterisk, saying SELECT MAX of everything, I'm just focused on one particular column. What's the maximum value of ListPrice in the entire Product table? In this case, we'll bring back $699.

Aggregate Functions

If we have MAX for maximum, it's a pretty good guess that we're also going to have MIN for minimum, and we do.

We also have AVG for average. We would add up all the values in the ListPrice column divided by the number of rows and return that single value.

Aggregate Functions

While we've seen COUNT, there's also an option of SUM, which instead of counting the rows will total all the values up.

So, in this case with this statement, we're looking for a CustomerID equal to 854 that will find all the rows for that customer and then add together all the values in the total due column.

Aggregate FunctionsAnd that's one of the great things about working with Aggregate functions that they really don't allow an awful lot of complexity. They are going to return a single number. They need to be very straightforward to customize. However, we can take them one step further.

We've seen how we can use the idea of SELECT COUNT (*) to get the number of rows. We can use it in conjunction with a WHERE clause. In this case, COUNT up the number of rows where the color is equal to 'Red', and we'll get some result back.

Aggregate FunctionsBut if I wanted to know not just the COUNT of Red Products or All Products, but I want to know how many Red Products we have, how many Black Products we have, Silver, Gold and so on, well, I could create multiple statements like this just changing the WHERE clause every time. But that's not only tedious, it assumes I know ahead of time what all the colors will be, and we might be adding new colors all the time.

But if I wanted to know not just the COUNT of Red Products or All Products, but I want to know how many Red Products we have, how many Black Products we have, Silver, Gold and so on. What I can do is add another SQL keyword, which is great for use with the Aggregate functions, and that is GROUP BY. So, I'm selecting here to GROUP BY Color. It will count up all the products for a particular color, because we've told that to group by color. A GROUP BY is something that only makes sense with Aggregate functions. You don't use GROUP BY otherwise. Conversely on the other side, if you are using an Aggregate function, you are pretty much always going to use it by itself unless you used GROUP BY, because the most you would ever expect to return from your query is one single value unless you are using GROUP BY to categorize those results.

After a while, you're going to find that very straight forward SQL statements where you're just selecting a few pieces of data from a single table, well, they end up being a little limiting. If we've gone to all the trouble of defining multiple tables to store our data, we do that with the understanding that it will be possible to get it all back using the relationships that we've described. So, I want to have one SQL query not to be limited to selecting from one table but to be able to select from two different tables or even three or more and the phrase we're going to use is to join our tables together.

Joining Tables

So, in the example I'm going to go through, we have an Employee table and a Department table, and there's a one-to-many relationship between department and employee using the Department ID column.

Joining Tables

Department ID is a foreign key and employee is a primary key in Department, and it's just so we don't store redundant department details for each employee row. Now the question is how would we start to join these together in SQL?

Joining Tables

Let's begin just by doing a fairly regular SQL Statement. So, I'm selecting a few columns, First Name, Last Name, Hire Date, and Department ID just from the employee table. So, we've seen this one before. No surprises there in the results that we would expect.

Joining Tables

To start involving the other table, the magic word here is JOIN. After the Employee table, I'm going to use the word JOIN and then say Department. So, from Employee join Department. This by itself isn't doing very much but we're starting to add the necessary pieces bit by bit.

If I say I want to join these two tables together, I can then start adding columns from the Department table to the Select clause. So, after Department ID, I'm going to say common name, location that are both columns in my Department table. But immediately, we have a problem. If I'm now selecting from two different tables, it's really common that it will have a name conflict. In this case, Department ID is going to give us a problem here because Department ID exists as a column in the Employee table, and as a column in the Department table. So, the SQL query would be very confused, which one am I talking about?

Joining Tables

What I could do is just to prefix that with the name of the table, and it wouldn't actually really matter which one we picked, but we need to be explicit so that SQL doesn't get confused. And if I wanted my SQL statement here to be very explicit, I could do this for every column. So, select employee.first name, employee.lastname, employee.hiredate, department.name, department.location and so on, but I still do have an issue. I still couldn't run this query because I need to describe exactly how these tables are to be joined together.

The way that we do that is to use the word ON, and we'll use that in conjunction with JOIN. So, it's Employee JOIN Department ON, and I name the columns in each table and how they link together.

Joining Tables

If I run that statement, we'll get the results back which are combining, joining these two tables together. Now, one important idea here is in the kind of join that we're doing right now. It's only going to bring back rows where there it is a match between the two tables. So, if you notice in the actual employee table at the top, the third row is Alice Bailey. Well, Alice has a Department ID column value of null. She's not linking to Department. So, what that means is when we do the join, we will not get a row coming back for Alice Bailey because it has to have a match, and that's because what we're doing here is called an inner join. So, I am using this JOIN keyword here, and if I was to be good about this, and I usually would be, I should use the words INNER JOIN rather than JOIN, even though that is the default kind of JOIN is an INNER JOIN.

Joining Tables

Using an INNER JOIN means only bring back the rows where there is a match in both tables. So, we'll neither see the row like Alice Bailey because she has a Department ID of null but on the other hand nor will we see any of rows from the Department table that don't have matching Employees. In this simple example here, I don't have anybody with the Department ID of one so I'm never seeing the row that says Production CA with the budget code of A4, but sometimes you might want to start involving these other rows that don't exactly match, and that would be by creating something called an OUTER JOIN.

Joining Tables

An OUTER JOIN means we're going to pick one of the tables and say this one takes precedence. We want to see all of the rows return from a particular table and still show the matching data where possible. Now, that might sound a little weird so let me demonstrate what the difference would be. So instead of using the INNER JOIN keywords, I'm going to use OUTER JOIN, but I can't just write OUTER JOIN I have to be explicit.

With an OUTER JOIN, you are typically saying one of these tables takes precedence over the other. We are interested in where they match but we still want to get the results where they don't. So, I would typically use the word left or right, A LEFT OUTER JOIN or a RIGHT OUTER JOIN. The left and right here simply means is that the table on the left-hand side of the word JOIN, which for us as employee, or it is the one on the right-hand side of the word join, which will be department. So, in this one, I'm going to do a LEFT OUTER JOIN, and it's going to look to the left to the word join and see employee and then say that employee will take precedence.

Joining Tables

31

BIT 275 ICE 03