Database Management Systems: Relational Databases & SQL · 1 Database Management Systems:...

88
Database Management Systems: Relational Databases & SQL Caitlyn Wolf UW Chemical Engineering March 5 th , 2019 Data Science Methods for Clean Energy Research (DSMCER)

Transcript of Database Management Systems: Relational Databases & SQL · 1 Database Management Systems:...

1

DatabaseManagementSystems:RelationalDatabases&SQLCaitlynWolfUWChemicalEngineering

March5th,2019

DataScienceMethodsforCleanEnergyResearch(DSMCER)

2

Announcements

• Studentstand-upsarethisafternoon!

3

Outline

• Introductiontodatabasemanagementsystems(DBMS’s)• WhataretheyandwhyshouldIcare?

• Datamodels• Hierarchical,Network,Relational,Entity-Relationship

• Databasekeys• Databasenormalization• SQL

4

WhatisadatabasemanagementsystemorDBMS?

AdatabasemanagementsystemorDBMSallowsausertoefficientlybuild,access,modifyandupdateadatabaseofinformation.

DBMS

Database

data

User1

Application

User2

schema

storage

5

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing

DBMS

Database

data

schema

storage

6

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity

DBMS

Database

data

schema

storage

7

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)

8

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)• Physicalandlogicaldataindependence

9

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

Application 1 Application 2 User1 User2

10

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

LogicalDataIndependence

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

Application 1 Application 2 User1 User2

11

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

PhysicalorInternalLevel

LogicalDataIndependence

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

Application 1 Application 2 User1 User2

12

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

DiskStorage

LogicalDataIndependence

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

PhysicalorInternalLevel

Application 1 Application 2 User1 User2

13

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

PhysicalorInternalLevel

DiskStorage

PhysicalDataIndependence

LogicalDataIndependence

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

Application 1 Application 2 User1 User2

14

DatabaseLevelsofAbstraction

ExternalLevel

LogicalorConceptualLevel

PhysicalorInternalLevel

DiskStorage

PhysicalDataIndependence

LogicalDataIndependence

DBMS

OperatingSystem

Reference:http://jcsites.juniata.edu/faculty/rhodes/dbms/dbarch.htm

Application 1 Application 2 User1 User2

15

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)• Physicalandlogicaldataindependence• Datalossprotection

16

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)• Physicalandlogicaldataindependence• Datalossprotection

• Highcost

17

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)• Physicalandlogicaldataindependence• Datalossprotection

• Highcost• Requiresadditionalmanagementandsecurity

18

Advantages(andDisadvantages)ofaDBMS

• Improveddataaccessandsharing• Dataadministrationandsecurity• Dataintegrity(accuracyandconsistency)• Physicalandlogicaldataindependence• Datalossprotection

• Highcost• Requiresadditionalmanagementandsecurity• Morecomplexthangeneralfilesystems

19

DataModels

20

DataModels:Hierarchical

Additional Reading: Stonebraker, Michael, and Joey Hellerstein. "What goes around comes around." Readings in Database Systems 4 (2005): 1724-1735

• Simplelanguagetointeractwithandretrievedatafromthesystem

• Somelogicaldataindependenceisallowed

• Each“child”canonlyhaveone“parent”but“parents”canhavemultiplechildren.

• Leadstoredundantdatapronetoinconsistenciesupon updates

• Doesnotofferflexibility(e.g.instructorcannotexistiftheyarenotteachingacourse)

• Lacksphysicaldataindependence (i.e.changesinphysicalstoragewouldbreakapplications)

21

DataModels:Network

Additional Reading: Stonebraker, Michael, and Joey Hellerstein. "What goes around comes around." Readings in Database Systems 4 (2005): 1724-1735

• Moreflexiblethanahierarchicalmodel

• Limitsredundantdatabyallowingmany-manyrelationships

• Complextoworkthroughandretrievedatafrom

• Nophysicaldataindependence

• Nologicaldataindependence

22

DataModels:Relational

Additional Reading: Stonebraker, Michael, and Joey Hellerstein. "What goes around comes around." Readings in Database Systems 4 (2005): 1724-1735

• OffersphysicalANDlogicaldataindependence

• Flexibledesign

• Simplisticcomparedtothenetworkmodel

• Languageisdifficulttounderstand(relationalalgebra&calculus)

• Inefficientdataretrieval

Inthe1980’s, IBMhadthefinalsayinthedebatebetweenrelationalandnetworkmodels….relationalcameoutontop(seeadditionalreading).

ClassID Number Name Credits

1 ChemE 545 DSMCER 3

2 ChemE 546 SEDS 3

InstructorID Name Email

1 DaveBeck [email protected]

2 ChadCurtis [email protected]

ClassID InstructorID

1 1

1 2

2 1

2 2

23

DataModels:Entity-Relationship(E/R– Diagrams)

Additional Reading: Stonebraker, Michael, and Joey Hellerstein. "What goes around comes around." Readings in Database Systems 4 (2005): 1724-1735

• Missingalanguageforinteractionswiththesystem

• Overshadowedbyrelationalmodels

• Becamesuccessfulasaschemadesigntoolforrelationaldatabases

24

DataModels:Entity-Relationship(E/R– Diagrams)

ClassID Number Name Credits

1 ChemE 545 DSMCER 3

2 ChemE 546 SEDS 3

InstructorID Name Email

1 DaveBeck [email protected]

2 ChadCurtis [email protected]

ClassID InstructorID

1 1

1 2

2 1

2 2

Schema:roadmapofdataorganizationinthedatabase.

25

DatabaseKeys

26

DatabaseKeys

ClassID Number Name Credits

1 ChemE 545 DSMCER 3

2 ChemE 546 SEDS 3

InstructorID Name Email

1 DaveBeck [email protected]

2 ChadCurtis [email protected]

ClassID InstructorID

1 1

1 2

2 1

2 2

27

Super,Candidate,PrimaryandForeignKeys

• Akeyisusedtouniquelyidentifyarow,ortuple,ineachtablewithinadatabase.

• Keysareespeciallyimportantfordefiningrelationshipsbetweendata.

ClassID Number Name Credits

1 ChemE 545 DSMCER 3

2 ChemE 546 SEDS 3

InstructorID Name Email

1 DaveBeck [email protected]

2 ChadCurtis [email protected]

ClassID InstructorID

1 1

1 2

2 1

2 2

28

Super,Candidate,PrimaryandForeignKeys

• Asuperkey isanattribute,orsetofattributes,thatcandetermineallotherattributesinatuple.

29

Whatarethesuperkeys?

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

• EmployeeID• SocialSecurityNumber• EmployeeID,SocialSecurityNumber• EmployeeID,Name• SocialSecurityNumber,Name• EmployeeID,SocialSecurityNumber,Name

30

Super,Candidate,PrimaryandForeignKeys

• Asuperkey isanattribute,orsetofattributes,thatcandetermineallotherattributesinatuple.

• Acandidatekey,alsocalledaminimalsuperkey,isakeyinwhichallattributesarerequiredforuniquelyidentifyingthetupleandremainingakey.Thisdoesnothavetobeasingleattribute.

31

Whatarethecandidatekeys?

• EmployeeID• SocialSecurityNumber• EmployeeID,SocialSecurityNumber• EmployeeID,Name• SocialSecurityNumber,Name• EmployeeID,SocialSecurityNumber,Name

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

32

Whatarethecandidatekeys?

• EmployeeID• SocialSecurityNumber• EmployeeID,SocialSecurityNumber• EmployeeID,Name• SocialSecurityNumber,Name• EmployeeID,SocialSecurityNumber,Name

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

33

Super,Candidate,PrimaryandForeignKeys

• Asuperkey isanattribute,orsetofattributes,thatcandetermineallotherattributesinatuple.

• Acandidatekey,alsocalledaminimalsuperkey,isakeyinwhichallattributesofthekeyarerequiredforuniquelyidentifyingthetupleandremainingakey.Thisdoesnothavetobeasingleattribute.

• Aprimarykeyisonespecificcandidatekeychoseninthedatabasedesigntoserveastheuniqueidentifierforeachtuple.

34

Whatistheprimarykey?

• EmployeeID• SocialSecurityNumber• EmployeeID,SocialSecurityNumber• EmployeeID,Name• SocialSecurityNumber,Name• EmployeeID,SocialSecurityNumber,Name

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

35

Whatistheprimarykey?

• EmployeeID• SocialSecurityNumber• EmployeeID,SocialSecurityNumber• EmployeeID,Name• SocialSecurityNumber,Name• EmployeeID,SocialSecurityNumber,Name

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

36

Super,Candidate,PrimaryandForeignKeys

• Asuperkey isanattribute,orsetofattributes,thatcandetermineallotherattributesinatuple.

• Acandidatekey,alsocalledaminimalsuperkey,isakeyinwhichallattributesofthekeyarerequiredforuniquelyidentifyingthetupleandremainingakey.Thisdoesnothavetobeasingleattribute.

• Aprimarykeyisonespecificcandidatekeychoseninthedatabasedesigntoserveastheuniqueidentifierforeachtuple.

• Aforeignkey referencesaprimarykeyinanothertableandisusedtodefinerelationshipsbetweentuplesinseparatetables.

37

Whataretheprimaryandforeignkeys?

DepartmentID DepartmentName FloorLocation

100 Sales 1

200 Marketing 2

300 Payroll 2

DepartmentID EmployeeID

100 2

100 5

300 6

200 1

200 6

300 4

300 3

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

38

Whataretheprimaryandforeignkeys?

DepartmentID DepartmentName FloorLocation

100 Sales 1

200 Marketing 2

300 Payroll 2

DepartmentID EmployeeID

100 2

100 5

300 6

200 1

200 6

300 4

300 3

Primarykeys

EmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

39

Whataretheprimaryandforeignkeys?

DepartmentID DepartmentName FloorLocation

100 Sales 1

200 Marketing 2

300 Payroll 2

DepartmentID EmployeeID

100 2

100 5

300 6

200 1

200 6

300 4

300 3

Primarykeys

ForeignkeysEmployeeID SocialSecurityNo. Name

1 123-45-ABCD Caitlyn

2 123-45-DCBA Dave

3 123-54-BACD Torin

4 456-12-BCAD Chad

5 456-98-BCDA Ted

6 123-54-ABDC Dave

40

In-classactivity:Findthesuperandcandidatekeys

ClassName ClassNumber Instructor Department

DataScienceMethodsforCleanEnergyResearch 545 DaveBeck Chem E

DataScienceMethodsforCleanEnergyResearch 545 DaveBeck MSE

MachineLearning 546 KevinJamieson CSE

ColloidalSystems 556 JohnBerg Chem E

PrinciplesofDataManagement 544 DanSuciu CSE

SEDS 546 DaveBeck Chem E

41

ClassName ClassNumber Instructor Department

DataScienceMethodsforCleanEnergyResearch 545 DaveBeck Chem E

DataScienceMethodsforCleanEnergyResearch 545 DaveBeck MSE

MachineLearning 546 KevinJamieson CSE

ColloidalSystems 556 JohnBerg Chem E

PrinciplesofDataManagement 544 DanSuciu CSE

SEDS 546 DaveBeck Chem E

• ClassName,Department• ClassNumber,Department• ClassName,Instructor,Department• ClassName,ClassNumber,Department• ClassNumber,Instructor,Department• ClassName,ClassNumber,Instructor,Department

SuperkeysCandidatekeys

In-classactivity:Findthesuperandcandidatekeys

42

Normalization

43

DatabaseNormalization

• Normalizationhelpswithschemadesignbyprovidingasetofrulestolimitdataredundancyandsupportdataintegrity

• Rulesgetmorestrictasmoveforwardthrough:• 1st NormalForm• 2nd NormalForm• 3rd NormalForm• BoyceCoddNormalForm(BCNF)

• Formoreinformationbeyondwhatwewillcovertoday,checkoutthisreferenceguide:

• https://www.studytonight.com/dbms/database-normalization.php

44

1st NormalForm

• Alltablesshouldbe2-dimensional,i.e.notmorethanonevalueineachcell.

45

1st NormalForm

• Alltablesshouldbe2-dimensional,i.e.notmorethanonevalueineachcell.

• Whatdoyouseeasaprobleminthisdatabase:

Airline FlightNumber Path Plane

Model Pilot Passenger PassengerCity

PassengerState

Alaska 36 SEAàMSP A Dave Caitlyn Seattle Washington

AmericanAirlines 36 SEAà

MSP B Chad Ted Portland Oregon

Alaska 7 MSPàJFK C Dave Ted,Torin Portland,

SanDiegoOregon,California

Alaska 3367 MSPàJFK C Dave Caitlyn,

TorinSeattle,SanDiego

Washington,California

AmericanAirlines 3367 JFKà

SEA D Chad Torin SanDiego California

46

1st NormalForm

• Alltablesshouldbe2-dimensional,i.e.notmorethanonevalueineachcell.

• Whatdoyouseeasaprobleminthisdatabase:

Multiplepassengersaregroupedtogetherincommoncells.

Airline FlightNumber Path Plane

Model Pilot Passenger PassengerCity

PassengerState

Alaska 36 SEAàMSP A Dave Caitlyn Seattle Washington

AmericanAirlines 36 SEAà

MSP B Chad Ted Portland Oregon

Alaska 7 MSPàJFK C Dave Ted,Torin Portland,

SanDiegoOregon,California

Alaska 3367 MSPàJFK C Dave Caitlyn,

TorinSeattle,SanDiego

Washington,California

AmericanAirlines 3367 JFKà

SEA D Chad Torin SanDiego California

47

1st NormalForm

Airline FlightNumber

Passenger PassengerCity

PassengerState

Alaska 36 Caitlyn Seattle Washington

AmericanAirlines

36 Ted Portland Oregon

Alaska 7 Ted Portland Oregon

Alaska 7 Torin SanDiego California

Alaska 3367 Caitlyn Seattle Washington

Alaska 3367 Torin SanDiego California

AmericanAirlines

3367 Torin SanDiego California

Airline FlightNumber

Path PlaneModel

Pilot

Alaska 36 SEAà MSP A Dave

AmericanAirlines

36 SEAà MSP B Chad

Alaska 7 MSPà JFK C Dave

Alaska 3367 MSPà JFK C Dave

AmericanAirlines

3367 JFKà SEA D Chad

• Alltablesshouldbe2-dimensional,i.e.notmorethanonevalueineachcell.

48

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

49

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

Airline FlightNumber Path Pilot PlaneModel

Alaska 36 SEAàMSP Dave A

AmericanAirlines 36 SEAàMSP Chad B

Alaska 7 MSPà JFK Dave C

Alaska 3367 MSPà JFK Dave C

AmericanAirlines 3367 JFKà SEA Chad D

50

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

Airline,FlightNumberà Path,Pilot,PlaneModelAirlineà Pilot

Airline FlightNumber Path Pilot PlaneModel

Alaska 36 SEAàMSP Dave A

AmericanAirlines 36 SEAàMSP Chad B

Alaska 7 MSPà JFK Dave C

Alaska 3367 MSPà JFK Dave C

AmericanAirlines 3367 JFKà SEA Chad D

51

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

Airline FlightNumber Path PlaneModel

Alaska 36 SEAàMSP A

AmericanAirlines 36 SEAàMSP B

Alaska 7 MSPà JFK C

Alaska 3367 MSPà JFK C

AmericanAirlines 3367 JFKà SEA D

Airline Pilot

Alaska Dave

AmericanAirlines Chad

52

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

53

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàEAirline FlightNumber Passenger PassengerCity PassengerState

Alaska 36 Caitlyn Seattle Washington

AmericanAirlines 36 Ted Portland Oregon

Alaska 7 Ted Portland Oregon

Alaska 7 Torin Seattle Washington

Alaska 3367 Caitlyn Seattle Washington

Alaska 3367 Torin Seattle Washington

AmericanAirlines 3367 Torin Seattle Washington

54

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàEAirline FlightNumber Passenger PassengerCity PassengerState

Alaska 36 Caitlyn Seattle Washington

AmericanAirlines 36 Ted Portland Oregon

Alaska 7 Ted Portland Oregon

Alaska 7 Torin Seattle Washington

Alaska 3367 Caitlyn Seattle Washington

Alaska 3367 Torin Seattle Washington

AmericanAirlines 3367 Torin Seattle Washington

Airline,FlightNumber,Passengerà City,StatePassengerà City,State

55

2nd NormalForm

• 1st normalformmustbemetandthereshouldbenopartialfunctionaldependencies.

• ABàCDE,AàE

Airline FlightNumber Passenger

Alaska 36 Caitlyn

AmericanAirlines 36 Ted

Alaska 7 Ted

Alaska 7 Torin

Alaska 3367 Caitlyn

Alaska 3367 Torin

AmericanAirlines 3367 Torin

Passenger PassengerCity PassengerState

Caitlyn Seattle Washington

Ted Portland Oregon

Torin Seattle Washington

56

3rd NormalForm

• 2nd normalformmustbemetandthereshouldbenotransitivefunctionaldependencies.

• Aà B,Bà C

57

3rd NormalForm

• 2nd normalformmustbemetandthereshouldbenotransitivefunctionaldependencies.

• Aà B,Bà C

Passenger PassengerCity PassengerState

Caitlyn Seattle Washington

Ted Portland Oregon

Torin Seattle Washington

58

3rd NormalForm

• 2nd normalformmustbemetandthereshouldbenotransitivefunctionaldependencies.

• Aà B,Bà C

Passenger PassengerCity PassengerState

Caitlyn Seattle Washington

Ted Portland Oregon

Torin Seattle Washington

Passengerà CityCityà State

59

3rd NormalForm

• 2nd normalformmustbemetandthereshouldbenotransitivefunctionaldependencies.

• Aà B,Bà C

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

PassengerCity PassengerState

Seattle Washington

Portland Oregon

60

BoyceCoddNormalForm(BCNF)

• Mustmeet3rd normalform,andeveryfunctionaldependencymustbetrivialorrepresentasuperkey.

• ABà C,Cà B

61

BoyceCoddNormalForm(BCNF)

• Mustmeet3rd normalform,andeveryfunctionaldependencymustbetrivialorrepresentasuperkey.

• ABà C,Cà B

Airline FlightNumber Path PlaneModel

Alaska 36 SEAàMSP A

AmericanAirlines 36 SEAàMSP B

Alaska 7 MSPà JFK C

Alaska 3367 MSPà JFK C

AmericanAirlines 3367 JFKà SEA D

62

BoyceCoddNormalForm(BCNF)

• Mustmeet3rd normalform,andeveryfunctionaldependencymustbetrivialorrepresentasuperkey.

• ABà C,Cà B

Airline FlightNumber Path PlaneModel

Alaska 36 SEAàMSP A

AmericanAirlines 36 SEAàMSP B

Alaska 7 MSPà JFK C

Alaska 3367 MSPà JFK C

AmericanAirlines 3367 JFKà SEA D

Airline,FlightNumber,Pathà PlaneModelPlaneModelà Path

63

BoyceCoddNormalForm(BCNF)

• Mustmeet3rd normalform,andeveryfunctionaldependencymustbetrivialorrepresentasuperkey.

• ABà C,Cà B

Airline FlightNumber Path

Alaska 36 SEAàMSP

AmericanAirlines 36 SEAàMSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Path PlaneModel

SEAàMSP A

SEAàMSP B

MSPà JFK C

JFKà SEA D

64

Wherewestarted

Airline FlightNumber Path Plane

Model Pilot Passenger PassengerCity

PassengerState

Alaska 36 SEAàMSP A Dave Caitlyn Seattle Washington

AmericanAirlines 36 SEAà

MSP B Chad Ted Portland Oregon

Alaska 7 MSPàJFK C Dave Ted,Torin Portland,

SanDiegoOregon,California

Alaska 3367 MSPàJFK C Dave Caitlyn,

TorinSeattle,SanDiego

Washington,California

AmericanAirlines 3367 JFKà

SEA D Chad Torin SanDiego California

65

FinaldesigninBCNF

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Path PlaneModel

SEAà MSP A

SEAà MSP B

MSPà JFK C

JFKà SEA D

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

PassengerCity PassengerState

Seattle Washington

Portland Oregon

Airline FlightNumber Passenger

Alaska 36 Caitlyn

AmericanAirlines 36 Ted

Alaska 7 Ted

Alaska 7 Torin

Alaska 3367 Caitlyn

Alaska 3367 Torin

AmericanAirlines 3367 Torin

Airline Pilot

Alaska Dave

AmericanAirlines Chad

66

SQL:StructuredQueryLanguage

67

SQL:StructuredQueryLanguage

• SQLcannotonlyallowyoutoretrievespecificdatafromarelationaldatabase,itenablesyoutocreate,delete,and/orupdateinformationinthedatabase.

• Today,wewillfocusonthebasicstructureforretrievingspecificdatafromadatabase.

• ThesewillstartwiththekeywordSELECT

• Formoreinformation,andacomprehensiveSQLtutorial,checkoutthisreference:

• https://www.w3schools.com/sql/sql_intro.asp

68

SELECT

• YouwillalwaysstartyourSQLquerieswiththefollowingstructure:

SELECTfeature1,feature2,feature3…FROMtable;

69

SELECT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECTAirline,FlightNumberFROMFlights;

70

SELECT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECTAirline,FlightNumberFROMFlights;

Airline FlightNumber

Alaska 36

AmericanAirlines 36

Alaska 7

Alaska 3367

AmericanAirlines 3367

Results

71

SELECT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECT*FROMFlights;

72

SELECT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECT*FROMFlights;

ResultsAirline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

73

SELECTDISTINCT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECTDISTINCTAirlineFROMFlights;

74

SELECTDISTINCT

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECTDISTINCTAirlineFROMFlights;

Airline

Alaska

AmericanAirlines

Results

75

WHERE

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECT*FROMFlightsWHEREAirline=‘Alaska’;

76

WHERE

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECT*FROMFlightsWHEREAirline=‘Alaska’;

ResultsAirline FlightNumber Path

Alaska 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

77

WHERE

• YoucanusemanydifferentoperatorswiththeWHEREkeywordtodeclareacondition:

• =• <>or!=• >• <• >=• <=

• YoucanalsostringtogethermanyconditionalstatementsusingAND,OR,NOT

78

WHERE

Airline FlightNumber Path

Alaska 36 SEAà MSP

AmericanAirlines 36 SEAà MSP

Alaska 7 MSPà JFK

Alaska 3367 MSPà JFK

AmericanAirlines 3367 JFKà SEA

Flights

SELECT*FROMFlightsWHEREAirline=‘Alaska’ANDPath=‘SEAàMSP’;

ResultsAirline FlightNumber Path

Alaska 36 SEAà MSP

79

JOINS

Imagesource:https://commons.wikimedia.org/wiki/File:SQL_Joins.svg

80

INNERJOIN

Results

SELECTPassengers.Passenger,Homes.PassengerStateFROMPassengersINNERJOINHomesONPassengers.PassengerCity=Homes.PassengerCity;

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Chad NewYork

PassengerCity PassengerState

Seattle Washington

Portland Oregon

SanDiego California

PassengersHomes

Passenger PassengerState

Caitlyn Washington

Ted Oregon

Torin Washington

81

LEFTJOIN

SELECTPassengers.Passenger,Homes.PassengerStateFROMPassengersLEFTJOINHomesONPassengers.PassengerCity=Homes.PassengerCity;

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Chad NewYork

PassengerCity PassengerState

Seattle Washington

Portland Oregon

SanDiego California

PassengersHomes

ResultsPassenger PassengerState

Caitlyn Washington

Ted Oregon

Torin Washington

Chad NULL

82

AggregateFunctions

• COUNT,MAX,MIN,SUM,AVGareavailabletoperformoperationsonyourdataset

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Dave Seattle

Chad NewYork

Passengers

SELECTCOUNT(Passenger)FROMPassengersWHEREPassengerCity=‘Seattle’;

COUNT(Passenger)

3

Results

83

GROUPBY

• YoucanalsoaGROUPBYkeywordwithanaggregatefunction

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Dave Seattle

Chad NewYork

Passengers

SELECTCOUNT(Passenger),PassengerCityFROMPassengersGROUPBYPassengerCity

COUNT(Passenger) PassengerCity

3 Seattle

1 Portland

1 NewYork

Results

84

Aliases

• Youcangivecolumnsortablestemporarynamesusinganalias

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Dave Seattle

Chad NewYork

Passengers

SELECTCOUNT(Passenger)asTotal,PassengerCityFROMPassengersGROUPBYPassengerCity

Total PassengerCity

3 Seattle

1 Portland

1 NewYork

Results

85

NestingwithSELECT

• Youcanselectfromatablethatistheresultofanotherquery

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Dave Seattle

Chad NewYork

Passengers

SELECTCOUNT(Passenger)asTotal,PassengerCityFROMPassengersGROUPBYPassengerCity

Total PassengerCity

3 Seattle

1 Portland

1 NewYork

Results

86

NestingwithSELECT

• Youcanselectfromatablethatistheresultofanotherquery

SELECTMAX(Total),PassengerCityFROM(SELECTCOUNT(Passenger)asTotal,PassengerCityFROMPassengersGROUPBYPassengerCity);

Total PassengerCity

3 Seattle

Results

Passenger PassengerCity

Caitlyn Seattle

Ted Portland

Torin Seattle

Dave Seattle

Chad NewYork

Passengers

SELECTCOUNT(Passenger)asTotal,PassengerCityFROMPassengersGROUPBYPassengerCity

Total PassengerCity

3 Seattle

1 Portland

1 NewYork

Results

87

PracticewritingSQLqueries

ThereferenceIprovidedalsoincludesapracticerelationaldatabase:https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all

Trywritingeachofthequeriesbelowandwritedowntheanswers:1. Howmanycustomersarethereinthedatabase?2. Howmanydifferentfirstnamesarethereamongtheemployees?3. Howmanydifferentcountriesarethesupplierslocatedin?4. HowmanysuppliersarelocatedinFrance?5. Whatisthemostofoneproductitemthatwaseverpurchasedonasingle

order?6. Whichcountryhasthemostsuppliers?(Hint:exploreanestedquery)7. CreateatableofProductNameandCategoryName forproductswitha

pricelessthan50.

88

PracticewritingSQLqueries

ThereferenceIprovidedalsoincludesapracticerelationaldatabase:https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_all

Trywritingeachofthequeriesbelowandwritedowntheanswers:1. Howmanycustomersarethereinthedatabase?912. Howmanydifferentfirstnamesarethereamongtheemployees?103. Howmanydifferentcountriesarethesupplierslocatedin?174. HowmanysuppliersarelocatedinFrance?35. Whatisthemostofoneproductitemthatwaseverpurchasedonasingle

order?1206. Whichcountryhasthemostsuppliers?(Hint:exploreanestedquery) USA7. CreateatableofProductNameandCategoryName forproductswitha

pricelessthan50.70rows