INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team...

29
INTRODUCTION DP SUMMARIES QUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner Li Shubin Viraj Mohan Zahin Ali NORMALIZATION FORMS

Transcript of INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team...

Page 1: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

INTRODUCTION DP SUMMARIES QUERIES

International Student College Experience Enhancement Program

Team MembersAlice ZhangFlorence LiaoHuan GuoJake MagnerLi ShubinViraj MohanZahin Ali

NORMALIZATION

FORMS

Page 2: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Project Background

To design a database for a website that helps international students with various aspects of “settling in”, by providing a platform for interaction between students, local communities, cultural organizations and employers

Project Objective

XiYiRen, a start up social utility website will be using a small part of our expansive project, focusing on Chinese students.

Client

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Page 3: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Project Background: Objective and Client descriptionSummary of entities involvedDatabase capabilities Simplified EER diagram with 10 entities, 3 Weak entities/relationships, and superclass/subclass division

DP I Summary

Progress

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Page 4: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

DP II Summary

Revised simplified EER diagramIncluding more entities and 30 relationships Implementation of queries in relational algebraRealized need for more complex queries utilizing IEOR methods: forecasting, optimal event locating, etc.

Progress

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 5: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

DP III Summary

Revised simplified EER diagramRelational schema Five queries implemented in SQL and AccessFocused on client-centric queries

Progress

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 6: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

EER

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 7: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Relational Schema1. Person(Pid, Fname, Lname, MI, Birth_date, Profile5)2. Student(Pid1, Housing7, University14, Pickup_Person3, Flight, Country11, price_preference,

year, sleep, wakeup, study, friends, outgoing)3. Community_Member(Pid1, occupation)4. Alumni(Pid1, Class, Occupation, Donation_Amount) 5. Profile(Profile_id, Pic, Email, Phone)6. Location(Street, City, State, Apt_Suite, Zip, x, y)7. Housing(Hid, offered_by_person1, Street6, Apt_Suite6, Zip 6, offered_by_org8, org_profile5,

price, availability_date, furnished, number_rooms, number_bathrooms, water, electice, garbage, gas, internet, move-in special)

8. Organization(OrgName, Profile_id5, Street6, Apt_Suite6, Zip 6, type, description)9. Department(DepName, University14)10. Event(EventName, Profile_id5, Street6, Apt_Suite6, Zip 6, description, attendance, date, time)11. Country(Name, Capital, Population)12. Language(Name, Countries_spoken_in)13. Resource(Rid, Owner1, Price, Quantity)14. University(Name, student_population, ranking)15. Donation(Did, Amount, Time, Date, Pid1)

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 8: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Relational Schema (contd)16. Mentors(Mentor1, Mentee2)17. Student_University(Student2, University14)18. Person_in_Org(Person1, OrgName8, OrgProfile5)19. RSVP(Person1, EventName10, EventProfile5, SurveyScore)20. Student_in_Department(Student2, DepName9, UniName14)21. Person_speaks_language(Person1, Language12)22. Housing_near_Uni(Housing7, UniName14)23. Organization_University(OrgName8, OrgProfile5, UniName14)24. Org_holds_event(OrgName8, OrgProfile5, EventName10, EventProfile5)25. Org_speaks_Language(OrgName8, OrgProfile5, Language12)26. Org_Country(OrgName8, OrgProfile5, Country11)27. Dep_sponsors_event(DepName9, UniName14, EventName10, EventProfile5)28. Event_speaks_language(EventName10, EventProfile5, Language12)29. Event_country(EventName10, EventProfile5, Country11)30. Country_Language(Country11, Language12)31. Alumni_Uni(Pid4, UniName14, class_of)32. Alumni_Dept(Pid4, DepName9)33. Person_gives_donation(Pid1, Did15)34. Rommates(Pid11, Pid21)

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 9: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Relational Design

INTRODUCTION QUERIES NORMALIZATION

FORMSDP SUMMARIES

Page 10: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 1: Roommate Matching

•Shows all possible roommate combinations ordered by MatchRating.• A dorm/off-campus housing facility can use it to pair up students interested in their housing

Description

Description of Attributes

Sleep Early to late sleep time (Scale of 1-5)

Wakeup Early to late (1-5)

Outgoing Outgoingness Level (1-5)

Study In room(1) - Library(5)

Friends Having friends in room never(1) – always(5)

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Page 11: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 1: Roommate Matching

SELECT P.Fname, P.Lname, Q.Fname, Q.Lname, Min(0.2*(Abs(S.sleep-R.sleep))+0.2*(Abs(S.wakeup-R.wakeup))+0.2*(Abs(S.outgoing-R.outgoing))+0.2*(Abs(S.study-R.study))+0.2*(Abs(S.friends-R.friends))) AS MatchratingFROM Student AS S, Student AS R, Person AS P, Person AS QWHERE (((S.pid)=[P].[pid]) AND ((Q.pid)=[R].[pid] And (Q.pid)<[P].[pid]))GROUP BY P.Fname, P.Lname, Q.Fname, Q.LnameHAVING (((([P].[Fname]=[Q].[Fname]) And ([P].[Lname]=[Q].[Lname]))=False))ORDER BY Min(0.2*(20-Abs(S.sleep-R.sleep))+0.2*(20-Abs(S.wakeup-R.wakeup))+0.2*(20-Abs(S.outgoing-R.outgoing))+0.2*(20- Abs(S.study-R.study))+0.2*(20-Abs(S.friends-R.friends)));

SQL Code

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 12: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 1: Roommate Matching

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 13: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 2: New Student Forecasting

•Extracts the data of how many new students come each year which can then be used to forecast the future number of students•The year table is a one attribute table containing a list of years •Uses regression equation y=ax+b with slope b = (N∑XY - (∑X)(∑Y))/(N∑X2 - (∑X)2), and intercept a = (∑Y - b(∑X))/N. Where N = number of tuples, X =year, and Y = number of students

Description

SELECT y.year AS [Year], count(s.pid) AS Number_Of_Students, u.name AS UniversityFROM [year] AS y, student AS s, university AS uWHERE s.year=y.year AND s.university=u.nameGROUP BY y.year, u.nameORDER BY y.year;

SQL Code

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 14: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 3: Event Interest

•Outputs a list of all events along with their computed attendance rate, the average level of student interest, and a metric combining surveyed interest with actual attendance•Organizations throwing events with low attendance but high survey scores may need to look into changing venues or increasing advertising.

Description

SELECT e.EventName, e.Attendance/(Count(r.person)) AS Attendance_Rate, Avg(r.SurveyScore) AS Surveyed_Interest, Avg(r.SurveyScore)*e.Attendance/(Count(r.person)) AS Interest_MetricFROM Event AS e, RSVP AS rWHERE (((r.EventProfile)=[e].[Profile_id]))GROUP BY e.EventName, e.Profile_id, e.AttendanceORDER BY Avg(r.SurveyScore)*e.Attendance/(Count(r.person)) DESC;

SQL Code

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 15: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 3: Event Interest

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 16: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 4: Optimal Event Location

•Selects optimal potential event location on UC Berkeley campus in relation to attendee housing locations. •By utilizing P-Median approach for event location that minimizes total demand weighted distances•Assume P = 1 and calculate Dij by utilizing Euclidean distance formula:

Description

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 17: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 4: Optimal Event Location

SELECT e.EventName, l2.street AS Potential_Location, sum(((l.x-l2.x)^2)+((l.y-l2.y)^2)^0.5) AS distance, AVG(s.EventInterest) AS DemandFROM Student AS s, RSVP AS p, Housing AS h, location AS l, location AS l2, Event AS eWHERE s.PID=p.person And p.EventName=e.EventName And s.housing=h.hid And h.street=l.street And h.state=l.state And h.city=l.city And h.apt_suite=l.apt_suite And h.zip=l.zipGROUP BY e.EventName, l2.streetORDER BY e.EventName, sum(((l.x-l2.x)^2)+((l.y-l2.y)^2)^0.5);

SQL Code

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 18: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 4: Optimal Event Location

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 19: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Assumptions: (1)Only take students who arrive at the airport between 8am to 7:59 pm into account(2)Buses leave the airport on the hour. (3)The opportunity cost of each student waiting for a bus for an hour is $10. (4) Each type I bus has a total of 5 seats and each type II bus has a total of 10 seats.(5) We only deal with the arrival hour of each student, (student arriving at 1:01pm is treated the same as a student arriving at 1:59pm in this query implementation. and a ten-seat-vehicle to the airport and back cost $50 and $100, respectively.

•For date, airport extract # of students arriving in each time interval C i

•A≤i≤L; Ci is interpreted as the number of students arriving at the airport no earlier than (i-1) o’clock but prior to i o’clock

Description

Query 5: Min Airport Pick-up Cost

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 20: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 5: Min Airport Pick-up Cost

Formulation Decision variables:tij= 1 if a type j bus is arranged to pick up students at i o’clock.tij = 0 otherwise; (For A≤i≤L, 1≤j≤2)Objective Function (Cost Min.):

SELECT s.airport AS Airport, s.arr_date AS Arr_Date, s.flight_arr_hour AS Arr_Time, COUNT(*) AS Number_of_Students

FROM student AS sGROUP BY s.flight_arr_hour, s.arr_date, s.airport;

SQL Code

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Subject to. People_constrain {Z in A,B,C,D,E,F,G,H,I,J,K,L}:

Page 21: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Query 5: Min Airport Pick-up Cost

INTRODUCTION DP SUMMARIES NORMALIZATION

FORMSQUERIES

Page 22: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Normalization Analysis: 1NF

R is in 1NF if the domain of an attribute must include only atomic (simple, indivisible) values and that the value of any attribute in a tuple must be a single value from the domain of that attribute.

Profile (Profile_id, Pic, Emails, Phones)Pic (Profile_id, Pic)Email (Profile_id, Email)Phone (Profile_id, Phone)

1NF

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Page 23: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Normalization Analysis: 2NF

R is in 2NF if R is in 1NF and every nonprime attribute A in R is fully functionally dependent on the primary key of R.

Location (Street, City, State, Apt_Suite, Zip, x, y)Assumption: ZIP_CODE determines CITY and STATE. Location1 (Street, Apt_Suite, Zip, x, y)Zip (Zip, City, State) Organization (OrgName, Profile_id5, Street6, Apt_Suite6, Zip6, type, description)Assumption: The name of an organization determines its type. OrgName (OrgName, Type)Organization1 (OrgName, Profile_id5, Street6, Apt_Suite6, Zip6, description)

2NF

INTRODUCTION DP SUMMARIES QUERIES FORMSNORMALIZATION

Page 24: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Normalization Analysis: 3NF

R is in 3NF if R is in 2NF and no nonprime attribute of R is transitively dependent on the primary key. Housing (Hid, offered_by_person1, Street6, Apt_Suite6, Zip 6, offered_by_org8, org_profile5, price, availability_date, furnished, number_rooms, number_bathrooms, water, electricity, garbage, gas, internet, move_in_special, ready_to_move_in)Assumption: For a housing place to be “ready to move in”, it has to have Internet, water, electricity, gas and garbage.Housing1 (Hid, offered_by_person1, Street6, Apt_Suite6, Zip 6, offered_by_org8, org_profile5, price, availability_date, furnished, number_rooms, number_bathrooms, move_in_special, Water, Electricity, Garbage, Gas, Internet)Ready_to_move_in (ready_to_move_in, Water, Electricity, Garbage, Gas, Internet)

3NF

INTRODUCTION DP SUMMARIES QUERIES FORMSNORMALIZATION

Page 25: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Normalization Analysis: BCNF

R is in BCNF if whenever a nontrivial functional dependency XA holds in R, then X is a superkey of R.

Student (Pid1, Housing7, University14, Pickup_Person3, Flight, Country11, price_preference, year, sleep, wakeup, study, friends, outgoing)

BCNF

INTRODUCTION DP SUMMARIES QUERIES FORMSNORMALIZATION

Page 26: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Organization Form

Page 27: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Person Form

Page 28: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

INTRODUCTION DP SUMMARIES QUERIES NORMALIZATION

FORMS

Student Report

Page 29: INTRODUCTIONDP SUMMARIESQUERIES International Student College Experience Enhancement Program Team Members Alice Zhang Florence Liao Huan Guo Jake Magner.

Questions?