IEOR 115 Final Presentation (2)
-
Upload
catherine-christabel-darmawan -
Category
Documents
-
view
19 -
download
1
Transcript of IEOR 115 Final Presentation (2)
DP Final Presentation
Silicon Valley Youth Bridge
Team 6: Abbey Chaver, Catherine Darmawan, Desmond Chan, Isha Thapa, Jason Mao, Jessica Wijaya
Team 6Desmond Chan, Abigail Chaver, Catherine Darmawan, Jason Mao, Isha Thapa, Jessica Wijaya
Meet the Client
Non-profit
Teaches grade-school students how to play bridge
Goal: to inspire the next generation of bridge players
Fully run by ~100 volunteers
~400 registered youth members
SiVY Bridge Background
Events and ProgramsYouth Tournaments (Pizza Party, Casual Friday)
Summer Camp
Parent-Child Games
External Tournament
About BridgePartnership Game
Complex game involving strategy and logic
Two parts - bidding and playing
Duplicate bridge at tournaments
Win Masterpoints
EER Diagram + Relational Schema
Previous Approach DP 1 DP 2
Final EER Diagram
Relational Schema1. Person (PersonID, Fname, Lname, gender, Start_date, Branch36, DOB, email, points)2. Volunteer (Volunteer_ID, PersonID1, certification)3. Teacher (Teacher_ID, Volunteer_ID2)4. Board_Member (BM_ID, Volunteer_ID2, Position, Year)5. Mentor (Mentor_ID, Volunteer_ID2)6. Alumnus (Alumnus_ID, PersonID1, End_date_as_student)7. Student (StudentID, PersonID1, waiver, School, Parent19, Parent29, Parent39, Phone_number,
Emergency_contact_phone, how_did_you_hear, Prior_experience, ABCL_member, ACBL_number, Date_joined_ACBL, Training_program_participant)
8. Donor (Donor_ID, PersonID1)9. Parent( Parent_ID, PersonID1)
10. Event (Event_ID, start_date, end_date)11. External Tournament (Tournament_ID, Event_ID,10 Type, City)
Relational Schema12. Internal Event ( InternalE_ID, Event_ID10, RoomID22, BranchID36, Date, Time)13. Pizza_Party_Tournament (PP_ID, InternalE_ID10, Food_Ordered, Food_Consumed)14. Parent_Child_Tournament (PC_ID, InternalE_ID10)15. Casual_Friday (CF_ID, InternalE_ID10)16. Class (Class_ID, InternalE_ID10, Class_Name, Term, Teacher_ID3, School_hosting24, Weekly_hour,
Weekly_day)17. Class_Session(Session_ID, Date, Class_ID16, No_Attendees)18. Summer_Camp ( Camp_ID, InternalE_ID10, Year)19. Fundraiser(Fundraiser_ID, InternalE_ID10)20. IntEvent_Performance (InternalEvent_ID12, Person_ID1, Partner_ID1, point_type, points_achieved)21. Time_interval (Time_ID, RoomID22, Start_time, End_time, Date)22. Room (RID, BID23, capacity, projector)23. Building (BID, Street_address, City, ZIP_code)24. School (School_name, level)25. Donation (Donation_ID, Donor_ID8, Amount, Associated_Fundraiser19)
Relational Schema26. Camp_Tuition (Tuition_ID, Camp_ID18, Student_ID7, Expense_ID)27. Transaction (Transaction_ID, Amount, date, type)28. Revenue (Rev_ID, Transaction_ID27)29. Expense (Exp_ID, Transaction_ID27)30. Supply Order (Order_ID, InternalE_ID10, items_description,VolunteerID2, date, Exp_ID29)31. Skill (Skill_Name)32. Skill_Teaching (Skill_Name31, Teacher_ID3)33. Skill_Student( Skill_Name31, Student_ID7, level, Test_ID35)34. Sponsorship (Sponsorship_ID, Student_ID7, Exp_ID29, UsedOn) 35. Test (Test_ID, Time_ID21, Skill_Name31, date)36. Branch (Branch_ID, City, Country)37. ExtTournament_Performance( StudentID7, TournamentID11,point_type, points_achieved)38. IntEvent_RSVP_and_Attendance(PersonID1, EventID12,Attended, RSVP, Partner1)39. StudentParent(StudentID7, ParentID9)
Queries/Analysis
1. Optimizing Food Purchases for Events
2. Assessing Skill Levels3. Partner Matching4. Donation Trend Analytics5. Forecasting Event
Participation Level
1. Optimizing Food Purchases for Events● Optimizing amount of food purchased
○ We can do a forecast on the number of students that would most likely participate in the event based on previous events’ attendees data
○ According to this number of attendees, we can then buy the optimal amount of food to reduce leftovers
● Benefit: ○ help the organization reduce internal event’s expenses
○ improving the quality of the organization indirectly as the money saved can then be
allocated on other areas for improvement (e.g. using the money to sponsor students to
tournament, to hold extra session for underperforming students, to be used for marketing purposes, etc.)
Step 1: Use SQL query to extract data
Step 2: Step 3:
Creating the QueryStep 1: Retrieve data of the number of attendees, amount of food, and amount leftover
SQL > SELECT S.InternalE_ID, count(IA.Attended), count(IA.RSVP), S.sum(quantity), PPT.Pizza_Remaining
FROM Pizza_Party_Tournament as PPT, Supply_Order as S, IntEvent_RSVP_and_Attendance as IA, Internal_Event as IE
WHERE PPT.order_id = S.order_idAND S.product_type = “pizza”AND IA.event_ID = IE.Event_IDAND IE.InternalE_ID = PPT.InternalE_IDAND IA.Attended = 1AND IA.RSVP = 1GROUP BY S.InternalE_ID;
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:
Step 2: Use linear regression to predict the number of attendees
Step 3:
Predict Number of Attendees from RSVPs- Regress number of attendees against number of RSVPs- Verify linear model- Use linear model function in R- Check significance level
Example: Attendance = .27 + .8*RSVP
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 1: Step 2:
Step 3: Use linear regression to predict amount of pizza consumed
Predict Number of Pizzas Consumed- Children may eat less than
standard serving size- Regress pizza consumption
against number of attendees
- Assume distribution of ages is the same
- Verify linear model- Set intercept to zero- ex: Pizza = .26* Attendees
Use Both Models to Predict Consumption- obtain predicted number of attendees from first model- plug value into second model to estimate amount of pizza - don’t extrapolate data!
2. Assessing Skill Levels● Identify underperforming student for mentors/teachers to provide extra
support and attention● Identify best performing or “most improved” students to reward with
recognition and prizes like sponsorships for external tournaments● Evaluation is based on points, years playing bridge, participation in
classes, attendance for events excluding classes, and test scores● Benefit:
○ When more attention is put on the underperforming student, they will be more likely to
improve. This will in turn improve the quality of the organization, and making the students and parents more proud of the improvement and achievement made.
○ If the right student was picked to get the sponsorship to attend external tournament, the
organization will have greater chance of having its member winning the tournament. This will improve Sivy Bridge’s reputation as well
Step 1:Use SQL to display all names, points, attendance and skill level
Step 2: Step 3:
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:
Step 2: Normalize data and graph in MS Excel
Step 3:
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 1: Step 2:Step 3:Analyze data with reservations
Step 1: Use SQL to Gather Relevant InfoSQL > SELECT P.PersonID, P.points, count(EA.Attended) as Total_attendance, average(SS.level) as Average_skill
FROM Person P, IntEvent_RSVP_and_Attendance as EA, Skill_Student SS, Student S
WHERE P.PersonID = S.PersonID AND SS.StudentID = S.StudentID AND EA.PersonID = S.PersonID AND
EA.Attended = 1; GROUP BY P.PersonID;
Step 2: Graph Data with MS Excel
Step 3: Analyze Data with Reservations- Do not take data for granted- When looking at data, choosing outliers may be easy but understanding
which students for teachers to focus on may be completely different- Jack Ma and Isabel Wong seem to be the most underperforming students
but Frank Liu actually is- Upon closer inspection one can see that Frank Liu attends events but
doesn’t perform on par with his skill level- Jack and Isabel have a high skill level but have a lower overall score
because they didn’t attend events, why? Perhaps they cannot learn any more from attending events focused towards the majority of the organization, which is at a lower skill level than their current
3. Partner Matching● Maximize sum value of partnerships for students at a tournament● Partnership “value” weighted by skill level, point accumulation, age, prior
partnership○ Using linear programming, we will minimize difference in skill level, age, and personal
points, and maximize games played together and points achieved together.
● Benefit:○ Better compatibility increases quality of teamwork in playing bridge for the tournament,
and therefore will increase chances of winning.
○ Create strong relationships between students, improving the experience of playing and their commitment to the game.
Step 1: Extract Data Step 2: Step 3:
Query: Relevant Data for each possible matchSQL > CREATE VIEW Skill_rank (select StudentID, average(level) as
Average_levelFROM Student_SkillGROUP BY StudentID);
SQL > SELECT P1.PersonID, P2.PersonID, P1. points - P2.points, SR1.Average_level - SR2.Average_level,P1.DOB-P2.DOB, count(IEP.InternalEvent_ID), sum(IEP.points_achieved)
FROM Person P1, Person P2, Skill_rank SR1, Skill_rank SR2, Student S1, Student S2, IntEvent_performance IEP
WHERE P1.Pid < P2.Pid AND IEP.PersonID = P1.PersonID AND IntEvent_Performance.PartnerID = P2.PersonID AND SR1.StudentID = S1.StudentID AND SR2.StudentID = S2.StudentID AND S1.PersonID = P1.PersonID AND S2.PersonID = P2.PersonID AND P1.PersonID, P2.PersonID IN (SELECT PID FROM IEP
WHERE IEP.InternalEventID = 15) GROUP BY P1.PersonID, P2.PersonID;
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1: Step 2: Optimize matches with AMPL
Step 3:
Optimize matches with AMPL formulationparam n; # number of students attending event
param m = n*(n-1)/2; # possible matches
set attributes;
param match{1..m, attributes};
var y {1..m} binary; # indicates match i is selected
var x {1..n, 1..n} binary; # representation of a match i between person j and k
minimize MatchValue:
sum {i in 1..m} (abs(match[i, 1]) + abs(match[i, 2]) + abs(match[i, 3]) - match[i, 4]
- match[i, 5])*y[i];
subject to
# Condition 1: A person can only have one partner
C1a {i in 1.. n}: sum {j in 1 .. n} x[i,j] <=1;
C1b {j in 1.. n}: sum {i in 1.. n} x[i,j] <= 1;
# Condition 2: All students should be matched unless n is odd, in which case only one
should be unmatched
C2a: sum {j in 1..n, k in 1..n} x[j, k] <= n/2;
C2b: sum {j in 1..n, k in 1..n} x[j, k] >= n/2 - 1;
# Condition 3: Student cannot be paired with him/herself
C3 {j in 1..n}: x[j,j] <= 0;
# Condition 4: Eliminate identical pairings with different order
C4 {j in 1 ..n, k in 1..j}: x[j, k] <= 0;
# Condition 5: Relating x[j, k] to y[i] through a numerical transformation based on the
ordering of the match matrix
C5a {j in 1..n, k in 1..n}: x[j,k] <= y[(j-1)*n - (j-1)*j/2 + (k - j)];
C5b {j in 1..n, k in 1..n}: x[j,k] >= y[(j-1)*n - (j-1)*j/2 + (k - j)];
data; #####################
param n: 5;
set attributes := "PID1", "PID2", "PointDiff", "SkillDiff", "BDiff", "IEP", "JointP";
param match:
PID1 PID2 PointDiff SkillDiff BDiffIEP JointP:=
1 1 9 4 0.5 1 3 3
2 1 16 3 1 -3 0 0
3 1 4 -2 -0.25 2 2 5
4 1 5 6 0.5 -4 1 4
5 9 16 -1 0 4 1 1
6 9 4 -6 0 1 4 8
7 9 11 2 0.2 -5 1 0
8 16 4 -5 0 5 1 2
9 16 11 3 -1 -1 3 1
10 4 11 8 0.5 -6 0 0;
Output:
y1 y2 y3 y4 y5 y6 y7 y8 y9 y10
0 0 0 0 0 1 0 0 1 0
In this case, the optimal matching is to select match 6 and 9, resulting in the pairs (9, 4) and (16, 11). Person 1 is unmatched and will play with a volunteer.
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 1: Extract Data with SQL
Step 2: Optimize matches with AMPL
Step 3: Tune Objective Weights
Step 3: Tune Objective WeightsMinimize MatchValue:
sum {i in 1..m} (Z1*abs(match[i, 1]) + Z2*abs(match[i, 2])
+ Z3*abs(match[i, 3]) - Z4*match[i, 4] - Z5*match[i, 5])*y[i];
Using IntEvent_Performance.Points as the result, we can evaluate the success of our matching. By adding weights to the components of the objective function, we can try to optimize the coefficients to give the most weight to the most accurate predictors of partnership success.
4. Donation Trend AnalyticsIs the amount of money received from donation consistent over months
and years?
Business Justification:
● Finding trends for money donations● Analyze whether time affects the amount of donations● Predict financials to foresee the future of the organization and to note if
fundraising efforts would be needed
Step 1: Microsoft AccessCreate a query using SQL
Step 2:Microsoft Excel: ANOVA Test
Step 3:Microsoft Access:Chi-Squared Goodness Fit Test
Step 1: Creating a QueryFind the total amount per month using SQL in MS Access
SQL Code
SELECT DISTINCTROW Format$([Donation].[Date],'yyyy/mm') AS [Year and Month], Sum(Donation.Amount) AS [Sum Of Amount]
FROM DonationGROUP BY Format$([Donation].[Date],'yyyy/mm'),
Year([Donation].[Date])*12+DatePart('m',[Donation].[Date])-1ORDER BY Format$([Donation].[Date],'yyyy/mm'),
Year([Donation].[Date])*12+DatePart('m',[Donation].[Date])-1;
Output
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Microsoft AccessCreate a query using SQL
Step 2:Microsoft Excel:ANOVA test
Step 3:Making analysis from existing data
Step 2: Consistency of donation amount over yearsExport the data to MS Excel Use ANOVA: Single Factor Data Analysis
Step 2Export the data to MS Excel Use ANOVA: Single Factor Data Analysis
Since F < F critical, Accept H0 = µ2012 = µ2013 = µ2014
Step 2: Consistency of donation amount over years
Step 2: Consistency of donation amount over monthsFind the average of donations per month ANOVA
Step 2: Consistency of donation amount over months
Since F > F critical, Reject H0 = µ1 = µ2 = ... = µ12
Find the average of donations per month ANOVA
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 1: Microsoft AccessCreate a query using SQL
Step 2: Microsoft Excel: ANOVA Test
Step 3:Making analysis from existing data
Step 3: Summary Since the total amount of donations are consistent over years, it will be beneficial for SiVY to use this data for planning of long-term goals and expansions. Therefore, SiVY can determine whether fundraising is necessary to collect more funds to aid future missions and cover expenditures.
Since donations are not consistent over months, SiVY needs to carefully plan the usage of donations for expenditures of events and competitions ahead of time (i.e. creating financial plans for 2016 activities and expenditures and make sure that the money is available before the start of year 2016)
5. Forecasting Event ParticipationSummary:
● We find the seasonal trend of student’s participation level at internal events and forecast future events’
participation levels based on previous attendance data
○ There may be some period of time when more students would be more/less interested in attending
event (e.g. during holiday season, beginning of school year, etc.)
● Using this forecast, we can then plan more events during this season so that it will be more effective and less
events during low season period (low number of attendees).
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1: Retrieve & filter data from Access using SQL QueryFor each event, we can filter the data using SQL to find the total number of people
attended the event held on particular dates.
SELECT IntEvent_RSVP_and_Attendance.EventID, Internal_Event.Date, Count(IntEvent_RSVP_and_Attendance.Attended) AS TotalAttendanceFROM IntEvent_RSVP_and_Attendance, Internal_EventWHERE Internal_Event.InternalE_ID = IntEvent_RSVP_and_Attendance.EventIDGROUP BY IntEvent_RSVP_and_Attendance.EventID, Internal_Event.Date;
Output
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 2: Export data & forecast with Holt-Winters method● Using holt-winters method, we forecast future attendance level for events held at
different time. ○ The intuition behind using holt-winters model is because we might have seasonal
factor affecting the attendance level.
● Therefore, we are going to set the seasonal period to be 12 (monthly season), ● We are also going to be using multiplicative seasonal method.
Holt-Winters Formulayt = forecast at time t
lt = coefficient level at time t
bt = trend at time t
st = seasonal factor at time t
= smoothing parameter for coefficient
= smoothing parameter for the trend
= smoothing parameter for seasonal factor
m = number of period
The first year’s data are taken just to get the method calculation started.
We averaged the attendance level and use it to become the initial values for the Holt-
Winters formula.
1st Year (actual data)
2nd and 3rd Year (actual data)
4th and 5th year (forecast)
Step 1:Filter & retrieve
data from Access
database to
create a query
using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3: Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 1:Filter & retrieve data from Access database to create a query using SQL
Step 2:Export the data to
Excel and apply
Holt-Winters
method to do
forecasting
Step 3:Generate analysis from the result to create recommendation for future events
Step 3: Analysis and recommendation for future eventsNotice that the seasonal trend for the attendance level is maintained when forecast for future period is made with holt-winters method.● In this case, we assume that children are more likely to go to events in the middle
of the semester, and less likely to go during summer break and winter break as they might already have plans on their own with families & friends.
The attendance level would be higher for events in the middle of the semester (Feb-May) and slightly lower for events during school breaks (June-August & Dec-Jan)● focus creating more events during the peak period since it will be more effective as
more members (students) will be participating in the event● hold events during low-season period (school break) or modify the event planning
to fit the lower number of attendees (e.g. ordering less food, booking smaller rooms, etc.) to reduce cost.
Normalization1NF
2NF
3NF
BCNF
First Normal Form (1NF)Before:
1. Person (PersonID, Fname, Lname, gender, Start_date, Branch36, DOB, email, points)
A person can be part of more than 1 branch of the organization, therefore “Branch” is a multivalued attribute => not in 1NF
After (to normalize it, we break it into 2 tables):
1.1. Person (PersonID, Fname, Lname, gender, Start_date, DOB, email, points)1.2. Person_of_Branch (PersonID, Branch36)
Second Normal Form (2NF)Before:
16. Class (Class_ID, InternalE_ID10, Class_Name, Term, Teacher_ID3, School_hosting24, Weekly_hour, Weekly_day)
Class_ID alone can determine Class_Name => not fully FD on every CK
After (to normalize it, we break it into 2 tables):
16.1. Class (Class_ID, InternalE_ID10, Term, Teacher_ID3, School_hosting24, Weekly_hour, Weekly_day)
16.2. Class_Name (Class_ID, Class_Name)
Third Normal Form (3NF)Before:
23. Building (BID, Street_address, City, ZIP_code)
{Street_address, City} alone can determine ZIP_code => Not in 3NF After (to normalize it, we break it into 2 tables):
23.1. Building (BID, Street_address, City)23.2. Address_ZIP (Street_address, City, ZIP_code)
Assumption: Same street address can exist in multiple cities, so it has to be combined with city to be unique!
Boyce-Codd Normal Form (BCNF)Before:
33. Skill_Student( Skill_Name31, Student_ID7, level, Test_ID35)
Test_ID → Skill_Name because Tests are administered on a single skill (dependency captured in the relation Test.
To normalize this into BCNF:33. Skill_Student(Test_ID35, Student_ID7, level)
However, this defeats the purpose of easily identifying which skills a student possesses, so it’s not very sensible.
Fully normalized Boyce-Codd Normal Form (BCNF)10. Event (Event_ID, start_date, end_date)
is in 3NF because the two non-prime attributes are fully dependent on the primary key, and because there is no functional dependency between the two non-prime attributes.It is further in BCNF because every functional dependency is of the form superkey → non-prime attribute:
Event_ID → start_dateEvent_ID → end_datestart_date ↛ Event_IDend_date ↛ Event_ID
start_date ↛ end_dateend_date ↛ start_date
Assumption: some events span more than one day (otherwise we would not track this as two separate attributes).
Questions?