Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama,...

9
Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama, Leena and Shweta

Transcript of Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama,...

Page 1: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Group A9 - Big Data

Garrett, Priyanka, Rohith, Anupama, Leena and Shweta

Page 2: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Business Goal - Reduce the cost incurred by the company due to customer cancellations

Assumptions – Cost of customer cancellations 100 INR

Cost of calling a customer 10 INR

Customer won’t be annoyed by additional confirmation call

Opportunity – 10% of bookings are cancelled by the customer

Benefits to YourCabs - Cost Reduction , Increased Vendor Satisfaction

Page 3: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Supervised task: Find a relationship between dependent variables and customer cancellations.

Predictive : Provide predictive analysis for customer cancellations on the future bookings.

Implementation: Predicts the customers who are most likely to cancel as the flagged customers.

Action by YourCabs: 1. Contact the flagged customers one hour before pickup. 2. Send the cab after confirmation

Page 4: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Data Preparation - Data Cleanup and Missing value handling

Data Analysis - Visual Representation of the data through bar charts, cross tables and pivot tables.

Data Binning - Binned the data in the following useful variables – following slide

Data Partitioning - Training – 23% , Validation – 43% and Test – 31%.

Page 5: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Is_Customer_Cancelled (output) – Binary Variable (0,1)

Booking_Day_Of_The_Week(input) – Categorical variable (1-7)

Pickup_Day_Of_The_Week(input) - Categorical variable (1-7)

Binned_Booking_Time_Of_The_Day(input) – Categorical variable (6

bins)

Binned_Pickup_Time_Of_Day(input) - Categorical variable (6 bins)

Binned_Pickup_Month(input) – Categorical variable (1-12)

Binned_Booking_Month(input) – Categorical variable (1-12)

Is_Booking_Within_a_day(input) – Binary variable (1 for <=24, 0 for

>24)

Page 6: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

KNN Algorithm

Naïve Bayes

Page 7: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Cost of error

Naïve Rule - Cost 43500* 0.10 * 100 = Rs. 4,35,000

After Implementation Cost 13950*(1/(13366/43500)) = Rs. 3,90,000

Savings per year 4,35,000 – 3,90,000 = Rs 45,000

Page 8: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

• Run the model at every one hour interval

• Call the customer who are flagged by the model

• Confirm with the customer if the booking will be cancelled

• Send cab only after confirmation from customer

Page 9: Group A9 - Big Data Garrett, Priyanka, Rohith, Anupama ... · Garrett, Priyanka, Rohith, Anupama, Leena and Shweta . Business Goal - Reduce the cost incurred by the company due to

Thank You