PhillyCycle Case - Team 90

17
DECEMBER 2015 “Weather” to Rent a Bike TEAM 90 ALLYSON ANDERSON ERIN BILLSTROM JOHN SCHMUTZER

Transcript of PhillyCycle Case - Team 90

Page 1: PhillyCycle Case - Team 90

DECEMBER  2015  

“Weather” to Rent a Bike

TEAM 90 ALLYSON ANDERSON ERIN BILLSTROM JOHN SCHMUTZER

Page 2: PhillyCycle Case - Team 90

Page 1 of 17    

Executive Summary PhillyCycle is a bike sharing company based in Philadelphia that began operation in 2010. People are able to rent bikes at kiosk locations throughout the city and return it to a different location on an as-needed basis. The company has two main user segments: registered and casual. Registered customers pay a monthly membership fee in exchange for a discount in the hourly rate. Casual customers pay an hourly rate with no registration. The CEO of PhillyCycle is planning for the company’s growth. Our analysis focuses on the following: the key drivers in demand variation, environmental drivers of demand, and the difference in demand patterns between users. In addition, we determined future recommendations that will assist the CEO in her expansion plan. First, we discovered that for all users there is a pattern between the numbers of rentals based on season, day of week, and time of day. The largest demand for bike rentals occurs during the summer. Casual users rentals are mostly concentrated during the weekend. However, registered users rentals are concentrated during the weekdays. Peak usage times for registered users during the weekdays are 8 AM and 5 PM. This suggests registered users rent bikes for transportation to and from work, while casual users rent the bikes for leisure. Second, users mainly rent bikes when the temperature is between 60 to 85 degrees Fahrenheit, the wind speed is less than 20 MPH, and the sky is clear to partly cloudy. Our data suggests that weather does play a factor in number of rentals. Finally, the number of users by category is significantly different. Registered users consist of 81% of the rentals, while casual users only make up 19% of total users. When the CEO is considering expanding operations to cities that have a likelihood of success, the cities should have year-round moderate-to-warm weather. However, weather should not be the exclusive factor in expansion decisions. Our data suggests that the registered users rent bikes for work transportation. Therefore, the average distance that residents commute to work should also be taken into account. Finally, we recommend focusing on the registered users for future marketing campaigns. The campaign can focus on using the bikes for work transportation.

Page 3: PhillyCycle Case - Team 90

Page 2 of 17    

Analysis Data Summary

Summary Table by User Segment

Casual User Registered User

Median 16.00 rentals 115.00 rentals

Interquartile Range 44.00 rentals 186.00 rentals

Range 367.00 rentals 876.00 rentals

PhillyCycle could benefit by looking at the median, interquartile range, and the range. The median of casual users from our data set is 16 rentals. The median of registered users is 115 rentals. The data is not normally distributed so the mean would not be an appropriate indicator if the central tendencies of the data. In addition, the median is a more appropriate measure because outliers affect it less. The interquartile range (IQR) of casual users is 44 rentals. This means that 50% of the casual users rented bikes between 4 and 48 times between 2011 and 2012. The IQR of registered users is 186 rentals. Furthermore, 50% of the registered users rented bikes between 34 and 220 times in the given 2-year time period. The smaller the IQR compared to the range, the stronger or more distinct the central tendency of rentals is. Therefore, since the IQR for both casual and registered users is significantly smaller than the range, we know that there is a strong central tendency. Another useful measurement of the spread of our data is the range. The range of casual users is 367 rentals and the range of registered users is 876 rentals. Our data examines the number of bike rentals, which can only be a positive whole number or zero. Therefore, since the minimum number of rentals for both user segments is zero, the number for the range is also the maximum number of rentals per user segment. The range shows the difference in magnitude between casual and registered users. Based on our three stated measurements, it is evident that registered users have a significant more number of rentals than casual users. Therefore, the CEO of PhillyCycle should focus more on the registered user segment for upcoming marketing campaigns.  

Page 4: PhillyCycle Case - Team 90

Page 3 of 17    

On average for all users, PhillyCycle has more rentals occur in the summer than other seasons. The smallest amounts of rentals occur during winter. Since biking is weather dependent, our data is consistent with the concept of people being more likely to rent bikes during warmer weather patterns. Registered users make up 81% of the total users of PhillyCycle. This means that 4 out of 5 users have a membership with PhillyCycle. On the opposite end, only 19% of users are casual.

0.00  50.00  100.00  150.00  200.00  250.00  300.00  

Winter   Spring   Summer   Fall  

Average  Re

ntals  

Season  

Average  Rentals  by  Season  

19%  

81%  

Propor3on  of  Rentals  by  User  Category  

Casual  

Registered  

Page 5: PhillyCycle Case - Team 90

Page 4 of 17    

PhillyCycle began operations in 2010, which explains why there were less rentals in 2011 compared to 2012. For both 2011 and 2012, the highest number of rentals occurred between May and October.

Casual users have a higher number of average rentals during the weekend than weekdays. Throughout the week, the number of rentals between each day for casual users is consistent around 20. During the weekend, Saturday and Sunday, casual user rentals increase dramatically to an average of 60 rentals per day. On the other hand, registered users on average have a higher number of rentals during the weekdays than the weekend. Between Monday and Friday, the average number of rentals is around 165. During the weekend, the average number of rentals is 126. The reason for the difference between user segments will be explained in the following two graphs.

0.00  

50.00  

100.00  

150.00  

200.00  

Monday   Tuesday   Wednesday   Thursday   Friday   Saturday   Sunday  

Average  Re

ntals  

Day  of  Week  

Average  Rentals  by  Day  of  Week  per  Customer  Category  

Casual   Registered   All  Users  

0  

10000  

20000  

30000  

40000  

50000  

60000  

70000  

80000  

Rentals  

Month  

Rentals  Per  Month  by  Year  

2011  

2012  

Page 6: PhillyCycle Case - Team 90

Page 5 of 17    

As noted above, casual users have a higher number of rentals during the weekend than the weekday. The graph shows that there is no trend in rentals during the weekday for casual users. For registered users, the peak period for rentals occurs at 8 AM and 5 PM. At 8 AM there is an average of 426 rentals and at 5 PM an average of 465 rentals. Both of these peaks occur during the normal commute to and from work. This indicates that it is possible that the registered users are using PhillyCycle’s bikes as their transportation for work.

However, during the weekend both casual and registered users have the same rental trend. The registered users comprise a greater number of rentals overall which explains the magnitude difference between the users. The peak period of rentals for both users is between noon and 3 PM. People tend to be more out and about in the early to mid afternoon which could explain this rental pattern.

0.00  

50.00  

100.00  

150.00  

200.00  

250.00  

300.00  

0:00  

1:00  

2:00  

3:00  

4:00  

5:00  

6:00  

7:00  

8:00  

9:00  

10:00  

11:00  

12:00  

13:00  

14:00  

15:00  

16:00  

17:00  

18:00  

19:00  

20:00  

21:00  

22:00  

23:00  

Rentals  

Time  of  Day  

Average  Weekend  Rentals  by  User  Category    

Casual   Registered  

0.00  

100.00  

200.00  

300.00  

400.00  

500.00  

0:00  

1:00  

2:00  

3:00  

4:00  

5:00  

6:00  

7:00  

8:00  

9:00  

10:00  

11:00  

12:00  

13:00  

14:00  

15:00  

16:00  

17:00  

18:00  

19:00  

20:00  

21:00  

22:00  

23:00  

Rentals  

Time  of  Day  

Average  Weekday  Rentals  by  User  Category  

Casual   Registered  

Page 7: PhillyCycle Case - Team 90

Page 6 of 17    

Hypothesis Tests & Confidence Intervals

Average Rentals – Confidence Intervals

Lower Upper

All Users 185.23 194.94

Casual 34.23 36.85

Registered 150.49 158.60

The above table shows 95% confidence intervals for average rentals among each user category, and all combined users. As such, we can say with 95% confidence that the population average falls between the above ranges for each user category. We will now determine whether there is evidence for different rental volume on weekends versus weekdays for all user categories. We will operate under the null hypothesis that there are different volumes on the weekdays versus weekends.

95% Confidence intervals for the difference between weekday and weekend rentals The above table shows the 95% confidence intervals for the difference between weekday and weekend rentals for each user category. The confidence interval for All Users REJECTS the null hypothesis because 0 is within the confidence interval. This means that there could be no difference at all between weekday and weekend rentals for all users combined. The confidence intervals for casual and registered users both FAIL TO REJECT the null hypothesis. Both confidence intervals show that there is a clear difference between weekday and weekend rentals.

Hypothesis tests for C-Ration

Lastly, the above table shows confidence intervals for the proportion of casual users out of all users. The weekday proportions are calculated with a 99% confidence level and the weekend proportions with a 95% confidence level. Given the null hypotheses indicated above by ‘H0’, we REJECT all null hypotheses as the hypothesized proportions do not fall within the confidence intervals.

Page 8: PhillyCycle Case - Team 90

Page 7 of 17    

Exploratory Analysis The chart above shows the relationship between total rentals and three variables of weather: clear to partly cloudy, misty, and precipitation. The results show that 71.38% of rentals occurred when weather conditions were clear to partly cloudy while only 23.55% occurred when it was misty and 5.05% when there was precipitation. Therefore, a majority of all rentals occur when it is clear to partly cloudy. The chart above shows the relationship between total rentals and the outdoor temperature. The results show a lower number of rentals when the temperature is 40 degrees or below as well as when the temperature is 90 degrees or above. The highest number of rentals occurs between 60 and 90 degrees.

0  100000  200000  300000  400000  500000  600000  700000  800000  

Clear/PC   Misty   PrecipitaNon  

Rentals  

Weather  

Rentals  by  Weather  Condi3ons  

0  20000  40000  60000  80000  

100000  120000  140000  160000  

Rentals  

Temperature  (Far.)  

Total  Rentals  by  Temperature  

Page 9: PhillyCycle Case - Team 90

Page 8 of 17    

The chart above looks at the relationship between total rentals and wind speed. The results show that rentals are relatively high between wind speeds of 0 to 20 mph. When the wind reaches speeds greater than 20 mph there is a significant decrease in the number of rentals.  

0  

50000  

100000  

150000  

200000  

250000  

300000  

350000  

Rentals  

Windspeed  (mph)  

Rentals  by  Wind  Speed  

Page 10: PhillyCycle Case - Team 90

Page 9 of 17    

Regression Analysis Effects of Temperature on Total Rentals Given the significance of weather in rental activity, our first regression analysis shows the effect of temperature on rentals.

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  Intercept   -­‐77.41659132   8.1922685   -­‐9.449957764   4.89958E-­‐21   -­‐93.47669424   -­‐61.35648839  

Temp  (Far.)   4.424960246   0.130296333   33.96074274   1.4188E-­‐229   4.169527637   4.680392856  

All Users with independent variable Temperature

The above regression table shows that for every degree (Far.) increase in temperature, total rentals increases by 4.42. The Temperature variable is also statistically significant. The 95% confidence interval does not include 0, the p-value is less than .05, and the t-stat is much larger than 1.96. Temperature accounts for approximately 17.39% of the variance in total rental as indicated by its Multiple R-Squared value. (This value is the adjusted square of the correlation between All Rentals and Temperature) Effects of Weekends/Weekdays on Rental Volume It has also been noted that each user category uses PhillyCycle’s bikes for different reasons. To understand the difference between leisure use and work transportation a regression is run for casual, registered, and all users with temperature and weekend as dependent variables.

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  Intercept   -­‐77.20572619   8.373615361   -­‐9.220118533   4.14645E-­‐20   -­‐93.62134171   -­‐60.79011066  Weekend   -­‐0.60804181   4.987878811   -­‐0.121903886   0.902979614   -­‐10.3862675   9.17018388  Temp  (Far.)   4.42434721   0.130405062   33.92772601   3.6148E-­‐229   4.168701439   4.679992981  

All Users with independent variables Temperature and Weekend

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  

Intercept   -­‐19.78428916   7.17295555   -­‐2.758178135   0.005831754   -­‐33.84613406   -­‐5.72244426  Weekend   -­‐35.24559816   4.272686463   -­‐8.249048571   1.98145E-­‐16   -­‐43.62176248   -­‐26.86943384  

Temp  (Far.)   3.050408712   0.111706792   27.30728049   4.9186E-­‐154   2.831418983   3.26939844   Registered Users with independent variables Temperature and Weekend

Page 11: PhillyCycle Case - Team 90

Page 10 of 17    

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  

Intercept   -­‐57.42143702   2.046707821   -­‐28.05551258   6.1868E-­‐162   -­‐61.43379814   -­‐53.40907591  Weekend   34.63755635   1.219154467   28.41113024   9.5657E-­‐166   32.24752885   37.02758385  

Temp  (Far.)   1.373938499   0.031874053   43.1052335   0   1.311452681   1.436424316   Casual Users with independent variables Temperature and Weekend Weekend clearly affects each user category in opposite ways. Casual user rentals increase on weekends, and registered user rentals decrease on weekends. The weekend variable is also statistically significant for both casual and registered users alone. However, because each group is affected in opposite ways, the Weekend variable is not statistically significant for all user combined (0 is within the 95% confidence interval). Multiple Regression of All Variables With the trends of usage and temperature affirmed, we now look at the effects of all variables on the total rentals. The below regression table summarizes total rentals as described by Temperature, Wind speed, Humidity, Season, Weather, Year and, Weekends.

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  

Intercept   22.64484868   13.68350248   1.654901492   0.098001914   -­‐4.180265545   49.4699629  Temp  (Far.)   4.719272988   0.192816163   24.47550512   1.2647E-­‐125   4.341276521   5.097269455  Wind  Speed   0.472977819   0.26949382   1.75505998   0.079305087   -­‐0.055337393   1.001293031  Humidity   -­‐2.761685582   0.126892704   -­‐21.76394308   8.7422E-­‐101   -­‐3.010445816   -­‐2.512925349  Spring   5.434096586   7.229957067   0.751608417   0.452318952   -­‐8.739498545   19.60769172  Summer   -­‐22.72542203   9.299997018   -­‐2.443594551   0.014573182   -­‐40.95711985   -­‐4.493724221  Fall   57.34216178   6.400744181   8.958671079   4.43189E-­‐19   44.79415413   69.89016943  2012   74.01171663   4.109348175   18.01057333   1.70146E-­‐70   65.95575769   82.06767557  Weekend   -­‐2.408165173   4.513708228   -­‐0.533522561   0.593693599   -­‐11.25683085   6.440500506  Precipitation   -­‐2.303412336   8.137552608   -­‐0.283059594   0.777141878   -­‐18.25625617   13.6494315  

Mist   12.12551319   5.021457938   2.414739572   0.0157791   2.281455875   21.96957051   Multiple regression of All Users described by all independent variables The multiple regression equation is as follows…

All  Rentals  =  22.64  +  (4.72)*Temp  +  (.47)*Wind  Speed  -­‐  (2.76)*Humidity  +  (5.43)*Spring  -­‐  (22.73)*Summer  +  (57.34)*Fall  +  (74.01)*Year  -­‐  (2.41)*Weekend  –  (2.30)*Precipitation  +  (12.13)*Mist  

 

Page 12: PhillyCycle Case - Team 90

Page 11 of 17    

Notes on Multiple Regression • This regression equation indicates that summer days have 22.72 less rentals than

winter days – all else equal. • Fall is statistically significant because zero is not within the confidence interval

(95%), the p-value is less than .05, and the t-stat is grater than 1.96. All else equal, a fall day has 57.34 more rentals than a winter day.

• Temperature is statistically significant because zero is not within the confidence interval (95%), the p-value is less than .05, and the t-stat is grater than 1.96. A one-degree increase in temperature increases total rentals by approximately 4.7.

• Given a partly cloudy, summer weekend in 2012 with a temperature of 70, humidity of 60, and wind speed of 60, the regression model estimates there would be 238.54 rentals.

• All else equal, a partly cloudy or clear day has 2.3 more rentals than a day with precipitation.

• According to the regression model, 32.53% of variation in bike rentals is described by the variation of all independent variables (Adjusted R-Squared = .3253).

Wind Speed, Humidity, and Cost Saving Given the high costs associated with collecting wind speed and humidity data, it is preferable to collect humidity data if only one can be chosen. Table 1 shows the regression statistics for a model including all independent variables except wind speed, and Table 2 shows regression statistics for a model including all independent variables except humidity.

Regression  Statistics  Multiple  R   0.517876242  R  Square   0.268195802  Adjusted  R  Square   0.266990635  Standard  Error   156.9444952  

Observations   5475  

The model in including humidity describes a greater variance in bike rentals than the model including wind speed as indicated by their respective R-Squared values (.33 vs .27). The standard error for the model including humidity is also slightly smaller (151 vs 157).

Regression  Statistics  

Multiple  R   0.571134679  R  Square   0.326194821  Adjusted  R  Square   0.325085169  Standard  Error   150.5968234  

Observations   5475  

Table  1           Table  2  

Page 13: PhillyCycle Case - Team 90

Page 12 of 17    

Exploratory Analysis of Regression Effects on Holidays One factor worth considering is the impact of holidays on total rentals. Because most businesses close on major holidays, this should affect each user category in a similar manner. The below regression table shows a model describing total rentals with the independent variable Holiday.

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  

Intercept   191.2469484   2.510500291   76.17881942   0   186.3253698   196.1685269  

Holiday   -­‐42.47361502   15.16724522   -­‐2.80035131   0.005122605   -­‐72.20744507   -­‐12.73978498  

Regression of All Users with independent variable Holiday

All Rentals = 191.25 – (42.47)*Holiday Holiday proves to be statistically significant and has an obvious impact on rentals. The mean for All Users is 190, so a holiday experiences a 22% (42 out of 190) drop in rentals compared to a non-holiday (all else equal). User Sensitivity to Changes in Season A final factor to consider is the impact of season on different user categories. Given the leisure and transportation trends of each category, it would be useful to see which group is more sensitive to changes in season (Philadelphia experiences very distinct seasons, so the data will prove meaningful). The below regression tables summarize models of casual and registered rentals with independent variables for seasons.

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  Intercept   14.18134328   1.291788632   10.97806788   9.47393E-­‐28   11.64892384   16.71376273  Spring   31.44462708   1.801876064   17.45104877   2.06816E-­‐66   27.91223341   34.97702074  Summer   36.66224832   1.813255026   20.21902479   9.98099E-­‐88   33.10754736   40.21694928  

Fall   16.25844729   1.827889509   8.894655402   7.83491E-­‐19   12.67505693   19.84183766  

Regression of Casual Users with independent variables Spring, Summer, and Fall (Winter base)

Casual Rentals = 14.18 + (31.44)*Spring + (36.66)*Summer + (16.26)*Fall

    Coefficients   Standard  Error   t  Stat   P-­‐value   Lower  95%   Upper  95%  Intercept   95.35522388   4.069529538   23.43151045   8.7167E-­‐116   87.37732759   103.3331202  Spring   66.09572884   5.67646106   11.6438267   5.70592E-­‐31   54.9676077   77.22384997  Summer   97.95904114   5.712308272   17.14876657   2.9926E-­‐64   86.76064522   109.1574371  

Fall   71.15113364   5.758411373   12.3560352   1.30358E-­‐34   59.8623573   82.43990997  

Regression of Registered Users with independent variables Spring, Summer, and Fall (Winter base)

Registered Rentals = 95.36 + (66.10)*Spring + (97.95)*Summer + (71.15)*Fall

Page 14: PhillyCycle Case - Team 90

Page 13 of 17    

The magnitudes of casual and registered rentals are very different; therefore, we will consider the coefficients as a percentage of the average winter rentals (base) per user category.

Coefficients  as  %  of  Average  Winter  Rentals  

  Casual  Users   Registered  Users  Spring   222%   69%  Summer   259%   103%  

Fall   115%   75%  

Coefficients as a percentage of average winter rentals by user category

Because the intercept in the regression models is the average winter rentals, taking the other coefficients as a percentage of the intercept shows the percentage change in average rentals from the base season. The above table summarizes these findings and shows that casual users are far more sensitive to changes in season than registered users in a city with distinct climate changes per season.

Page 15: PhillyCycle Case - Team 90

Page 14 of 17    

Conclusion In conclusion, our analysis gives insight into the key drivers in demand variation, environmental drivers of demand, and the difference in demand patterns between users. The largest demand for bike rentals occurs during summer between the temperatures of 60 and 85 degrees Fahrenheit, when it is clear to partly cloudy. Our data suggests that casual users rent bikes more during the weekend for leisure, while registered users rent more during the weekdays commuting hours. The average rentals by registered users is over four times larger than that of casual users. In order to make a more informed decision for expansion within Philadelphia as well as other cities, we suggest that PhillyCycle further explores factors such as distance traveled by user segment, the age distribution within each user segment, and popularity of each kiosk. These suggestions could help PhillyCycle gain a deeper understanding of their customers’ needs and better determine future marketing and pricing strategies. We cannot make a recommendation for expansion at this time until further data is collected, specifically the aforementioned customer data and revenue data. Given that expanding is such a significant cost, an understanding of the company’s revenue generating ability and operating cost is crucial to make an informed decision. When entering a new city there are other factors besides weather that could affect bike rental demand such as traffic and the city’s infrastructure’s effect on bikers. We acknowledge the importance of weather and temperature on bike rentals but stress that these are not the only relevant factors.

Page 16: PhillyCycle Case - Team 90

Page 15 of 17    

Appendix Elevator Charts

We chose the chart above because it shows the difference in average rentals between the weekdays and weekends for casual, registered, and all users. We felt this chart is important to emphasize because it clearly shows average rentals increase on the weekends for causal users and decrease on the weekends for registered users. We decided to arrange this graph first because it shows the broadest and most significant different between user rental demands. We chose the chart above because it supports the idea that registered users use the bikes as work transportation because the peak times during the most typical commuting times. This graph gives insight to the needs of users in order to better market PhillyCycle’s services. We arranged this graph second because it shows the underlying cause for the underlying cause of the difference between user rental demands.

0.00  100.00  200.00  300.00  400.00  500.00  

0:00  

1:00  

2:00  

3:00  

4:00  

5:00  

6:00  

7:00  

8:00  

9:00  

10:00  

11:00  

12:00  

13:00  

14:00  

15:00  

16:00  

17:00  

18:00  

19:00  

20:00  

21:00  

22:00  

23:00  

Rentals  

Time  of  Day  

Average  Weekday  Rentals  by  User  Category  

Casual   Registered  

0.00  

50.00  

100.00  

150.00  

200.00  

Monday   Tuesday   Wednesday   Thursday   Friday   Saturday   Sunday  

Average  Re

ntals  

Day  of  Week  

Average  Rentals  by  Day  of  Week  per  Customer  Category  

Casual   Registered   All  Users  

Page 17: PhillyCycle Case - Team 90

Page 16 of 17    

We chose this pie chart because it breaks down the number of rentals by the user categories casual and registered. We felt this graph was important because 81% or rentals come from registered users, which is significantly greater than casual users. This is important information for the company to understand where their business is coming from. We arranged this graph last because it explains the magnitude difference between the registered and casual users. Notes on Data Preparation Before analysis of the data set, we ensured that the data was free of error, repetition, and significant outliers. All entries were complete and were not missing any data points. The data was further filtered for any duplicates – 6 total were found out of the original 5489 entries. Lastly, significant and abnormal outliers were removed to prevent skewing of the central tendencies. Data with temperature points above 50 degrees Celsius (120 Fahrenheit) were considered erroneous and were removed (8 more entries were removed). The original data contained very few errors, and those errors were adjusted for. We have no concern regarding the data accuracy or quality.  

19%  

81%  

Propor3on  of  Rentals  by  User  Category  

Casual  

Registered