DECEMBER 2015
“Weather” to Rent a Bike
TEAM 90 ALLYSON ANDERSON ERIN BILLSTROM JOHN SCHMUTZER
Page 1 of 17
Executive Summary PhillyCycle is a bike sharing company based in Philadelphia that began operation in 2010. People are able to rent bikes at kiosk locations throughout the city and return it to a different location on an as-needed basis. The company has two main user segments: registered and casual. Registered customers pay a monthly membership fee in exchange for a discount in the hourly rate. Casual customers pay an hourly rate with no registration. The CEO of PhillyCycle is planning for the company’s growth. Our analysis focuses on the following: the key drivers in demand variation, environmental drivers of demand, and the difference in demand patterns between users. In addition, we determined future recommendations that will assist the CEO in her expansion plan. First, we discovered that for all users there is a pattern between the numbers of rentals based on season, day of week, and time of day. The largest demand for bike rentals occurs during the summer. Casual users rentals are mostly concentrated during the weekend. However, registered users rentals are concentrated during the weekdays. Peak usage times for registered users during the weekdays are 8 AM and 5 PM. This suggests registered users rent bikes for transportation to and from work, while casual users rent the bikes for leisure. Second, users mainly rent bikes when the temperature is between 60 to 85 degrees Fahrenheit, the wind speed is less than 20 MPH, and the sky is clear to partly cloudy. Our data suggests that weather does play a factor in number of rentals. Finally, the number of users by category is significantly different. Registered users consist of 81% of the rentals, while casual users only make up 19% of total users. When the CEO is considering expanding operations to cities that have a likelihood of success, the cities should have year-round moderate-to-warm weather. However, weather should not be the exclusive factor in expansion decisions. Our data suggests that the registered users rent bikes for work transportation. Therefore, the average distance that residents commute to work should also be taken into account. Finally, we recommend focusing on the registered users for future marketing campaigns. The campaign can focus on using the bikes for work transportation.
Page 2 of 17
Analysis Data Summary
Summary Table by User Segment
Casual User Registered User
Median 16.00 rentals 115.00 rentals
Interquartile Range 44.00 rentals 186.00 rentals
Range 367.00 rentals 876.00 rentals
PhillyCycle could benefit by looking at the median, interquartile range, and the range. The median of casual users from our data set is 16 rentals. The median of registered users is 115 rentals. The data is not normally distributed so the mean would not be an appropriate indicator if the central tendencies of the data. In addition, the median is a more appropriate measure because outliers affect it less. The interquartile range (IQR) of casual users is 44 rentals. This means that 50% of the casual users rented bikes between 4 and 48 times between 2011 and 2012. The IQR of registered users is 186 rentals. Furthermore, 50% of the registered users rented bikes between 34 and 220 times in the given 2-year time period. The smaller the IQR compared to the range, the stronger or more distinct the central tendency of rentals is. Therefore, since the IQR for both casual and registered users is significantly smaller than the range, we know that there is a strong central tendency. Another useful measurement of the spread of our data is the range. The range of casual users is 367 rentals and the range of registered users is 876 rentals. Our data examines the number of bike rentals, which can only be a positive whole number or zero. Therefore, since the minimum number of rentals for both user segments is zero, the number for the range is also the maximum number of rentals per user segment. The range shows the difference in magnitude between casual and registered users. Based on our three stated measurements, it is evident that registered users have a significant more number of rentals than casual users. Therefore, the CEO of PhillyCycle should focus more on the registered user segment for upcoming marketing campaigns.
Page 3 of 17
On average for all users, PhillyCycle has more rentals occur in the summer than other seasons. The smallest amounts of rentals occur during winter. Since biking is weather dependent, our data is consistent with the concept of people being more likely to rent bikes during warmer weather patterns. Registered users make up 81% of the total users of PhillyCycle. This means that 4 out of 5 users have a membership with PhillyCycle. On the opposite end, only 19% of users are casual.
0.00 50.00 100.00 150.00 200.00 250.00 300.00
Winter Spring Summer Fall
Average Re
ntals
Season
Average Rentals by Season
19%
81%
Propor3on of Rentals by User Category
Casual
Registered
Page 4 of 17
PhillyCycle began operations in 2010, which explains why there were less rentals in 2011 compared to 2012. For both 2011 and 2012, the highest number of rentals occurred between May and October.
Casual users have a higher number of average rentals during the weekend than weekdays. Throughout the week, the number of rentals between each day for casual users is consistent around 20. During the weekend, Saturday and Sunday, casual user rentals increase dramatically to an average of 60 rentals per day. On the other hand, registered users on average have a higher number of rentals during the weekdays than the weekend. Between Monday and Friday, the average number of rentals is around 165. During the weekend, the average number of rentals is 126. The reason for the difference between user segments will be explained in the following two graphs.
0.00
50.00
100.00
150.00
200.00
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Average Re
ntals
Day of Week
Average Rentals by Day of Week per Customer Category
Casual Registered All Users
0
10000
20000
30000
40000
50000
60000
70000
80000
Rentals
Month
Rentals Per Month by Year
2011
2012
Page 5 of 17
As noted above, casual users have a higher number of rentals during the weekend than the weekday. The graph shows that there is no trend in rentals during the weekday for casual users. For registered users, the peak period for rentals occurs at 8 AM and 5 PM. At 8 AM there is an average of 426 rentals and at 5 PM an average of 465 rentals. Both of these peaks occur during the normal commute to and from work. This indicates that it is possible that the registered users are using PhillyCycle’s bikes as their transportation for work.
However, during the weekend both casual and registered users have the same rental trend. The registered users comprise a greater number of rentals overall which explains the magnitude difference between the users. The peak period of rentals for both users is between noon and 3 PM. People tend to be more out and about in the early to mid afternoon which could explain this rental pattern.
0.00
50.00
100.00
150.00
200.00
250.00
300.00
0:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
Rentals
Time of Day
Average Weekend Rentals by User Category
Casual Registered
0.00
100.00
200.00
300.00
400.00
500.00
0:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
Rentals
Time of Day
Average Weekday Rentals by User Category
Casual Registered
Page 6 of 17
Hypothesis Tests & Confidence Intervals
Average Rentals – Confidence Intervals
Lower Upper
All Users 185.23 194.94
Casual 34.23 36.85
Registered 150.49 158.60
The above table shows 95% confidence intervals for average rentals among each user category, and all combined users. As such, we can say with 95% confidence that the population average falls between the above ranges for each user category. We will now determine whether there is evidence for different rental volume on weekends versus weekdays for all user categories. We will operate under the null hypothesis that there are different volumes on the weekdays versus weekends.
95% Confidence intervals for the difference between weekday and weekend rentals The above table shows the 95% confidence intervals for the difference between weekday and weekend rentals for each user category. The confidence interval for All Users REJECTS the null hypothesis because 0 is within the confidence interval. This means that there could be no difference at all between weekday and weekend rentals for all users combined. The confidence intervals for casual and registered users both FAIL TO REJECT the null hypothesis. Both confidence intervals show that there is a clear difference between weekday and weekend rentals.
Hypothesis tests for C-Ration
Lastly, the above table shows confidence intervals for the proportion of casual users out of all users. The weekday proportions are calculated with a 99% confidence level and the weekend proportions with a 95% confidence level. Given the null hypotheses indicated above by ‘H0’, we REJECT all null hypotheses as the hypothesized proportions do not fall within the confidence intervals.
Page 7 of 17
Exploratory Analysis The chart above shows the relationship between total rentals and three variables of weather: clear to partly cloudy, misty, and precipitation. The results show that 71.38% of rentals occurred when weather conditions were clear to partly cloudy while only 23.55% occurred when it was misty and 5.05% when there was precipitation. Therefore, a majority of all rentals occur when it is clear to partly cloudy. The chart above shows the relationship between total rentals and the outdoor temperature. The results show a lower number of rentals when the temperature is 40 degrees or below as well as when the temperature is 90 degrees or above. The highest number of rentals occurs between 60 and 90 degrees.
0 100000 200000 300000 400000 500000 600000 700000 800000
Clear/PC Misty PrecipitaNon
Rentals
Weather
Rentals by Weather Condi3ons
0 20000 40000 60000 80000
100000 120000 140000 160000
Rentals
Temperature (Far.)
Total Rentals by Temperature
Page 8 of 17
The chart above looks at the relationship between total rentals and wind speed. The results show that rentals are relatively high between wind speeds of 0 to 20 mph. When the wind reaches speeds greater than 20 mph there is a significant decrease in the number of rentals.
0
50000
100000
150000
200000
250000
300000
350000
Rentals
Windspeed (mph)
Rentals by Wind Speed
Page 9 of 17
Regression Analysis Effects of Temperature on Total Rentals Given the significance of weather in rental activity, our first regression analysis shows the effect of temperature on rentals.
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95% Intercept -‐77.41659132 8.1922685 -‐9.449957764 4.89958E-‐21 -‐93.47669424 -‐61.35648839
Temp (Far.) 4.424960246 0.130296333 33.96074274 1.4188E-‐229 4.169527637 4.680392856
All Users with independent variable Temperature
The above regression table shows that for every degree (Far.) increase in temperature, total rentals increases by 4.42. The Temperature variable is also statistically significant. The 95% confidence interval does not include 0, the p-value is less than .05, and the t-stat is much larger than 1.96. Temperature accounts for approximately 17.39% of the variance in total rental as indicated by its Multiple R-Squared value. (This value is the adjusted square of the correlation between All Rentals and Temperature) Effects of Weekends/Weekdays on Rental Volume It has also been noted that each user category uses PhillyCycle’s bikes for different reasons. To understand the difference between leisure use and work transportation a regression is run for casual, registered, and all users with temperature and weekend as dependent variables.
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95% Intercept -‐77.20572619 8.373615361 -‐9.220118533 4.14645E-‐20 -‐93.62134171 -‐60.79011066 Weekend -‐0.60804181 4.987878811 -‐0.121903886 0.902979614 -‐10.3862675 9.17018388 Temp (Far.) 4.42434721 0.130405062 33.92772601 3.6148E-‐229 4.168701439 4.679992981
All Users with independent variables Temperature and Weekend
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95%
Intercept -‐19.78428916 7.17295555 -‐2.758178135 0.005831754 -‐33.84613406 -‐5.72244426 Weekend -‐35.24559816 4.272686463 -‐8.249048571 1.98145E-‐16 -‐43.62176248 -‐26.86943384
Temp (Far.) 3.050408712 0.111706792 27.30728049 4.9186E-‐154 2.831418983 3.26939844 Registered Users with independent variables Temperature and Weekend
Page 10 of 17
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95%
Intercept -‐57.42143702 2.046707821 -‐28.05551258 6.1868E-‐162 -‐61.43379814 -‐53.40907591 Weekend 34.63755635 1.219154467 28.41113024 9.5657E-‐166 32.24752885 37.02758385
Temp (Far.) 1.373938499 0.031874053 43.1052335 0 1.311452681 1.436424316 Casual Users with independent variables Temperature and Weekend Weekend clearly affects each user category in opposite ways. Casual user rentals increase on weekends, and registered user rentals decrease on weekends. The weekend variable is also statistically significant for both casual and registered users alone. However, because each group is affected in opposite ways, the Weekend variable is not statistically significant for all user combined (0 is within the 95% confidence interval). Multiple Regression of All Variables With the trends of usage and temperature affirmed, we now look at the effects of all variables on the total rentals. The below regression table summarizes total rentals as described by Temperature, Wind speed, Humidity, Season, Weather, Year and, Weekends.
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95%
Intercept 22.64484868 13.68350248 1.654901492 0.098001914 -‐4.180265545 49.4699629 Temp (Far.) 4.719272988 0.192816163 24.47550512 1.2647E-‐125 4.341276521 5.097269455 Wind Speed 0.472977819 0.26949382 1.75505998 0.079305087 -‐0.055337393 1.001293031 Humidity -‐2.761685582 0.126892704 -‐21.76394308 8.7422E-‐101 -‐3.010445816 -‐2.512925349 Spring 5.434096586 7.229957067 0.751608417 0.452318952 -‐8.739498545 19.60769172 Summer -‐22.72542203 9.299997018 -‐2.443594551 0.014573182 -‐40.95711985 -‐4.493724221 Fall 57.34216178 6.400744181 8.958671079 4.43189E-‐19 44.79415413 69.89016943 2012 74.01171663 4.109348175 18.01057333 1.70146E-‐70 65.95575769 82.06767557 Weekend -‐2.408165173 4.513708228 -‐0.533522561 0.593693599 -‐11.25683085 6.440500506 Precipitation -‐2.303412336 8.137552608 -‐0.283059594 0.777141878 -‐18.25625617 13.6494315
Mist 12.12551319 5.021457938 2.414739572 0.0157791 2.281455875 21.96957051 Multiple regression of All Users described by all independent variables The multiple regression equation is as follows…
All Rentals = 22.64 + (4.72)*Temp + (.47)*Wind Speed -‐ (2.76)*Humidity + (5.43)*Spring -‐ (22.73)*Summer + (57.34)*Fall + (74.01)*Year -‐ (2.41)*Weekend – (2.30)*Precipitation + (12.13)*Mist
Page 11 of 17
Notes on Multiple Regression • This regression equation indicates that summer days have 22.72 less rentals than
winter days – all else equal. • Fall is statistically significant because zero is not within the confidence interval
(95%), the p-value is less than .05, and the t-stat is grater than 1.96. All else equal, a fall day has 57.34 more rentals than a winter day.
• Temperature is statistically significant because zero is not within the confidence interval (95%), the p-value is less than .05, and the t-stat is grater than 1.96. A one-degree increase in temperature increases total rentals by approximately 4.7.
• Given a partly cloudy, summer weekend in 2012 with a temperature of 70, humidity of 60, and wind speed of 60, the regression model estimates there would be 238.54 rentals.
• All else equal, a partly cloudy or clear day has 2.3 more rentals than a day with precipitation.
• According to the regression model, 32.53% of variation in bike rentals is described by the variation of all independent variables (Adjusted R-Squared = .3253).
Wind Speed, Humidity, and Cost Saving Given the high costs associated with collecting wind speed and humidity data, it is preferable to collect humidity data if only one can be chosen. Table 1 shows the regression statistics for a model including all independent variables except wind speed, and Table 2 shows regression statistics for a model including all independent variables except humidity.
Regression Statistics Multiple R 0.517876242 R Square 0.268195802 Adjusted R Square 0.266990635 Standard Error 156.9444952
Observations 5475
The model in including humidity describes a greater variance in bike rentals than the model including wind speed as indicated by their respective R-Squared values (.33 vs .27). The standard error for the model including humidity is also slightly smaller (151 vs 157).
Regression Statistics
Multiple R 0.571134679 R Square 0.326194821 Adjusted R Square 0.325085169 Standard Error 150.5968234
Observations 5475
Table 1 Table 2
Page 12 of 17
Exploratory Analysis of Regression Effects on Holidays One factor worth considering is the impact of holidays on total rentals. Because most businesses close on major holidays, this should affect each user category in a similar manner. The below regression table shows a model describing total rentals with the independent variable Holiday.
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95%
Intercept 191.2469484 2.510500291 76.17881942 0 186.3253698 196.1685269
Holiday -‐42.47361502 15.16724522 -‐2.80035131 0.005122605 -‐72.20744507 -‐12.73978498
Regression of All Users with independent variable Holiday
All Rentals = 191.25 – (42.47)*Holiday Holiday proves to be statistically significant and has an obvious impact on rentals. The mean for All Users is 190, so a holiday experiences a 22% (42 out of 190) drop in rentals compared to a non-holiday (all else equal). User Sensitivity to Changes in Season A final factor to consider is the impact of season on different user categories. Given the leisure and transportation trends of each category, it would be useful to see which group is more sensitive to changes in season (Philadelphia experiences very distinct seasons, so the data will prove meaningful). The below regression tables summarize models of casual and registered rentals with independent variables for seasons.
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95% Intercept 14.18134328 1.291788632 10.97806788 9.47393E-‐28 11.64892384 16.71376273 Spring 31.44462708 1.801876064 17.45104877 2.06816E-‐66 27.91223341 34.97702074 Summer 36.66224832 1.813255026 20.21902479 9.98099E-‐88 33.10754736 40.21694928
Fall 16.25844729 1.827889509 8.894655402 7.83491E-‐19 12.67505693 19.84183766
Regression of Casual Users with independent variables Spring, Summer, and Fall (Winter base)
Casual Rentals = 14.18 + (31.44)*Spring + (36.66)*Summer + (16.26)*Fall
Coefficients Standard Error t Stat P-‐value Lower 95% Upper 95% Intercept 95.35522388 4.069529538 23.43151045 8.7167E-‐116 87.37732759 103.3331202 Spring 66.09572884 5.67646106 11.6438267 5.70592E-‐31 54.9676077 77.22384997 Summer 97.95904114 5.712308272 17.14876657 2.9926E-‐64 86.76064522 109.1574371
Fall 71.15113364 5.758411373 12.3560352 1.30358E-‐34 59.8623573 82.43990997
Regression of Registered Users with independent variables Spring, Summer, and Fall (Winter base)
Registered Rentals = 95.36 + (66.10)*Spring + (97.95)*Summer + (71.15)*Fall
Page 13 of 17
The magnitudes of casual and registered rentals are very different; therefore, we will consider the coefficients as a percentage of the average winter rentals (base) per user category.
Coefficients as % of Average Winter Rentals
Casual Users Registered Users Spring 222% 69% Summer 259% 103%
Fall 115% 75%
Coefficients as a percentage of average winter rentals by user category
Because the intercept in the regression models is the average winter rentals, taking the other coefficients as a percentage of the intercept shows the percentage change in average rentals from the base season. The above table summarizes these findings and shows that casual users are far more sensitive to changes in season than registered users in a city with distinct climate changes per season.
Page 14 of 17
Conclusion In conclusion, our analysis gives insight into the key drivers in demand variation, environmental drivers of demand, and the difference in demand patterns between users. The largest demand for bike rentals occurs during summer between the temperatures of 60 and 85 degrees Fahrenheit, when it is clear to partly cloudy. Our data suggests that casual users rent bikes more during the weekend for leisure, while registered users rent more during the weekdays commuting hours. The average rentals by registered users is over four times larger than that of casual users. In order to make a more informed decision for expansion within Philadelphia as well as other cities, we suggest that PhillyCycle further explores factors such as distance traveled by user segment, the age distribution within each user segment, and popularity of each kiosk. These suggestions could help PhillyCycle gain a deeper understanding of their customers’ needs and better determine future marketing and pricing strategies. We cannot make a recommendation for expansion at this time until further data is collected, specifically the aforementioned customer data and revenue data. Given that expanding is such a significant cost, an understanding of the company’s revenue generating ability and operating cost is crucial to make an informed decision. When entering a new city there are other factors besides weather that could affect bike rental demand such as traffic and the city’s infrastructure’s effect on bikers. We acknowledge the importance of weather and temperature on bike rentals but stress that these are not the only relevant factors.
Page 15 of 17
Appendix Elevator Charts
We chose the chart above because it shows the difference in average rentals between the weekdays and weekends for casual, registered, and all users. We felt this chart is important to emphasize because it clearly shows average rentals increase on the weekends for causal users and decrease on the weekends for registered users. We decided to arrange this graph first because it shows the broadest and most significant different between user rental demands. We chose the chart above because it supports the idea that registered users use the bikes as work transportation because the peak times during the most typical commuting times. This graph gives insight to the needs of users in order to better market PhillyCycle’s services. We arranged this graph second because it shows the underlying cause for the underlying cause of the difference between user rental demands.
0.00 100.00 200.00 300.00 400.00 500.00
0:00
1:00
2:00
3:00
4:00
5:00
6:00
7:00
8:00
9:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
Rentals
Time of Day
Average Weekday Rentals by User Category
Casual Registered
0.00
50.00
100.00
150.00
200.00
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Average Re
ntals
Day of Week
Average Rentals by Day of Week per Customer Category
Casual Registered All Users
Page 16 of 17
We chose this pie chart because it breaks down the number of rentals by the user categories casual and registered. We felt this graph was important because 81% or rentals come from registered users, which is significantly greater than casual users. This is important information for the company to understand where their business is coming from. We arranged this graph last because it explains the magnitude difference between the registered and casual users. Notes on Data Preparation Before analysis of the data set, we ensured that the data was free of error, repetition, and significant outliers. All entries were complete and were not missing any data points. The data was further filtered for any duplicates – 6 total were found out of the original 5489 entries. Lastly, significant and abnormal outliers were removed to prevent skewing of the central tendencies. Data with temperature points above 50 degrees Celsius (120 Fahrenheit) were considered erroneous and were removed (8 more entries were removed). The original data contained very few errors, and those errors were adjusted for. We have no concern regarding the data accuracy or quality.
19%
81%
Propor3on of Rentals by User Category
Casual
Registered
Top Related