Movie sales analysis.pptx

16
Movie Madness Applied Regression Analysis Flora Amores, Kristen Bierfeldt, Ben Bronfman, Jena Deng, Jennie Goldstein, Christiana Kwon, Jonathan Ragins

description

This is presentation to analyze the movie sales and different movie release on a particular period of time and predicting the future movie based on this.

Transcript of Movie sales analysis.pptx

Page 1: Movie sales analysis.pptx

Movie Madness

Applied Regression Analysis

Flora Amores, Kristen Bierfeldt, Ben Bronfman, Jena Deng,

Jennie Goldstein, Christiana Kwon, Jonathan Ragins

Page 2: Movie sales analysis.pptx

Objective & Approach

Identify factors that influence ticket sales Create a model to help studios predict gross ticket sales

Description of independent and

dependent variables

Histogram analysis

Distribution Test

Correlation Analysis

Descriptive Statistics

Best subset regression

analysis: identify relevant

independent variables

Residual analysis

Regression Analysis

Forecasting gross domestic ticket sales for 2014

releases

Forecasting

Page 3: Movie sales analysis.pptx

Dataset

Data Field Description Source

Movie Title Name of the film Box Office Mojo

Total Gross Total amount of money the film made while it was in theaters Box Office Mojo

Awareness Score Public’s awareness of top-billed celebrity; 0-100 Rating IMDB; E-Score

Appeal Score Appeal of top-billed celebrity; 0-100 Rating IMDB; E-Score

Rotten Tomatoes Rating

Film critic and writer reviews Rotten Tomatoes

Production Budget Amount the studio spent on production of the film (excl Marketing) IMDB

Studio CBS, Disney, FilmDistrict, Focus, Fox, Lionsgate, Open Road, Paramount, Relativity, Screen Gems, Sony, Summit, Universal, Warner Bros, Weinstein

IMDB

Season Season the film premiered in: Winter, Spring, Summer, Fall Box Office Mojo

Genre Action/Adventure, Animated, Comedy, Drama, Horror, Sci-Fi/Fantasy, Suspense/Thriller, Romance, Documentary

Movies.com

Rating G, PG, PG-13, R Movies.com

Page 4: Movie sales analysis.pptx

Skewed right with long right tail Even among the top films, very few

achieve extreme success (>$300M),

Most of the top 100 earned closer to $75M, compared to the mean production budget of about $74M

Histograms – Revenue Minus Cost

Skewed right with long right tail Many production budgets for top

grossing films are over $40M (long right tail)

However, about 30% are made with a <$40M budget, indicating a

massive budget is not necessary for commercial success

Page 5: Movie sales analysis.pptx

Fairly normal, but skewed left Top grossing films tend to star

actors with higher appeal scores

Histogram – Independent Variables

Does not appear normal Current stars or rising stars tend to

be more common in top films Lower awareness more common in

top grossing films

Page 6: Movie sales analysis.pptx

Distribution Test

Skewed right with long right tail Even among the top films, only very few are extremely successful (>$300M)

Actual blockbusters are rare - Most of the top 100 earn closer to $75M This compares to about $75M average production budget, indicating limited profitability

even among the most successful films

Page 7: Movie sales analysis.pptx

Full Model OutputCoefficients Term Coef SE Coef T-Value P-Value VIF Constant -96476280 92212802 -1.05 0.299 Awareness Score 31609 347267 0.09 0.928 1.76 Appeal Score 1499880 842243 1.78 0.079 2.13 Rotten Tomatoes Rating (Tomato 114213804 33548802 3.40 0.001 1.92 Production Budget 0.887 0.184 4.82 0.000 3.51 CBS -20713796 74706088 -0.28 0.782 1.34 Disney 37249040 40706931 0.92 0.363 3.29 FilmDistrict 40710067 58480417 0.70 0.489 1.62 Focus -80902310 57291818 -1.41 0.163 1.56 Fox -29008644 38769488 -0.75 0.457 3.56 Lionsgate 68249084 40197434 1.70 0.094 2.21 Open Road 25477257 49761753 0.51 0.610 1.74 Paramount 25497332 40556061 0.63 0.532 2.93 Relativity 15707538 48630710 0.32 0.748 2.20 Screen Gems 16569478 59756962 0.28 0.782 1.69 Sony -14683522 37111355 -0.40 0.694 3.77 Summit -22185861 44280947 -0.50 0.618 2.25 Universal 24118387 37529997 0.64 0.523 4.35 Warner Bros. 4413709 37927330 0.12 0.908 4.19 Fall 9031477 24175518 0.37 0.710 2.35 Spring 28732675 21196246 1.36 0.180 1.93 Summer 10441133 20942975 0.50 0.620 2.27 Action/Adventure -68845716 86491210 -0.80 0.429 36.51 Animated -14257945 77891882 -0.18 0.855 14.38 Comedy -16598316 84045022 -0.20 0.844 29.34 Drama -45802940 84873963 -0.54 0.591 23.44 Horror -16445679 85217610 -0.19 0.848 11.44 Sci-Fi/Fantasy -88908859 87255620 -1.02 0.312 16.59 Suspense/Thriller -35039393 92033894 -0.38 0.705 5.97 Romance -26475377 97313741 -0.27 0.786 4.49 PG-13 -4388273 17825515 -0.25 0.806 1.92 PG 14749424 41440261 0.36 0.723 5.30 G -97902869 88703709 -1.10 0.274 1.89

R2 = 63.4%

Adj. R2 = 45.9% Several

variables with extremely high VIF scores

Multiple insignificant variables

Page 8: Movie sales analysis.pptx

Correlation Matrix

Production budget, rating, and Disney studio have highest correlations to the dependent variable

Action/Adventure and budget are highly correlated, likely due to special effects costs. Viewers also benefit the most from this genre’s in-theater experience Independent variables have low correlation to each other (<40%)

Total Gross

Awareness Score

Appeal Score

Rotten Tomatoes Rating (Tomato Meter)

Production Budget CBS Disney

FilmDistrict Focus Fox Lionsgate

Open Road

Paramount Relativity

Screen Gems Sony Summit Universal

Warner Bros.

Weinstein Fall Spring Summer Winter

Action/Adventure Animated Comedy Drama Horror

Sci-Fi/Fantas

ySuspense/Thriller Romance

Documentary PG-13 PG G

Total Gross

Awareness Score 12.75%

Appeal Score 30.39% 35.33%Rotten Tomatoes Rating (Tomato Meter) 36.15% -2.26% 26.98%

Production Budget 60.26% 20.61% 21.12% 6.98%

CBS -4.24% 9.12% 11.91% -2.69% -7.04%

Disney 36.46% 4.45% 8.05% 7.22% 35.52% -3.16%

FilmDistrict -1.53% -13.55% -2.34% -5.98% -7.93% -1.44% -4.49%

Focus -12.15% -3.93% 10.50% 20.10% -12.46% -1.44% -4.49% -2.04%

Fox -3.86% 10.40% 8.61% 0.08% 7.33% -3.53% -11.06% -5.02% -5.02%

Lionsgate 2.30% -7.81% -5.65% -11.21% -15.93% -2.54% -7.95% -3.61% -3.61% -8.88%

Open Road -13.68% -18.47% -7.09% -13.54% -13.39% -1.77% -5.53% -2.51% -2.51% -6.18% -4.44%

Paramount 8.42% 11.73% 0.48% 3.47% 9.00% -2.96% -9.27% -4.21% -4.21% -10.37% -7.45% -5.19%

Relativity -12.48% -4.58% -1.50% -24.87% -13.21% -2.05% -6.42% -2.92% -2.92% -7.18% -5.16% -3.59% -6.02%

Screen Gems -11.07% -19.95% -8.12% -12.70% -6.30% -1.44% -4.49% -2.04% -2.04% -5.02% -3.61% -2.51% -4.21% -2.92%

Sony -7.85% 9.13% -7.66% 4.97% -4.33% -3.89% -12.16% -5.52% -5.52% -13.59% -9.77% -6.80% -11.40% -7.89% -5.52%

Summit -9.06% 0.05% 2.85% 4.21% -4.41% -2.31% -7.21% -3.28% -3.28% -8.07% -5.80% -4.03% -6.77% -4.68% -3.28% -8.87%

Universal -1.80% 6.14% 0.43% 1.08% -6.94% -4.22% -13.21% -6.00% -6.00% -14.77% -10.61% -7.39% -12.39% -8.57% -6.00% -16.24% -9.64%

Warner Bros. 12.74% -9.81% -4.27% 11.97% 19.56% -4.06% -12.69% -5.76% -5.76% -14.18% -10.19% -7.10% -11.90% -8.24% -5.76% -15.60% -9.26% -16.95%

Weinstein -11.71% 3.04% -0.87% -1.14% -16.98% -2.31% -7.21% -3.28% -3.28% -8.07% -5.80% -4.03% -6.77% -4.68% -3.28% -8.87% -5.26% -9.64% -9.26%

Fall 5.97% 3.23% 26.65% 26.38% -13.91% 19.49% 9.52% 10.17% 10.17% -10.28% -2.69% -9.07% -6.15% 14.53% 10.17% -5.33% -0.56% -1.03% -6.65% -0.56%

Spring 9.19% -3.71% -21.54% -11.25% 15.15% -5.49% -0.58% 9.17% -7.81% -4.03% -3.80% 4.32% 10.16% 0.97% -7.81% 0.07% -1.64% -2.99% 5.34% -1.64% -28.18%

Summer 2.74% -3.11% 11.00% -0.95% 12.23% -6.74% 1.59% -9.58% 5.87% 10.99% -7.83% -11.79% -11.80% -13.68% 5.87% 19.10% 4.46% 2.12% -2.12% -5.46% -34.56% -36.63%

Winter -17.48% 3.89% -15.88% -12.86% -14.70% -5.80% -10.09% -8.25% -8.25% 1.85% 14.59% 16.92% 8.51% 0.00% -8.25% -15.45% -2.65% 1.62% 3.33% 7.95% -29.77% -31.55% -38.70%

Action/Adventure 16.57% 27.07% 32.24% -3.22% 37.74% -6.27% 11.52% 7.00% -8.91% -0.57% -6.38% -10.97% 6.24% -1.36% -8.91% 2.38% 6.13% 23.70% -18.74% -14.31% 0.66% 8.26% 6.36% -15.43%

Animated 25.80% 3.23% 0.57% 0.80% 19.67% -3.53% 22.45% -5.02% -5.02% 18.28% -8.88% -6.18% -10.37% 9.13% -5.02% 5.42% -8.07% -5.82% -14.18% 6.60% 5.41% -4.03% 10.99% -12.92% -21.92%

Comedy -16.02% -1.72% -2.18% -18.14% -30.97% 18.92% -8.27% -7.59% 9.66% 4.47% 6.91% 4.81% 2.14% 1.48% -7.59% -6.17% 9.97% -8.79% -0.56% -1.11% -3.67% -11.82% 6.16% 8.36% -33.12% -18.67%

Drama -8.25% 7.34% 0.22% 31.11% -22.34% -4.39% -4.19% -6.23% 13.25% -6.63% 11.95% 8.31% -2.82% -8.91% -6.23% -0.65% -10.01% -18.33% 13.84% 27.53% 4.29% -4.41% -17.46% 18.90% -27.22% -15.34% -23.18%

Horror -10.06% -29.61% -23.51% -3.22% -24.96% -2.76% -8.63% 24.08% -3.92% -9.65% 9.57% -4.82% -8.09% -5.60% 24.08% 1.05% -6.29% 10.43% 0.23% -6.29% 5.10% -5.68% -1.44% 2.26% -17.11% -9.65% -14.57% -11.97%

Sci-Fi/Fantasy -1.48% -23.31% -8.75% -8.56% 26.92% -3.35% -10.48% -4.76% -4.76% -1.07% -8.42% 13.68% 14.74% -6.80% 19.05% -12.89% -7.65% 4.67% 15.37% -7.65% -9.00% 13.47% -0.72% -3.85% -20.79% -11.72% -17.70% -14.55% -9.15%

Suspense/Thriller -4.80% 8.55% -0.77% 2.79% -7.84% -1.77% -5.53% -2.51% -2.51% -6.18% -4.44% -3.09% -5.19% -3.59% -2.51% 10.63% 22.86% -7.39% 9.80% -4.03% 5.33% 18.25% -11.79% -10.15% -10.97% -6.18% -9.34% -7.68% -4.82% -5.86%

Romance -8.57% -8.30% -11.33% -0.60% -10.55% -1.44% -4.49% -2.04% -2.04% -5.02% -3.61% -2.51% -4.21% 33.53% -2.04% -5.52% -3.28% -6.00% 14.82% -3.28% -7.37% -7.81% -9.58% 24.74% -8.91% -5.02% -7.59% -6.23% -3.92% -4.76% -2.51%

Documentary -8.29% -8.92% -35.07% 2.98% -9.80% -1.01% -3.16% -1.44% -1.44% -3.53% -2.54% -1.77% -2.96% -2.05% -1.44% 26.00% -2.31% -4.22% -4.06% -2.31% -5.18% -5.49% 14.99% -5.80% -6.27% -3.53% -5.34% -4.39% -2.76% -3.35% -1.77% -1.44%

PG-13 10.84% 10.61% 1.73% -3.37% 25.33% 10.46% 4.76% 0.57% -13.73% -20.98% 17.87% -5.16% 1.18% -9.40% 0.57% -7.38% 14.69% -6.73% 13.15% 5.51% -0.39% 14.08% -3.81% -9.25% 15.87% -33.78% -2.71% 7.21% -10.67% 14.68% 6.57% 0.57% -9.66%

PG 14.74% -3.57% -13.42% -5.35% 13.80% -4.22% 16.15% -6.00% -6.00% 38.94% -10.61% -7.39% -12.39% 5.72% -6.00% 8.74% -9.64% -9.80% -16.95% 3.21% -1.03% -2.99% 8.17% -4.85% -13.72% 74.74% -15.55% -18.33% -11.53% -4.67% -7.39% -6.00% 23.92% -40.36%

G 19.41% 7.07% 3.78% 9.03% 30.09% -1.01% 31.96% -1.44% -1.44% -3.53% -2.54% -1.77% -2.96% -2.05% -1.44% -3.89% -2.31% -4.22% -4.06% -2.31% -5.18% -5.49% 14.99% -5.80% -6.27% 28.59% -5.34% -4.39% -2.76% -3.35% -1.77% -1.44% -1.01% -9.66% -4.22%

R -26.27% -9.86% 7.39% 5.62% -42.86% -7.54% -23.59% 4.17% 19.05% -6.39% -10.18% 11.24% 8.60% 5.95% 4.17% 1.98% -7.65% 15.17% -0.24% -7.65% 2.25% -11.29% -5.23% 14.43% -5.01% -26.37% 15.49% 7.05% 20.25% -11.11% -0.98% 4.17% -7.54% -72.06% -31.51% -7.54%

Page 9: Movie sales analysis.pptx

Reduced Model - Method

Since Minitab only handles 31 independent variables, we ran a Best Subset by excluding the studios that had the lowest correlation with the dependent variable– The Best Subset model had 13 variables that were used to run a regression

After running the regression we continued to remove insignificant variables from the model until we felt the best reduced model was reached

Page 10: Movie sales analysis.pptx

Reduced Model - Output

R2 = 53% Adj. R2 = 50%

All variables statistically significant at the ~90% level

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.73

R Square 0.53

Adjusted R Square 0.50

Standard Error 61,965,779.77

Observations 100.00

ANOVA

df SS MS F Significance F

Regression 6.00 398,577,185,982,704,000 66,429,530,997,117,400 17.30 0.00

Residual 93.00 357,097,481,180,966,000 3,839,757,862,160,930

Total 99.00 755,674,667,163,671,000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%

Intercept (65,694,307.46) 35,415,935.25 (1.85) 0.07 (136,023,336) 4,634,721 (136,023,336) 4,634,721

Appeal Score 964,102.15 592,906.32 1.63 0.11 (213,292) 2,141,497 (213,292) 2,141,497

Rotten Tomatoes Rating (Tomato Meter)101,543,955.22 24,678,272.79 4.11 0.00 52,537,796 150,550,114 52,537,796 150,550,114

Production Budget 0.62 0.11 5.90 0.00 0.41 0.83 0.41 0.83

Animated 35,240,004.48 20,513,077.46 1.72 0.09 (5,494,902) 75,974,911 (5,494,902) 75,974,911

Disney 40,129,621.38 23,533,412.34 1.71 0.09 (6,603,072) 86,862,314 (6,603,072) 86,862,314

Focus (78,441,939.30) 45,822,939.55 (1.71) 0.09 (169,437,216) 12,553,337 (169,437,216) 12,553,337

Predictive Equation: Total Gross = -65,694,307 + 964,102Appeal + 101,543,955RottenTomatoes + 0.62Production Budget + 36,240,004Animated + 40,129,621Disney – 78,441,939Focus

Page 11: Movie sales analysis.pptx

Residuals

Our assumption of linearity does not hold Heteroscedasticity present

Page 12: Movie sales analysis.pptx

How Did We Do?

Using our final model, we predicted ticket sales for movies released and closed in 2014

Page 13: Movie sales analysis.pptx

Conclusions

Model indicates that Rotten Tomatoes Rating and Production Budget are significant contributors to Total Gross variability

Studio has little impact on the variability of Total Gross, with the exception of Disney (positive impact) and Focus (negative impact)– Disney: People go to see a movie because it’s a Disney movie, which drives up ticket sales– Focus: This indie film studio has smaller distribution and awareness

Ticket sales are difficult to predict Movie goers are irrational Action/Adventure films have high production budgets but do not consistently

generate high ticket sales. The genre is more prone to outliers in the form of blockbusters

Page 14: Movie sales analysis.pptx

QUESTIONS?

Page 15: Movie sales analysis.pptx

Appendix

Page 16: Movie sales analysis.pptx

Additional Data

In addition to the previous data, we also aggregated the following data that we did not include in our analysis:

Data Field Description Reason Not Included

Theaters (Opening)

Total number of theaters that showed the movie on its opening weekend

Several outliers that had limited openings

Opening Film’s gross on its opening weekend Opening gross is not necessarily representative of total movie performance, especially for limited openings

Theaters Total number of theaters that ever showed the movie

Minimal variance within the data

Open Date of opening Opening date reflected within “Seasons” variable

Close Date of closing Closing date inherently reflected within total gross numbers