Quiz 1. Name: No books, notes, or electronic...
-
Upload
duongkhuong -
Category
Documents
-
view
215 -
download
1
Transcript of Quiz 1. Name: No books, notes, or electronic...
Quiz 1. Name:___________________________________________________________________ No books, notes, or electronic devices.
1.(10) What is the slope of the linear approximation to a function f(x)? A. The integral of f(x) B. The derivative of f(x) C. Always 0 D. Always 1/10
2.(10) In the classic regression model, what distribution is assumed for p(y|x)? A. Normal B. Generic C. Bernoulli D. Poisson
3.(10) What happens when you select a smaller "smoothing" parameter (f) in LOWESS? A. The curve is more smooth B. The curve is less smooth C. The curve is flat D. The curve is exponential
4.(30) In the taxpayer document, Y = charitable contributions and X = income. State the correct meaning of the expression E( Y | X=100,000) in terms of charitable contributions and income. One sentence is enough; no more than two.
5. (10) What is another name for the "R-squared" statistic? A. Correlation coefficient B. Residual variance C. t-statistic D. Coefficient of determination
6. (10)The Gauss-Markov Theorem states that certain estimators of the regression coefficients are "good." Which ones are "good"? A. Least squares B. Maximum likelihood C. Method of moments D. Bayesian posterior means
Quiz 2. Name:_____________________________________________ Closed books, notes, and no electronic devices.
1. (20) Suppose you test Ho: β1 = 0 and get a p-value p = .30. Which is the confidence interval for β1? A. (.10, .50) B. (.20, .40) C. (.30, .30) D. (-3.4, 5.6)
2. (20) Why is it wrong to report p ≤ 0.034 rather than p = 0.034? A. because the p-value is not 0.001 B. because p ≤ 0.034 indicates a one-sided test C. because p ≤ 0.034 indicates significance while p = 0.034 indicates insignificance D. because the results are not easily explained by chance alone
3. (40) Suppose that, in the Toluca case, E(Workhours|Lotsize = 80) = 200. Explain this result in terms of the production process and the Law of Large Numbers.
Quiz 3. Name:_____________________________________________ Closed books, notes, and no electronic devices.
4. (40) I gave four concerns (problems) with using testing methods to evaluate regression assumptions. State two of them.
5. (40) I gave five reasons that you might want to use transformations. State two of them.
Quiz 4. Name:_____________________________________________ Closed books, notes, and no electronic devices.
6. (40) The first equation was Price = 24723 – 0.17 Mileage. Does the fact that b1 is close to zero (here b1 = -0.17) mean mileage is not very important? Do not refer to p-values, tests, correlations, or R2 statistics in your answer.
7. (40) Draw a typical scatterplot of residuals versus fitted values that clearly shows heteroscedasticity. (It’s not necessarily to reproduce the specific graph shown in the reading.)
Quiz 5. Name:_________________________________________ Closed notes, books, and no electronic devices.
1. (40) Perform the following matrix multiplication:
=
635241
100010001
2.(20) Which matrix function can tell you that a matrix is not invertible? A. cofactor B. eigenvector C. determinant D. linear combination
3. (20) The partial regression plot is a scatterplot showing __________ on one axis and ___________ on the other axis. A. residuals, residuals B. residuals, fitted values C. fitted values, slopes D. slopes, intercepts
Quiz 6. Name:_________________________________________ Closed notes, books, and no electronic devices.
In the Carvalho document, he reported the regression equation Sales = 116 – 97.7P1 + 109P2, where
Sales = Your sales of a product
P1 = Your price for the product
P2 = Competitors’ price for the same product
Give the correct interpretation of the coefficient
b1 = -97.7
in a sentence or two, as was done in the document. Make sure your interpretation correctly addresses the fact that this coefficient is a negative number.
Quiz 7. Name: _______________________________________ Closed notes, no electronic devices.
1.(20) Inference between observational units is called _______________; inference within observational units is called _______________. A. predictive, causal B. correlational, causal C. causal, correlational D. correlational, predictive
2.(20) In causal inference, you need to consider outcomes Y that would have happened, had you set the treatment variable to some different number. These outcomes are called A. predictions B. residuals C. slopes D. counterfactuals
3.(20) The true regression model is Y = β0 + β1T + β2X + ε. The confounder X is related to the treatment via X = γ0 + γ1T + ν. Give the slope of the regression of Y on T alone, in terms of these models’ parameters.
4.(20) The book describes three methods for estimating causal effects. Name one of these methods. (Two or three words only. Do not describe the method).
Quiz 8. Name: _______________________________________ Closed notes, no electronic devices.
Define multicollinearity in one or two sentences. (Don’t tell me what it does, just tell me what it is.)
Quiz 9. Name:_______________________________
Here is a model equaiton:
sexdosesexdoseY **** 321 βββα +++=
Assume Males are coded as 1 and Females as 0.
Using the model equation, show why the effect of dose on Y among males is 31 ββ + .
Quiz 10. Name:_______________________________
Closed notes, no electronic devices.
The variance bias trade-off refers to the estimated regression function ).(ˆ xf
Variance refers to the variability (i.e., multiple possible values) of )(ˆ xf .
Why is there variability (i.e. multiple possible values) of )(ˆ xf ?
Quiz 11. Name:_______________________________
Closed notes, no electronic devices.
True regression functions are never straight lines. Instead they are always curved, to some degree. Explain why this is true in an example from your field of study as follows:
1. Name your Y and X variables from your field of study. Y = ____________________________________________________
X = _____________________________________________________
2. Define the true regression function in terms of your Y and X variables.
3. Explain, in terms of your Y and X variables, why the true regression function is not precisely linear.
Quiz 12. Name: _______________________________________________
Here is a model.
Pricei = β0 + β1dn1i + β2dn2i + εi
Recall: There are three neighborhoods, and i = 1,…, 128 identifies a house in the data set.
dn1 = the dummy variable for neighborhood 1
dn2 = the dummy variable for neighborhood 2.
What does the model say about the distribution of house prices in neighborhood 2? (One or two sentences maximum.)
Quiz 14. Name: ____________________________________
Suppose there are outliers in Y|X space. What does this tell you about p(y|x)?
Quiz 15. Name: ___________________________________________________ Closed notes, no electronic devices.
1. Which assumptions are needed for typical quantile regression? (Select all that apply, 8 points per correct selection/non-selection) A. A model for the data-generating process B. Linearity C. Constant variance D. Normality E. Ordinary least squares
2. In the reading, the variable τ was used to indicate a (pick one) A. mean B. probability C. variance D. slope
3. In the food expenditure / income example, the slope of the 0.1 quantile regression function was ________________ the slope of the 0.9 quantile regression function. (pick one) A. Great than B. Less than C. Approximately equal to
Quiz 16. Name:__________________________________________________
1. If a process is stationarity then the means and variances are identical, for all time points t. What else is identical, for all time points t, when the process is stationary?
2. Several stationary processes were discussed in the article. Name one.
Quiz 17. Name:__________________________________________________
The document states that the covariance matrix of the error terms (The εi terms) is Σ.
1. What do different diagonal elements of Σ tell you?
2. What do non-zero off-diagonal elements of Σ tell you?
Quiz 18. Name:__________________________________________________
The document discusses “pitch.” What is “pitch”?
Quiz 19. Name:__________________________________________________
Answer on the lines provided only. (Be brief).
Give an example of panel data as follows, either from the reading, of from your own choosing.
Y = ____________________________________________________________________________
X = _____________________________________________________________________________
The variables Y and X are subscripted by “i” and “t” as follows: Yit, Xit.
In your chosen example,
i refers to ____________________________________________________________________________
t refers to ____________________________________________________________________________
Quiz 20. Name:__________________________________________________
Suppose you can observe data in five groups as follows:
Group 1 Group 2 Group 3 Group 4 Group 5
Data: Y11, Y12, …,
Y1n1
Y21, Y22, …, Y2n2
Y31, Y32, …, Y3n3
Y41, Y42, …, Y4n4
Y51, Y52, …, Y5n5
True (“population”)
mean: µ1 µ2 µ3 µ4 µ5
According to the article, you should not use the ordinary average of the data in Group 1 to estimate µ1, you should instead use a “shrinkage” estimate to estimate µ1. Describe briefly what is the “shrinkage estimate” of µ1. Do not use formulas.
Quiz 21. Name:__________________________________________________
Write down the equation of a simple multilevel model. It should be clear from your equation why it is called “multilevel.”
Quiz 22. Name:__________________________________________________
Suppose your level-one model is
Yij = β0j + β1jXij + εij,
for j = 1, …, J level-2 observational units, and i = 1, …, nj level-1 observation units within level-2 unit j.
Your theory states that the level-two terms β0j and β1j have the following models:
β0j = γ00 + u0j and β1j = γ10 + γ11 Zj + u1j
where Zj is a level-2 predictor variable.
Write the single equation that is your multilevel model. Identify which terms are fixed effects and which terms are random effects in that single-equation model.
Quiz 24. Name:__________________________________________
Regression models assume that Y is randomly sampled from (or produced by) distributions p(y|x).
Classic regression models assume that these distributions p(y|x) are normal distributions.
What are the distributions p(y|x) when Y is a binary random variable?
Quiz 25. Name:__________________________________________
Regression models assume that Y is randomly sampled from (or produced by) distributions p(y|x).
Classic regression models assume that these distributions p(y|x) are normal distributions.
What are the distributions p(y|x) when Y is a nominal random variable?
Quiz 26. Name:__________________________________________
Regression models assume that Y is randomly sampled from (or produced by) distributions p(y|x).
Classic regression models assume that these distributions p(y|x) are normal distributions.
Suppose your dependent variable is Y = count of financial advisors a person has used in their life. The first 10 Y observations in your data set are: 0 0 0 3 0 1 1 1 0 1.
What distributions p(y|x) will you assume to produce Y in this case?
Quiz 27. Name:__________________________________________
Regression models assume that Y is randomly sampled from (or produced by) distributions p(y|x).
Classic regression models assume that these distributions p(y|x) are normal distributions.
Suppose your dependent variable is
Y = time waiting on the phone for customer service.
You will analyze these data using parametric survival analysis methods.
What distributions p(y|x) will you assume to produce Y in this case?
Quiz 28. Name:__________________________________________
Regression models assume that Y is randomly sampled from (or produced by) distributions p(y|x).
Classic regression models assume that these distributions p(y|x) are normal distributions.
What distributions p(y|x) will you assume to produce Y when you use a sample selection model?