Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

13
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Transcript of Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Page 1: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Performance of Resampling Variance

Estimation Techniques with Imputed Survey data.

Page 2: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

• The Jackknife variance estimation based on adjusted imputed values proposed by Rao and Shao (1992).

• The Bootstrap procedure proposed by Shao and Sitter (1996)

Page 3: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Performance

We carry out a Montecarlo study.

For each replication, we compute:

• Relative bias

• Relative mean square error

• The 95% confidence interval based on the normal distribution

Page 4: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Imputation methods

• Ratio and mean imputation

• For each method we consider several fractions of missing data, with and without covariates

Page 5: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

1 case: Structural Business Survey

• Population : Annual Industrial Business Survey (completely enumerates enterprises with 20 or more employees) of size N=16,438

• The variable to impute: Turnover

• Auxiliary variable: total expense

Page 6: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

2 case:Retail Trade Index Survey

• Population : sample of businesses from the Retail Trade Index Survey of size N=9,414

• The variable to impute: Turnover

• Auxiliary variable: the same month year ago turnover

Page 7: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

1 case:Montecarlo study

• Simple random samples without replacement of sizes n=100, 500, 1000 and 5000

• Non-response in the turnover variable is randomly generated (response mechanism uniform)

• A loss of about 30% is simulated

Page 8: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

2 case:Montecarlo study• Stratified random samples without

replacement of sizes n=800, 1500, 2200 and 3000

• Non-response in the turnover variable is randomly generated (response mechanism uniform)

• Missing data are generated following a distribution similar to the true missing value pattern observed in the survey.

Page 9: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Montecarlo study

Number of replications is 200,000 for each auxiliary variable, imputation method and sample size

Page 10: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Results (I)

• The performance of the jackknife variance estimator is better for larger sample sizes and for ratio imputation.

• The jackknife variance performs poorly. This shows that strong skewness and kurtosys of imputed variable can influence considerably the results.

Page 11: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Results (II)

• The relative bias is large for small sizes, then decreases and increases again when the sampling fraction becomes non-negligible

• The coverage rate is not close to the nominal one even for large samples. (Due to the skewed and heavy-tailed distributions of the variables)

Page 12: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Conclusions (I)

• Ratio imputation should be used instead of mean whenever auxiliary variable are avalaible.

• In these examples, the stratification of the sample doesn’t improve the quality of the the jackknife variance estimator

Page 13: Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Conclusions (II)

• The percentile bootstrap performs better than the jackknife for coverage rate of the confidence intervals and the reverse is true for mean square errors and bias of the variance.