Post on 29-Mar-2015
Survey of Electronic Commerce and Technology: Past, Present and Future Challenges
Jason Raymond
Third International Conference on Establishment Surveys
June 2007
Outline
Description of the survey
Methodology
Improvements to the sample design
Weighted Outliers
Future challenges
Description of the survey
Annual survey in place since 1999
Cross-economy surveySome exceptions at sub-industry level
Domains of interest:NAICS, SIZE (number of employees)
Description of the survey
Two-page questionnaire with questions on:Use of information and communications technologies (Internet, intranet, web site, …)Use of electronic commerce for the purchase and sale of goods and servicesBarriers to electronic commerce
Types of questions: Mostly categoricalSome numerical
total sales over Internetpercentages
Methodology
SamplingUniverse
Statistics Canada’s Business Register List of public units
Target populationFixed thresholds of exclusion:
$100,000 or $250,000 in gross business income depending on industryCovers approximately 95% of income in each industry
around 700,000 businesses
Methodology
SamplingStratification
NAICS3, NAICS4
Size:0 to 19 employees
20 to 99 employees
100 to 499 employees
500 employees and more -> Take-all stratum
Public/private sector
Take-some strata
Methodology
SamplingNeyman allocation
Sample SelectionSample size: around 19,000 enterprises
Maximum overlap between two consecutive years:Kish and Scott method (1971)
Approximately 70% overlap
Outlier detectionVariables:
Sales over Internet
Year over year difference for sales over Internet
Method: Variant of sigma gap
Distance measure between observations
Methodology
Partial nonresponse (8.3%) imputationDeductive (1%)
Historical (0.1%)
Administrative (0.02%)
Donor (7.2%)
Total nonresponse (31%) reweighting
Methodology
Methodology
Estimation using Statistics Canada’s Generalized Estimation System (GES)
Types of estimatesMeans
Totals
Proportions
Ratios
Data quality measures based on CVs and imputation rates
Improvements to the sample design
When?Current sample design tested in 2004 in parallel with original design and adopted in 2005
Why?Improve the comparability of estimates over time
Need for estimates by size of enterprise
Target populationOriginal sampling design:
Units accounting for 95% of the total income
Drawback: Unstable population over time
New sampling designFixed thresholds of exclusion: $100,000 or $250,000 depending on the industry
Improvements to the sample design
Stratification and allocationOriginal sampling design
NAICS3, NAICS4
Lavallée-Hidiroglou: 2 take-some strata and 1 take-all stratum
Auxiliary variable: GROSS BUSINESS INCOME
Drawback: Not efficient for estimates by size (Number of employees)
Improvements to the sample design
New sampling designStratification:
NAICS3, NAICS4
Size:0 to 19 employees
20 to 99 employees
100 to 499 employees
500 employees and more -> Take-all stratum
Public/private
Neyman allocation
Improvements to the sample design
Take-some strata
Weighted Outliers
Small proportions of firms sell over Internet (8% of private sector and 16% public sector)
Moderate values but large weights sometimes significantly influence estimates
Previously outlier detection uniquely for unweighted values of sales over the Internet
Weighted Outliers
Weighted outlier detection and treatment implemented in 2006Same detection method as for unweighted values (variant of sigma gap method)Treatment methods studied
Hidiroglou/Srinath WinsorizationDalén and Tambay Promotion to own stratum
Hidiroglou/Srinath (1981)Weight reduction method
Minimizes MSE of estimator for total
Requires use of population characteristics which are unknown, and which may possibly not be estimated reliably.
Weighted Outliers
WinsorizationReduces values larger than a certain cutoff to the cutoff itself (dependent on outlier detection method)
Modified to weight reduction method
Weighted Outliers
Dalén(1987) and Tambay(1988)Cross between Winsorization and weight reduction The cutoff for weighted outlier detection is determined for each stratumOutlier value is split into two parts:
Portion less than the cutoff which receives the same new weight as the non-outliers;Portion greater than the cutoff which is allocated a weight of 1
Weighted Outliers
Weighted Outliers
Promotion to own stratumOutliers assigned a weight of 1
Remaining units in stratum have their weights adjusted
Outlier represents only itself during estimation
Implemented method: Dalén and TambayFewer assumptions
Nice compromiseImpact on the estimates is reduced
Not as drastic as promotion to own stratum
Method performed well using 2005 data
Additional empirical studies to confirm effectiveness of the method (simulations?)
Weighted Outliers
Future challenges
Response burdenMaximising overlap = increased response burden?
Minimal effect on response rates
Conditioning effect?
Sample rotation:Ease response burden
Control sample overlap for longitudinal analysis
Statistics Canada’s Business Register redesign
Sampling elements based on operating structure VS statistical structure
Certain modeled variables replaced by administrative data
Future challenges
Pour plus d’information, veuillez contacter
For more information please contact
www.statcan.ca
Jason Raymond613-951-1917
Jason.Raymond@statcan.ca