Annika Lindblom Alex Teterukovsky Statistics Sweden
description
Transcript of Annika Lindblom Alex Teterukovsky Statistics Sweden
Annika Lindblom
Alex Teterukovsky
Statistics Sweden
On coordination of stratified Pareto ps and simple random samples
The paper focuses on:
• Presentation of the sampling designs in the SAMU:
stratified SRS and Pareto ps
• Sample co-ordination, in particular the
implementation of Pareto ps design
• Overlap between ps and SRS samples:
- theoretical findings for SRS
- empirical findings based on surveys in practice
The SAMU
• A system for co-ordination of frame populations and samples from the Business Register at Statistics Sweden since 1972
• Three main objectives:
- obtain comparable statistics
- ensure high precision in estimates of change
over time
- spread the response burden
Inclusion probabilities
Frame population divided into H disjoint strata Uh, h = 1,..H, where Uh contains Nh units. A sample of fixed size nh from each Uh is to be drawn. Inclusion probabilities are:
hk N
nh
h
jj
kh
Nk
x
xn
1
Stratified SRS:
same for all units within a stratum
(Pareto) ps:
unique for each unit, xk size measure for unit k
Permanent random numbers
• To each unit k in the Business Register a permanent random number uk uniformly distributed over the interval (0,1), is attached
• For SRS we choose the starting point and the direction, and sample the necessary number of units:
Different blocks
Pareto ps sampling procedure
1. Compute the desired inclusion probabilities within each stratum
2. If k>1 then unit is sampled with probability 1
3. For other units calculate the ranking variable:
4. The sample consists of the units with the nh
smallest q-values within stratum h
kk
kk
k u
uq
1
1
h
jj
kh
Nk
x
xn
1
Random number transformation is necessary.
The objective of the transformation is to select the nh units with the smallest q-values within a stratum h independently of what starting point S is chosen.
Transform uk into zk as follows.
Sampling direction right:
Sampling direction left:
1,SuModz kk 1,kk uSModz
Pareto and starting points
kk
kkk z
zq
1
1
Coordination and overlap
• Theoretical. SRS/SRS.
• Empirical. SRS/SRS.
• Empirical. Pareto/Pareto.
• Empirical. SRS/Pareto. Same surveys.
• Empirical. SRS/Pareto. Different surveys.
• Empirical. SRS/SRS and Pareto/Pareto over time.
Theoretical overlap. SRS/SRS.
Coordinate 2 equal SRS samples (h is stratum):
sample sizes nh
frame population Nh
completely enumerated units mh
What is the expected overlap for all a’s and b’s?
GEOMETRY!
Theoretical overlap. SRS/SRS.Type Condition Expected overlap
Ia b-a<Rh<1-(b-a) nh –(b-a)(Nh-
mh)
IIa 1-(b-a)<Rh<b-a nh + (b-a-1)(Nh-
mh)
IIIa max(b-a, 1-(b-a))<Rh
2nh - Nh
IVa Rh<min(b-a,1-(b-
a))
mh
Rh=hh
hh
mN
mn
Theoretical overlap. SRS/SRS.Type Condition Expected overlap
Ib 0.5(b-a)<Rh<b-a 2nh –(b-a)Nh-
(1-b+a)mh
IIb b-a < Rh < 0.5(1+
(b-a))
mh+(b-a)(Nh-
mh)
IIIb max(b-a, 0.5(1+(b-a)))<Rh
2nh - Nh
IVb Rh<0.5(b-a) mh
Empirical overlap. SRS/SRS.
Wages and salaries, private sector (stratified after #employees)
N= 66 083 SRS: 1 732 completely enumerated (23% of the sample)n = 7 497
SAMU block Actual Expected
Start Direction Units % Units %
0.0 Right2 258 30 2 258 30
1.0 Left
0.0 Right3 111 41 3 103 41
0.2 Right
0.0 Right2 613 35 2 616 35
0.7 Right
0.0 Right2 956 39 2 927 39
0.7 Left
0.7 Right2 258 30 2 258 30
0.7 Left
Turnover in retail trade (stratified after the turnover)
N= 31 732 SRS: 246 completely enumerated(10 % of the sample)n = 2 405
SAMU block Actual Expected
Start Direction Units % Units %
0.0 Right444 18 444 18
1.0 Left
0.0 Right657 27 639 27
0.2 Right
0.0 Right
549 23 521 220.7 Right
0.0 Right611 25 606 25
0.7 Left
0.7 Right444 18 444 18
0.7 Left
Empirical overlap. Pareto/Pareto.
Wages and salaries, private sector(stratified after #employees)
N= 66 083Pareto: 1 659 completely enumeratedSRS: 1 732 completely enumeratedn = 7 497
SAMU block Pareto SRS
Start Direction Units % Units %
0.0 Right2 265 30 2 258 30
1.0 Left
0.0 Right3 005 40 3 111 41
0.2 Right
0.0 Right2 594 35 2 613 35
0.7 Right
0.0 Right2 814 38 2 956 39
0.7 Left
0.7 Right2 299 31 2 258 30
0.7 Left
Turnover in retail trade(stratified after the turnover)
N= 31 732 Pareto: 287 completely enumeratedSRS: 246 completely enumerated
n = 2 405
SAMU block Pareto SRS
Start Direction Units % Units %
0.0 Right466 19 444 18
1.0 Left
0.0 Right705 29 657 27
0.2 Right
0.0 Right566 24 549 23
0.7 Right
0.0 Right613 25 611 25
0.7 Left
0.7 Right461 19 444 18
0.7 Left
Empirical overlap. SRS/Pareto. Same surveys.
Use of Information and Communication in Enterprises(stratified after #employees)
N= 29 124 SRS: 1 214 completely enumeratedPareto: 1 116 completely enumeratedn = 4 355
SAMU block Overlap SRS/Pareto
Start Direction Units %
0.0 Right 3 593 83
0.2 Right 3 595 83
0.7 Left 3 605 83
0.7 Right 3 601 83
1.0 Left 3 559 82
Stock of goods in the Wholesale and Retail Trade(optimally stratified after the turnover)
N= 9 322 SRS: 382 completely enumeratedPareto: 402 completely enumeratedn = 1 753
SAMU block Overlap SRS/Pareto
Start Direction Units %
0.0 Right 1 418 81
0.2 Right 1 406 80
0.7 Left 1 397 80
0.7 Right 1 383 79
1.0 Left 1 405 80
Empirical overlap. SRS/Pareto. Different surveys.
Overlap between two different surveys (Inf and Sto) placed in different blocks
N (Inf) = 29 124N (Sto) = 9 322
Completely enumeratedSRS(Inf): 1 214 Pareto(Inf) : 1 116 SRS(Sto): 382 Pareto(Sto): 402
n (Inf) = 4 355n (Sto) = 1 753
SAMU block Pareto SRS
Survey Start Direction Units % Units %
Inf 0.0 Right241
14
200
11 Sto 1.0 Left
Inf 0.0 Right359
21
348
20 Sto 0.2 Right
Inf 0.0 Right263
15
215
12 Sto 0.7 Right
Inf 0.0 Right275
16
244
14 Sto 0.7 Left
Inf 0.7 Right235 13 188 11
Sto 0.7 Left
Overlap between two different surveys (Inf and Sto) placed in different blocks
N (Inf) = 29 124N (Sto) = 9 322
Completely enumeratedSRS(Inf): 1 214 Pareto(Inf) : 1 116 SRS(Sto): 382 Pareto(Sto): 402
n (Inf) = 4 355n (Sto) = 1 753
SAMU block Pareto SRS
Survey Start Direction Units % Units %
Sto 0.0 Right222
13
177
10 Inf 1.0 Left
Sto 0.0 Right250
14
202
12 Inf 0.2 Right
Sto 0.0 Right290
17
284
16 Inf 0.7 Right
Sto 0.0 Right262
15
214
12 Inf 0.7 Left
Sto 0.7 Right240 14 196 11
Inf 0.7 Left
Empirical. Overlap over time.
Use of Information and Communication in Enterprises (stratified after #employees)
SAMU block Pareto SRS
Start Direction Units % Units %
0.0 Right 3 560 82 3 428 79
0.2 Right 3 564 82 3 432 79
0.7 Left 3 584 82 3 483 80
0.7 Right 3 590 82 3 455 79
1.0 Left 3 553 82 3 411 78
Stock of goods in the Wholesale and Retail Trade(optimally stratified after the turnover)
SAMU block Pareto SRS
Start Direction Units % Units %
0.0 Right 1 352 77 1 305 74
0.2 Right 1 322 75 1 288 74
0.7 Left 1 337 76 1 313 75
0.7 Right 1 342 77 1 327 76
1.0 Left 1 316 75 1 310 75