Sampling bias in multi-agent simulation (MAS) models Buysse, J 1., Frija, A 1., Van der Straeten, B...
-
date post
22-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Sampling bias in multi-agent simulation (MAS) models Buysse, J 1., Frija, A 1., Van der Straeten, B...
Sampling bias in multi-agent Sampling bias in multi-agent simulation (MAS) modelssimulation (MAS) models
Buysse, J1., Frija, A1., Van der Straeten, B1., Nolte, S.1, Lauwers, L.1,2, Claeys, D.2 and Van Huylenbroeck G.
(1)Ghent University, Department of Agricultural Economics(2)Institute for Agricultural and Fisheries Research, Merelbeke, Belgium
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Overview of the presentationOverview of the presentation
- Why MAS models?- Problem statement: the sampling bias in MAS- Objectives- Methodology- Results- Perspectives
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Why MAS models? Why MAS models?
- Heterogeneity of opportunities and constraints at the individual level
-Accurate estimation of policy distributional effects
- Accurate estimation of agents interactions (spatial effects, TC, propensity of innovation, etc.)
- But…
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Problem statementProblem statement
- The need for full population data
- In case of sampling, farms in sample cannot interact with the real-world farm
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Problem statementProblem statement
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Farmer i
Full population
Farmer j
Problem statementProblem statement
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Farmer i
Sample
Farmer k
Problem statementProblem statement
-Systematic bias when TC between agents are simulated in MAS
- Most MAS empirical models rely on sample data
- Future large scale MAS models on sampled data !
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Problem statementProblem statement
Illustration of sampling bias on real model (Van der Straeten et al. 2010)
- A MAS model used to simulation PR exchange between 30,000 farmers in Flanders
- The bias is correlated to the sample size
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Bootstrap Number of repetitions
Average cost simulated
SD Average cost simulated/Average cost of population
S= 100 (0.26%) 100 3210.88 1200.77 + 59 %S= 200 (0.52%) 100 2698.58 666.24 + 33 %S= 500 (1.31%) 100 2581.09 755.91 + 28 %S= 750 (2%) 100 2571.13 536.30 + 27 %Full population (100%) - 2016.49 - -
ObjectivesObjectives
‣To test, illustrate, and quantify the sampling biases resulting in cases of existence of TC
‣To develop and to discuss mechanisms that can remove such sampling biases
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
MethodologyMethodology- Simplified MAS model
- minimizes transport costs of emissions and the cost of emission abatement:
Minimize Σn (Σm cnm τnm + ωn p)
s.t.
en + Σm τmn - Σm τnm ≤ rn + ωn
where‣n and m are farm indices,
‣τnm is the amount of transported emission form n to m,
‣ωn is the amount of emission abatement of agent n,
‣en is the amount of emission of farm n,
‣rn is the amount of emission rights of farm n,
‣cnm is the transport cost per transported emission from farm n to m ,
‣p is the penalty per overused emission right. 122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
MethodologyMethodology
-Applied on synthetic population data (500 farmers): average cost per farm: 27.74
‣ Emission (en ) random sampled from normal distribution,‣ Emission right (rn ) random sampled from normal distribution,‣ Transport costs (cnm) is random sampled from uniform distribution
- We bootstrap on different sample sizes: 20-500
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
ResultsResults
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
100 200 300 400
05
00
10
00
15
00
20
00
Samplesize
ave
rag
e c
ost
s
Average cost of 27.7
Variations
ResultsResults - The order of magnitude of the sampling bias can be very large
- Nonlinear effect of the sample size on the bias
Cause:
- Subsamples do not always satisfy the real population balance
- Motivation for sampling bias correction via macrobalance coefficients
- The amount of emission is smaller than the total amount of emission rights (Σn en < Σn rn)
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
ResultsResults
100 200 300 400
10
02
00
30
04
00
50
0
Samplesize
ave
rag
e c
ost
s
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Average cost of 27.7
Average costs of samples
Remove remaining bias with calibrationRemove remaining bias with calibration
- Calibration is the comparison of two measurements: - the measurement of a device with known correctness: full
population model
- is used to correct another measurement made by another device: sample model
- Once calibrated, the second device can make correct measurements: sample model can be used in for simulations
- Resampling data is used to estimate the calibration function: prediction of the bias as a function of the sample size
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
ResultsResults
coefficients of the polynomial of the simulated average costs on the sample size
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Estimate Std. Error t value Pr(>|t|)
(sample size)^0 6,11E+05 2,68E+04 22.754 < 2e-16 ***
(sample size)^1 -2,53E+04 2,62E+03 -9.657 < 2e-16 ***
(sample size)^2 5,64E+02 9,35E+01 6.032 1.88e-09 ***
(sample size)^3 -7,46E+00 1,69E+00 -4.403 1.11e-05 ***
(sample size)^4 6,24E-02 1,78E-02 3.505 0.000464 ***
(sample size)^5 -3,42E-04 1,16E-04 -2.944 0.003267 **
(sample size)^6 1,24E-06 4,84E-07 2.563 0.010450 *
(sample size)^7 -2,95E-09 1,29E-09 -2.286 0.022337 *
(sample size)^8 4,42E-12 2,13E-12 2.076 0.037970 *
(sample size)^9 -3,78E-15 1,98E-15 -1.911 0.056064 .
(sample size)^10 1,41E-18 7,91E-19 1.778 0.075497 .
ResultsResults
100 200 300 400
10
20
30
40
50
Samplesize
ave
rag
e c
ost
s
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Average cost of 27.7
ConclusionConclusion- Macrobalance correction is very useful
- Only macrobalance is necessary
- Also usefull in models without heterogenous interactions
- Calibration correction is promising- such corrections are not possible if we do not have full population data
- necessity to assign correction factors based on information available in sample datasets
- Corrected sampling in MAS is important- more complex analysis become possible
- more datasets at sample level could be used
- MAS can be applied in large scale empirical models
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics
Further researchFurther research- Check for:
- Impact on variance- Impact of changes in model structure- Impact of using synthetic full population as
calibration reference
- Search for: - Calibration correction without availability of full
population (see first attempts in paper)
122nd EAAE Seminar – Ancona 16- 18 February 2011UGent – Faculty of Bioscience Engineering– Department Agricultural Economics