DEI/CISUC Evonet Summer School - Parma © 2003 Ernesto Costa 1 How To Conduct Good Experiments?...
-
Upload
luke-skinner -
Category
Documents
-
view
231 -
download
1
Transcript of DEI/CISUC Evonet Summer School - Parma © 2003 Ernesto Costa 1 How To Conduct Good Experiments?...
1
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
How To Conduct Good Experiments?
Ernesto Costa
DEI/CISUC
http://www.dei.uc.pt/~ernesto
2
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Summary
What is the goal of this talk?
BackgroundProbabilities
Random Variables and Probability distributions
Inferential Statistics
Applying the Theory
Conclusions
3
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
What is the goal of this talk?
I don’t know! I have been asked to give a talk on that subject…
I do know!EC is (much) an experimental discipline
Most of our work is to compare thingsAlgorithms
Parameters settings
What is a fair comparison?
4
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
What is the goal of this talk?
Looking for EC papersOne problem
One run
Several runs10, 20, 30?
Use average values
Use average of the bests
Use the mean
Use the mean and the standard deviation
Use Confidence Levels / Intervals
5
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
What is a good experiment? Identify independent and dependent variables
Mutation rate fitness
Different crossover operators fitness
Evolution and Learning # of survivors
Identify the conditions of the experimentInitial conditions
Number of runs
Parameters Settings
Identify the kind of Statistics you will needDescriptive
Inferential
Non parametric
What is the goal of this talk?
6
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Probabilities
Experiment: procedure whose variable result cannot be predicted ahead of time.
Tossing a coin, rolling a dice
Sample Space: set of possible outcomes of an experiment.
{Heads, Tails}
{1,2,3,4,5,6}
Event: subset of the sample space{Heads}
{1,3,5}
7
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Probability of an EventMeasure the likelihod that the event will occur
Tossing a (fair) coin: probability(outcome=heads) =1/2
AxiomsP(E)0
P(S)=1
For mutually exclusive events
11
( )i iii
P E P E
Probabilities
8
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
1/6
Experiment Prob. Assign.Sample Space Event
ExampleWhat is the probability of when rolling two dice the sum of the two outcomes equal 7?
Working Methodology
Two Dice Experiment
02468
1 2 3 4 5 6 7 8 9 101112
Sum
Nu
mb
er
Tw o DiceExperiment
Background
Probabilities
9
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Definition: Let E and F be two events, with p(F)>0. The conditional probability of E given F, p(E|F), is defined as:
p(E | F) p(EF)
p(F)
Probabilities
Example: A family has two children. Knowing that one is a boy whatis the probability that they have two boys?
1/3
10
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Probabilities
Theorem of Bayes:
)()|()()|(
)()|(
)(
)()|()|1(
2211
1111
ApABpApABp
ApABp
Bp
ApABpBAp
Example: A building has two lifts. One is used by 45% of the residents And the other by 55%. The first one, 5% of the time have problems, whileThe second 8% of the time can let you in trouble. Knowing that one lift had a problem , what is the probability of being lift number 1?
33,8%
11
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Random Variables
Definition: A random variable, X, is a function from the sample space of an experiment to the set of real numbers.
X(s)
s
SX
0 1 2 3
SX
A RV is a function … and is not random!!!
Random Variables and Probability Distributions
12
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Experiment Prob. Assign.Sample Space Event
Random Variable Prob. Distribution
Working Methodology
Random Variables and Probability Distributions
Toss coin (3x) f(xi)=p(X=xi)8 possibilities # Heads
X(HHT)=2 Xf(xi)
ExperimentSample Space
Random Variable
Event
Prob. Distrib.
Example
Prob. Assign.
13
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Example: Suppose you toss a coin three times. Let X(t) denote the number of heads that appear when t is the result. Então X(t):
X(HHH) = 3X(HHT) = X(HTH) = X(THH) = 2X(TTH) = X(THT) = X(HTT) = 1X(TTT) = 0
Probability Distribution
00,05
0,1
0,150,2
0,250,3
0,350,4
0 1 2 3
X
f(x
i)
Random Variables and Probability Distributions
Probabilty Distribution
14
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Random Variables and Probability Distributions
DiscreteProbability Mass Function
ContinuousProbability Density Function (pdf)
( ) ( ) 0P X x p x
Types of Random Variables
( ) 1x
p x
( ) 0,f x x
( ) 1f x dx
( ) ( )b
a
P a X b f x dx x
f(x)
0 x1 x2
15
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
LocationMean
DispersionVariance
Standard Deviation
( ) ( )x
E X xp x
Random Variables and Probability Distributions
Measures of Random Variables
( ) ( )E X xf x dx
2 2( ) ( ) ( )x
V X x p x
2 2( ) ( ) ( )V X x f x
( )V X
16
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Two random Variables, X and Y, over the same sample space S, are said to be independent iff:
Theorem of the Product
Theorem of Sum
1 2 1 2( ( ) ( ) ) ( ( ) )* ( ( ) )p X s r Y s r p X s r p Y s r
Random Variables and Probability Distributions
Independence of Random Variables
( ) ( ) ( )V X Y V X V Y
( * ) ( )* ( )E X Y E X E Y
17
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Random Variables and Probability Distributions
Binomial DistributionDomain {0,1,2,…n}
Probability mass function
Mean np
Variance npq
ininiii qpCpxXp )(
Discrete Probability Distributions
Binomial Distribution
0
0,05
0,1
0,15
0,2
0,25
0,3
1 2 3 4 5 6 7 8 9 10 11 12
Values x
Pro
ba
bili
ty
Series1
Binomial Distribution
0
0,05
0,1
0,15
0,2
0,25
1 2 3 4 5 6 7 8 9 10 11 12
Values x
Pro
babi
lity
Series1
P=0.3 P=0.5
n
i
inini npiqpCXE
0
)(
npqXEXEXV 222 )()()(
18
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Poisson DistributionApproach the Binomial DistributionDomain {0,1,2,3,...}Probability mass functionMean: Variance:
!)(
i
epiXp
i
i
Random Variables and Probability Distributions
Discrete Probability Distributions
=np Poisson distribution
0
0,05
0,1
0,15
0,2
1 2 3 4 5 6 7 8 9 10 11 12
Values
Pro
ba
bili
ty
Series1
=6Poisson Distribution
00,020,04
0,060,08
0,10,12
0,140,16
1 2 3 4 5 6 7 8 9 10 11 12
Values
Pro
ba
bili
ty
Series1
=8,4
19
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
Normal (Gaussian) Distribution
Standard Normal Distribution
Random Variables and Probability Distributions
(3,2)N
2
2
2
)(
2
1)(
x
exf
)1,0(N
Continuous Probability Distributions
-4 -2 2 4 6 8 10
0.05
0.1
0.15
0.2
0.25
20
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Converting a normal distribution to a standard normal distribution
X a random Variable withMean Standard Deviation σ
Using a translationDefining a new Random variable
XZ
Random Variables and Probability Distributions
Continuous Probability Distributions
21
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4N(0,1)
=1=5
=101
2
2
1( )
1( , )2 2
f xx
B
Student’s t-DistributionApproximates the standard normal distribution N(0,1)
Degrees of freedom (df),
Mean 0, Variance
Random Variables and Probability Distributions
Continuous Probability Distributions
22
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Goal: to apply probability theory to data analysis
How?Model the data (population) by mean of a probability distribution
Use a sample of the data instead of the all populationEstimate the population parameters (, σ, p) using correspondent sample statistics (x, s, )
StatisticsBackground
population sample
σ
p
x
sparameters statistics
p̂
p̂
23
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Unbiased estimatorA statistics with mean value equal to the population parameter being estimated
Point Estimators
Interval Estimators
BackgroundStatistics
24
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Consider a population with mean and standard deviation σ. Let denote the mean of the observations in random samples of size n. Then:
When the population distribution is normal, the sampling distribution of is also normal for any sample size n
(Central Limit Theorem) When n is sufficient large (n>30) the sampling distribution is well aproximated by a normal curve, even if the population distribution is not itself normal
)(XEx
Sample distribution of the sample mean and the Central Limit Theorem
X
nx
X
25
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Unbiased estimatorsMean
Standard Deviation
)(XEx
Sample distribution of the sample mean
2( )ˆ
1
ii
x
x xs
n
(n-1) are the degrees of freedom (df)
26
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
ConsequenceFor a large sample or population whose distribution is normal:
has (approximately) a standard normal (Z) distribution.
Sample distribution of the sample mean and the Central Limit Theorem
x
x
XZ
27
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Estimate the mean The population standard deviation, σ, is known;
The sample mean from a random sample, is known,
The sample size is large (>30)
The one sample Z confidence interval is
Example: for an 95% confidence interval Z=1.96.
X
Confidence Intervals – one sample
_critical valuex Zn
28
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Example: we want a confidence level of 90%Look into a N(0,1)
For a CL of 90%, we have to isolate the area of 5% to the left and to the right of the bell shaped normal distribution.
The confidence interval will be given by
Looking in a table for the value of Z we obtain Z=1.65
Confidence Intervals – one sample
0.1
2
x Zn
29
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
What does it means having a confidence interval of 95%?
That there is a probability of 95% that the true mean (population) is in the interval? NO!!
Mean that 95% of all possible samples result in an interval that includes the true mean!
Confidence Intervals – one sample
30
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Estimate the mean The population standard deviation, is NOT known;
The sample mean from a random sample, is known,
The sample size is large (>30) OR the population distribution is normal
The one sample t confidence interval is
where the t critical value is based on (n-1) degrees of freedom (df).
Example: for an 95% confidence interval and 19 df t=2.09.
The Student T Distribution can be used for small samples assuming that the population distribution is approximately normal
_critical value
sx t
n
Confidence Intervals – one sample
X
31
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
A hypothesis is a claim about the value of one or more population characteristics.A test procedure is a method for using sample data to decide between to competing claims about population characteristics. (= 100 or 100)Method by contradiction: we assume a particular hypothesis. Using the sample data we try to find out if there is convincing evidence to reject this hypothesis in favor of a competing one
Hypothesis Testing – one sample
32
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
The null hypothesis, H0, is a claim about a population characteristic that is initially assumed to be true.
Ha is the alternative hypothesis or competing claim.
Testing H0 versus Ha can lead to the conclusion the H0 must be rejected or we fail to reject H0. I that last case we cannot say that H0 is accepted!
Hypothesis Testing – one sample
33
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
ErrorsType I error
Rejecting H0 when H0 is true
The probability of a type I error, , is called Level of Significance of the test.
Type II errorFailing to reject H0 when H0 is false
The probability of a Type II error is denoted by .
There is a tradeoff between and : making type I error very small increase the probability of type II error.
Hypothesis Testing – one sample
34
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Test Statistic (Z,t): function of the sample data on which a decision about reject or fail to reject H0 is based;
p-value (observed significance level): is the probability, assuming that H0 is true, of obtaining a test statistics at least as inconsistent with H0 as what actually resulted.
Decision about H0: comparing the p-value with the chosen .
Reject H0 if p-value
Hypothesis Testing – one sample
35
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Hypothesis Testing – principlesWhat is the population parameter (mean,…)
State the H0 and Ha
Define the significance level The assumptions for the test are reasonable (big sample,…)
Calculate the test statistic (Z,…)
Calculate the associated p-value
State the conclusion (reject if p-value ,…)
Hypothesis Testing – one sample
36
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
ExamplePopulation parameter the mean, H0: =100, Ha: 100
Significance level =0.01
n=40 is large
From the sample: =105,3, σ=8.4
From the z-curve we know that the p-value 0
Therefore the null hypothesis, H0, is rejected with a significance level of 0.01.
x
Hypothesis Testing – one sample
105,3 1003.99
8.4
40
z
37
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Use the sample distribution of the difference of the sample means:
PropertiesThe mean of the difference is equal to the difference of the means
The variance of the difference is equal to the sum of the individuals variances. Thus, the standard deviation:
The sampling distribution of the difference of the sample means, can be considered approximately normal (each n large, each sample mean come from a population (approximately) normal
1 2x x
Comparing Two Populations based on independent samples
1 2 1 2x x
1 2
2 2
1 2
1 2x x n n
38
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
AssumptionsThe two samples are independently random samples
Sample sizes are both large (n >30) OR the population distributions are (approximately) normal.
Formulas
1 2 1 2x x Confidence interval for the mean of
2 2
1 21 2 _
1 2critical valuex x t
n ns s
2 22
1 2 1 21 22 2
1 21 2
1 21 1
V Vdf where V V
n n
n n
s sV V
39
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
Same procedure, only the formulas are different!
Z TestLarge samples OR
Population distributions are (at least approximately) normal
Hypothesis Test
1 2 1 2
2 2
1 2
1 2
( )x xz
n n
40
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Background
t testLarge samples OR
Population distributions normal AND the random samples are independent
1 2 1 2
2 2
1 2
1 2
( )x xt
n ns s
Hypothesis Test
2 22
1 2 1 21 22 2
1 21 2
1 21 1
V Vdf where V V
n n
n n
s sV V
41
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Applying the Theory
Two algorithmsA standard GA
A standard GA + local learning (Baldwin Effect)
Goal: good quality machines
Who is better? Comparing the means!H0:1= 2 (no improvement!!!), Ha: 1≠ 2
Confidence level, =0.01
Assuming that the population distributions are normal
Number of (independent) runs = 30 for each case
Use t test
The Busy Beaver Problem
42
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
From the samples (# good machines)sga=0.1
be=0.23
Sga2=0.093
Sbe2=0.185
From the formulasdf=53
t=1.35
p-value2*0.1=0.2
Conclusion With =0.01and p-value =0.2, the null hypothesis H0 cannot be rejected
The Busy Beaver Problem
Applying the Theory
43
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Applying the Theory
Two different GAs applied to function optimization
A standard GA using a 2 point CXover
A modified GA using transformation
Goal: find the minimum
-500
-250
0
250
500 -500
-250
0
250
500
0
500
1000
1500
-500
-250
0
250
500
Function Optimization
The Schwefel Function
Minimum = 0
44
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Who is better? Two point Crossover or Transformation?
Comparing the means of the best fit!
H0:1= 2 (no improvement!!!), Ha: 1≠ 2
Confidence level, =0.05
Assuming the population distributions are normal
Number of (independent) runs = 30 for each case
Use t test
Applying the TheoryFunction Optimization
45
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
From the samples (fitness of the best individuals)sga=5.4838
tr=0.0768
Sga2=149.788
Str2=0.02958
From the formulasdf=29
t=2.42
p-value2*0.012=0.024
Conclusion With =0.05 and p-value =0.024, the null hypothesis H0 is rejected.
Applying the TheoryFunction Optimization
46
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Conclusions
This is a very simple presentationAssuming Normal distributions
There are many others
In many situations we cannot assume a normal distribution
Many things left unmentionedMore than two populations
Analysis of Variance (ANOVA)
Regression and Correlation
Non parametric methods
47
DE
I/C
ISU
C
Evonet Summer School - Parma© 2003 Ernesto Costa
Want to know more?
Paul Cohen, Empirical Methods for Artificial Intelligence. MIT Press, Boston, 1995 James Kennedy and Russell Eberhart, Swarm Intelligence (Appendix A),Morgan Kaufman, 2001.Roxy Peck, Chris Olsen and Jay Devore, Introduction to Statistics and Data Analysis,Duxbury, 2001.Mark Wineberg and Steffen Christensen, Using Appropriate Statistics, GECCO’2003 Tutorial.