CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24...

41
www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/ CS1512 CS1512 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random number generators © J R W Hunter, 2006

Transcript of CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24...

Page 1: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

1

CS1512Foundations of

Computing Science 2

Lecture 24

Probability and statistics (5)Random number generators

© J R W Hunter, 2006

Page 2: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

2

After Easter

Lectures

• Dr Kees van Deemter will take over

• Logic and HCI

• Same times and places

Tutorials

• Logic and HCI

• Same times and places

Practicals

• Java programming simulation – Robocode ‘take-home asessment’

• Same times and places

Page 3: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

3

Continuous Assessment

Week 9 test• week 9 = week after Easter vacation;• worth 10% of the marks of the course;• as for week 5 – test under practical exam conditions;• will test your knowledge of inheritance.

‘Practical exam’• completed in your own time;• worth 30% of the marks of the course;• handed out in week 10; hand in by the end of week 12.

Both• conditional on AUT dispute being resolved;• safest course is to assume that they will go ahead;• if you are worried about the possible effects of the AUT

action, write to the Principal and express those concerns.

Page 4: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

4

Remember: Continuous data

• Divide range of observations into non-overlapping intervals (bins)

• Count number of observations in each bin

• Enzyme concentration data:

121 25 83 110 60 101

95 81 123 67 113 78

85 145 100 70 93 118

119 57 64 151 48 92

62 104 139 201 68 95

• Range: 25 to 201

• 10 bins of width 20

Page 5: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

5

Remember: Enzyme concentrations

Concentration Freq. Rel.Freq. 19.5 ≤ c < 39.5 1 0.033 39.5 ≤ c < 59.5 2 0.067 59.5 ≤ c < 79.5 7 0.233 79.5 ≤ c < 99.5 7 0.233 99.5 ≤ c < 119.5 7 0.233119.5 ≤ c < 139.5 3 0.100139.5 ≤ c < 159.5 2 0.067159.5 ≤ c < 179.5 0 0.000179.5 ≤ c < 199.5 0 0.000199.5 ≤ c < 219.5 1 0.033

Totals 30 1.000

Page 6: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

6

Relative Frequency Histogram

0.00E+00

5.00E-02

1.00E-01

1.50E-01

2.00E-01

2.50E-01

height of the bar gives the relative frequency

relative frequency

Page 7: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

7

Density Histograms

Plot relative frequency / width of the column (bin width) so that the area of the bar now gives the relative frequency

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

19.5 39.5 59.5 79.5 99.5 119.5 139.5 159.5 179.5 199.5

relative frequency = relative frequency density

bin width= 0.0165 20 = 0.233

relative frequency density

Page 8: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

8

Addition of areas

relative frequencies of values between here and here = this area

0.00E+00

2.00E-03

4.00E-03

6.00E-03

8.00E-03

1.00E-02

1.20E-02

1.40E-02

19.5 39.5 59.5 79.5 99.5 119.5 139.5 159.5 179.5 199.5

Page 9: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

9

Increase the number of samples

... and decrease the width of the bin ...

Page 10: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

10

Relative frequency as area under the curve

relative frequency of values between a and b = area

Page 11: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

11

Continuous random variable

Consider a large population of individuals e.g. all males in the UK over 16

Consider a continuous attribute e.g. Height: X

Select an individual at random so that any individual is as likely to be selected as any other

X is said to be a continuous random variable

Page 12: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

12

Probability density function

The probability distribution of X is said to be its probability density function defined such that:

P(a ≥ x > b) = area under the curve between a and b

NB total area under curve must be 1.0

Page 13: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

13

The ‘normal’ distribution

Very common distribution:

• often called a Gaussian distribution

• variable measured for large number of nominally identical objects;

• variation assumed to be caused by a large number of factors;

• each factor exerts a small random positive or negative influence;

• e.g. height: age diet bone structure genetic influences etc.

Symmetric about mean

Unimodal0

0.05

0.1

0.15

0.2

0.25

Page 14: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

14

Mean Mean determines the centre of the curve:

0

0.05

0.1

0.15

0.2

0.25

0

0.05

0.1

0.15

0.2

0.25

Mean = 10

Mean = 30

Page 15: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

15

Remember: Variance

Measure of spread: variance

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Page 16: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

16

Remember: Variance

sample variance = s2

sample standard deviation = s = √ variance

Page 17: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

17

Standard deviation

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0

0.05

0.1

0.15

0.2

0.25

Standard deviation determines the ‘width’ of the curve:

Std. Devn. = 2

Std. Devn. = 1

Page 18: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

18

Remember: Cumulative frequencies

Number of piglets

in a litter:

(discrete data)

Litter size Frequency Cum. Freq

5 1 1 6 0 1 7 2 3 8 3 6 9 3 9 10 9 18 11 8 26 12 5 31 13 3 34 14 2 36

Total 36cK = n

Page 19: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

19

Remember: Plotting

frequency cumulative frequency

Page 20: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

20

Cumulative normal distribution

0

0.2

0.4

0.6

0.8

1

1.2

For good demo, go to: http://www.vertex42.com/ExcelArticles/mc/NormalDistribution-Excel.html and download the Excel file

Page 21: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

21

Relationship between the two distributions

0

0.05

0.1

0.15

0.2

0.25

0

0.2

0.4

0.6

0.8

1

1.2

area under curve = 0.84

Page 22: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

22

Probability of sample lying within mean ± 2 standard deviations

x Prob Dist Cum Prob Dist 2.0 0.00013383 00.003% 2.4 0.000291947 00.007% 2.8 0.000611902 00.016% 3.2 0.001232219 00.034% 3.6 0.002384088 00.069% 4.0 0.004431848 00.135% 4.4 0.007915452 00.256% 4.8 0.013582969 00.466% 5.2 0.02239453 00.820% 5.6 0.035474593 01.390% 6.0 0.053990967 02.275% 6.4 0.078950158 03.593% 6.8 0.110920835 05.480% 7.2 0.149727466 08.076% 7.6 0.194186055 11.507% 8.0 0.241970725 15.866% 8.4 0.289691553 21.186% 8.8 0.333224603 27.425% 9.2 0.36827014 34.458% 9.6 0.391042694 42.074%10.0 0.39894228 50.000%

Mean (μ)= 10.0Std. devn (σ)= 2.0

P(X < μ – σ) = 15.866%

P(X < μ – 2σ) = 2.275%

P(μ – 2σ < X < μ + 2σ) = = (100 – 2 * 2.275)% = 94.5%

Page 23: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

23

Probability of sample lying within mean ± 2 standard deviations

0

0.05

0.1

0.15

0.2

0.25

2.275% 2.275%94.5%

μ – 2σ μ + 2σμ

Page 24: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

24

Uniform probability distribution

Also called rectangular distribution

0.0 1.0

x

P(X)

0.0

1.0

P(X < y) = 1.0 y = y

y

Page 25: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

25

Uniform probability distribution

0.0

x

P(X)

0.0

P(X < a) = a

P(X < b) = b

P(a ≤ X < b) = b - a

1.0

1.0

a b

Page 26: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

26

Sampling from a distribution

Suppose we want a stream of numbers sampled from a given distribution.Previously we had the sample data and wanted the distribution.Now we have the distribution and want sample data.

Simplest to sample from the uniform distribution between 0.0 and 1.0:

0.6282666143546787 0.1874836450093842 0.13450942779513230.0720166704579284 0.5892161544310359 0.93753356924707780.6377396244822982 0.6832029056956863 0.81960762878402440.3689553414430091 0.6597233555218959 0.99691464429868770.0867381632942044 0.4262198006313059 0.30649543632706120.7706191731891433 0.7327364126731544 0.61841146714454690.4400410508617185 0.7270704022602184 ...

Use a random number generator

Page 27: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

27

Random Number Generators

From a given starting number (the seed) there are algorithms which will generate a series of pseudo-random numbers which are uniformly distributed:• linear congruential pseudorandom number generator

• but you didn’t want to know this!

Computers are deterministic:• from a given starting point they always do the same thing;• how do we get different series?• start from different seeds:

choose the seed yourself derive it from the computer clock (date and time of day)

Page 28: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

28

Java support – class Math

static double random()• returns a double value with a positive sign, greater than or equal to 0.0

and less than 1.0 (0.0 ≤ x < 1.0);

• when this method is first called, it creates a single new pseudo-random-number generator (seed derived automatically) which is used thereafter for all calls to this method and is used nowhere else.

public void randGen() //demo of Math.random(){ for (int i=0; i<20; i++) System.out.println(Math.random());}

Page 29: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

29

Java support – class java.util.Random

Construct a random number generator:

• public Random() Creates a new random number generator; this constructor sets the

seed of the random number generator to a value very likely to be distinct from any other invocation of this constructor.

• public Random(long seed) Creates a new random number generator using a single long seed

Page 30: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

30

Java support – class java.util.Random

Get the (next) random number:

• public double nextDouble() just like Math.random()

• public int nextInt(int n) returns a pseudo-random, uniformly distributed int value between 0

(inclusive) and the specified value (exclusive)

• public double nextGaussian() returns the next pseudo-random, Gaussian ("normally") distributed

double value with mean 0.0 and standard deviation 1.0 from this random number generator's sequence.

Page 31: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

31

Testing the uniformity

public void testUniform(int numberOfSamples, int numberOfBins){ double[] hist = new double[numberOfBins]; double sample; int binNumber; for (int i = 0; i < numberOfSamples; i++){ sample = Math.random(); binNumber = (int) (sample * numberOfBins); hist[binNumber]++; } double relativeFrequency; for (int k = 0; k < numberOfBins; k++){ relativeFrequency = hist[k]/numberOfSamples; System.out.println(relativeFrequency); } System.out.println();}

Page 32: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

32

Simulating coin toss

public String coinToss(int n){ String s = ""; Random random = new Random(); for (int i = 0; i < n; i++) { int t = random.nextInt(2); // i.e. t = 0 or 1 if (t == 0) // we want this to happen // with probability 0.5 s = s + "T "; else s = s + "H "; } return s; }

Page 33: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

33

Simulating dice throw

public String diceThrow(int n){

String s = "";

Random random = new Random();

for (int i = 0; i < n; i++)

s = s + random.nextInt(6) + " ";

return s;

}

Page 34: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

34

Simulating picking balls

public String pickABall(int n){ String s = ""; Random random = new Random(); for (int i = 0; i < n; i++) { double b = random.nextDouble(); if (b < 0.3) // we want this to happen with probability 0.3 s = s + "R "; else s = s + "W "; } return s; }

Page 35: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

35

‘Foxes and rabbits’ simulation

Rabbits and foxes in an enclosed field;• example of a “predator-prey” simulation;• see Barnes and Kölling, Objects first with Java, Chapter 10.

The field:• has a fixed number of square

cells arranged in a square grid;• each cell can be occupied

by only one animal;• animals can’t leave the field.

Page 36: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

36

Animals

All animals have:• a state (alive or dead!)• an age• a location in the field

All animals do:• get older• breed• try to move to a new location• die of old age• die of overcrowding

Foxes:

• die of hunger

Rabbits:

• die from being eaten by a fox

Page 37: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

37

Breeding

Rabbits:

BREEDING_PROBABILITY = 0.15;

MAX_LITTER_SIZE = 5;

private int breed() // returns size of litter (if any) { int births = 0; if (rand.nextDouble() <= BREEDING_PROBABILITY) { births = rand.nextInt(MAX_LITTER_SIZE) + 1; } return births; }

Foxes:

BREEDING_PROBABILITY = 0.09;

MAX_LITTER_SIZE = 3;

Page 38: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

38

Breeding probability

0.0 1.0

x

P(X)

0.0

1.0

if (rand.nextDouble() <= BREEDING_PROBABILITY) ...

Page 39: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

39

Number of births witheach litter size equally likely

0 MAX_LITTER_SIZE

x

P(X)

0.0

1 / MAX_LITTER_SIZE

births = rand.nextInt(MAX_LITTER_SIZE) + 1

1 2 3 ...

Page 40: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

40

Number of births with different probabilities of litter sizes

0 MAX_LITTER_SIZE

x

P(X)

0.01 2 3 ...

p3

p…

p2

p1

pMAX

Given p1, p2, ... pMAX, how do you use a random number generator to generate a litter size?

Page 41: CS1512 jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 24 Probability and statistics (5) Random.

www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/

CS1512

CS1512

41

Have a good

Easter!