Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst...

36
Agenda Informationer Uformel evaluering 2 spørgeskemaer + eval. Opsamling fra sidst Variabel – def. og typer Fordeling (distribution) Centrumskøn Sandsynlighedsregning A. Definitioner B. Def. og basale regneregler C. Regneregel og uafhængighed Dagens øvelser

Transcript of Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst...

Page 1: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Agenda

• Informationer– Uformel evaluering– 2 spørgeskemaer + eval.

• Opsamling fra sidst– Variabel – def. og typer– Fordeling (distribution)– Centrumskøn

• SandsynlighedsregningA. DefinitionerB. Def. og basale regnereglerC. Regneregel og

uafhængighed

• Dagens øvelser

Page 2: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Mean (gennemsnittet)

• The mean is the sum of the observations divided by the number of observations– n betegner antallet af observationer (stikprøvestørrelsen)

– y1, y2, y3, … yi ,..., yn betegner de n observationer

– betegner gennemsnittet

• It is the center of mass

n

y

n

yyyy i in

21

y

Page 3: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Standard Deviation (standardafvigelsen)

• Gives a measure of variation by summarizing the deviations of each observation from the mean and calculating an adjusted average of these deviations.

1

)( 2

n

xxs

Site

Obs.1 2

3 Sum n Gns. Std.afv.

A 5 5 5 15 3 5 0,0

B 4 5 6 15 3 5 1,0

C 3 5 7 15 3 5 2,00

1

2

3

4

3 4 5 6 7

Page 4: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

A. Learning Objectives

1. Random Phenomena

2. Law of Large Numbers

3. Probability

4. Independent Trials (trail = forsøg / eksperiment)

5. Finding probabilities

Page 5: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 1:Random Phenomena

• For a random phenomena, the outcome is uncertain– In the short-run, the proportion of times that

something happens is highly random – In the long-run, the proportion of times that

something happens becomes very predictable

Probability quantifies long-run randomness

Page 6: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 2:Law of Large Numbers

• As the number of trials increase, the proportion of occurrences of any given outcome approaches a particular number “in the long run”

• For example, as one tosses a die, in the long run 1/6 of the observations will be a 6.

• Hvad får vi i det lange løb, hvis vi kaster en terning og bereger andelen, som er større end 3?

Page 7: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 3:Probability

• With random phenomena, the probability of a particular outcome is the proportion of times that the outcome would occur in a long run of observations

• Example:– When rolling a die, the outcome of “6” has probability

= 1/6. In other words, the proportion of times that a 6 would occur in a long run of observations is 1/6.

• Opgave:– Vi tager et kort fra en bunke spillekort bestående af 4

x 13 = 52 kort (og lægger det tilbage igen). Hvor stor en andel af gangene får man et kort over 10 (i det lange løb)?

Page 8: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 4:Independent Trials

• Different trials of a random phenomenon are independent if the outcome of any one trial is not affected by the outcome of any other trial.

• Example:– If you have 20 flips of a coin in a row that are

“heads”, you are not “due” a “tail” - the probability of a tail on your next flip is still 1/2. The trial of flipping a coin is independent of previous flips.

Page 9: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:How can we find Probabilities?

• Observe many trials of the random phenomenon and use the sample proportion of the number of times the outcome occurs as its probability. This is merely an estimate of the actual probability.

• Calculate theoretical probabilities based on assumptions about the random phenomena. For example, it is often reasonable to assume that outcomes are equally likely such as when flipping a coin, or a rolling a die.

Page 10: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

A. Learning Objectives

1. Random Phenomena

2. Law of Large Numbers

3. Probability

4. Independent Trials (trail=forsøg / eksperiment)

5. Finding probabilities

Page 11: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

B. Learning Objectives

1. Sample Space (Udfaldsrum) for a Trail (forsøg)

2. Event (Hændelse)

3. Probabilities for a sample space

4. Probability of an event

5. Basic rules for finding probabilities about a pair of events

6. Probability of the union of two events

7. Probability of the intersection of two events

Page 12: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 1:Sample Space (udfaldsrum) for a Trail (forsøg)

• The sample space (udfaldsrummet) is the set of all possible outcomes.

• Udfaldsrummet for en prøve bestående af 3 spørgsmål, som kan besvares korrekt, C (correct), eller forkert, I, (incorrect) fremgår af figuren.

• Hvad er udfaldsrummet?

Page 13: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 2:Event (hændelse)

• An event (hændelse) is a subset of the sample space

• An event corresponds to a particular outcome or a group of possible outcomes.

• For example;– Event A = student answers all 3 questions

correctly = (CCC)– Event B = student passes (at least 2 correct)

= (CCI, CIC, ICC, CCC)

Page 14: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 3:Probabilities for a sample space

Each outcome, f.eks. CCC, in a sample space has a probability

• The probability of each individual outcome is between 0 and 1.

• The total (the sum) of all the individual probabilities equals 1.

Page 15: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 4:Probability of an Event

• The Probability of an event A is denoted by P(A)

• The Probability is obtained by adding the probabilities of the individual outcomes in the event.

• When all the possible outcomes are equally likely:

space sample in the outcomes ofnumber

Aevent in outcomes ofnumber )( AP

Page 16: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 4:Eksempel: Forespørgsler på en hjemmeside?

• Oplist 3 hændelser i ovenstående udfaldsrum.• Hvad er ssh. for tilfældigt valgt person ...

– har kontaktet en hjemmeside med sin mobiltelefon?– har besøgt en hjemmeside med mere end 100.000 besøgende?

• Hvilken websitestørrelse har størst ssh. for at blive besøgt af en mobiltlf. bruger?

Antal sider Mobil PC Total

Under 25.000 90 14.010  

25.000-49.999 71 30.629  

50.000-99.999 69 24.631  

100.000 + 80 10.620  

Total      

Page 17: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:Basic rules for finding probabilities about a pair of events

• Some events are expressed as the outcomes (udfald) that

1. Are not in some other event (complement of the event)

2. Are in one event and in another event (intersection of two events)

3. Are in one event or in another event (union of two events)

Page 18: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:Complement of an event

• The complement of an event A consists of all outcomes in the sample space that are not in A.

• The probabilities of A and of A’ add to 1• P(A’) = 1 – P(A)

Page 19: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:Disjoint events

• Two events, A and B, are disjoint if they do not have any common outcomes (udfald)

Page 20: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:Intersection of two events (fællesmængde)

• The intersection of A and B consists of outcomes that are in both A and B.

Page 21: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 5:Union of two events (foreningsmængde)

• The union of A and B consists of outcomes that are in A or B or in both A and B.

Page 22: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 6:Probability of the Union of Two Events

Addition Rule:For the union of two events, P(A or B) = P(A) + P(B) – P(A and B)

If the events are disjoint, P(A and B) = 0, so P(A or B) = P(A) + P(B) + 0

Page 23: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 6:Example

• Event A = Mobil• Event B = Site med mere end 100.000 sider• Hvordan beregner vi P(A and B) til 0,001?

Antal sider Mobil PC Total

Under 25.000 90 14.010 14.100

25.000-49.999 71 30.629 30.700

50.000-99.999 69 24.631 24.700

100.000 + 80 10.620 10.700

Total 310 79.890 80.200

Page 24: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 7:Probability of the Intersection of Two

Events• Multiplication Rule:

For the intersection of two independent events, A and B, P(A and B) = P(A) x P(B)

• Opgave: Kast med to terninger. Hvad er sandsynligheden for at få to 6’ere?– Definer hændelserne A og B.

Page 25: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 7:Example

• What is the probability of getting 3 questions correct by guessing (= tilfældigheder)? A=correct.

• Probability of guessing correctly, P(A)=0,2

What is the probability that a student answers at least 2 questions correctly?

P( ) + P( ) + P( ) + P( ) =

0,... + ... = 0,104

Page 26: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 7:Events Often Are Not Independent

Don’t assume that events are independent unless you have given this assumption careful thought and it seems plausible.

  Øjne ved terningkast

Møntkast 1-2 3-4 5-6

Plat ½ x ⅓ ½ x ⅓ ½ x ⅓

Krone ½ x ⅓ ½ x ⅓ ½ x ⅓

Page 27: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 7:Events Often Are Not Independent

• Example: A Pop Quiz with 2 Multiple Choice Questions– Data giving the proportions for the actual responses– Events: II IC CI CC– Probability: 0,26 0,11 0,05 0,58– P(CC) = 0,58

Spm. 1 Correct Incorrect TotalCorrect 0,58 0,05 0,63Incorrect 0,11 0,26 0,37Total 0,69 0,31 1,00

Spm. 2

Page 28: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 7:Events Often Are Not Independent

• Define the events A and B as follows:– A: {first question is answered correctly}– B: {second question is answered correctly}

• P(A) = P{(CC), (CI)} = 0.58 + 0.05 = 0.63• P(B) = P{(CC), (IC)} = 0.58 + 0.11 = 0.69• P(A and B) = P{(CC)} = 0.58

• If A and B were independent, P(A and B) = P(A) x P(B) = 0.63 x 0.69 = 0.43• Thus, in this case, A and B are not independent!

Spm. 1 Correct Incorrect TotalCorrect 0,58 0,05 0,63Incorrect 0,11 0,26 0,37Total 0,69 0,31 1,00

Spm. 2

Page 29: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

B. Learning Objectives

1. Sample Space (Udfaldsrum) for a Trail (forsøg)

2. Event (Hændelse)

3. Probabilities for a sample space

4. Probability of an event

5. Basic rules for finding probabilities about a pair of events

6. Probability of the union of two events

7. Probability of the intersection of two events

Page 30: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

C. Learning Objectives

1. Conditional probability

2. Multiplication rule for finding P(A and B)

3. Independent events defined using conditional probability

Page 31: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 1:Conditional Probability

• For events A and B, the conditional probability of event A, given that event B has occurred, is:

• P(A|B) is read as “the probability of event A, given event B.” The vertical slash represents the word “given”.

• Of the times that B occurs, P(A|B) is the proportion of times that A also occurs

)(

) ()|P(

BP

BandAPBA

Page 32: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 6:Eksempel: 1) Omregning fra antal til ssh.

Antal sider Mobil PC Total

Under 25.000 90 14.010 14.100

25.000-49.999 71 30.629 30.700

50.000-99.999 69 24.631 24.700

100.000 + 80 10.620 10.700

Total 310 79.890 80.200

Antal sider Mobil PC Total

Under 25.000 0,0011 0,1747 0,1758

25.000-49.999 0,0009 0,3819 0,3828

50.000-99.999 0,0009 0,3071 0,3080

100.000 + 0,0010 0,1324 0,1334

Total 0,0039 0,9961 1,0000

Page 33: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 1:Example 1

• What was the probability of a cell phone visit, given that the site has ≥ 100,000?– Event A: Cell phone is used– Event B: Site has ≥ 100,000

007.01334.0

0010.0

P(B)

B) andP(A B)|P(A

Page 34: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 1:Example 1

• What is the probability of a cell phone visit given that the site has < 25.000 pages?

• A = Cell phone is used• B = Pages < 25.000

• P(A and B) = • P(B) = • P(A|B) = 0,0063

)(

) ()|P(

BP

BandAPBA

Antal sider Mobil PC Total

Under 25.000 0,0011 0,1747 0,1758

25.000-49.999 0,0009 0,3819 0,3828

50.000-99.999 0,0009 0,3071 0,3080

100.000 + 0,0010 0,1324 0,1334

Total 0,0039 0,9961 1,0000

Page 35: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 3:Independent Events Defined Using Conditional

Probabilities

• Two events A and B are independent if the probability that one occurs is not affected by whether or not the other event occurs

• Events A and B are independent if:P(A|B) = P(A), or equivalently, P(B|A) = P(B)

• If events A and B are independent, P(A and B) = P(A) x P(B)

Page 36: Agenda Informationer –Uformel evaluering –2 spørgeskemaer + eval. Opsamling fra sidst –Variabel – def. og typer –Fordeling (distribution) –Centrumskøn.

Learning Objective 3:Checking for Independence

• To determine whether events A and B are independent:– Is P(A|B) = P(A)?– Is P(B|A) = P(B)?– Is P(A and B) = P(A) x P(B)?

• If any of these is true, the others are also true and the events A and B are independent

  Øjne ved terningkast

Møntkast 1-2 3-4 5-6

Plat

Krone