Lecture 14 Statistical Example Chapters 10 and 12.

20
Lecture 14 Statistical Example Chapters 10 and 12

Transcript of Lecture 14 Statistical Example Chapters 10 and 12.

Page 1: Lecture 14 Statistical Example Chapters 10 and 12.

Lecture 14

Statistical ExampleChapters 10 and 12

Page 2: Lecture 14 Statistical Example Chapters 10 and 12.

Outline

10.1 Solving Simple Problems10.2 Assembling Solution Steps10.3 Summary of Operations10.4 Solving Larger Problems12.1 Behavioral Abstraction12.2 Matrix Operations12.3 MATLAB Implementation

Page 3: Lecture 14 Statistical Example Chapters 10 and 12.

“Simple” Problems

• Basic Character of the Data and Operations– Define the input data– Define the output data– Extract the transformations upon the input that

produce the output– Write the transformations as code operations• Debugging as necessary

Page 4: Lecture 14 Statistical Example Chapters 10 and 12.

“Not That Simple” Problems

• Build solutions to problems keeping in mind operations we know how to perform.

• Think how applying one operation might make the problem easier.

• Keep doing this, until the problem is broken down into parts that we can do.

• Build modular solutions, so that the building blocks for future problems are larger than the operations supplied by the language.

Page 5: Lecture 14 Statistical Example Chapters 10 and 12.

ANOVA

• Planned tests are determined before looking at the data and post hoc tests are performed after looking at the data. Post hoc tests such as Tukey's test most commonly compare every group mean with every other group mean and typically incorporate some method of controlling of Type I errors.

• *from http://en.wikipedia.org/wiki/Analysis_of_variance

Page 6: Lecture 14 Statistical Example Chapters 10 and 12.

Example, Apply ANOVA

• Measure B/C preference• Three groups of data– Tiz– Dax– Zup

• Is the B/C preference influenced by use of Tiz, Dax, Zup, or, are the groups all the same?

Page 7: Lecture 14 Statistical Example Chapters 10 and 12.

Getting to Numbers

• Some number of tests, n, will be used to develop a number of times the outcome B occurs.

• Some fraction of the time, outcome B will occur. Let this be b/n.

• If n = 1, then b/n can only be 1 or 0.• We want to obtain a certain number of test

results, so that we can calculate their mean and variance.

Page 8: Lecture 14 Statistical Example Chapters 10 and 12.

Formulae for Mean, Variance

• Mean = sum(the results)/number of results• Square of deviation = (one result – Mean) 2

• Variance = sum(squares of deviation)/number of results

• Standard deviation = positive square root of variance

Page 9: Lecture 14 Statistical Example Chapters 10 and 12.

Information from Data

• We can produce these sums over all subjects holding the shape constant.– We can try to find out whether shape matters.

• We can produce these sums over all shapes holding the subject constant.– We can try to find out whether the subjects are

different from one another.

Page 10: Lecture 14 Statistical Example Chapters 10 and 12.

Arranging Information in an Array

• Suppose we have several subjects (A, F, G, J) and several types of test (T, D, Z) and a result (number of b choices per total choices) for each.

• We could use the subject as an index on an array. • We could use the type of test as an index on an array.• We could use the index of the test (A’s 12th test session with

D, so, 12) as an index on an array.• We would store the ratio b/c as the value in the location

given by the indices:biasMeasure (subject, type, index) = #of b choices per total choices

Page 11: Lecture 14 Statistical Example Chapters 10 and 12.

Generalizing

• The example had three dimensions, subject, type, index.

• The number of dimensions could be more or fewer.

Page 12: Lecture 14 Statistical Example Chapters 10 and 12.

Mean, Varianceof Some Particular Thing

• Suppose we wanted the mean and standard deviation of Jack’s data, averaged over all values of index and type

• Suppose Jack’s subject identifier is “7”.• with biasMeasure (subject, type, index) • Mean = (1/(nTypes*nIndices))*

sum(sum(biasMeasure(7,:,:)))• Variance = (1/(nTypes*nIndices))*

sum(sum(biasMeasure(7,:,:)-Mean) 2 ))

Page 13: Lecture 14 Statistical Example Chapters 10 and 12.

Mean, Variance ofSomething Else

• Suppose we wanted the mean and standard deviation of one type data, averaged over all values of index and subject

• Suppose the type’s identifier is “3”.• with biasMeasure (subject, type, index) • Mean = (1/(nTypes*nSubjects))*

sum(sum(biasMeasure(:,3,:)))• Variance = (1/(nTypes*nSubjects))*

sum(sum(biasMeasure(:,3,:)-Mean) 2 ))

Page 14: Lecture 14 Statistical Example Chapters 10 and 12.

Analysis of Variance

The ANOVA tests the null hypothesis that samples in two or more groups are drawn from the same population. To do this, two estimates are made of the population variance. These estimates rely on various assumptions. The ANOVA produces an F statistic, the ratio of the variance calculated among the means to the variance within the samples. If the group means are drawn from the same population, the variance between the group means should be lower than the variance of the samples, following central limit theorem. A higher ratio therefore implies that the samples were drawn from different populations.

See http://en.wikipedia.org/wiki/One-way_ANOVA andHowell, David (2002). Statistical Methods for Psychology. Duxbury. pp. 324-325

Page 15: Lecture 14 Statistical Example Chapters 10 and 12.

Knowing Array,Choosing File Design

• We would like a multidimensional array, so that we can calculate easily the variances we want.

• Let’s look at some sample code for reading in from a file into a multidimensional array.

Page 16: Lecture 14 Statistical Example Chapters 10 and 12.

The File

All entries are coded as numbers.

Page 17: Lecture 14 Statistical Example Chapters 10 and 12.

Sample Code - 1[nums text raw] = xlsread('exmple4ANOVA.xls')

nums =

1 1 1 1 2 0 1 3 1 2 1 1 2 2 0 2 3 0 3 1 1 3 2 0 3 3 0 4 1 1 4 2 1 4 3 0

The subjects are numbered 1-4.The types are numbered 1-3.The outcome (b or s) is coded 1 for a b.

Page 18: Lecture 14 Statistical Example Chapters 10 and 12.

Sample Code -2 function outArray = fillMDArrayFrom2DNums(nums)%fillArray(nums) takes an array of numbers that has been%read in from a file%and extracts the values (dependent variables) associated

with setting of%independent variables%for example, subject, type and index might be independent

variables%the number of 'b' choices in n trials might be the dependent

variable%the returned array has a dimension for each of the

independent variables%the file has a column for each independent variable (except

index), %plus one column for the dependent variable%for example, a column for subject, a column for type, a

column for b%the number of trials with the same independent variables%is obtained by counting the number of repetitions%the index of the trial is obtained by the value of the counter

[nRows nColumns] = size(nums);%now we can see how many independent variables there arenIndependentVariables = nColumns -1;dependentVariables = nums(:,end);%the last columnnumsExceptLast = nums(:, 1:nIndependentVariables);allMins = min(numsExceptLast);allMaxs = max(numsExceptLast);allRanges = allMaxs-allMins+1;outArray = zeros(allRanges);columnValue = ones(1,nIndependentVariables);for rowIndex = 1:nRows if dependentVariables(rowIndex) == 1 %if there is a b to be added on for independentVariableIndex = 1:nIndependentVariables columnValue(independentVariableIndex) = ... nums(rowIndex, independentVariableIndex)... -allMins(independentVariableIndex)+1; end outArray(columnValue(1), columnValue(2))= outArray(columnValue(1),

columnValue(2))+1; endend end

Page 19: Lecture 14 Statistical Example Chapters 10 and 12.

File Output

• >> outArray=fillMDArrayFrom2DNums(nums)

• outArray =

• 1 0 1• 1 0 0• 1 0 0• 1 1 0

• >> xlswrite('theOutputFile.xls', outArray);

Page 20: Lecture 14 Statistical Example Chapters 10 and 12.

The File