Biserial Correlation

27
Dr. Meenakshi Shukla Assistant Professor Department of Psychology Magadh University Bodh Gaya Biserial Correlation

Transcript of Biserial Correlation

Page 1: Biserial Correlation

Dr. Meenakshi Shukla

Assistant Professor

Department of Psychology

Magadh University

Bodh Gaya

Biserial

Correlation

Page 2: Biserial Correlation

What is Biserial Correlation?❑ Suppose you have a set of bivariate data from the bivariate normal distribution. The

two variables have a correlation, sometimes called the product-moment correlation

coefficient. Now suppose one of the variables is dichotomized by creating a binary

variable that is zero if the original variable is less than a certain variable and one

otherwise.

❑ For example, you may want to calculate the correlation between IQ and the score on

a certain test, but the only measurement available with whether the test was passed

or failed. You could then use the biserial correlation to estimate the more meaningful

product-moment correlation.

Page 3: Biserial Correlation

• The biserial correlation is a correlation between a continuous variable

and a binary variable, where the binary variable is not a true binary

variable but a continuous variable has been dichotomized to create a

binary variable.

• Biserial correlation (rbis or rb) is a correlational index that estimates the

strength of a relationship between an artificially dichotomous variable

and a true continuous variable. Both variables are assumed to be

normally distributed in their underlying populations.

Page 4: Biserial Correlation

Assumptions of Biserial Correlation

•Assumption #1: Both of your two variables should be measured on a continuous scale.

Assumption #2: One of the variables should be made dichotomous. Examples of such

artificial dichotomous variables include Pass or Fail, above 75 or below 75 attendance,

Happy or Sad, and so forth.

•Assumption #3: There should be no outliers for both the continuous variables. You can

test for outliers using boxplots.

•Assumption #4: Your continuous variables should be approximately normally

distributed . You can test this using the Shapiro-Wilk test of normality.

•Assumption #5: Your continuous variables should have equal variances. You can test this

using Levene's test of equality of variances.

Page 5: Biserial Correlation

𝑟𝑏 =𝑀1 −𝑀0

SDt

×𝑝𝑞

𝑦

Formula:

Where,

𝑀0 = mean score for data pairs for x=0,

𝑀1 = mean score for data pairs for x=1,

q = proportion of data pairs for x=0,

p = proportion of data pairs for x=1,

SDt = population standard deviation,

y = ordinate or the height of the standard normal distribution at the point which divides the

proportions of p and q

Page 6: Biserial Correlation

A teacher wants to determine whether there is a relationship between the results of the

students (Pass or Fail) and the number of hours per week that they devoted to their studies.

The data of 14 students is given below. Calculate biserial correlation from the data given

below:

Result Study hours

Pass 2

Pass 3

Pass 3

Pass 4

Pass 5

Pass 3

Pass 3

Pass 2

Pass 1

Fail 0

Fail 3

Fail 5

Fail 0

Fail 1

Page 7: Biserial Correlation

Result Study hours

Pass (p) 2

Pass (p) 3

Pass (p) 3

Pass (p) 4

Pass (p) 5

Pass (p) 3

Pass (p) 3

Pass (p) 2

Pass (p) 1

Fail (q) 0

Fail (q) 3

Fail (q) 5

Fail (q) 0

Fail (q) 1

Let’s call Pass as 1 and Fail as 0. Then, the proportion of passed students will be denoted by p and the

proportion of failed students will be denoted by q.

Page 8: Biserial Correlation

𝑀1 =

𝑟𝑏 =𝑀1 −𝑀0

SDt

×𝑝𝑞

𝑦

2 + 3 + 3 + 4 + 5 + 3 + 3 + 2 + 1

9

= 26

9

= 2.89

0 + 3 + 5 + 0 + 1

5

𝑀0 =

9

5=

= 1.80

=33.50

13

= 2.5769

= 1.605

Page 9: Biserial Correlation

𝑟𝑏 =𝑀1 −𝑀0

SDt

×𝑝𝑞

𝑦

p= 9/14 = .64

q= 5/14 = .36

y= .50 - .36 = .14

• In the ordinate table, check at .14

under ‘Area from mean’ and see the

value of ‘y’ which is the ordinate50% 50%

p

q

Page 10: Biserial Correlation
Page 11: Biserial Correlation

𝑟𝑏 =𝑀1 −𝑀0

SDt

×𝑝𝑞

𝑦

=2.89 − 1.80

1.605×.64 × .36

.3739

=1.09

1.605×.2304

.3739

= .6791 × .6162

= .42

Calculating biserial correlation from

point-biserial correlation, and vice-

versa

𝑟𝑏 =𝑟𝑝𝑏 𝑝𝑞

𝑦

Page 12: Biserial Correlation

Significance testing

2 × .42

5

12

514

=

= -.27

• Using z-table check the p-value to

determine significance of biserial

correlation.

• Remember to multiply the Table value

by 2 to get a two-tailed p-value in

case of a two-tailed hypothesis.

Page 13: Biserial Correlation
Page 14: Biserial Correlation

Since the p-value is .79, the biserial correlation is non-significant. This means that there

is not a significant relationship between result and study hours.

• The p-value obtained from the table is .39358. Since it is the p-value for a one-tailed test,

multiply it by 2 to get p-value for a two-tailed test.

• If you have a specific one-tailed hypothesis, then you can use the one-tailed value from the

table and will not need to multiply it by 2.

• To recap the concept of one-tailed and two-tailed tests, see the next two slides.

p-value for two-tailed test:

= .39358 x 2

= .78716

= .79

Page 15: Biserial Correlation
Page 16: Biserial Correlation
Page 17: Biserial Correlation

Practice question 1:

Question: From the following data, obtain biserial correlation

and interpret the result.

Negative affectivity Scores on Beck Depression

Inventory

High 0

Low 12

High 14

High 54

Low 12

High 60

Low 43

Low 36

Low 9

High 58

Page 18: Biserial Correlation

Practice question 2:

Question: From the following data, obtain biserial correlation

and interpret the result.

Results IQ

Above average 80

Above average 85

Above average 90

Above average 104

Above average 88

Above average 110

Below average 100

Below average 110

Below average 98

Below average 88

Page 19: Biserial Correlation

Help: https://www.youtube.com/watch?v=RwqkiTDCgnc&t=699s

Class Interval X (Trained) Y (Untrained)

46-50 2 3

41-45 1 4

36-40 3 5

31-35 4 5

26-30 2 2

21-25 5 5

16-20 2 4

11-15 2 1

6-10 2 2

0-5 1 1

Practice question 3:

Question: From the following data, obtain biserial correlation and interpret the result. (Hint: Use Assumed Mean

method to calculate mean and then follow the regular process of calculating biserial correlation)

Page 20: Biserial Correlation

Biserial Correlation using SPSS

• A teacher wants to determine whether there is a relationship between the results of the students (Pass or

Fail) and the number of hours per week that they devoted to their studies. The data of 40 students is

available.

• Therefore, two variables were created in the Variable View of SPSS Statistics: Result, which had two

categories (“Pass" and “Fail") and StudyHours (i.e., a variable denoting the number of hours per week that a

student devoted to studies).

Page 21: Biserial Correlation

Click Analyze > Correlate > Bivariate... on the top menu, as shown below:

Page 22: Biserial Correlation

You will be presented with the following Bivariate Correlations screen:

Page 23: Biserial Correlation
Page 24: Biserial Correlation
Page 25: Biserial Correlation
Page 26: Biserial Correlation

SPSS Output

Page 27: Biserial Correlation

Thank you…