Data Analysis of Coded Chats Study of correlation and regression between different dimension...

34
Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos Xhafa VMT Project

Transcript of Data Analysis of Coded Chats Study of correlation and regression between different dimension...

Page 1: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

Data Analysis of Coded Chats Study of correlation and

regression between different dimension variables

Progress Report, VMT Meeting, Jan. 19th 2005Fatos Xhafa

VMT Project

Page 2: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Outline

The variables under study Test for Normal distribution of variables Correlation between different variables Regression between different variables Discussion

From statistical perspective From interaction based / CA perspective

Page 3: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

The variables under study

Social Reference Pbm Solving Math Move

- Still at the first level of analysis - The same sample of six powwows

Page 4: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Test for Normal distributions (I) In correlation and regression variables under

study are assumed to approximate a Normal distribution

We tested the normality distribution of the dimension variables: Social reference Problem Solving Math Move

Page 5: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Test for Normal distributions (II)

Social reference dimension variable: Not a good approximation to

Normal distribution Could be indicating outlier/s

0.0 0.2 0.4 0.6 0.8 1.0

Observed Cum Prob

0.0

0.2

0.4

0.6

0.8

1.0

Exp

ecte

d C

um

Pro

b

Normal P-P Plot of Percentage Social reference postings

Page 6: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Test for Normal distributions (III) Social reference

dimension variable: Pow18 shows to be an

outlier After removing it from the

sample a “perfect” approximation to Normal distribution is obtained

0.0 0.2 0.4 0.6 0.8 1.0

Observed Cum Prob

0.0

0.2

0.4

0.6

0.8

1.0

Exp

ecte

d C

um

Pro

b

Normal P-P Plot of Percentage Social reference postings

Page 7: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Test for Normal distributions (IV) The Pbm Solving and Math Move show good

approximations to Normal distribution Correlation and regression between:

Social reference and Pbm Solving Social Reference and Math Move

can be studied (pow18 excluded) Correlation and regression between:

Pbm Solving and Math Movecan be studied for the whole sample

Page 8: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

CorrelationsCorrelations

Percentage Social

reference postings

Percentage Pbm

Solving postings

Percentage Math Move postings

Percentage Social reference postings

Pearson Correlation

1 -.970(**)-.942(*)

Sig. (2-tailed). .006

.017

N 5 5 5

Percentage Pbm Solving postings

Pearson Correlation

-.970(**) 1.967(**)

Sig. (2-tailed).006 .

.007

N 5 5 5

Percentage Math Move postings

Pearson Correlation

-.942(*) .967(**)1

Sig. (2-tailed).017 .007

.

N 5 5 5

** Correlation is significant at the 0.01 level (2-tailed).* Correlation is significant at the 0.05 level (2-tailed).

Page 9: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Social reference vs. Pbm Solving The two variables are strongly and negatively

correlated (-.970) What type of correlation? How are they

correlated?

Page 10: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Social reference vs. Pbm Solving

Linear Regression

26.00 28.00 30.00 32.00 34.00

Percentage Social reference postings

10.00

20.00

30.00

40.00P

erce

nta

ge

Pb

m S

olv

ing

po

stin

gs

pow1

pow2_1

pow2_2

pow9

pow10

Percentage Pbm Solving postings = 126.01 + -3.36 * PercSocRR-Square = 0.94

Page 11: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Analytically…Model Summary

Model RR

SquareAdjusted R

Square Std. Error of the Estimate

1 .970(a) .941 .921 3.17911

a Predictors: (Constant), Percentage Social reference postings

ANOVA(b)

Model Sum of

Squares dfMean

Square F Sig.

1 Regression480.208 1 480.208 47.514

.006(a)

Residual30.320 3 10.107

Total 510.528 4

a Predictors: (Constant), Percentage Social reference postingsb Dependent Variable: Percentage Pbm Solving postings

Page 12: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Analytically…

Coefficients(a)

Model Unstandardized

Coefficients

Standardized

Coefficients t Sig.

BStd. Error Beta

1 (Constant) 126.014

14.683

8.582 .003

Percentage Social reference postings

-3.356 .487 -.970 -6.893 .006

a Dependent Variable: Percentage Pbm Solving postings

Page 13: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Social reference vs. Math Move The two variables are strongly and negatively

correlated (-.942) What type of correlation? How are they

correlated?

Page 14: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Social reference vs. Math Move

Linear Regression

26.00 28.00 30.00 32.00 34.00

Percentage Social reference postings

5.00

10.00

15.00

20.00

25.00P

erce

nta

ge

Mat

h M

ove

po

stin

gs

pow1

pow2_1

pow2_2

pow9

pow10

Percentage Math Move postings = 89.50 + -2.44 * PercSocRR-Square = 0.89

Page 15: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Analytically…Model Summary

Model RR

SquareAdjusted R

Square

Std. Error of the

Estimate

1 .942(a) .887 .850 3.27875

a Predictors: (Constant), Percentage Social reference postings

ANOVA(b)

Model Sum of

Squares dfMean

Square F Sig.

1 Regression 254.157 1 254.157 23.642 .017(a)

Residual 32.251 3 10.750

Total 286.408 4

a Predictors: (Constant), Percentage Social reference postingsb Dependent Variable: Percentage Math Move postings

Page 16: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Analytically…

Coefficients(a)

Model Unstandardized

Coefficients

Standardized

Coefficients t Sig.

BStd. Error Beta

1 (Constant) 89.505 15.143 5.911 .010

Percentage Social reference postings

-2.441 .502 -.942-

4.862.017

a Dependent Variable: Percentage Math Move postings

Page 17: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Pbm Solving vs. Math Move The two variables are strongly and positively

correlated (.967) What type of correlation? How are they

correlated?

Page 18: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Pbm Solving vs. Math Move

Linear Regression

20.00 30.00 40.00

Percentage Pbm Solving postings

10.00

15.00

20.00

25.00

Per

cen

tag

e M

ath

Mo

ve p

ost

ing

s

pow1

pow2_1

pow2_2

pow9

pow10

pow18

Percentage Math Move postings = -2.63 + 0.73 * PercPbmSR-Square = 0.91

Page 19: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Regression: Math Move vs. Pbm Solving

Linear Regression

10.00 15.00 20.00 25.00

Percentage Math Move postings

20.00

30.00

40.00P

erce

nta

ge

Pb

m S

olv

ing

po

stin

gs

pow1

pow2_1

pow2_2

pow9

pow10

pow18

Percentage Pbm Solving postings = 5.50 + 1.26 * PerMathMR-Square = 0.91

Page 20: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: correlations (I)

The Social reference is strongly and negatively correlated to Pbm Solving (-.970) and Math Move (-.942)

The degree of the correlation may vary by enlarging the sample size

The strong correlation indicates that such a tendency is expected:

by enlarging the sample size (the sample was ‘randomly’ chosen) even if coders might have influenced the strong correlation

Pow18 shows to be an outlier and requires a careful examination

Page 21: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: correlations (II) Question1: Why the “production” of Social reference

influences negatively the “production” of Pbm Solving and Math Move?

A first interpretation The math pbm solving activity takes place during a fixed

amount of time (roughly an hour). The more effort in “production” of Social Reference, less

“production” of Math Question2: Does this have anything to do with

“exploratory” vs. “expository” mode? e.g. pow2-1 vs. pow2-2 we see that there is a considerable “distance” between

the two (cf. regression)

Page 22: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: correlations (III) Study at the second level (subcategories)

Two codes from Social Ref. dimension seem particularly interesting: References to individual actions vs. group actions seem to be a key

point! Code: Individual reference = Any utterance with a

reference to the self or another member. This refers to the collaboration in a broader sense (an activity that has been done or will be done by the self or another group member)

Code: Group reference = Any utterance with a reference to the group. This refers to the collaboration in a broader sense (an activity that has been done or is assumed to be done or will be done by the group)

Let’s look at pow2-1 vs. pow2-2

Page 23: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Individual vs. group references in Pbm Solving

Group Ref.Individual Ref.Identify otherRisk-taking

social

Pies show percents

40.00%

20.00%

20.00%

20.00%

Check Orientation Perform

Result Restate Reflect

Strategy Tactic

100.00% 100.00%

100.00%

20.00%

60.00%

20.00%

57.14%

42.86%

100.00% 100.00%

POWWOW2-1 POWWOW2-2

100.00%

Check Orientation Perform

Result Restate Reflect

Strategy Tactic

100.00% 100.00%

83.33%

16.67%

100.00%

25.00%

50.00%

25.00%

100.00%

33.33%

66.67%

I thought of factoring (n + 2)^2 and n(n + 5) Pbm Solving (Tactic) & Individual Ref.we could find a range Pbm Solving (Tactic) & Group Ref.

Page 24: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

This leads to… Hypothesis:

in “expository” powwows there is more Individual ref. than Group Ref. and,

in “exploratory” powwows there is more Group Ref. than Individual ref.

that we will study from Statistical approach (second level of analysis)

distribution of freqs of individual vs. group refs distribution of freqs of other subcategories

Thread analysis computing and visualizing individual-like threads and group-

like threads and combinations of them CA approach

Page 25: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: from CA perspective

How does the “social activity” unfolds sequentially during the pbm solving?

And, specifically, how does the individual vs. group reference unfolds?

Page 26: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: from CA perspective (I)Handle Posting Soc. Ref  Pbm Solving  Math Move

AVR it's okay      

PIN hahaa Ss    

SUP my internet explorer wouldnt open      

PIN ena you gotta hurtet! Ci    

PIN haha jk Ss    

PIN hurry*      

AVR so now for the new triangle we have: 194.79 = 1/2bh Cg P Geo

AVR do you follow me? Cg    

PIN hey its 124.708   Ch Nc

PIN cuz look      

AVR

http://www.math.com/students/calculators/source/scientific.htm Rs    

AVR and do the calculation      

PIN we agree it is 10.392 Cg Ch Nc

SUP then einstein over here was confusing me Io    

PIN or no?      

AVR yes we do Cg Ch  

Powwow2-1

Page 27: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: from CA perspective (II)Handle Posting Soc. Ref. Pbm Solv. Math Move

REA I got 15   R  

MCP I'm getting 15 also

  Ch Nc

REA I'll explain

Ci    

AH3 Yep, that's right– I got 15 also

Ci Ch Nc

REA now

     

AH3 For the extra, let

     

REA first i got the area to both triangles

Ci T Geo

REA With the first one with edgelengths of 9

     

REA I used the 30-60-90 fourmla

Ci P Geo

Powwow2-2

Page 28: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: from CA perspective (III)Handle Posting Soc. Ref

Pbm Solving

Math Move

AVR so now we add the two areas Cg T  

SUP just a little      

PIN its 194.852   R Nc

AVR exactly   Ch  

AVR or 194.85 as I got it :-) Ci Re  

AVR multiply it by two   P Nc

AVR and you get 389.704 = bh Ci P Geo

PIN we should get the exact measure

ment Cg Ch Nc

Powwow2-1

Page 29: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: from CA perspective (IV)

Handle Posting  Soc. Ref  Pbm Sol  Math Move

OFF hey... Gr    

SUP what do we fdomwith the area Ss    

AVR off spring do SO not rule!      

OFF lol Ss    

AVR especially if you are a woman!      

AVR no jk jk Ss    

OFF lol Ss    

OFF im no woman      

PIN lol Ss    

AVR well I am      

SUP hey hey      

SUP women are great Ss    

GER why don't the three old timers explain what you have figured out      

OFF oh      

AVR women are great... Ss    

SUP ok      

AVR but pain-enduring      

SUP they wont explain it to me Cg    

AVR okay, let's explain Cg    

Powwow2-1

Page 30: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Discussion: regression

Significant linear regressions between: Social reference and Pbm Solving Social reference and Math Move Pbm Solving and Math Move

Coefficients in each equation show the estimation for each case.

Page 31: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Annex

Page 32: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

LLR Smoother (for the whole sample)

LLR Smoother

15.00 20.00 25.00 30.00 35.00

Percentage Social reference postings

20.00

30.00

40.00

Per

cen

tag

e P

bm

So

lvin

g p

ost

ing

s

pow1

pow2_1

pow2_2

pow9

pow10

pow18

A smoother is a trend line that shows how the two variables (X and Y) are related to one

another.

It is not a statistical test !!! of the relationship of X and Y,

although in most cases it is possible to infer the practical

significance of the relationship.

Page 33: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Correlation Pbm Solving vs. Math Move (without removing pow18)

Correlations

Percentage Pbm Solving postings

Percentage Math

Move postings

Percentage Pbm Solving postings

Pearson Correlation

1.956(**)

Sig. (2-tailed).

.003

N 6 6

Percentage Math Move postings

Pearson Correlation

.956(**)1

Sig. (2-tailed).003

.

N 6 6

** Correlation is significant at the 0.01 level (2-tailed).

Page 34: Data Analysis of Coded Chats Study of correlation and regression between different dimension variables Progress Report, VMT Meeting, Jan. 19 th 2005 Fatos.

January 19th, 2005. VMT Meeting

Individual vs. group action references in Social Activity (count; for percents look at slide 23)

CheckOrientation

PerformResult

RestateReflect

StrategyTactic

0

1

2

3

4

5

6

7

Co

un

t

Social Ref.

Collaboration group

Collaboration individual

Identify other

Risk-taking

Composition of Pbm Solving in terms of Social Ref.

POWWOW2-1 POWWOW2-2

I thought of factoring (n + 2)^2 and n(n + 5) Pbm Solving (Tactic) & Individual Ref.we could find a range Pbm Solving (Tactic) & Group Ref.

CheckOrientation

PerformResult

RestateReflect

StrategyTactic

0

1

2

3

4

5

6 Social Ref.

Collaboration group

Collaboration individualIdentify self

Resource

Composition of Pbm Solving in terms of Social Reference