PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans...

34
PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans [email protected] ALM Webinar, 18 March 2014

Transcript of PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans...

Page 1: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

PIAAC 2013 results: Care needed in reading reports of international

surveys

Jeff Evans [email protected]

ALM Webinar, 18 March 2014

Page 2: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

2

Plan

1. Introducing PIAAC (Project for International Assessment of Adult Competencies, aka Survey of Adult Skills), including its concept of adult numeracy

2. Social surveys, and several key issues of survey validity

3. Findings for the UK sample and international comparisons; consideration of various interpretations currently circulating

Page 3: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

3

PIAAC (Project for International Assessment of Adult Competencies (aka Survey of Adult Skills)

Fieldwork in 2011-12, results available in Oct. 2013• Measures: Literacy, Numeracy, and Problem solving in TRE• Samples: adults usually 16-65: 5000 [or more*] per country

Builds on earlier IALS (1990s) and ALLS (2002-06), BUT …• larger sample of 24 “industrial” countries, in 1st round • uses computer administration, allows ‘adaptive routing’, to

find appropriate “level” of respondent • methodological & fieldwork improvements, e.g. regulation of

sampling and fieldwork standards.

Some affinity to PISA (15 year-olds): BUT different concepts; PIAAC uses household survey methodology + educ’l testing

Page 4: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

4

PIAAC aimsEducation Directorate at OECD (PIAAC sponsor):

helping countries to:• Identify and measure differences between individuals and

across countries in key “competencies”

• Relate measures of skills based on these competencies to: individual outcomes , e.g. labour market participation / earnings/ further learning; or to aggregate outcomes, e.g. economic growth, or social equity in the labour market

• Assess performance of education / training systems, to enhance competencies through formal educational system – or in the work-place, through incentives (Schleicher, 2008)

Page 5: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

5

PIAAC concepts and measures (1)

OECD: competencies: […] abilities, capacities or dispositions embedded in the individual […] cognitive skills & knowledge base are critical elements,

[but] important […] to include other aspects such as motivation and value orientation.

Numeracy: the ability to access, use, interpret, communicate mathematical information & ideas, to engage in / manage mathematical demands of a range of situations in adult life.

Conceptualisation (PIAAC Numeracy Expert Group, 2009)

 

Page 6: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

6

Social Surveys – a distinctive method

• Standardised measure for every respondent

allows comparison of “like with like”• Emphasises representativeness sampling’ Random’

BUT ALSO produces sampling variation {‘error’}

So … need statistical inference, using the SAMPLE (n=5000) significance testing, of hypothesis about the value in the POPULATION e.g. average numeracy score in the UK … or

‘confidence interval’ estimation of the value in the POPULATION: sample estimate + margin of error

Thus uses probability to reduce uncertainty: illustrations below

Page 7: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

7

Surveys (non – experiments): issues of validity

Several concerns: • appropriateness of indicators for concepts to be

measured … Construct Validity

Comparability across countries, or across groups, where one wishes to assess the effect of other differences, such as gender or amount of formal schooling … Internal Validity

[Campbell & Stanley (1966), arguing that controlled experiments (now aka RCTs), do not solve everything]

representativeness and generalisability of findings outside the research context … External Validity

Page 8: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

8

PIAAC concepts and measures (2)

To produce measures, must characterise Numerate behaviour, dimensions used in construction / validation of set of items:

• context (4 types): everyday life, work, societal, further learning• response (or ‘cognitive strategy’ – 3 main types): identify / locate /

access (information); act on / use; interpret / evaluate.• mathematical content ( 4 main types): quantity & number,

dimension & shape, pattern & relationships, data & chance.• representations (of mathematical / statistical information): e.g. text,

tables, graphs.

Also Background Questionnaire: demographic & attitudinal information, e.g. level of trust, political efficacy, health

+ Job-Related Assessment: use of / need for skills at work

 

Page 9: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

9

Methodology (1)

• the content validity of the definitions of numeracy and numerate behaviour [‘types’ of items]

• the measurement validity of the items presented, including the administration and scoring procedures [‘qualities’ of items]

• the reliability of the measurement procedures• the internal validity, or validity of (‘effective’) relationships

claimed (within the sample), e.g. between skill scores and desirable life outcomes, e.g. wages, employment, health

• the external validity, or representativeness, for the national population of interest, of the results produced from the sample.

… Similar dilemmas for most educational assessment. - and for both Qual. and Quant. educational research

-

Page 10: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

10

Methodology (2)

Content validity: the extent to which a measure represents all facets of a given concept: … Here definition of numeracy based on 4 dimensions of numerate behaviour stipulated: context, content, response, representation.

Each item can be categorised on these four dimensions, and the proportion of items falling into each category can be controlled over the scale, so as to enhance the transparency of the operational definition.

However, this is a standard definition …(generalising) ... How well does it “fit” adults’ lives in any particular country?

Further, the four types of context (everyday, work, society and community, further learning) are under-specified: rather general to refer to any actual specific social practice or social context, in any particular respondent’s everyday life.

(Evans Wedege & Yasukawa, 2013)

Page 11: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

11

Methodology (3)

Measurement validity: extent to which person’s responses to set of items actually capture what the conceptualisation of numeracy specifies • Depends on the actual range of items used: see 3 illustrative items presented by OECD (2013) / on websites (e.g. CSO Ireland, PIAAC 2012 Results) … and next slide• Requires design of procedures for administration of the survey to be standardised across all countries, e.g. training of interviewers / testers; design specs. of the laptops (& software) to be used, and rules for access to calculators and other aids.• Full appreciation of the validity of procedures requires assurance of how these procedures are followed in the field … even more crucial when results are compared across countries using different fieldwork teams (see PIAAC Technical Report).

Page 12: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

Numeracy – Sample Item 3

12

This sample item (of difficulty level 4) focuses on the following aspects of the numeracy construct:

Content Quantity and n umber

Process Act upon, use (compute) Context Community and society

Correct Response: One of the three values (no values between): 595, 596 or 600.

Page 13: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

13

Methodology (4)External validity: includes representativeness of sample for the “population “… check a country’s sample design + other fieldwork aspects, e.g. incentives for completing interview … & judgments depend on knowing about actual field practices.

SO any summaries, e.g. mean scores, or gender differences, are sample-based estimates for the population value (of the mean score or size of gender difference ...) for country x

These interval estimates not exact, but show a margin of error [say, 2* standard errors, on either side -* depends on the level of confidence desired in the estimate] surprises

e.g. PIAAC numeracy: overall country results 2013

Japan = 288 Finland = 282 NL / BELG = 280

286 to 290 280 to 284 278 to 282 (overlap !)

Page 14: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

14

Methodology (5)Reliability of test administration across countries and across interviewers, especially assuring same standards / practices in marking (problem with past international surveys) …

Computer presentation and marking will help greatly.

But it may tend to undermine construct validity, if it reduces that range of types of question that can be asked (example)…

And, increasing the reliability may lead to concerns about ecological validity, whether the setting of the research is representative of those to which one wishes to generalise the results. For example, on-screen presentation may limit this?

Page 15: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

15

Presentation of Results (1)

Adult’s performance not expressed as ‘proportion correct’, since adaptive routing some presented with ‘harder’ items

So Item Response Theory (IRT) used to (‘psychometrically’) estimate a standardised score (e.g. mean 250, std dev 50) (e.g. Tout, 2013)

Then, to make numerical scores meaningful, they are commonly related to one of 5 general ‘levels’ of literacy or numeracy …

Page 16: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

PIAAC Proficiency levels:  numeracyLevel Score range Numeracy  Below Level

1Lower than 176 Tasks at this level require the respondents to carry out simple processes such as counting, sorting, performing

basic arithmetic operations with whole numbers or money, or recognising common spatial representations in concrete, familiar contexts where the mathematical content is explicit with little or no text or distractors.  

1 176-225 Tasks at this level require the respondent to carry out basic mathematical processes in common, concrete contexts where the mathematical content is explicit with little text and minimal distractors. Tasks usually require one-step or simple processes involving counting; sorting; performing basic arithmetic operations; understanding simple percents such as 50%; and locating and identifying elements of simple or common graphical or spatial representations.

 

2 226-275 Tasks at this level require the respondent to identify and act on mathematical information and ideas embedded in a range of common contexts where the mathematical content is fairly explicit or visual with relatively few distractors. Tasks tend to require the application of two or more steps or processes involving calculation with whole numbers and common decimals, percents and fractions; simple measurement and spatial representation; estimation; and interpretation of relatively simple data and statistics in texts, tables and graphs.

 

3 276-325 Tasks at this level require the respondent to understand mathematical information that may be less explicit, embedded in contexts that are not always familiar and represented in more complex ways. Tasks require several steps and may involve the choice of problem-solving strategies and relevant processes. Tasks tend to require the application of number sense and spatial sense; recognising and working with mathematical relationships, patterns, and proportions expressed in verbal or numerical form; and interpretation and basic analysis of data and statistics in texts, tables and graphs.

 

4 326-375 Tasks at this level require the respondent to understand a broad range of mathematical information that may be complex, abstract or embedded in unfamiliar contexts. These tasks involve undertaking multiple steps and choosing relevant problem-solving strategies and processes. Tasks tend to require analysis and more complex reasoning about quantities and data; statistics and chance; spatial relationships; and change, proportions and formulas. Tasks at this level may also require understanding arguments or communicating well-reasoned explanations for answers or choices.

 

5 Higher than 376 Tasks at this level require the respondent to understand complex representations and abstract and formal mathematical and statistical ideas, possibly embedded in complex texts. Respondents may have to integrate multiple types of mathematical information where considerable translation or interpretation is required; draw inferences; develop or work with mathematical arguments or models; and justify, evaluate and critically reflect upon solutions or choices.

 

16

Page 17: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

17

Presentation of Results (2)BUT this is simple, one-dimensional sense … e.g. “levels embody predetermined assumptions about progression and relative difficulty” (Gillespie (2004) referring to UK Skills for Life)

•Partly because many adults have different “spiky profiles”, distinctive life experiences: some find type A items (e.g. “data & chance”) more difficult; others items type B (e.g. “dimension & shape”).

... Some policy makers attempt to stipulate “minimum level of numeracy needed to cope with the demands of adult life” in particular country - BUT not supported by OECD [cf. IALS] …or by Canada (Bussière, Centre for Literacy Webinar, 17 Feb. 2014)

… in Australia, debate (see Tout, 2013; Black & Yasukawa, 2014) •tends to assume ‘demands’ are the same across countries•conflates adults with different work, family, social situations

Page 18: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

18

Some interpretations of PIAAC results (1)

In each of 24 countries reporting PIAAC results in 2013, the media seem to focus on “prominent results”:

You can check them out in your country (cf. Hamilton, Yasukawa & Evans, ESREA 2014) …

For example, in the UK …

“the UK (England and Northern Ireland) performed significantly below average in numeracy” …

Page 19: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

19

Results (2)Results (1)

Page 20: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

20

Results (1a)Results (1a)Not only

Means …

look

at the

Spreads

Page 21: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

21

Some interpretations of PIAAC results (2)

Prominent in the UK:

the UK (England and Northern Ireland) performed significantly below average in numeracy – with particular problems among the 16-24 age group where the UK came 21st out of 24 industrialised countries.

… “UK faces a shrinking pool of skills, with England the only country where the skills of young people are below those of older people.”

Page 22: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

22

Results (2)

Page 23: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

23

Some interpretations of PIAAC results (3)

OECD (UK Country Note: UK, 2013, p2): “The median hourly wage of workers who score at Level 4 or 5 in literacy is 94% higher than that of workers who score at or below Level 1.”

Page 24: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

24

Results (3)

Page 25: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

25

Results (3a): another correlation

Page 26: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

Other results: early impressions

1. Within-country results complex much “fun” for media, politicians, spin-doctors, since

1a. ... Some praiseworthy and some regrettable findings for almost everyone

2. Between country results ‘striking’…

2a. e.g. much discussion of age / generation differences – patterns vary widely

2b. but need to allow for sampling variation – and even harder to control for wide range of cultural differences between countries or groups

26

Page 27: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

Other results: methodological tools

3. In interpretation of results, beware:

A. Is the ‘numeracy’ (literacy, PSTRE) measured an appropriate indicator for the ‘numeracy’ referred to in research, policy and pedagogical discussions? [Construct Validity – several dims.]

B. Many of the interesting findings are correlations, but not necessarily causal [Internal Validity ]

C. All scores for countries and subgroups are double estimates: sample estimates and “psychometric” (IRT) estimates [External Validity]

27

Page 28: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

28

What is to be done? (1)… by researchers and tutors / practitioners working together

** Generalising (E Wedege & Yasukawa): bring research evidence / practitioner experience / to argue (remind) that

• Adult numeracy is distinctive from School Maths• Adult numeracy is distinctive in different settings• Adult numeracy is distinctive across different cultures, i.e.

different subgroups AND different countries

** One-dimensionality: Adult numeracy is multi-dimensional

** Need other kinds of research: local surveys, case studies,

incl. life histories (cf. Barton & Hamilton, 2012 on local literacies)

Page 29: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

29

What is to be done? (2)Examples of possible research topics:

a. Are numeracy levels higher in England than in NI; and, if so, why? E.G. Higher educational qualifications – or higher levels of numerate experience at work?

b. Why do a higher proportion of males (17%) attain scores at Level 4/5 in Australia on numeracy scale compared with females (9%)? ???

c. Why are the proportions of people at Level 1 (& below) generally highest in the oldest age groups (people aged 60+)? Does this indicate, as sometimes claimed, that “a person’s skills deteriorate over the life-course”?

Page 30: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

30

References OECD (2013). OECD Skills Outlook 2013: First Results from the Survey of Adult

Skills. Paris: OECD. Online: Online: http://www.oecd.org/site/piaac/#d.en.221854

OECD (2013). The Survey of Adult Skills: Reader’s Companion. Paris: OECD. Online: Online: http://www.oecd.org/site/piaac/#d.en.221854

OECD (2013). Survey of Adult Skills First Results: Country Note - England and Northern Ireland. Paris: OECD. Online: Online: http://www.oecd.org/site/piaac/#d.en.221854

Evans, J. (2013/14). What to look for in PIAAC results: reading reports from international surveys; paper given at ALM-20; revised to appear in ALM-IJ.

Evans, J., Wedege, T. & Yasukawa, K. (2013). Critical Perspectives on Adults’ Mathematics Education; in M. A. Clements, A. Bishop, C. Keitel, J. Kilpatrick and F. Leung (Eds.), Third International Handbook of Mathematics Education, New York: Springer.

Tout, D. (2013). Lessons Learned from International Assessments, Fine Print, 36, 2.

Black, S. & Yasukawa, K. (2014). Level 3: another single measure of adult literacy and numeracy, Australian Educational Researcher, 41, 2, April, 125-138. 

Barton, D. & Hamilton, M. (2012). Local Literacies: Reading and Writing in One Community. London: Routledge.

Hamilton, M. (2011). Literacy and the Politics of Representation. London: Routledge.

Page 31: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

31

Appendices

The current 24 participating countries in PIAAC include: 17 EU members, plus USA, Can., Aus, Japan, Korea, possibly Russian Federation. Developing countries are not involved in Round 1, including BRIC (except Russia),

And Round 2 includes: Chile, Greece, Indonesia, Israel, Lithuania, New Zealand, Slovenia, Singapore, Turkey. Results expected in 2016.

Illustrative items (3 slides): taken from OECD (@013), The Survey of Adult Skills: Reader’s Companion, pp28-30. Available online.

Claimed “equivalences” among different qualifications in Engl.

Page 32: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

Numeracy – Sample Item 1

32

This sample item (of difficulty level 3) focuses on the following aspects of the numeracy construct:

Content Data and chance

Process Interpret, evaluate Context Community and society

Correct Response: 1957 - 1967 and 1967 – 1977

Page 33: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

Numeracy – Sample Item 2

33

This sample item (of difficulty level 1) focuses on the following aspects of the numeracy construct:

Content Dimension and shape

Process Act upon, use (measure) Context Every day or work

Correct Response: Any value between -4 and -5

Page 34: PIAAC 2013 results: Care needed in reading reports of international surveys Jeff Evans j.evans@mdx.ac.ukj.evans@mdx.ac.uk ALM Webinar, 18 March 2014.

34

Equivalences ?

Notice: columns 1, 2, and final – neat equivalences claimed between different tests and age groups